Neural networks - CUDAKernel?/setConsta?ntMemory - the data supplied is too big for constant 'hintsD' On R2015a with Parallel Computing Toolbox and Neural Network Toolbox. Using the following code with GPU Nvidia GeForce GTX980 Ti: net1 = feedforwardnet(20); net1.trainFcn = 'trainscg'; x = inputs(1:4284,2:2000)'; % if I reduce this to 2:1900, it will work t = double(targets'); % casting to double for GPU t = t(:,1:4284); % preparing for GPU xg = nndata2gpu(x); tg = nndata2gpu(t); net1.input.processFcns = {'mapminmax'}; net1.output.processFcns = {'mapminmax'}; net2 = configure(net1,x,t); % Configure with MATLAB arrays net2 = train(net2,xg,tg); As you can see, this is not a big dataset. When I run this, it generates this error: Error using parallel.gpu.CUDAKernel/setConstantMemory The data supplied is too big for constant 'hintsD'. Error in nnGPU.codeHints (line 33) setConstantMemory(hints.yKernel,'hintsD',hints.double); Error in nncalc.setup2 (line 13) calcHints = calcMode.codeHints(calcHints); Error in nncalc.setup (line 17) [calcLib,calcNet] = nncalc.setup2(calcMode,calcNet,calcData,calcHints); Error in network/train (line 357) [calcLib,calcNet,net,resourceText] = nncalc.setup(calcMode,net,data); gpuDevice is showing this: Name: 'GeForce GTX 980 Ti' Index: 1 ComputeCapability: '5.2' SupportsDouble: 1 DriverVersion: 8 ToolkitVersion: 6.5000 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [2.1475e+09 65535 65535] SIMDWidth: 32 TotalMemory: 6.4425e+09 AvailableMemory: 5.1520e+09 MultiprocessorCount: 22 ClockRateKHz: 1139500 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 1 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1 As noted in the code above, if I reduce x marginally, it will run. I don't understand why data of this size would generate a memory error? Am I missing a step in preparing this for GPU?
Kshitij Singh answered .
2025-11-20
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)';
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
net1.input.processFcns = {'mapminmax'};
net1.output.processFcns = {'mapminmax'};
net1 = train(net1,x,t,'useGPU','yes');