Function approximation: Neural network great

Illustration
tea - 2021-07-26T09:55:35+00:00
Question: Function approximation: Neural network great

Function approximation: Neural network great 'on paper' but when simulated results are very bad? I need some help with NN because I don't understand what happened. One hidden layer, I=4, H=1:20, O=1. I run each net architecture 10 times with different initial weights (left default initnw). I have in total 34 datasets which were divided 60/20/20 when using Levenberg-Marquadt algorithm. Mse_goal = 0.01*mean(var(t',1)), i calculate NMSE and R^2, choose best R^2, for that check performance of each subsample, check regression plots, check rmse. R^2 is usually around 0,95; R for each subset 0,98... But when I simulate network with completely new set of data, estimations deviate quite a lot. It is not because of extrapolation. Data are normalized with mapminmax, transfer functions tansig, purelin.   Trainbr was my first choice actually, since I have small dataset and trainbr doesn't need validation set (Matlab2015a), but it is awfully slow. I ran a net with trainbr and we are talking hours versus minutes with trainlm.   I've read a ton of Greg Heath's posts and tutorials and found very valuable information there, however, still nothing. I see no way out.     % Solve an Input-Output Fitting problem with a Neural Network % Script generated by Neural Fitting app % Created 09-Aug-2016 18:33:13 % This script assumes these variables are defined: % % MP_UA_K - input data. % UA_K - target data. close all, clear all load varUA_K x = MP_UA_K; t = UA_K; var_t=mean(var(t',1)); %t variance [inputs,obs]=size(x); % hiddenLayerSize = 20; %max number of neurons numNN = 10; % number of training runs neurons = [1:hiddenLayerSize]'; training_no = 1:numNN; obs_no = 1:obs; nets = cell(hiddenLayerSize,numNN); trainOutputs = cell(hiddenLayerSize,numNN); valOutputs = cell(hiddenLayerSize,numNN); testOutputs = cell(hiddenLayerSize,numNN); Y_all = cell(hiddenLayerSize,numNN); performance = zeros(hiddenLayerSize,numNN); trainPerformance = zeros(hiddenLayerSize,numNN); valPerformance = zeros(hiddenLayerSize,numNN); testPerformance = zeros(hiddenLayerSize,numNN); e = zeros(numNN,obs); e_all = cell(hiddenLayerSize,numNN); NMSE = zeros(hiddenLayerSize,numNN); r_train = zeros(hiddenLayerSize,numNN); r_val = zeros(hiddenLayerSize,numNN); r_test = zeros(hiddenLayerSize,numNN); r = zeros(hiddenLayerSize,numNN); Rsq = zeros(hiddenLayerSize,numNN); for j=1:hiddenLayerSize % Choose a Training Function % For a list of all training functions type: help nntrain % 'trainlm' is usually fastest. % 'trainbr' takes longer but may be better for challenging problems. % 'trainscg' uses less memory. Suitable in low memory situations. trainFcn = 'trainbr'; % Bayesian Regularization backpropagation. % Create a Fitting Network net = fitnet(j,trainFcn); % Choose Input and Output Pre/Post-Processing Functions % For a list of all processing functions type: help nnprocess net.input.processFcns = {'removeconstantrows','mapminmax'}; net.output.processFcns = {'removeconstantrows','mapminmax'}; % Setup Division of Data for Training, Validation, Testing % For a list of all data division functions type: help nndivide % podaci su sortirani prema zavisnoj varijabli, cca svaki tre?i dataset je % testni net.divideFcn = 'divideind'; % Divide data by index net.divideMode = 'sample'; % Divide up every sample net.divideParam.trainInd = [1:3:34,2:3:34]; % net.divideParam.valInd = [5:5:30]; net.divideParam.testInd = [3:3:34]; mse_goal = 0.01*var_t; % Choose a Performance Function % For a list of all performance functions type: help nnperformance net.performFcn = 'mse'; % Mean Squared Error net.trainParam.goal = mse_goal; % Choose Plot Functions % For a list of all plot functions type: help nnplot net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ... 'plotregression', 'plotfit'}; for i=1:numNN % Train the Network net = configure(net,x,t); disp(['No. of hidden nodes ' num2str(j) ', Training ' num2str(i) '/' num2str(numNN)]) [nets{j,i}, tr{j,i}] = train(net,x,t); y = nets{j,i}(x); e (i,:) = gsubtract(t,y); e_all{j,i}= e(i,:); trainTargets = t .* tr{j,i}.trainMask{1}; %valTargets = t .* tr{j,i}.valMask{1}; testTargets = t .* tr{j,i}.testMask{1}; trainPerformance(j,i) = perform(net,trainTargets,y); %valPerformance(j,i) = perform(net,valTargets,y); testPerformance(j,i) = perform(net,testTargets,y); performance(j,i)= perform(net,t,y); rmse_train(j,i)=sqrt(trainPerformance(j,i)); %rmse_val(j,i)=sqrt(valPerformance(j,i)); rmse_test(j,i)=sqrt(testPerformance(j,i)); rmse(j,i)=sqrt(performance(j,i)); % outputs of all networks Y_all{j,i}= y; trainOutputs {j,i} = y .* tr{j,i}.trainMask{1}; %valOutputs {j,i} = y .* tr{j,i}.valMask{1}; testOutputs {j,i} = y .* tr{j,i}.testMask{1}; [r(j,i)] = regression(t,y); [r_train(j,i)] = regression(trainTargets,trainOutputs{j,i}); %[r_val(j,i)] = regression(valTargets,valOutputs{j,i}); [r_test(j,i)] = regression(testTargets,testOutputs{j,i}); NMSE(j,i) = mse(e_all{j,i})/mean(var(t',1)); % normalized mse % coefficient of determination Rsq(j,i) = 1-NMSE(j,i); end [minperf_train,I_train] = min(trainPerformance',[],1); minperf_train = minperf_train'; I_train = I_train'; % [minperf_val,I_valid] = min(valPerformance',[],1); % minperf_val = minperf_val'; % I_valid = I_valid'; [minperf_test,I_test] = min(testPerformance',[],1); minperf_test = minperf_test'; I_test = I_test'; [minperf,I_perf] = min(performance',[],1); minperf = minperf'; I_perf = I_perf'; [maxRsq,I_Rsq] = max(Rsq',[],1); maxRsq = maxRsq'; I_Rsq = I_Rsq'; [train_min,train_min_I] = min(minperf_train,[],1); % [val_min,val_min_I] = min(minperf_val,[],1); [test_min,test_min_I] = min(minperf_test,[],1); [perf_min,perf_min_I] = min(minperf,[],1); [Rsq_max,Rsq_max_I] = max(maxRsq,[],1); end figure(4) hold on xlabel('observation no.') ylabel('targets') scatter(obs_no,trainTargets,'b') % scatter(obs_no,valTargets,'g') scatter(obs_no,testTargets,'r') hold off figure(5) hold on xlabel('neurons') ylabel('min. performance') plot(neurons,minperf_train,'b',neurons,minperf_test,'r',neurons,minperf,'k') hold off figure(6) hold on xlabel('neurons') ylabel('max Rsq') scatter(neurons,maxRsq,'k') hold off % View the Network %view(net) % Plots % Uncomment these lines to enable various plots. %figure, plotperform(tr) %figure, plottrainstate(tr) %figure, ploterrhist(e) %figure, plotregression(t,y) %figure, plotfit(net,x,t) % Deployment % Change the (false) values to (true) to enable the following code blocks. % See the help for each generation function for more information. save figure(4).fig save figure(5).fig save figure(6).fig if (false) % Generate MATLAB function for neural network for application % deployment in MATLAB scripts or with MATLAB Compiler and Builder % tools, or simply to examine the calculations your trained neural % network performs. genFunction(net,'nn_UA_K_BR'); y = nn_UA_K_BR(x); end % sa?uvati sve varijable iz workspacea u poseban file za daljnju analizu save ws_UA_K_BR

Expert Answer

Profile picture of Prashant Kumar Prashant Kumar answered . 2025-11-20

% I need some help with NN because I don't understand what happened. One % hidden layer, I=4, H=1:20, O=1. I run each net architecture 10 times % with different initial weights (left default initnw). I have in total % 34 datasets
 
 
 Do you mean data points N = 34?
It typically takes ~ 10 to 30 data points per dimension to
 
adequately characterize a distribution. For a 4-D distribution I'd recommend
 
 40 <~ Ntrn <~ 120

% which were divided 60/20/20 when using Levenberg-Marquadt

 Ntrn = 34-2*round(0.2*34) = 20

 Hub = (20-1)/(4+1+1) = 3.2

indicating you really don't have enough data to adequately characterize a 4-D distribution.

You should consider

 1. Dimensionality reduction
 2. k-fold crossvalidation
 3. Adding new data with the same mean and covariance (stdv + 
correlations) matrix
% algorithm. Mse_goal = 0.01*mean(var(t',1)), i calculate NMSE and R^2, % choose best R^2, for that check performance of each subsample, check % regression plots, check rmse. R^2 is usually around 0,95; R for each % subset 0,98... But when I simulate network with completely new set of % data, estimations deviate quite a lot. It is not because of % extrapolation.
 
 
 No. It probably is. Your training data subset is insufficiently 
large for 4 dimensions.

 I would begin with minimizing H with dividetrain. Then consider 
k-fold crossvalidation.
% Data are normalized with mapminmax, transfer functions tansig, % purelin. % Trainbr was my first choice actually, since I have small dataset and % trainbr doesn't need validation set (Matlab2015a), but it is awfully % slow. I ran a net with trainbr and we are talking hours versus minutes % with trainlm.
This may be a BUG. Let MATLAB know. What version are you using?
>> ver
% I've read a ton of Greg Heath's posts and tutorials and found very % valuable information there, however, still nothing. I see no way out.
 It typically takes ~ 10 to 30 data points per dimension to adequately 
characterize a distribution,
I suggest calculating the means and stdv for each data set to see how 
much your training data is representative of the total 4-D 
distribution that includes the new datasets. 2 or 3-D 
color coded projections may be helpful.

 


Not satisfied with the answer ?? ASK NOW

Get a Free Consultation or a Sample Assignment Review!