How can I find the right neural network architecture

Illustration
nitin_ch - 2021-07-15T09:54:58+00:00
Question: How can I find the right neural network architecture

  I am trying to learn how to use the neural network to fit functions. I did read a little bit into the subject but I am still not sure how to find the right architecture (the number of neuron in a hidden layer. I use networks with 1 hidden layer and my training algorithm are 'trainlm' and 'trainbr'. Currently I am aware of 4 problems that can occur:   + Algorithm reaches a local minimum: the best training performance (tr.best_perf) is too large?   + Overfitting: the best validation performance (tr.best_vperf) is much larger than the best training performance (tr.best_perf)?   + Underfitting: the best validation performance (tr.best_vperf), the best training performance (tr.best_perf), the best test performance (tr.best_tperf) are in the similar size but they are still too large.   + Extrapolating: the best test error (tr.best_tperf) is much larger than the two other ones.   Currently, I wrote a loop that examine networks with 1 neuron to 50 neurons. Each network (e.g. a network with 20 neurons) is trained for 10 times and the one with the lowest training performance (tr.best_perf) is chosen in order to avoid the local minimum. Afterwards, I store tr.best_tperf, tr.best_vperf and tr.best_perf of that network in a array. Finally I compare those 50 networks to each other and take the one with the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]).   The other way to go would be to train each network (e.g. a network with 20 neurons) for 10 times and choose the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]). Then I store this error for each network in a vector. Finally, I choose the network with the lowest element of that vector.   Can someone tell me which way is the correct way? I really appreciate any help you can provide.

Expert Answer

Profile picture of Prashant Kumar Prashant Kumar answered . 2025-11-20

Search the NEWSREADER and ANSWERS using
 
 
   fitnet Hmin Hmax Ntrials

Minimization of the number of hidden nodes subject to the MSEtrn upper bound

 MSEtrn <= 0.01*mean(var(targettrn',1))

        <= 0.01*var(targettrn,1) for 1-dim
this yields a training subset Rsquaretrn exceeding 0.99.
 
Many of the posts don't have the training subset subscript trn and/or may have used t instead of target. So, there are probably a jillion variations posted including
 MSEgoal = 0.01*vart1

The best way I have found to obtain relatively unbiased results is to use 2 loops.

 1. Outer loop over # of hidden nodes Hmin:dH:Hmax 
    with Hmax <= Hub, the upper bound for not 
   having more unknown weights, Nw, than training 
    equations Ntrneq. 

 2. Inner loop over Ntrials >= 10 different 
    random distributions of initial weights.
Nets are initially ranked by their validation subset performance. Then unbiased estimates of performance are obtained from the test subset performance.
However, I usually rank the nets by their combined nontraining validation AND test subset performance.
Again, I have jillions of examples posted in the NEWSREADER and ANSWERS. The best search words are probably
        Hmin Hmax Ntrials


Not satisfied with the answer ?? ASK NOW

Get a Free Consultation or a Sample Assignment Review!