java - Annealing on a multi-layered neural network: XOR experiments -


im begineer in concept , have tried learn feed-forward type neural network(topology of 2x2x1 ):

bias , weight range of each neuron_____________outputs xor test inputs                 [-1,1]                           1,1 ----> 0,9                                                              1,0 ----> 0,8                                                  0,1 ---->-0.1                                                  0,0 ----> 0.1                  [-10,10]                         1,1 ----> 0,24                                                              1,0 ----> 0,67                                                  0,1 ---->-0.54                                                  0,0 ----> 0.10                  [-4,4]                           1,1 ----> -0,02                                                              1,0 ----> 0,80                                                  0,1 ----> 0.87                                                  0,0 ----> -0.09 

so, range of [-4,4] seems better other.

question: there way find proper limits of weigths , biases compared temperature limits , temperature decrease rate?

note: im trying 2 ways here. first randomizing weights , biases @ once each trial. second randomizing single weight , single bias @ each trial. (50 iterations before decreasing temperature). single weight change gives worse results.

 (n+1) next value, (n) value before   tempmax=2.0  tempmin=0.1 ----->approaching zero, error of xor output approaches 0  temp(n+1)=temp(n)/1.001   weight update:  w(n+1)=w(n)+(float)(math.random()*t*2.0f-t*1.0f)); // t temperature  (same bias update)   iterations per temperature=50   using java's math.random() method(spectral property appropriate annealing?)   transition probability:  (1.0f/(1.0f+math.exp(((candidate state error)-(old error))/temp)))   neuron activation function: math.tanh() 

tried many times , results same. reannealing solution evade deeper local minimums?

i need suitable weight/bias range/limit according total neuron number , layer number , starting/enging temperature. 3x6x5x6x1 can count 3-bit input , gives outpu, can approximate step function, need play ranges always.

for training data set, output error big(193 data points, 2 inputs, 1 output):

193 2 1 0.499995 0.653846 1 0.544418 0.481604 1 0.620200 0.320118 1 0.595191 0.404816 0 0.404809 0.595184 1 0.171310 0.636142 0 0.014323 0.403392 0 0.617884 0.476556 0 0.391548 0.478424 1 0.455912 0.721618 0 0.615385 0.500005 0 0.268835 0.268827 0 0.812761 0.187243 0 0.076923 0.499997 1 0.769231 0.500006 0 0.650862 0.864223 0 0.799812 0.299678 1 0.328106 0.614848 0 0.591985 0.722088 0 0.692308 0.500005 1 0.899757 0.334418 0 0.484058 0.419839 1 0.200188 0.700322 0 0.863769 0.256940 0 0.384615 0.499995 1 0.457562 0.508439 0 0.515942 0.580161 0 0.844219 0.431535 1 0.456027 0.529379 0 0.235571 0.104252 0 0.260149 0.400644 1 0.500003 0.423077 1 0.544088 0.278382 1 0.597716 0.540480 0 0.562549 0.651021 1 0.574101 0.127491 1 0.545953 0.731052 0 0.649585 0.350424 1 0.607934 0.427886 0 0.499995 0.807692 1 0.437451 0.348979 0 0.382116 0.523444 1 1 0.500000 1 0.731165 0.731173 1 0.500002 0.038462 0 0.683896 0.536585 1 0.910232 0.581604 0 0.499998 0.961538 1 0.903742 0.769772 1 0.543973 0.470621 1 0.593481 0.639914 1 0.240659 0.448408 1 0.425899 0.872509 0 0 0.500000 0 0.500006 0.269231 1 0.155781 0.568465 0 0.096258 0.230228 0 0.583945 0.556095 0 0.550746 0.575954 0 0.680302 0.935290 1 0.693329 0.461550 1 0.500005 0.192308 0 0.230769 0.499994 1 0.721691 0.831791 0 0.621423 0.793156 1 0.735853 0.342415 0 0.402284 0.459520 1 0.589105 0.052045 0 0.189081 0.371208 0 0.533114 0.579952 0 0.251594 0.871762 1 0.764429 0.895748 1 0.499994 0.730769 0 0.415362 0.704317 0 0.422537 0.615923 1 0.337064 0.743842 1 0.560960 0.806496 1 0.810919 0.628792 1 0.319698 0.064710 0 0.757622 0.393295 0 0.577463 0.384077 0 0.349138 0.135777 1 0.165214 0.433402 0 0.241631 0.758362 0 0.118012 0.341772 1 0.514072 0.429271 1 0.676772 0.676781 0 0.294328 0.807801 0 0.153846 0.499995 0 0.500005 0.346154 0 0.307692 0.499995 0 0.615487 0.452168 0 0.466886 0.420048 1 0.440905 0.797064 1 0.485928 0.570729 0 0.470919 0.646174 1 0.224179 0.315696 0 0.439040 0.193504 0 0.408015 0.277912 1 0.316104 0.463415 0 0.278309 0.168209 1 0.214440 0.214435 1 0.089768 0.418396 1 0.678953 0.767832 1 0.080336 0.583473 1 0.363783 0.296127 1 0.474240 0.562183 0 0.313445 0.577267 0 0.416055 0.443905 1 0.529081 0.353826 0 0.953056 0.687662 1 0.534725 0.448035 1 0.469053 0.344394 0 0.759341 0.551592 0 0.705672 0.192199 1 0.385925 0.775385 1 0.590978 0.957385 1 0.406519 0.360086 0 0.409022 0.042615 0 0.264147 0.657585 1 0.758369 0.241638 1 0.622380 0.622388 1 0.321047 0.232168 0 0.739851 0.599356 0 0.555199 0.366750 0 0.608452 0.521576 0 0.352098 0.401168 0 0.530947 0.655606 1 0.160045 0.160044 0 0.455582 0.518396 0 0.881988 0.658228 0 0.643511 0.153547 1 0.499997 0.576923 0 0.575968 0.881942 0 0.923077 0.500003 0 0.449254 0.424046 1 0.839782 0.727039 0 0.647902 0.598832 1 0.444801 0.633250 1 0.392066 0.572114 1 0.242378 0.606705 1 0.136231 0.743060 1 0.711862 0.641568 0 0.834786 0.566598 1 0.846154 0.500005 1 0.538462 0.500002 1 0.379800 0.679882 0 0.584638 0.295683 1 0.459204 0.540793 0 0.331216 0.430082 0 0.672945 0.082478 0 0.671894 0.385152 1 0.046944 0.312338 0 0.499995 0.884615 0 0.542438 0.491561 1 0.540796 0.459207 1 0.828690 0.363858 1 0.785560 0.785565 0 0.686555 0.422733 1 0.231226 0.553456 1 0.465275 0.551965 0 0.378577 0.206844 0 0.567988 0.567994 0 0.668784 0.569918 1 0.384513 0.547832 1 0.288138 0.358432 1 0.432012 0.432006 1 0.424032 0.118058 1 0.296023 0.703969 1 0.525760 0.437817 1 0.748406 0.128238 0 0.775821 0.684304 1 0.919664 0.416527 0 0.327055 0.917522 1 0.985677 0.596608 1 0.356489 0.846453 0 0.500005 0.115385 1 0.377620 0.377612 0 0.559095 0.202936 0 0.410895 0.947955 1 0.187239 0.812757 1 0.768774 0.446544 0 0.614075 0.224615 0 0.350415 0.649576 0 0.160218 0.272961 1 0.454047 0.268948 1 0.306671 0.538450 0 0.323228 0.323219 1 0.839955 0.839956 1 0.636217 0.703873 0 0.703977 0.296031 0 0.662936 0.256158 0 0.100243 0.665582 1

i highly doubt strict rules exist problem. first of all, limits/bounds of weights strictly dependant on input data representation, activation functions, neurons number , output function. can rely on here rules of thumb in best possible scenario.

first, lets consider initial weights values in classical algorithms. basic idea of weights scale use them in range of [-1,1] small layers, , large ones divide square root of number of units in large layer. more sophisticated methods described bishop (1995). such rule of thumb deduce, resonable range (which row of magniture bigger initial guess) in form of [-10,10]/sqrt(neurons_count_in_the_lower_layer).

unfortunately, best knowledge, temperature choice more complex, rather data dependant factor, not topology based one. in papers there have been suggestions values specific time series prediction, nothing general. in simmulated annleaing "in general" (not applied nn training), there have been proposed many heuristic choices, ie.

if know maximum distance (cost function difference) between 1 neighbour , can use information calculate starting temperature. method, suggested in (13. rayward-smith, v.j., osman, i.h., reeves, c.r., smith, g.d. 1996. modern heuristic search methods. john wiley & sons.), start high temperature , cool rapidly until 60% of worst solutions being accepted. forms real starting temperature , can cooled more slowly. similar idea, suggested in (5. dowsland, k.a. 1995. simulated annealing. in modern heuristic techniques combinatorial problems (ed. reeves, c.r.), mcgraw-hill, 1995), rapidly heat system until proportion of worse solutions accepted , slow cooling can start. can seen similar how physical annealing works in material heated until liquid , cooling begins (i.e. once material liquid pointless carrying on heating it). [from notes university of nottingham]

but choice of best application has based on numerous tests, of things in machine learning. if dealing problem, concerned trained neural network, seems resonable interest in extreme machine learning, , extreme learning machines (elm), neural network training conducted in global optimization procedure, guarantees best possible solution (under used regularized cost function). simulated annleaing, interative, greedy process (as propagation) cannot guarantee anything, there heuristics , rules of thumb.


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

IIS->Tomcat Redirect: multiple worker with default -