Homework "Artificial Neural Networks"
In this homework we shall construct a simple artificial neural network which will be trained to interpolate a tabulated function.
The network is an ordinary three-layer neural network with one neuron in the input layer, several neurons in the hidden layer, and one neuron in the output layer,
synapse↘[ hidden neuron ] / \ axon→/ \w1 / w2 \ x ⟶ [identity neuron]---[hidden neuron]-----[summation neuron] ⟶ y=Fp(x) \ / \ /w3 \ / [ hidden neuron ]
Here the input neuron is the identity neuron: it simply sends the input, a real number x, to all hidden neurons without any modification.
The output neuron is a summation neuron: it sums the outputs of the hidden neurons and sends the result to the output.
The hidden neurons are ordinary neurons: the neuron number i transforms its input signal, x, into the its output signal, yi, as
yi=f((x-ai)/bi)*wi,
where f is the activation function (the same for all hidden neurons) and where ai, bi, wi are the parameters of the neuron number i.
The network response Fp(x) is then given as
Fp(x) = ∑i f((x-ai)/bi)*wi
The activation function can be
• a Gaussian wavelet, f(x)=x*exp(-x²),
• a Gaussian, f(x)=exp(-x²),
• a wavelet, f(x)=cos(5x)*exp(-x²),
• any another suitable function.
The whole network then functions as one big non-linear multi-parameter function y=Fp(x), where p={ai,bi,wi}i=1..n is the set of parameters of the network.
Given the tabulated function, {xk,yk}k=1..N, the training of the network consists of tuning its parameters to minimize the cost function
C(p) = ∑k=1..N (Fp(xk) - yk)²,
which amounts to minimization of the cost function C(p) in the space of the parameters of the network. This minimization should be done with your own minimization routine.
Minimization methods based on numerical gradient might not work well here, so you might need to use the analytic gradient of our network (it is analytic, I believe, provided excitation functions are analytic).
A class to keep your network could be like this (put "public" and other access modifiers wherever needed),
class ann{ int n; /* number of hidden neurons */ Func<double,double> f = x => x*Exp(-x*x); /* activation function */ vector p; /* network parameters */ ann(int n){/* constructor */} double response(double x){ /* return the response of the network to the input signal x */ } void train(vector x,vector y){ /* train the network to interpolate the given table {x,y} */ } }Train your network to approximate some intersting function, for example
g(x)=Cos(5*x-1)*Exp(-x*x)sampled at several points on [-1,1]. The gaussian wavelets are (probably) the best activation functions for this task.
(3 points) Modify the previous exercise such that the network, after training, could also return the first and second derivatives and also the anti-derivative of the approximant to the tabulated function. A gaussian wavelet could be a good activation function here as both its derivatives and the anti-derivative are analytic.
(1 point) Implement an artificial neural network that can be trained to approximate a solution to the differential equation
Φ[y(x)]≡Φ(y'',y',y,x)=0,
(where Φ is generally a non-linear function of its arguments) on an interval [a,b] with the boundary condition at a given point 'c',y(c)=yc, y'(c)=y'c,
where c∈[a,b] and yc and y'c are given numbers.The cost function that penalizes deviations from satifying the differential equation and the boundary conditions might be
C(p)=∫ab Φ[Fp(x)]² dx + α (Fp(c)-yc)² + β (Fp'(c)-y'c)² ,
where α and β are parameters that specify the relative contribution of the boundary conditions to the cost-function. Tuning these parameters might improve convergence of the training procedure.