Homework "Minimization"

(6 points) Newton's method with numerical gradient, numerical Hessian matrix and back-tracking linesearch

Implement the Newton minimization method with numerical gradient, numerical Hessian matrix, and back-tracking linesearch.

Something like this,

public static vector newton(
	Func<vector,double> φ, /* objective function */
	vector x,              /* starting point */
	double acc=1e-3        /* accuracy goal, on exit |∇φ| should be < acc */
){
	do{ /* Newton's iterations */
		var ∇φ = gradient(φ,x);
		if(∇φ.norm() < acc) break; /* job done */
		var H = hessian(φ,x);
		var QRH = givensQR(H);   /* QR decomposition */
		var dx = QRH.solve(-∇φ); /* Newton's step */
		double λ=1,φx=φ(x);
		do{ /* linesearch */
			if( φ(x+λ*dx) < φx ) break; /* good step: accept */
			if( λ < λmin ) break; /* accept anyway */
			λ/=2;
		}while(true);
		x+=λ*dx;
	}while(true);
	return x;
}//newton

Hints:

It seems that in this code you can save one φ-evaluation per step: is this true?
You might limit the number of Newton's steps to, say, 1000 to stop the infinite loop in case the convergence criterion cannot be reached.

The numerical gradient of the objective function φ(x) can be calculated like this,

public static vector gradient(Func<vector,double> φ,vector x){
	vector ∇φ = new vector(x.size);
	double φx = φ(x); /* no need to recalculate at each step */
	for(int i=0;i<x.size;i++){
		double dx=Max(Abs(x[i]),1)*Pow(2,-26);
		x[i]+=dx;
		∇φ[i]=(φ(x)-φx)/dx;
		x[i]-=dx;
	}
	return ∇φ;
}

The Hessian matrix of the objective function φ(x) can be calculated like this,

public static matrix hessian(Func<vector,double> φ,vector x){
	matrix H=new matrix(x.size);
	vector ∇φx=gradient(φ,x);
	for(int j=0;j<x.size;j++){
		double dx=Max(Abs(x[j]),1)*Pow(2,-13); /* for numerical gradient */
		x[j]+=dx;
		vector d∇φ=gradient(φ,x)-∇φx;
		for(int i=0;i<x.size;i++) H[i,j]=d∇φ[i]/dx[j];
		x[j]-=dx;
	}
	//return H;
	return (H+H.T)/2; // you think?
}

Find a minimum of the Rosenbrock's valley function, f(x,y)=(1-x)²+100(y-x²)².

Find a minimum of the Himmelblau's function, f(x,y)=(x²+y-11)²+(x+y²-7)².

Record the number of steps it takes for the algorithm to reach the minimum.

(3 points) Higgs boson discovery

In 2012 CERN announced the discovery of a previously unknown boson with mass 125.3±0.6 GeV/c² [Wikipedia].

In one of the experiments a certain cross-section involving Higgs (something like H→γγ) was measured and the corresponding data (with background subtracted – that's why some data are negative) are given as [Physics Letters B Volume 726, pages 88-119 (2013)] [arxiv:1207.7235]

# energy E[GeV], signal σ(E) [certain units], experimental uncertainty Δσ [same units]
101 -0.25 2.0
103 -0.30 2.0
105 -0.15 1.9
107 -1.71 1.9
109  0.81 1.9
111  0.65 1.9
113 -0.91 1.9
115  0.91 1.9
117  0.96 1.6
119 -2.52 1.6
121 -1.01 1.6
123  2.01 1.6
125  4.83 1.6
127  4.58 1.6
129  1.26 1.3
131  1.01 1.3
133 -1.26 1.3
135  0.45 1.3
137  0.15 1.3
139 -0.91 1.3
141 -0.81 1.1
143 -1.41 1.1
145  1.36 1.1
147  0.50 1.1
149 -0.45 1.1
151  1.61 1.1
153 -2.21 1.1
155 -1.86 0.9
157  1.76 0.9
159 -0.50 0.9

Fit the Breit-Wigner function (where A is the scale-factor, m is the mass and Γ is the widths of the resonance),

F(E|m,Γ,A) = ^A/_{[(E-m)²+Γ²/4]} ,

to the data and determine the mass and the experimetnal width of the Higgs boson (the experimental width is only the upper limit on the physical width).

You should fits the data by minimizing (using your own minimizer) the deviation function

D(m,Γ,A)=Σ_i[^{(F(E_i|m,Γ,A)-σ_i)}/_{Δσ_i}]² .

You can send the data into the standard input stream of your program like this,

mono main.exe < higgs.data.txt 1> out.txt 2> log

and you can read the data table into generic lists (either our genlist or the System.Collections.Generic.List) from the standard input like this,

var energy = new genlist<double>();
var signal = new genlist<double>();
var error  = new genlist<double>();
var separators = new char[] {' ','\t'};
var options = StringSplitOptions.RemoveEmptyEntries;
do{
        string line=Console.In.ReadLine();
        if(line==null)break;
        string[] words=line.Split(separators,options);
        energy.add(double.Parse(words[0]));
        signal.add(double.Parse(words[1]));
        error .add(double.Parse(words[2]));
}while(true);

Plot your fit together with the experimental data.

(1 point) Central instead of forward finite difference approximation for the derivatives

In part A, when estimating derivatives numerically, we used (for simplicity of programming) the forward difference approximations for the derivatives,
```
f'(x) ≈ ^{[f(x+δx)-f(x)]}/_δx ,

f''(x) ≈ ^{[f'(x+δx)-f'(x)]}/_δx .
```
However, the central difference approximations might be a better choice,
```
f'(x) ≈ ^{[f(x+δx)-f(x-δx)]}/_[2δx] ,

f''(x) ≈ ^{[f(x+δx)-2f(x)+f(x-δx)]}/_δx² .
```
So, implement the central finite difference approximations for the derivatives, compare the resulting algorithm with the one in Part~A and find out which is bettter. You must take advantage of the fact that the estimates of the first and second derivatives share the function evaluations.