Newton-Raphson solver

This section deals with the programming of the Newton-Raphson method, in any dimension. Given a function f:RnRmf: \R^n \rightarrow \R^m that is differentiable along both dimensions, it is possible to write ff and its derivatives in the following way :

f:[x1  xn][f1(x1,,xn)  fm(x1,,xn)]f:\left[\begin{array}{c} x_1 \\\ \vdots \\\ x_n \end{array}\right] \rightarrow \left[\begin{array}{c} f_1(x_1,\dots,x_n)\\\ \vdots\\\ f_m(x_1,\dots,x_n)\end{array}\right]

H:[x1  xn][ fi(x1,,xn)xj ]H:\left[\begin{array}{c} x_1 \\\ \vdots \\\ x_n \end{array}\right] \rightarrow \left[\begin{array}{c} \\\ \frac{\partial f_i(x_1,\dots,x_n)}{\partial x_j} \\\ \end{array}\right]

The HH function is called the Jacobian matrix of ff. The idea of the Newton-Raphson algorithm consists in considering that, given a point XX, the best direction leading to a root of the function is the direction given by the slope of ff at XX. By following this direction, the algorithm assumes that it moves closer to a root of ff.

Given a vector UU representing the current position, the algorithm computes f(U)f(U), a vector representing the value of ff at UU, as well as H(U)H(U) the Jacobian matrix of the derivatives of ff at U+VU + V is chosen such that:

f(U+V)=0f(U+V)=0 where f(U+V)f(U+V) is approximated by f(U)+H(U)matrix×Vvectorf(U) + \underbrace{H(U)}_{matrix} \times \underbrace{V}_{vector}

Then, UU is replaced by U+VU+V, and this operation is repeated until either UU converges or a maximal number of iterations is reached.

  • Questions :
    1. Write the (very simple) equation linking HH, UU and VV.
    2. Write the Newton-Raphson method in a generic way :
    function U = Newton_Raphson(f, J, U0, N, epsilon)
    

    where f is the function under study, J is its Jacobian matrix, U0 is the starting position of the algorithm, N is the maximal number of steps allowed during the algorithm and epsilon measures the convergence of the algorithm.

One requirement is to be able to compute the values taken by the function ff and all its derivatives in a priori any point of the plane. Therefore, it is necessary to specify ff and its derivatives in the form of functions. Python allows programming using functional parameters. These parameters may be provided by functions defined either with the keyword lambda, or with simple def definitions.

The algorithm requires the resolution of several linear systems with matrices MM that are possibly singular (i.e det(M)0det(M) \approx 0. Prefer the function numpy.linalg.lstsq in order to avoid such numerical problems.

One of the drawbacks of this algorithm is its tendency to diverge in numerous cases. It is therefore imperative to limit the number of steps taken, and to detect to what extent the algorithm has converged.

  • Questions :
    1. Propose a very simple test protocol for a function f:RRf: \R \longrightarrow \R.
    2. Enhance your implementation to include backtracking inside your Newton-Raphson implementation.