Newton-Raphson solver

This section deals with the programming of the Newton-Raphson method, in any dimension. Given a function $f: \R^n \rightarrow \R^m$ that is differentiable along both dimensions, it is possible to write $f$ and its derivatives in the following way :

$$f:\left[\begin{array}{c} x_1 \\\ \vdots \\\ x_n \end{array}\right] \rightarrow \left[\begin{array}{c} f_1(x_1,\dots,x_n)\\\ \vdots\\\ f_m(x_1,\dots,x_n)\end{array}\right]$$

$$H:\left[\begin{array}{c} x_1 \\\ \vdots \\\ x_n \end{array}\right] \rightarrow \left[\begin{array}{c} \\\ \frac{\partial f_i(x_1,\dots,x_n)}{\partial x_j} \\\ \end{array}\right]$$

The $H$ function is called the Jacobian matrix of $f$. The idea of the Newton-Raphson algorithm consists in considering that, given a point $X$, the best direction leading to a root of the function is the direction given by the slope of $f$ at $X$. By following this direction, the algorithm assumes that it moves closer to a root of $f$.

Given a vector $U$ representing the current position, the algorithm computes $f(U)$, a vector representing the value of $f$ at $U$, as well as $H(U)$ the Jacobian matrix of the derivatives of $f$ at $U + V$ is chosen such that:

$f(U+V)=0$ where $f(U+V)$ is approximated by $f(U) + \underbrace{H(U)}_{matrix} \times \underbrace{V}_{vector}$

Then, $U$ is replaced by $U+V$, and this operation is repeated until either $U$ converges or a maximal number of iterations is reached.

  • Questions :
    1. Write the (very simple) equation linking $H$, $U$ and $V$.
    2. Write the Newton-Raphson method in a generic way :
    function U = Newton_Raphson(f, J, U0, N, epsilon)
    

    where f is the function under study, J is its Jacobian matrix, U0 is the starting position of the algorithm, N is the maximal number of steps allowed during the algorithm and epsilon measures the convergence of the algorithm.

One requirement is to be able to compute the values taken by the function $f$ and all its derivatives in a priori any point of the plane. Therefore, it is necessary to specify $f$ and its derivatives in the form of functions. Python allows programming using functional parameters. These parameters may be provided by functions defined either with the keyword lambda, or with simple def definitions.

The algorithm requires the resolution of several linear systems with matrices $M$ that are possibly singular (i.e $det(M) \approx 0$. Prefer the function numpy.linalg.lstsq in order to avoid such numerical problems.

One of the drawbacks of this algorithm is its tendency to diverge in numerous cases. It is therefore imperative to limit the number of steps taken, and to detect to what extent the algorithm has converged.

  • Questions :
    1. Propose a very simple test protocol for a function $f: \R \longrightarrow \R$.
    2. Enhance your implementation to include backtracking inside your Newton-Raphson implementation.