pysolvegn.solve_gauss_newton#

solve_gauss_newton(residual_func, jacobian_func, parameters, *, weight=None, loss=None, max_iterations=None, max_time=None, ftol=None, xtol=None, gtol=None, atol=None, ptol=None, callback=None, update_func=None, verbosity=0, history=False)[source]#

Solve the least squares problem using the Gauss-Newton method with robust cost functions.

The function accepts multiple terms in the least squares problem, each with its own residual function, Jacobian function, weight, and loss function solving the following optimization problem

\[\min_{\mathbf{p}} \frac{1}{2} \sum_{i} w_i \sum_j \rho_i\left(\| \mathbf{R}_{i,j}(\mathbf{p}) \|^2\right)\]

where \(w_i\) is a weight for each sub least squares problem, and \(\rho_i\) is a robust cost function for each sub least squares problem.

For more details on the optimization problem and the algorithm, please refer to the documentation.

Note

By nomenclature, we assume that the first sub least squares problem (i.e. \(i=0\)) is the main least squares problem containing the data residuals, and the other sub least squares problems (i.e. \(i \geq 1\)) are regularization terms.

Important

At least one of the stopping criteria (i.e. max_iterations, max_time, ftol, xtol, gtol, atol, ptol) must be provided to ensure the optimization process will stop.

Parameters:
  • residual_func (Union[Callable, Sequence[Callable]]) – The function(s) to compute the residuals for each term in the least squares problem. Can be a single function or a sequence of functions. Each callable represent the i-th term \(\mathbf{R}_i(\mathbf{p})\) in the least squares problem and should take as inputs the parameters (array-like with shape (n_parameters,)) and return the residuals as a 1D numpy array with shape (n_residuals_i,) representing each \(\mathbf{R}_{i,j}(\mathbf{p})\) where \(j=0,...,n_{\text{residuals}_i-1}\).

  • jacobian_func (Union[Callable, Sequence[Callable]]) – The function(s) to compute the Jacobian matrices for each term in the least squares problem. Can be a single function or a sequence of functions. Each callable represent the i-th term \(\mathbf{J}_i(\mathbf{p})\) in the least squares problem such as \(\mathbf{J}_i(\mathbf{p}) = \frac{\partial \mathbf{R}_i(\mathbf{p})}{\partial \mathbf{p}}\) and should take as inputs the parameters (array-like with shape (n_parameters,)) and return the Jacobian matrix as a 2D numpy array with shape (n_residuals_i, n_parameters) representing the Jacobian of each residual \(\mathbf{R}_{i,j}(\mathbf{p})\) where \(j=0,...,n_{\text{residuals}_i-1}\).

  • parameters (ArrayLike) – The starting parameters of the optimization with shape (n_parameters,). The array will not be modified by this function, a copy of the parameters will be used for the optimization process.

  • weight (Optional[Union[Real, Sequence[Real]]], optional (default=None)) – The weight(s) for each term in the least squares problem. Can be a single value or a sequence of values. Each weight \(w_i\) is a weight for the corresponding term in the least squares problem and should be a non-negative real number. If a single value is provided, it will be used for all terms in the least squares problem. If None is provided, it will be set to 1.0.

  • loss (Optional[Union[str, Sequence[str]]], optional (default=None)) – The loss function(s) for each term in the least squares problem. Can be a single string or a sequence of strings. Each loss function \(\rho_i\) is a robust cost function for the corresponding term in the least squares problem and should be one of the following strings: [“linear”, “soft_l1”, “cauchy”, “arctan”]. If a single string is provided, it will be used for all terms in the least squares problem. If None is provided, it will be set to “linear”.

  • max_iterations (Optional[Integral], optional (default=None)) – Stop criterion by the number of iterations. The optimization process is stopped when the number of iterations exceeds max_iterations. If None, no limit on the number of iterations is considered.

  • max_time (Optional[Real], optional (default=None)) – Stop criterion by the elapsed time of optimization. The optimization process is stopped when the time elapsed since the beginning of the optimization exceeds max_time seconds. If None, no limit on the computation time is considered.

  • ftol (Optional[Real], optional (default=None)) – Stop criterion by the change of the cost function value. The optimization process is stopped when dF < ftol * F where F is the cost function value and dF is the change of the cost function value between two iterations. If None, this criterion is not considered.

  • xtol (Optional[Real], optional (default=None)) – Stop criterion by the change of the parameters. The optimization process is stopped when ||dp|| < xtol * (xtol + ||p||) where p is the parameters and dp is the change of the parameters between two iterations. If None, this criterion is not considered.

  • gtol (Optional[Real], optional (default=None)) – Stop criterion by the optimality value. The optimization process is stopped when the optimality verifies norm(b, ord=numpy.inf) < gtol where b is the scaled second term. If None, this criterion is not considered.

  • atol (Optional[Real], optional (default=None)) – Stop criterion by the absolute cost function value. The optimization process is stopped when F < atol where F is the cost function value. If None, this criterion is not considered.

  • ptol (Optional[Real], optional (default=None)) – Stop criterion by the parameters value. The optimization process is stopped when ||p|| < ptol where p is the parameters. If None, this criterion is not considered.

  • callback (Optional[Callable[[Dict], bool]], optional (default=None)) – A callback function that is called at the end of each iteration of the optimization process. The callback function should take a dictionary similar to the one returned in the history of the optimization process as input. If the return value of the callback function is True (or None), the optimization process will continue. If the return value is False, the optimization process will be stopped. If None, no callback function is used.

  • update_func (Optional[Callable[[numpy.ndarray, numpy.ndarray], ArrayLike]], optional (default=None)) – A function that is called at the end of each iteration of the optimization process to compute the parameters for the next iteration. The function should take the current parameters and update (p, dp) as inputs and return the parameters with shape (n_parameters,) to use for the next iteration. If None, the parameters for the next iteration will be computed as p + dp where p is the current parameters and dp is the update computed by solving the linear system.

  • verbosity (Integral, optional) – The level of verbosity for logging the optimization process. 0: No logging (default) 1: Log only the final results of the optimization process. 2: Log the results at each iteration of the optimization process. 3: Log detailed information about the optimization process.

  • history (bool) – If True, the function will also return a tuple containing the history of the optimization process.

Returns:

  • parameters (numpy.ndarray) – The parameters that minimize the least squares problem with shape (n_parameters,).

  • history (Tuple[Dict], optional) – Only returned if history is True. A tuple containing the history of the optimization process. Each element of the tuple is a dictionary containing the keys described below.

Return type:

Tuple[ndarray, Tuple[Dict]]

Notes

The history contains the following keys:

  • “iteration”: Integer representing the iteration number.

  • “costs”: List of floats representing the cost function value for each term in the least squares problem at the current iteration compute as \(\frac{1}{2} \sum_j \rho_i\left(\| \mathbf{R}_{i,j}(\mathbf{p}) \|^2\right)\) for each term \(i\).

  • “cost”: Float representing the cost function value at the current iteration computed as \(\frac{1}{2} \sum_i w_i \sum_j \rho_i\left(\| \mathbf{R}_{i,j}(\mathbf{p}) \|^2\right)\).

  • “parameters”: Numpy array representing the parameters at the current iteration.

  • “residuals”: The residual of the data term (i.e. the first term in the least squares problem) at the current iteration.

The optimization process is performed by iteratively solving a linearized version of the least squares problem at each iteration for different values of the regularization factor, and then selecting the optimal regularization factor based on the L-curve analysis. The step of each loop iteration is as follows:

While NOT converged:
    1. Build the system M Δp = -b from 'residual_func', 'jacobian_func', 'loss' and 'weight'.
    2. Call the 'callback' function and STOP if it returns False.
    3. Check the stopping criteria and STOP if any of them is satisfied.
    4. If CONTINUE solve the linear system M Δp = -b for Δp.
    5. Update the parameters as p = p + Δp (or using 'update_func' if provided).

Version#

  • 0.0.1: Initial version.