{
  "__type": "IngestedDoc",
  "__tag": 4010,
  "_content": {
    "Notes": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Warns": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Raises": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Yields": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Methods": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Returns": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Summary": {
      "__type": "Section",
      "__tag": 4015,
      "children": [
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Trust Region Reflective algorithm for least-squares optimization."
            }
          ]
        }
      ],
      "title": [],
      "level": 0,
      "target": null
    },
    "Receives": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Warnings": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Attributes": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Parameters": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    },
    "Extended Summary": {
      "__type": "Section",
      "__tag": 4015,
      "children": [
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The algorithm is based on ideas from paper "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "STIR"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". The main idea is to account for the presence of the bounds by appropriate scaling of the variables (or, equivalently, changing a trust-region shape). Let's introduce a vector v:"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": []
        },
        {
          "__type": "DefList",
          "__tag": 4033,
          "children": [
            {
              "__type": "DefListItem",
              "__tag": 4037,
              "dt": {
                "__type": "Paragraph",
                "__tag": 4045,
                "children": [
                  {
                    "__type": "Text",
                    "__tag": 4046,
                    "value": "v[i] = | x[i] - lb[i], if g[i] > 0 and lb[i] > -np.inf"
                  }
                ]
              },
              "dd": []
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where g is the gradient of a cost function and lb, ub are the bounds. Its components are distances to the bounds at which the anti-gradient points (if this distance is finite). Define a scaling matrix D = diag(v**0.5). First-order optimality conditions can be stated as"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "D^2 g(x) = 0."
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Meaning that components of the gradient should be zero for strictly interior variables, and components must point inside the feasible region for variables on the bound."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Now consider this system of equations as a new optimization problem. If the point x is strictly interior (not on the bound), then the left-hand side is differentiable and the Newton step for it satisfies"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "(D^2 H + diag(g) Jv) p = -D^2 g"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where H is the Hessian matrix (or its J^T J approximation in least squares), Jv is the Jacobian matrix of v with components -1, 1 or 0, such that all elements of matrix C = diag(g) Jv are non-negative. Introduce the change of the variables x = D x_h (_h would be \"hat\" in LaTeX). In the new variables, we have a Newton step satisfying"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "B_h p_h = -g_h,"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where B_h = D H D + C, g_h = D g. In least squares B_h = J_h^T J_h, where J_h = J D. Note that J_h and g_h are proper Jacobian and gradient with respect to \"hat\" variables. To guarantee global convergence we formulate a trust-region problem based on the Newton step in the new variables:"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "0.5 * p_h^T B_h p + g_h^T p_h -> min, "
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "|p_h|"
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": " <= Delta"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "In the original space B = H + D^{-1} C D^{-1}, and the equivalent trust-region problem is"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "0.5 * p^T B p + g^T p -> min, "
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "|D^{-1} p|"
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": " <= Delta"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Here, the meaning of the matrix D becomes more clear: it alters the shape of a trust-region, such that large steps towards the bounds are not allowed. In the implementation, the trust-region problem is solved in \"hat\" space, but handling of the bounds is done in the original space (see below and read the code)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The introduction of the matrix D doesn't allow to ignore bounds, the algorithm must keep iterates strictly feasible (to satisfy aforementioned differentiability), the parameter theta controls step back from the boundary (see the code for details)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The algorithm does another important trick. If the trust-region solution doesn't fit into the bounds, then a reflected (from a firstly encountered bound) search direction is considered. For motivation and analysis refer to "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "STIR"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": " paper (and other papers of the authors). In practice, it doesn't need a lot of justifications, the algorithm simply chooses the best step among three: a constrained trust-region step, a reflected step and a constrained Cauchy step (a minimizer along -g_h in \"hat\" space, or -D^2 g in the original space)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Another feature is that a trust-region radius control strategy is modified to account for appearance of the diagonal C matrix (called diag_h in the code)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Note that all described peculiarities are completely gone as we consider problems without bounds (the algorithm becomes a standard trust-region type algorithm very similar to ones implemented in MINPACK)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The implementation supports two methods of solving the trust-region problem. The first, called 'exact', applies SVD on Jacobian and then solves the problem very accurately using the algorithm described in "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "JJMore"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". It is not applicable to large problem. The second, called 'lsmr', uses the 2-D subspace approach (sometimes called \"indefinite dogleg\"), where the problem is solved in a subspace spanned by the gradient and the approximate Gauss-Newton step found by "
            },
            {
              "__type": "InlineCode",
              "__tag": 4051,
              "value": "scipy.sparse.linalg.lsmr"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". A 2-D trust-region problem is reformulated as a 4th order algebraic equation and solved very accurately by "
            },
            {
              "__type": "InlineCode",
              "__tag": 4051,
              "value": "numpy.roots"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". The subspace approach allows to solve very large problems (up to couple of millions of residuals on a regular PC), provided the Jacobian matrix is sufficiently sparse."
            }
          ]
        }
      ],
      "title": [],
      "level": 0,
      "target": null
    },
    "Other Parameters": {
      "__type": "Section",
      "__tag": 4015,
      "children": [],
      "title": [],
      "level": 0,
      "target": null
    }
  },
  "_ordered_sections": [
    "Summary",
    "Extended Summary",
    "Parameters",
    "Attributes",
    "Methods",
    "Returns",
    "Yields",
    "Receives",
    "Other Parameters",
    "Raises",
    "Warns",
    "Warnings",
    "Notes"
  ],
  "item_file": "/scipy/optimize/_lsq/trf.py",
  "item_line": 0,
  "item_type": "module",
  "aliases": [
    "scipy.optimize._lsq.trf"
  ],
  "example_section_data": {
    "__type": "Section",
    "__tag": 4015,
    "children": [],
    "title": [],
    "level": 0,
    "target": null
  },
  "see_also": [],
  "signature": null,
  "references": [
    ".. [STIR] Branch, M.A., T.F. Coleman, and Y. Li, \"A Subspace, Interior,",
    "      and Conjugate Gradient Method for Large-Scale Bound-Constrained",
    "      Minimization Problems,\" SIAM Journal on Scientific Computing,",
    "      Vol. 21, Number 1, pp 1-23, 1999.",
    ".. [JJMore] More, J. J., \"The Levenberg-Marquardt Algorithm: Implementation",
    "    and Theory,\" Numerical Analysis, ed. G. A. Watson, Lecture"
  ],
  "qa": "scipy.optimize._lsq.trf",
  "arbitrary": [
    {
      "__type": "Section",
      "__tag": 4015,
      "children": [
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Trust Region Reflective algorithm for least-squares optimization."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The algorithm is based on ideas from paper "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "STIR"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". The main idea is to account for the presence of the bounds by appropriate scaling of the variables (or, equivalently, changing a trust-region shape). Let's introduce a vector v:"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": []
        },
        {
          "__type": "DefList",
          "__tag": 4033,
          "children": [
            {
              "__type": "DefListItem",
              "__tag": 4037,
              "dt": {
                "__type": "Paragraph",
                "__tag": 4045,
                "children": [
                  {
                    "__type": "Text",
                    "__tag": 4046,
                    "value": "v[i] = | x[i] - lb[i], if g[i] > 0 and lb[i] > -np.inf"
                  }
                ]
              },
              "dd": []
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where g is the gradient of a cost function and lb, ub are the bounds. Its components are distances to the bounds at which the anti-gradient points (if this distance is finite). Define a scaling matrix D = diag(v**0.5). First-order optimality conditions can be stated as"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "D^2 g(x) = 0."
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Meaning that components of the gradient should be zero for strictly interior variables, and components must point inside the feasible region for variables on the bound."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Now consider this system of equations as a new optimization problem. If the point x is strictly interior (not on the bound), then the left-hand side is differentiable and the Newton step for it satisfies"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "(D^2 H + diag(g) Jv) p = -D^2 g"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where H is the Hessian matrix (or its J^T J approximation in least squares), Jv is the Jacobian matrix of v with components -1, 1 or 0, such that all elements of matrix C = diag(g) Jv are non-negative. Introduce the change of the variables x = D x_h (_h would be \"hat\" in LaTeX). In the new variables, we have a Newton step satisfying"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "B_h p_h = -g_h,"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "where B_h = D H D + C, g_h = D g. In least squares B_h = J_h^T J_h, where J_h = J D. Note that J_h and g_h are proper Jacobian and gradient with respect to \"hat\" variables. To guarantee global convergence we formulate a trust-region problem based on the Newton step in the new variables:"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "0.5 * p_h^T B_h p + g_h^T p_h -> min, "
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "|p_h|"
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": " <= Delta"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "In the original space B = H + D^{-1} C D^{-1}, and the equivalent trust-region problem is"
            }
          ]
        },
        {
          "__type": "Blockquote",
          "__tag": 4059,
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "0.5 * p^T B p + g^T p -> min, "
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "|D^{-1} p|"
                },
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": " <= Delta"
                }
              ]
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Here, the meaning of the matrix D becomes more clear: it alters the shape of a trust-region, such that large steps towards the bounds are not allowed. In the implementation, the trust-region problem is solved in \"hat\" space, but handling of the bounds is done in the original space (see below and read the code)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The introduction of the matrix D doesn't allow to ignore bounds, the algorithm must keep iterates strictly feasible (to satisfy aforementioned differentiability), the parameter theta controls step back from the boundary (see the code for details)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The algorithm does another important trick. If the trust-region solution doesn't fit into the bounds, then a reflected (from a firstly encountered bound) search direction is considered. For motivation and analysis refer to "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "STIR"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": " paper (and other papers of the authors). In practice, it doesn't need a lot of justifications, the algorithm simply chooses the best step among three: a constrained trust-region step, a reflected step and a constrained Cauchy step (a minimizer along -g_h in \"hat\" space, or -D^2 g in the original space)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Another feature is that a trust-region radius control strategy is modified to account for appearance of the diagonal C matrix (called diag_h in the code)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "Note that all described peculiarities are completely gone as we consider problems without bounds (the algorithm becomes a standard trust-region type algorithm very similar to ones implemented in MINPACK)."
            }
          ]
        },
        {
          "__type": "Paragraph",
          "__tag": 4045,
          "children": [
            {
              "__type": "Text",
              "__tag": 4046,
              "value": "The implementation supports two methods of solving the trust-region problem. The first, called 'exact', applies SVD on Jacobian and then solves the problem very accurately using the algorithm described in "
            },
            {
              "__type": "CitationReference",
              "__tag": 4063,
              "label": "JJMore"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". It is not applicable to large problem. The second, called 'lsmr', uses the 2-D subspace approach (sometimes called \"indefinite dogleg\"), where the problem is solved in a subspace spanned by the gradient and the approximate Gauss-Newton step found by "
            },
            {
              "__type": "InlineCode",
              "__tag": 4051,
              "value": "scipy.sparse.linalg.lsmr"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". A 2-D trust-region problem is reformulated as a 4th order algebraic equation and solved very accurately by "
            },
            {
              "__type": "InlineCode",
              "__tag": 4051,
              "value": "numpy.roots"
            },
            {
              "__type": "Text",
              "__tag": 4046,
              "value": ". The subspace approach allows to solve very large problems (up to couple of millions of residuals on a regular PC), provided the Jacobian matrix is sufficiently sparse."
            }
          ]
        }
      ],
      "title": [],
      "level": 0,
      "target": null
    },
    {
      "__type": "Section",
      "__tag": 4015,
      "children": [
        {
          "__type": "Citation",
          "__tag": 4064,
          "label": "STIR",
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "Branch, M.A., T.F. Coleman, and Y. Li, \"A Subspace, Interior, and Conjugate Gradient Method for Large-Scale Bound-Constrained Minimization Problems,\" SIAM Journal on Scientific Computing, Vol. 21, Number 1, pp 1-23, 1999."
                }
              ]
            }
          ]
        },
        {
          "__type": "Citation",
          "__tag": 4064,
          "label": "JJMore",
          "children": [
            {
              "__type": "Paragraph",
              "__tag": 4045,
              "children": [
                {
                  "__type": "Text",
                  "__tag": 4046,
                  "value": "More, J. J., \"The Levenberg-Marquardt Algorithm: Implementation and Theory,\" Numerical Analysis, ed. G. A. Watson, Lecture"
                }
              ]
            }
          ]
        }
      ],
      "title": [
        {
          "__type": "Text",
          "__tag": 4046,
          "value": "References"
        }
      ],
      "level": 0,
      "target": null
    }
  ],
  "local_refs": []
}