Trust region¶
This subpackage contains trust region methods.
See also¶
- Step size - step size selection methods like Barzilai-Borwein and Polyak's step size.
- Line search - line search methods.
Classes:
-
CubicRegularization
–Cubic regularization.
-
Dogleg
–Dogleg trust region algorithm.
-
LevenbergMarquardt
–Levenberg-Marquardt trust region algorithm.
-
TrustCG
–Trust region via Steihaug-Toint Conjugate Gradient method.
-
TrustRegionBase
–
CubicRegularization ¶
Bases: torchzero.modules.trust_region.trust_region.TrustRegionBase
Cubic regularization.
Parameters:
-
hess_module
(Module | None
) –A module that maintains a hessian approximation (not hessian inverse!). This includes all full-matrix quasi-newton methods,
tz.m.Newton
andtz.m.GaussNewton
. When using quasi-newton methods, setinverse=False
when constructing them. -
eta
(float
, default:0.0
) –if ratio of actual to predicted rediction is larger than this, step is accepted. When :code:
hess_module
is GaussNewton, this can be set to 0. Defaults to 0.15. -
nplus
(float
, default:3.5
) –increase factor on successful steps. Defaults to 1.5.
-
nminus
(float
, default:0.25
) –decrease factor on unsuccessful steps. Defaults to 0.75.
-
rho_good
(float
, default:0.99
) –if ratio of actual to predicted rediction is larger than this, trust region size is multiplied by
nplus
. -
rho_bad
(float
, default:0.0001
) –if ratio of actual to predicted rediction is less than this, trust region size is multiplied by
nminus
. -
init
(float
, default:1
) –Initial trust region value. Defaults to 1.
-
maxiter
(float
, default:100
) –maximum iterations when solving cubic subproblem, defaults to 1e-7.
-
eps
(float
, default:1e-08
) –epsilon for the solver, defaults to 1e-8.
-
update_freq
(int
, default:1
) –frequency of updating the hessian. Defaults to 1.
-
max_attempts
(max_attempts
, default:10
) –maximum number of trust region size size reductions per step. A zero update vector is returned when this limit is exceeded. Defaults to 10.
-
fallback
(bool
) –if
True
, whenhess_module
maintains hessian inverse which can't be inverted efficiently, it will be inverted anyway. WhenFalse
(default), aRuntimeError
will be raised instead. -
inner
(Chainable | None
, default:None
) –preconditioning is applied to output of thise module. Defaults to None.
Examples:
Cubic regularized newton
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.CubicRegularization(tz.m.Newton()),
)
Source code in torchzero/modules/trust_region/cubic_regularization.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
|
Dogleg ¶
Bases: torchzero.modules.trust_region.trust_region.TrustRegionBase
Dogleg trust region algorithm.
Parameters:
-
hess_module
(Module | None
) –A module that maintains a hessian approximation (not hessian inverse!). This includes all full-matrix quasi-newton methods,
tz.m.Newton
andtz.m.GaussNewton
. When using quasi-newton methods, setinverse=False
when constructing them. -
eta
(float
, default:0.0
) –if ratio of actual to predicted rediction is larger than this, step is accepted. When :code:
hess_module
is GaussNewton, this can be set to 0. Defaults to 0.15. -
nplus
(float
, default:2
) –increase factor on successful steps. Defaults to 1.5.
-
nminus
(float
, default:0.25
) –decrease factor on unsuccessful steps. Defaults to 0.75.
-
rho_good
(float
, default:0.75
) –if ratio of actual to predicted rediction is larger than this, trust region size is multiplied by
nplus
. -
rho_bad
(float
, default:0.25
) –if ratio of actual to predicted rediction is less than this, trust region size is multiplied by
nminus
. -
init
(float
, default:1
) –Initial trust region value. Defaults to 1.
-
update_freq
(int
, default:1
) –frequency of updating the hessian. Defaults to 1.
-
max_attempts
(max_attempts
, default:10
) –maximum number of trust region size size reductions per step. A zero update vector is returned when this limit is exceeded. Defaults to 10.
-
inner
(Chainable | None
, default:None
) –preconditioning is applied to output of thise module. Defaults to None.
Source code in torchzero/modules/trust_region/dogleg.py
LevenbergMarquardt ¶
Bases: torchzero.modules.trust_region.trust_region.TrustRegionBase
Levenberg-Marquardt trust region algorithm.
Parameters:
-
hess_module
(Module | None
) –A module that maintains a hessian approximation (not hessian inverse!). This includes all full-matrix quasi-newton methods,
tz.m.Newton
andtz.m.GaussNewton
. When using quasi-newton methods, setinverse=False
when constructing them. -
y
(float
, default:0
) –when
y=0
, identity matrix is added to hessian, wheny=1
, diagonal of the hessian approximation is added. Values between interpolate. This should only be used with Gauss-Newton. Defaults to 0. -
eta
(float
, default:0.0
) –if ratio of actual to predicted rediction is larger than this, step is accepted. When
hess_module
isNewton
orGaussNewton
, this can be set to 0. Defaults to 0.15. -
nplus
(float
, default:3.5
) –increase factor on successful steps. Defaults to 1.5.
-
nminus
(float
, default:0.25
) –decrease factor on unsuccessful steps. Defaults to 0.75.
-
rho_good
(float
, default:0.99
) –if ratio of actual to predicted rediction is larger than this, trust region size is multiplied by
nplus
. -
rho_bad
(float
, default:0.0001
) –if ratio of actual to predicted rediction is less than this, trust region size is multiplied by
nminus
. -
init
(float
, default:1
) –Initial trust region value. Defaults to 1.
-
update_freq
(int
, default:1
) –frequency of updating the hessian. Defaults to 1.
-
max_attempts
(max_attempts
, default:10
) –maximum number of trust region size size reductions per step. A zero update vector is returned when this limit is exceeded. Defaults to 10.
-
fallback
(bool
, default:False
) –if
True
, whenhess_module
maintains hessian inverse which can't be inverted efficiently, it will be inverted anyway. WhenFalse
(default), aRuntimeError
will be raised instead. -
inner
(Chainable | None
, default:None
) –preconditioning is applied to output of thise module. Defaults to None.
Examples:
Gauss-Newton with Levenberg-Marquardt trust-region
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.LevenbergMarquardt(tz.m.GaussNewton()),
)
LM-SR1
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.LevenbergMarquardt(tz.m.SR1(inverse=False)),
)
First order trust region (hessian is assumed to be identity)
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.LevenbergMarquardt(tz.m.Identity()),
)
Source code in torchzero/modules/trust_region/levenberg_marquardt.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
|
TrustCG ¶
Bases: torchzero.modules.trust_region.trust_region.TrustRegionBase
Trust region via Steihaug-Toint Conjugate Gradient method.
.. note::
If you wish to use exact hessian, use the matrix-free :code:`tz.m.NewtonCGSteihaug`
which only uses hessian-vector products. While passing ``tz.m.Newton`` to this
is possible, it is usually less efficient.
Parameters:
-
hess_module
(Module | None
) –A module that maintains a hessian approximation (not hessian inverse!). This includes all full-matrix quasi-newton methods,
tz.m.Newton
andtz.m.GaussNewton
. When using quasi-newton methods, setinverse=False
when constructing them. -
eta
(float
, default:0.0
) –if ratio of actual to predicted rediction is larger than this, step is accepted. When :code:
hess_module
is GaussNewton, this can be set to 0. Defaults to 0.15. -
nplus
(float
, default:3.5
) –increase factor on successful steps. Defaults to 1.5.
-
nminus
(float
, default:0.25
) –decrease factor on unsuccessful steps. Defaults to 0.75.
-
rho_good
(float
, default:0.99
) –if ratio of actual to predicted rediction is larger than this, trust region size is multiplied by
nplus
. -
rho_bad
(float
, default:0.0001
) –if ratio of actual to predicted rediction is less than this, trust region size is multiplied by
nminus
. -
init
(float
, default:1
) –Initial trust region value. Defaults to 1.
-
update_freq
(int
, default:1
) –frequency of updating the hessian. Defaults to 1.
-
reg
(int
, default:0
) –regularization parameter for conjugate gradient. Defaults to 0.
-
max_attempts
(max_attempts
, default:10
) –maximum number of trust region size size reductions per step. A zero update vector is returned when this limit is exceeded. Defaults to 10.
-
boundary_tol
(float | None
, default:1e-06
) –The trust region only increases when suggested step's norm is at least
(1-boundary_tol)*trust_region
. This prevents increasing trust region when solution is not on the boundary. Defaults to 1e-2. -
prefer_exact
(bool
, default:True
) –when exact solution can be easily calculated without CG (e.g. hessian is stored as scaled identity), uses the exact solution. If False, always uses CG. Defaults to True.
-
inner
(Chainable | None
, default:None
) –preconditioning is applied to output of thise module. Defaults to None.
Examples:
Trust-SR1
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.TrustCG(hess_module=tz.m.SR1(inverse=False)),
)
Source code in torchzero/modules/trust_region/trust_cg.py
TrustRegionBase ¶
Bases: torchzero.core.module.Module
, abc.ABC
Methods:
-
trust_region_apply
–Solves the trust region subproblem and outputs
Var
with the solution direction. -
trust_region_update
–updates the state of this module after H or B have been updated, if necessary
-
trust_solve
–Solve Hx=g with a trust region penalty/bound defined by
radius
Source code in torchzero/modules/trust_region/trust_region.py
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 |
|
trust_region_apply ¶
Solves the trust region subproblem and outputs Var
with the solution direction.
Source code in torchzero/modules/trust_region/trust_region.py
trust_region_update ¶
trust_region_update(var: Var, H: LinearOperator | None) -> None
updates the state of this module after H or B have been updated, if necessary
trust_solve ¶
trust_solve(f: float, g: Tensor, H: LinearOperator, radius: float, params: list[Tensor], closure: Callable, settings: Mapping[str, Any]) -> Tensor
Solve Hx=g with a trust region penalty/bound defined by radius