Smoothing¶
This subpackage contains smoothing-based optimization modules, currently laplacian and gaussian smoothing.
Classes:
-
GradientSampling
–Samples and aggregates gradients and values at perturbed points.
-
LaplacianSmoothing
–Applies laplacian smoothing via a fast Fourier transform solver which can improve generalization.
GradientSampling ¶
Bases: torchzero.core.reformulation.Reformulation
Samples and aggregates gradients and values at perturbed points.
This module can be used for gaussian homotopy and gradient sampling methods.
Parameters:
-
modules
(Chainable | None
, default:None
) –modules that will be optimizing the modified objective. if None, returns gradient of the modified objective as the update. Defaults to None.
-
sigma
(float
, default:1.0
) –initial magnitude of the perturbations. Defaults to 1.
-
n
(int
, default:100
) –number of perturbations per step. Defaults to 100.
-
aggregate
(str
, default:'mean'
) –how to aggregate values and gradients - "mean" - uses mean of the gradients, as in gaussian homotopy. - "max" - uses element-wise maximum of the gradients. - "min" - uses element-wise minimum of the gradients. - "min-norm" - picks gradient with the lowest norm.
Defaults to 'mean'.
-
distribution
(Literal
, default:'gaussian'
) –distribution for random perturbations. Defaults to 'gaussian'.
-
include_x0
(bool
, default:True
) –whether to include gradient at un-perturbed point. Defaults to True.
-
fixed
(bool
, default:True
) –if True, perturbations do not get replaced by new random perturbations until termination criteria is satisfied. Defaults to True.
-
pre_generate
(bool
, default:True
) –if True, perturbations are pre-generated before each step. This requires more memory to store all of them, but ensures they do not change when closure is evaluated multiple times. Defaults to True.
-
termination
(TerminationCriteriaBase | Sequence[TerminationCriteriaBase] | None
, default:None
) –a termination criteria module, sigma will be multiplied by
decay
when termination criteria is satisfied, and new perturbations will be generated iffixed
. Defaults to None. -
decay
(float
, default:0.6666666666666666
) –sigma multiplier on termination criteria. Defaults to 2/3.
-
reset_on_termination
(bool
, default:True
) –whether to reset states of all other modules on termination. Defaults to True.
-
sigma_strategy
(str | None
, default:None
) –strategy for adapting sigma. If condition is satisfied, sigma is multiplied by
sigma_nplus
, otherwise it is multiplied bysigma_nminus
. - "grad-norm" - at leastsigma_target
gradients should have lower norm than at un-perturbed point. - "value" - at leastsigma_target
values (losses) should be lower than at un-perturbed point. - None - doesn't use adaptive sigma.This introduces a side-effect to the closure, so it should be left at None of you use trust region or line search to optimize the modified objective. Defaults to None.
-
sigma_target
(int
, default:0.2
) –number of elements to satisfy the condition in
sigma_strategy
. Defaults to 1. -
sigma_nplus
(float
, default:1.3333333333333333
) –sigma multiplier when
sigma_strategy
condition is satisfied. Defaults to 4/3. -
sigma_nminus
(float
, default:0.6666666666666666
) –sigma multiplier when
sigma_strategy
condition is not satisfied. Defaults to 2/3. -
seed
(int | None
, default:None
) –seed. Defaults to None.
Source code in torchzero/modules/smoothing/sampling.py
|
|
LaplacianSmoothing ¶
Bases: torchzero.core.transform.Transform
Applies laplacian smoothing via a fast Fourier transform solver which can improve generalization.
Parameters:
-
sigma
(float
, default:1
) –controls the amount of smoothing. Defaults to 1.
-
layerwise
(bool
, default:True
) –If True, applies smoothing to each parameter's gradient separately, Otherwise applies it to all gradients, concatenated into a single vector. Defaults to True.
-
min_numel
(int
, default:4
) –minimum number of elements in a parameter to apply laplacian smoothing to. Only has effect if
layerwise
is True. Defaults to 4. -
target
(str
, default:'update'
) –what to set on var.
Examples:
Laplacian Smoothing Gradient Descent optimizer as in the paper
.. code-block:: python
opt = tz.Modular(
model.parameters(),
tz.m.LaplacianSmoothing(),
tz.m.LR(1e-2),
)
Reference
Osher, S., Wang, B., Yin, P., Luo, X., Barekat, F., Pham, M., & Lin, A. (2022). Laplacian smoothing gradient descent. Research in the Mathematical Sciences, 9(3), 55.