ocp.v1.training.preservation_policies module
Defines policies for when a checkpoint is preserved.
PreservationPolicy
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.PreservationPolicy(*args, **kwargs)[source]
Bases: Protocol
A policy that defines when checkpoints should be preserved.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
PreserveAll
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.PreserveAll[source]
Preserves all checkpoints.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
LatestN
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.LatestN(n=None)[source]
Preserves the last n checkpoints. Preserves all checkpoint if n is None.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
EveryNSeconds
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.EveryNSeconds(interval_secs)[source]
Ensures checkpoints are preserved at least after the time interval.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
EveryNSteps
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.EveryNSteps(interval_steps, exact_interval=True, max_to_keep=None)[source]
Preserves checkpoints after at least N steps.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
CustomSteps
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.CustomSteps(steps)[source]
Preserves checkpoints at the given steps.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
BestN
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.BestN(*, get_metric_fn, reverse=False, n=None, keep_checkpoints_without_metrics=True)[source]
A policy that preserves the best checkpoints based on a best_fn.
- get_metric_fn:
A function that accepts a nested tree of metrics and returns a scalar value
representing the value used for ranking checkpoints.
- reverse:
If False (default), checkpoints are sorted in ascending order, according to
the best_fn. If True, checkpoints are sorted in descending order. Same as
the semantics of built-in sorted() function.
- n:
The number of checkpoints to preserve. If None, all checkpoints are
preserved. If 0, no checkpoints are preserved.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
LatestDuration
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.LatestDuration(duration)[source]
Preserves checkpoints that are newer than the given duration.
E.g. retain checkpoints within the last 24 hours:
import datetime
LatestDuration(datetime.timedelta(hours=24))
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
EveryNStepsClosest
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.EveryNStepsClosest(interval_steps, max_to_keep=None)[source]
Preserves checkpoints at steps closest to absolute multiples of N.
This policy maps each checkpoint to its closest nominal target step on a grid
defined by interval_steps (i.e. k * interval_steps). For each nominal
target, the closest available checkpoint is preserved.
This avoids the error accumulation/drift that can occur with
EveryNSteps(exact_interval=False) when checkpoints are irregular.
The last checkpoint is always preserved for final model state and efficient
recovery.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
AnyPreservationPolicy
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.AnyPreservationPolicy(policies)[source]
Applies multiple preservation policies and preserves if any policy preserves.
-
should_preserve(checkpoints, *, context)[source]
Indicates which checkpoints should be preserved..
- Return type:
Sequence[bool]
PreservationContext
-
class orbax.checkpoint.experimental.v1.training.preservation_policies.PreservationContext[source]
Additional properties for making a save decision.
-
__eq__(other)
Return self==value.
-
__hash__ = None
-
__init__()