AbstractCheckpointManager#
Abstract class to manage checkpoints: AbstractCheckpointManager.
AbstractCheckpointManager#
- class orbax.checkpoint.AbstractCheckpointManager(*args, **kwargs)[source][source]#
Interface to manage checkpoints.
Allows a user to save and restore objects for which a
Checkpointerimplementation exists (e.g.PyTreeCheckpointerfor PyTrees). The class keeps track of multiple checkpointable objects in the following structure:path/to/directory/ (top-level directory) 0/ (step) params/ (first saveable) ... metadata/ (second saveable) ... 1/ (step) ... 2/ (step) ... ...
- abstract property directory: Path#
Returns the top-level directory containing checkpoints for all items.
- Return type:
- abstractmethod all_steps(read=False)[source][source]#
Returns all steps tracked by the manager.
- Parameters:
read (
bool) – If True, forces a read directly from the storage location. Otherwise, a cached result can be returned.- Return type:
Sequence[int]- Returns:
A sequence of steps (integers)
- abstractmethod latest_step()[source][source]#
Returns the latest step saved.
Returns None if no steps have been saved.
- Return type:
Optional[int,None]- Returns:
A step (int) or None if no steps are present.
- abstractmethod best_step()[source][source]#
Returns the best step saved, as defined by options.best_fn.
Returns None if no steps have been saved.
- Return type:
Optional[int,None]- Returns:
A step (int) or None if no steps are present.
- abstractmethod reload()[source][source]#
Reloads internal properties.
Resets internal cache of checkpoint steps, in case the directory managed by this object has been updated externally.
- abstractmethod reached_preemption(step)[source][source]#
Returns True if a preemption sync point has been reached.
- Return type:
bool
- abstractmethod should_save(step)[source][source]#
Returns True if a checkpoint should be saved for the current step.
This depends the previous step and save interval.
- Parameters:
step (
int) – int- Return type:
bool- Returns:
True if the checkpoint should be saved.
- abstractmethod restore(step, *args, **kwargs)[source][source]#
Restores the given step.
- Return type:
Union[Any,Mapping[str,Any],CompositeArgs]
- abstractmethod item_metadata(step)[source][source]#
Returns metadata for all known items.
- Return type:
Union[Any,Mapping[str,Any],CompositeArgs]
- abstractmethod metadata(step=None)[source][source]#
Returns StepMetadata for the specified step, or RootMetadata all.
If step is specified, only return StepMetadata for that step. Otherwise, return RootMetadata.
- Parameters:
step (
UnionType[int,None]) – Step for which to retrieve StepMetadata. If None, returns RootMetadata.- Return type:
UnionType[StepMetadata,RootMetadata]- Returns:
Metadata for the specified step (StepMetadata), or all steps (RootMetadata).
- abstractmethod metrics(step)[source][source]#
Returns metrics for step, if present.
- Return type:
Optional[Any,None]
- abstractmethod wait_until_finished()[source][source]#
Blocks until any incomplete save operations are completed.
Note that this method will typically be a no-op if all checkpointers are synchronous, since old checkpoints are already cleaned up immediately after completing save, and there is no background thread to wait for.
If some checkpointers are of type
AsyncCheckpointer, however, this method will wait until each of these checkpointers is finished.
- abstractmethod check_for_errors()[source][source]#
Checks for any outstanding errors in completed asynchronous save operations.
Delegates to underlying
Checkpointer.
- abstractmethod close()[source][source]#
Waits for outstanding operations to finish and closes
Checkpointers.
- classmethod __subclasshook__(other)[source]#
Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).