ocp.v1.multihost module#

Multihost functionalities.

orbax.checkpoint.experimental.v1.multihost.is_pathways_backend()[source][source]#
Return type:

bool

async orbax.checkpoint.experimental.v1.multihost.sync_global_processes(key, *, operation_id, timeout=None, processes=None, record_event_name='/jax/checkpoint/sync_global_devices_duration_sec')[source][source]#

Barrier to sync concurrent processes.

NOTE: The barrier name must be unique, i.e. no process should wait on the same barrier name multiple times.

Parameters:
  • key (str) – barrier name. Must be unique.

  • operation_id (str) – The barrier name will be prefixed with the operation id.

  • timeout (UnionType[int, None]) – timeout in seconds.

  • processes (Optional[Collection[int], None]) – If None, expects to wait across all processes and devices. Otherwise, creates a barrier only across devices associated with the given processes.

  • record_event_name (str) – The name of the event to record the duration of the synchronization.

orbax.checkpoint.experimental.v1.multihost.is_primary_host(primary_host)[source][source]#
orbax.checkpoint.experimental.v1.multihost.process_count()[source][source]#
Return type:

int

orbax.checkpoint.experimental.v1.multihost.process_index()[source][source]#
Return type:

int