gyoza.client#

gyoza server API client.

Provides both the GyozaClient class for explicit construction and a pre-configured gyoza_client singleton for use by other modules.

Environment variables#

GYOZA_SERVER_URL

Base URL of the gyoza server. Defaults to http://localhost:5555.

GYOZA_API_KEY

API key for authentication. Required.

class gyoza.client.GyozaClient(base_url, api_key=None, timeout=30.0)[source]#

Bases: object

HTTP client covering all gyoza server API endpoints.

Parameters:
  • base_url (str) – Base URL of the gyoza server (e.g. "http://localhost:5555").

  • api_key (str | None) – API key sent as the X-API-Key header on every request.

  • timeout (float) – Per-request timeout in seconds.

upsert_definition(op_dict)[source]#

Create or replace an OpDefinition on the server.

Parameters:

op_dict (dict[str, Any]) – Serialised OpDefinition payload (from OpDefinition.to_dict()).

Returns:

The created or updated OpDefinition as returned by the server.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

get_definition(name, version=None)[source]#

Fetch an OpDefinition by name and optional version.

Parameters:
  • name (str) – OpDefinition identifier.

  • version (str | None) – Specific version to retrieve; omit to get the latest.

Returns:

The OpDefinition data.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the definition does not exist.

list_definitions()[source]#

Fetch all registered OpDefinitions (all versions).

Returns:

All OpDefinitions stored on the server.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

create_run(image, inputs, *, priority=5, constraints=None, retry_policy=None, event_delivery=None)[source]#

Create an ad-hoc OpRun without an OpDefinition.

Parameters:
  • image (str) – Docker image reference to execute.

  • inputs (dict[str, Any]) – Input parameters for the run.

  • priority (int) – Scheduling priority (higher value = higher priority).

  • constraints (dict[str, Any] | None) – Hardware requirements override.

  • retry_policy (dict[str, Any] | None) – Retry behaviour override.

  • event_delivery (dict[str, Any] | None) – Event delivery configuration override.

Returns:

The created OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

create_run_from_definition(name, inputs, *, version=None, priority=5)[source]#

Create an OpRun from a registered OpDefinition.

Parameters:
  • name (str) – OpDefinition identifier.

  • inputs (dict[str, Any]) – Input parameters for the run.

  • version (str | None) – Specific definition version to use; omit for latest.

  • priority (int) – Scheduling priority.

Returns:

The created OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the definition does not exist, 400 on invalid inputs.

get_run(run_id)[source]#

Fetch an OpRun by ID.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

Full OpRun data including state, events, and attempts.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

update_run(run_id, *, state=None, outputs=None, priority=None)[source]#

Partially update an OpRun’s state, outputs, or priority.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • state (str | None) – New state value (see OpRunState).

  • outputs (dict[str, Any] | None) – Output data to attach to the run.

  • priority (int | None) – New scheduling priority.

Returns:

The updated OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 on invalid state transition.

retry_run(run_id)[source]#

Trigger a retry for a failed OpRun.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

The OpRun reset to PENDING with an incremented attempt counter.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 if max attempts reached.

add_event(run_id, event_type, msg, payload=None)[source]#

Append an event to the current attempt of an OpRun.

Called by the worker during execution to report progress, completion, or failure. Special event types (STARTED, COMPLETED, FAILED, CANCELLED) trigger state transitions on the run.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • event_type (str) – Event type string (see EventType).

  • msg (str | int) – Human-readable message or progress value (0–100 for PROGRESS).

  • payload (dict[str, Any] | None) – Optional event-specific data (e.g. {"outputs": {...}} for COMPLETED, {"error_message": "..."} for FAILED).

Returns:

The updated OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 if the run is locked or event is invalid.

poll_events(run_id, after=None)[source]#

Poll events for an OpRun, optionally from a given index.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • after (int | None) – Return only events with index greater than this value. Omit to retrieve all events from the beginning.

Returns:

{"events": [...]} where each entry has id, type, t, msg, and state.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

get_attempts(run_id)[source]#

Fetch all execution attempts for an OpRun.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

Ordered list of all OpAttempts for this run.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

list_workers()[source]#

Fetch all workers registered with the server.

Returns:

All workers with their resources, tags, and status.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

heartbeat(worker_id, resources, tags=None, running_ops=None)[source]#

Register or update a worker via heartbeat.

Parameters:
  • worker_id (str) – Unique identifier for the worker.

  • resources (dict[str, Any]) – Worker resources with keys cpu_cores, ram_mb, and gpus (list of {"id": int, "vram_mb": int, "tags": [...]})

  • tags (list[str] | None) – Worker capability tags (e.g. ["gpu", "high-mem"]).

  • running_ops (list[dict[str, Any]] | None) – Currently executing ops as WorkerOpRun dicts.

Returns:

The created or updated Worker object.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

claim_ops(worker_id)[source]#

Request work allocation for a worker.

Parameters:

worker_id (str) – Unique identifier of the worker requesting ops.

Returns:

List of claimed WorkerOpRun objects (may be empty).

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

close()[source]#

Close the underlying HTTP connection pool.

Return type:

None

Unified HTTP client for the gyoza server API.

Used by the worker (heartbeat, claim ops, send events) and the CLI deployment pipeline (upsert definitions, create runs).

class gyoza.client._client.GyozaClient(base_url, api_key=None, timeout=30.0)[source]#

Bases: object

HTTP client covering all gyoza server API endpoints.

Parameters:
  • base_url (str) – Base URL of the gyoza server (e.g. "http://localhost:5555").

  • api_key (str | None) – API key sent as the X-API-Key header on every request.

  • timeout (float) – Per-request timeout in seconds.

upsert_definition(op_dict)[source]#

Create or replace an OpDefinition on the server.

Parameters:

op_dict (dict[str, Any]) – Serialised OpDefinition payload (from OpDefinition.to_dict()).

Returns:

The created or updated OpDefinition as returned by the server.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

get_definition(name, version=None)[source]#

Fetch an OpDefinition by name and optional version.

Parameters:
  • name (str) – OpDefinition identifier.

  • version (str | None) – Specific version to retrieve; omit to get the latest.

Returns:

The OpDefinition data.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the definition does not exist.

list_definitions()[source]#

Fetch all registered OpDefinitions (all versions).

Returns:

All OpDefinitions stored on the server.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

create_run(image, inputs, *, priority=5, constraints=None, retry_policy=None, event_delivery=None)[source]#

Create an ad-hoc OpRun without an OpDefinition.

Parameters:
  • image (str) – Docker image reference to execute.

  • inputs (dict[str, Any]) – Input parameters for the run.

  • priority (int) – Scheduling priority (higher value = higher priority).

  • constraints (dict[str, Any] | None) – Hardware requirements override.

  • retry_policy (dict[str, Any] | None) – Retry behaviour override.

  • event_delivery (dict[str, Any] | None) – Event delivery configuration override.

Returns:

The created OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

create_run_from_definition(name, inputs, *, version=None, priority=5)[source]#

Create an OpRun from a registered OpDefinition.

Parameters:
  • name (str) – OpDefinition identifier.

  • inputs (dict[str, Any]) – Input parameters for the run.

  • version (str | None) – Specific definition version to use; omit for latest.

  • priority (int) – Scheduling priority.

Returns:

The created OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the definition does not exist, 400 on invalid inputs.

get_run(run_id)[source]#

Fetch an OpRun by ID.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

Full OpRun data including state, events, and attempts.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

update_run(run_id, *, state=None, outputs=None, priority=None)[source]#

Partially update an OpRun’s state, outputs, or priority.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • state (str | None) – New state value (see OpRunState).

  • outputs (dict[str, Any] | None) – Output data to attach to the run.

  • priority (int | None) – New scheduling priority.

Returns:

The updated OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 on invalid state transition.

retry_run(run_id)[source]#

Trigger a retry for a failed OpRun.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

The OpRun reset to PENDING with an incremented attempt counter.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 if max attempts reached.

add_event(run_id, event_type, msg, payload=None)[source]#

Append an event to the current attempt of an OpRun.

Called by the worker during execution to report progress, completion, or failure. Special event types (STARTED, COMPLETED, FAILED, CANCELLED) trigger state transitions on the run.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • event_type (str) – Event type string (see EventType).

  • msg (str | int) – Human-readable message or progress value (0–100 for PROGRESS).

  • payload (dict[str, Any] | None) – Optional event-specific data (e.g. {"outputs": {...}} for COMPLETED, {"error_message": "..."} for FAILED).

Returns:

The updated OpRun.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if not found, 400 if the run is locked or event is invalid.

poll_events(run_id, after=None)[source]#

Poll events for an OpRun, optionally from a given index.

Parameters:
  • run_id (str) – Unique identifier of the OpRun.

  • after (int | None) – Return only events with index greater than this value. Omit to retrieve all events from the beginning.

Returns:

{"events": [...]} where each entry has id, type, t, msg, and state.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

get_attempts(run_id)[source]#

Fetch all execution attempts for an OpRun.

Parameters:

run_id (str) – Unique identifier of the OpRun.

Returns:

Ordered list of all OpAttempts for this run.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – 404 if the run does not exist.

list_workers()[source]#

Fetch all workers registered with the server.

Returns:

All workers with their resources, tags, and status.

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

heartbeat(worker_id, resources, tags=None, running_ops=None)[source]#

Register or update a worker via heartbeat.

Parameters:
  • worker_id (str) – Unique identifier for the worker.

  • resources (dict[str, Any]) – Worker resources with keys cpu_cores, ram_mb, and gpus (list of {"id": int, "vram_mb": int, "tags": [...]})

  • tags (list[str] | None) – Worker capability tags (e.g. ["gpu", "high-mem"]).

  • running_ops (list[dict[str, Any]] | None) – Currently executing ops as WorkerOpRun dicts.

Returns:

The created or updated Worker object.

Return type:

dict[str, Any]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

claim_ops(worker_id)[source]#

Request work allocation for a worker.

Parameters:

worker_id (str) – Unique identifier of the worker requesting ops.

Returns:

List of claimed WorkerOpRun objects (may be empty).

Return type:

list[dict[str, Any]]

Raises:

httpx.HTTPStatusError – On non-2xx responses.

close()[source]#

Close the underlying HTTP connection pool.

Return type:

None