embedl_hub.core.invoke package#

Invocation components and result types.

Re-exports#

InvokeError — Raised when an invocation job fails.
ONNXRuntimeInvoker — ONNX Runtime invoker component.
ONNXRuntimeInvocationResult — Output of ONNX Runtime invocation.
TensorRTInvoker — TensorRT invoker component.
TensorRTInvocationResult — Output of TensorRT invocation.
TFLiteInvoker — TFLite invoker component.
TFLiteInvocationResult — Output of TFLite invocation.

exception embedl_hub.core.invoke.InvokeError[source]#

Bases: RuntimeError

Raised when an invocation job fails.

class embedl_hub.core.invoke.ONNXRuntimeInvocationResult(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, output: dict[str, numpy.ndarray], output_file: LoggedArtifact)[source]#

Bases: ComponentOutput

Output from the ONNXRuntimeInvoker component.

Parameters:

output – The output data from the model invocation, mapping output tensor names to NumPy arrays.
output_file – The artifact containing the serialised output .npz file.

output: dict[str, numpy.ndarray]#

output_file: LoggedArtifact#

class embedl_hub.core.invoke.ONNXRuntimeInvoker(*, name: str | None = None, device: str | None = None, use_compiled_names: bool = False)[source]#

Bases: Component

Component that runs inference on ONNX models using embedl-onnxruntime.

Runs embedl-onnxruntime run-inference on a remote device over SSH to execute inference on an ONNX model and exports the output tensors.

Device-specific parameters (embedl_onnxruntime_path, cli_args) are configured via EmbedlONNXRuntimeConfig on the device.

run(ctx: HubContext, model: ONNXRuntimeCompiledModel, input_data: dict[str, numpy.ndarray], *, device: str | None = None, use_compiled_names: bool = False) → ONNXRuntimeInvocationResult[source]#

Run inference on an ONNX model via embedl-onnxruntime.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the value from the constructor is used.

Parameters:

ctx – The execution context with device configuration.
model – An ONNXRuntimeCompiledModel whose path artifact points to an ONNX model.
input_data – Dictionary mapping input tensor names to NumPy arrays. Uploaded as a compressed .npz file.
use_compiled_names – If False (default) and the compiler renamed input/output tensors, both original and compiled names are accepted for inputs, and outputs are returned with original names. Set to True to skip all remapping and only accept compiled names. Only used on qai_hub devices.

Returns:

An ONNXRuntimeInvocationResult with the output artifact.

run_type: ClassVar[RunType] = 'INFERENCE'#

class embedl_hub.core.invoke.TFLiteInvocationResult(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, output: dict[str, numpy.ndarray], output_file: LoggedArtifact)[source]#

Bases: ComponentOutput

Output of a TFLite inference step.

Parameters:

output – Dictionary mapping output tensor names to numpy arrays.
output_file – The logged output .npz artifact.

output: dict[str, numpy.ndarray]#

output_file: LoggedArtifact#

class embedl_hub.core.invoke.TFLiteInvoker(*, name: str | None = None, device: str | None = None, use_compiled_names: bool = False)[source]#

Bases: Component

Run inference on a compiled TFLite model.

Dispatches to a device-specific implementation based on the configured device type.

run(ctx: HubContext, model: TFLiteCompiledModel, input_data: dict[str, numpy.ndarray], *, device: str | None = None, use_compiled_names: bool = False) → TFLiteInvocationResult[source]#

Run inference on a compiled TFLite model.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the value from the constructor is used.

Parameters:

ctx – The execution context with device configuration.
model – The compiled TFLite model (from TFLiteCompiler).
input_data – Dictionary mapping input tensor names to numpy arrays.
device – Name of the target device.
use_compiled_names – If False (default), input names are remapped to the compiled model’s names before submission and output names are mapped back to the originals.

Returns:

A TFLiteInvocationResult with inference output.

run_type: ClassVar[RunType] = 'INFERENCE'#

class embedl_hub.core.invoke.TensorRTInvocationResult(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, output: dict[str, numpy.ndarray], output_file: LoggedArtifact)[source]#

Bases: ComponentOutput

Output from the TensorRTInvoker component.

Parameters:

output – The output data from the model invocation, mapping output tensor names to NumPy arrays.
output_file – The artifact containing the serialised output JSON file produced by trtexec --exportOutput.

output: dict[str, numpy.ndarray]#

output_file: LoggedArtifact#

class embedl_hub.core.invoke.TensorRTInvoker(*, name: str | None = None, device: str | None = None)[source]#

Bases: Component

Component that runs inference on TensorRT engines using trtexec.

Runs trtexec on a remote device over SSH to execute inference on a compiled .trt / .engine / .plan engine and exports the output tensors.

Device-specific parameters (trtexec_path, trtexec_cli_args) are configured via TrtexecConfig on the device. Per-component overrides can be set via provider_config_overrides.

run(ctx: HubContext, model: TensorRTCompiledModel, input_data: dict[str, numpy.ndarray], *, device: str | None = None) → TensorRTInvocationResult[source]#

Run inference on a compiled TensorRT engine via trtexec.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the value from the constructor is used.

Parameters:

ctx – The execution context with device configuration.
model – A TensorRTCompiledModel whose path artifact points to a compiled TensorRT engine.
input_data – Dictionary mapping input tensor names to NumPy arrays.

Returns:

A TensorRTInvocationResult with the output artifact.

run_type: ClassVar[RunType] = 'INFERENCE'#