embedl_hub.core.compile package#

Compiler components and result types.

Re-exports#

CompileError — Raised on compilation failure.
CompileResult — Container for compilation results.
ONNXRuntimeCompiler — ONNX Runtime compiler component.
ONNXRuntimeCompiledModel — Output of ONNX Runtime compilation.
TensorRTCompiler — TensorRT compiler component.
TensorRTCompiledModel — Output of TensorRT compilation.
TFLiteCompiler — TFLite compiler component.
TFLiteCompiledModel — Output of TFLite compilation.

exception embedl_hub.core.compile.CompileError[source]#

Bases: RuntimeError

Raised when a compile job fails or times out.

class embedl_hub.core.compile.CompileResult(model_path: Path, job_id: str | None = None, device: str | None = None)[source]#

Bases: object

Result of a successful compile job.

device: str | None = None#

job_id: str | None = None#

model_path: Path#

class embedl_hub.core.compile.ONNXRuntimeCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None, input_name_mapping: dict[str, str] | None = None, output_name_mapping: dict[str, str] | None = None)[source]#

Bases: CompiledModel

Output from the ONNXRuntimeCompiler component.

Extends CompiledModel with optional input/output tensor name mappings that record how the compiler renamed tensors.

input_name_mapping: dict[str, str] | None = None#

output_name_mapping: dict[str, str] | None = None#

class embedl_hub.core.compile.ONNXRuntimeCompiler(*, name: str | None = None, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, numpy.ndarray] | None = None, calibration_method: Literal['minmax', 'entropy'] | None = None, per_channel: bool = False, quantize_io: bool = False)[source]#

Bases: Component

Component that compiles ONNX models to ONNX Runtime format.

Supports two device types:

``qai_hub`` devices: compile via the Qualcomm AI Hub cloud service.
``embedl_onnxruntime`` devices: compile via the embedl-onnxruntime CLI on a remote device over SSH.

run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, numpy.ndarray] | None = None, calibration_method: Literal['minmax', 'entropy'] | None = None, per_channel: bool = False, quantize_io: bool = False) → ONNXRuntimeCompiledModel[source]#

Compile an ONNX model to ONNX Runtime format.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.

Parameters:

ctx – The execution context with device configuration.
onnx_path – Path to the input ONNX model.
device – Name of the target device (overrides the constructor default).
input_shape – Input shape tuple, e.g. (1, 3, 224, 224). Only used on qai_hub devices.
calibration_data – Calibration data for post-training quantization — a path to a directory of .npy files or a dict mapping model input names to NumPy arrays.
calibration_method – Calibration algorithm ('minmax' or 'entropy'). Only used on embedl_onnxruntime devices.
per_channel – If True, quantize weights per channel. Only used on embedl_onnxruntime devices.
quantize_io – If True, quantize model I/O tensors to INT8. Only used on qai_hub devices.

Returns:

An ONNXRuntimeCompiledModel with the path to the compiled model.

run_type: ClassVar[RunType] = 'COMPILE'#

class embedl_hub.core.compile.TFLiteCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None, input_name_mapping: dict[str, str] | None = None, output_name_mapping: dict[str, str] | None = None)[source]#

Bases: CompiledModel

Output of a TFLite compilation step.

Extends CompiledModel with optional input/output tensor name mappings that record how the compiler renamed tensors.

input_name_mapping: dict[str, str] | None = None#

output_name_mapping: dict[str, str] | None = None#

Bases: Component

Compile an ONNX model to TFLite format.

Supports both local compilation (via onnx2tf) and cloud compilation via Qualcomm AI Hub.

run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, numpy.ndarray] | None = None, quantize_io: bool = False) → TFLiteCompiledModel[source]#

Compile an ONNX model to TFLite.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.

Parameters:

ctx – The execution context with device configuration.
onnx_path – Path to the input ONNX model.
device – Name of the target device.
input_shape – Input shape tuple. Only used on qai_hub devices.
calibration_data – Calibration data for quantization. Only used on qai_hub devices.
quantize_io – If True, quantize model I/O tensors. Only used on qai_hub devices.

Returns:

A TFLiteCompiledModel with the path to the compiled model.

run_type: ClassVar[RunType] = 'COMPILE'#

class embedl_hub.core.compile.TensorRTCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None)[source]#

Bases: CompiledModel

Output from the TensorRTCompiler component.

Parameters:: graph_file – The artifact containing the engine graph JSON (from --exportLayerInfo), if available.

class embedl_hub.core.compile.TensorRTCompiler(*, name: str | None = None, device: str | None = None, tensorrt_version: str | None = None, calib_path: Path | None = None)[source]#

Bases: Component

Component that compiles ONNX models to TensorRT engines.

Runs trtexec on a remote device over SSH to convert an .onnx model into a .trt engine file.

Provider-specific parameters (trtexec_path, trtexec_cli_args) are configured via TrtexecConfig on the device. Per-component overrides can be set via provider_config_overrides.

run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, tensorrt_version: str | None = None, calib_path: Path | None = None) → TensorRTCompiledModel[source]#

Compile an ONNX model to a TensorRT engine.

Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.

Parameters:

ctx – The execution context with device configuration.
onnx_path – Local path to the input ONNX model.
device – Name of the target device (overrides the constructor default).
tensorrt_version – The TensorRT version to target. When None the compiler auto-detects the version by running trtexec --help on the remote device.
calib_path – Optional local path to a TensorRT INT8 calibration .cache file containing pre-computed per-tensor dynamic ranges. The cache must be generated externally using the TensorRT Python API (e.g. trt.IInt8EntropyCalibrator2) from your calibration dataset.

Returns:

A TensorRTCompiledModel with the compiled engine path.

run_type: ClassVar[RunType] = 'COMPILE'#