embedl_hub.core.compile package#
Compiler components and result types.
Re-exports#
CompileError— Raised on compilation failure.CompileResult— Container for compilation results.ONNXRuntimeCompiler— ONNX Runtime compiler component.ONNXRuntimeCompiledModel— Output of ONNX Runtime compilation.TensorRTCompiler— TensorRT compiler component.TensorRTCompiledModel— Output of TensorRT compilation.TFLiteCompiler— TFLite compiler component.TFLiteCompiledModel— Output of TFLite compilation.
- exception embedl_hub.core.compile.CompileError[source]#
Bases:
RuntimeErrorRaised when a compile job fails or times out.
- class embedl_hub.core.compile.CompileResult(model_path: Path, job_id: str | None = None, device: str | None = None)[source]#
Bases:
objectResult of a successful compile job.
- device: str | None = None#
- job_id: str | None = None#
- model_path: Path#
- class embedl_hub.core.compile.ONNXRuntimeCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None, input_name_mapping: dict[str, str] | None = None, output_name_mapping: dict[str, str] | None = None)[source]#
Bases:
CompiledModelOutput from the
ONNXRuntimeCompilercomponent.Extends
CompiledModelwith optional input/output tensor name mappings that record how the compiler renamed tensors.- input_name_mapping: dict[str, str] | None = None#
- output_name_mapping: dict[str, str] | None = None#
- class embedl_hub.core.compile.ONNXRuntimeCompiler(*, name: str | None = None, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, ndarray] | None = None, calibration_method: Literal['minmax', 'entropy'] | None = None, per_channel: bool = False, quantize_io: bool = False)[source]#
Bases:
ComponentComponent that compiles ONNX models to ONNX Runtime format.
Supports two device types:
``qai_hub`` devices: compile via the Qualcomm AI Hub cloud service.
``embedl_onnxruntime`` devices: compile via the
embedl-onnxruntimeCLI on a remote device over SSH.
- run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, ndarray] | None = None, calibration_method: Literal['minmax', 'entropy'] | None = None, per_channel: bool = False, quantize_io: bool = False) ONNXRuntimeCompiledModel[source]#
Compile an ONNX model to ONNX Runtime format.
Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.
- Parameters:
ctx – The execution context with device configuration.
onnx_path – Path to the input ONNX model.
device – Name of the target device (overrides the constructor default).
input_shape – Input shape tuple, e.g.
(1, 3, 224, 224). Only used onqai_hubdevices.calibration_data – Calibration data for post-training quantization — a path to a directory of
.npyfiles or adictmapping model input names to NumPy arrays.calibration_method – Calibration algorithm (
'minmax'or'entropy'). Only used onembedl_onnxruntimedevices.per_channel – If
True, quantize weights per channel. Only used onembedl_onnxruntimedevices.quantize_io – If
True, quantize model I/O tensors to INT8. Only used onqai_hubdevices.
- Returns:
An
ONNXRuntimeCompiledModelwith the path to the compiled model.
- class embedl_hub.core.compile.TFLiteCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None, input_name_mapping: dict[str, str] | None = None, output_name_mapping: dict[str, str] | None = None)[source]#
Bases:
CompiledModelOutput of a TFLite compilation step.
Extends
CompiledModelwith optional input/output tensor name mappings that record how the compiler renamed tensors.- input_name_mapping: dict[str, str] | None = None#
- output_name_mapping: dict[str, str] | None = None#
- class embedl_hub.core.compile.TFLiteCompiler(*, name: str | None = None, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, ndarray] | None = None, quantize_io: bool = False)[source]#
Bases:
ComponentCompile an ONNX model to TFLite format.
Supports both local compilation (via
onnx2tf) and cloud compilation via Qualcomm AI Hub.- run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, input_shape: tuple[int, ...] | None = None, calibration_data: Path | dict[str, ndarray] | None = None, quantize_io: bool = False) TFLiteCompiledModel[source]#
Compile an ONNX model to TFLite.
Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.
- Parameters:
ctx – The execution context with device configuration.
onnx_path – Path to the input ONNX model.
device – Name of the target device.
input_shape – Input shape tuple. Only used on
qai_hubdevices.calibration_data – Calibration data for quantization. Only used on
qai_hubdevices.quantize_io – If
True, quantize model I/O tensors. Only used onqai_hubdevices.
- Returns:
A
TFLiteCompiledModelwith the path to the compiled model.
- class embedl_hub.core.compile.TensorRTCompiledModel(artifact_dir: Path | None, devices: dict[str, DeviceLog], run_log: RunLog | None, path: LoggedArtifact, input: LoggedArtifact | None = None)[source]#
Bases:
CompiledModelOutput from the TensorRTCompiler component.
- Parameters:
graph_file – The artifact containing the engine graph JSON (from
--exportLayerInfo), if available.
- class embedl_hub.core.compile.TensorRTCompiler(*, name: str | None = None, device: str | None = None, tensorrt_version: str | None = None, calib_path: Path | None = None)[source]#
Bases:
ComponentComponent that compiles ONNX models to TensorRT engines.
Runs
trtexecon a remote device over SSH to convert an.onnxmodel into a.trtengine file.Provider-specific parameters (
trtexec_path,trtexec_cli_args) are configured viaTrtexecConfigon the device. Per-component overrides can be set viaprovider_config_overrides.- run(ctx: HubContext, onnx_path: Path, *, device: str | None = None, tensorrt_version: str | None = None, calib_path: Path | None = None) TensorRTCompiledModel[source]#
Compile an ONNX model to a TensorRT engine.
Keyword arguments override the defaults set in the constructor. If a keyword argument is not provided here, the constructor default is used.
- Parameters:
ctx – The execution context with device configuration.
onnx_path – Local path to the input ONNX model.
device – Name of the target device (overrides the constructor default).
tensorrt_version – The TensorRT version to target. When
Nonethe compiler auto-detects the version by runningtrtexec --helpon the remote device.calib_path – Optional local path to a TensorRT INT8 calibration
.cachefile containing pre-computed per-tensor dynamic ranges. The cache must be generated externally using the TensorRT Python API (e.g.trt.IInt8EntropyCalibrator2) from your calibration dataset.
- Returns:
A
TensorRTCompiledModelwith the compiled engine path.