API Documentation#

Module contents#

Python package to make AI models deployment-ready for any hardware.

class embedl_deploy.TransformationPlan(model: GraphModule, matches: dict[str, dict[str, ~embedl_deploy._internal.core.patterns.main.PatternMatch]]=<factory>)[source]#

Bases: object

Editable transformation plan.

Returned by get_transformation_plan(). The matches dict maps input_node_name → pattern_class_name → PatternMatch. Toggle match.apply = False to skip specific matches before calling apply_transformation_plan().

matches: dict[str, dict[str, PatternMatch]]#: Nested dict of discovered matches, keyed by the last matched node’s name and the pattern class name.

model: GraphModule#: Deep copy of the original graph (not yet modified by replacements).

class embedl_deploy.TransformationResult(model: GraphModule, report: TransformationReport, matches: list[PatternMatch])[source]#

Bases: object

Result of applying a transformation plan.

Returned by apply_transformation_plan().

matches: list[PatternMatch]#: The actual PatternMatch objects that were applied.

model: GraphModule#: The transformed model with fused / quantized modules.

report: TransformationReport#: Summary of what was applied and what was skipped.

embedl_deploy.apply_transformation_plan(plan: TransformationPlan) → TransformationResult[source]#

Apply the enabled matches from plan.

Only matches with apply=True are applied via replace(). The plan’s model is modified in place (it is already a deep copy created by get_transformation_plan()). After replacement, dead code and orphaned submodules are removed, the graph is linted and recompiled, the model is put into eval mode, and shape metadata is re-propagated when available.

Parameters:: plan – The plan to apply (from get_transformation_plan()).
Returns:: A TransformationResult containing model (transformed), report (summary), and matches (applied matches).
Raises:: ValueError – If any nodes are included in more than one enabled pattern.

Example:

result = apply_transformation_plan(plan)
print(result.report)
torch.onnx.export(result.model, x, "deployed.onnx")

embedl_deploy.get_transformation_plan(graph_module: GraphModule, patterns: Sequence[Pattern], *, deep_copy: bool = True) → TransformationPlan[source]#

Find all non-overlapping tree-based pattern matches.

Deep-copies graph_module by default so the original is never modified. Pass deep_copy=False to operate on the graph in place (e.g. when the caller already holds a copy). Each pattern’s match() must return PatternMatch objects with a populated tree_match. Overlapping matches are resolved by marking later overlaps as apply=False.

Returns a TransformationPlan that can be inspected and edited before calling apply_transformation_plan().

Parameters:

graph_module – The traced graph module to analyze.
patterns – Patterns to search for. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.
deep_copy – When True (the default), graph_module is deep-copied before matching so the original is never modified. Set to False when operating on a model that has already been copied (e.g. inside quantization passes).

Returns:

A TransformationPlan containing graph_module and matches (a nested dict of discovered pattern matches).

Example:

from torch import fx
from embedl_deploy import get_transformation_plan
from embedl_deploy.tensorrt import TENSORRT_PATTERNS

graph_module = fx.symbolic_trace(model)
plan = get_transformation_plan(
    graph_module, patterns=TENSORRT_PATTERNS
)
for node, pats in plan.matches.items():
    for name, match in pats.items():
        print(f"{node}: {name} apply={match.apply}")

embedl_deploy.transform(model: Module | GraphModule, patterns: Sequence[Pattern]) → TransformationResult[source]#

Apply pattern transformations to model in one step.

Recomposition and conversion patterns are applied iteratively until no new matches are found, then fusion patterns are matched and applied in a single pass. The original model is deep-copied on the first processing call and never modified.

Parameters:

model – The model to transform. If model is a nn.Module it will be traced with symbolic_trace().
patterns – Patterns to match and apply. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.

Returns:

A TransformationResult containing model (transformed), report (summary), and matches (applied matches).

Example:

from embedl_deploy import transform
from embedl_deploy.tensorrt import TENSORRT_PATTERNS

deployable_model = transform(model, patterns=TENSORRT_PATTERNS).model

API Documentation#

Module contents#

Subpackages#