API Documentation#

Module contents#

Python package to make AI models deployment-ready for any hardware.

class embedl_deploy.TransformationPlan(model: ~torch.fx.graph_module.GraphModule, matches: dict[str, dict[str, ~embedl_deploy._internal.core.pattern.PatternMatch]] = <factory>)[source]#

Bases: object

Editable transformation plan.

Returned by get_transformation_plan(). The matches dict maps input_node_name pattern_class_name PatternMatch. Toggle match.apply = False to skip specific matches before calling apply_transformation_plan().

matches: dict[str, dict[str, PatternMatch]]#

Nested dict of discovered matches, keyed by the last matched node’s name and the pattern class name.

model: GraphModule#

Deep copy of the original graph (not yet modified by replacements).

class embedl_deploy.TransformationResult(model: GraphModule, report: TransformationReport, matches: list[PatternMatch])[source]#

Bases: object

Result of applying a transformation plan.

Returned by apply_transformation_plan().

matches: list[PatternMatch]#

The actual PatternMatch objects that were applied.

model: GraphModule#

The transformed model with fused / quantized modules.

report: TransformationReport#

Summary of what was applied and what was skipped.

embedl_deploy.apply_transformation_plan(plan: TransformationPlan) TransformationResult[source]#

Apply the enabled matches from plan.

Only matches with apply=True are applied via replace(). The plan’s model is modified in place (it is already a deep copy created by get_transformation_plan()).

Parameters:

plan – The plan to apply (from get_transformation_plan()).

Returns:

A TransformationResult containing model (transformed), report (summary), and matches (applied matches).

Raises:

ValueError – If any nodes are included in more than one enabled pattern.

Example:

result = apply_transformation_plan(plan)
print(result.report)
torch.onnx.export(result.model, x, "deployed.onnx")
embedl_deploy.get_transformation_plan(graph_module: GraphModule, patterns: Sequence[Pattern]) TransformationPlan[source]#

Find all non-overlapping tree-based pattern matches.

Deep-copies graph_module so the original is never modified. Each pattern’s match() must return PatternMatch objects with a populated tree_match. Overlapping matches are resolved by marking later overlaps as apply=False.

Returns a TransformationPlan that can be inspected and edited before calling apply_transformation_plan().

Parameters:
  • graph_module – The traced graph module to analyse.

  • patterns – Patterns to search for. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.

Returns:

A TransformationPlan containing graph_module (a deep copy) and matches (a nested dict of discovered pattern matches).

Example:

from torch import fx
from embedl_deploy import get_transformation_plan
from embedl_deploy.tensorrt import TENSORRT_PATTERNS

graph_module = fx.symbolic_trace(model)
plan = get_transformation_plan(
    graph_module, patterns=TENSORRT_PATTERNS
)
for node, pats in plan.matches.items():
    for name, match in pats.items():
        print(f"{node}: {name} apply={match.apply}")
embedl_deploy.transform(model: Module | GraphModule, patterns: Sequence[Pattern]) TransformationResult[source]#

Apply pattern transformations to model in one step.

Conversion patterns (is_conversion=True) are applied iteratively until no new matches are found, then fusion patterns are matched and applied in a single pass. The original model is never modified — a deep copy is made internally by get_transformation_plan().

Parameters:
  • model – The model to transform. If model is a nn.Module it will be traced with symbolic_trace().

  • patterns – Patterns to match and apply. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.

Returns:

A TransformationResult containing model (transformed), report (summary), and matches (applied matches).

Example:

from embedl_deploy import transform
from embedl_deploy.tensorrt import TENSORRT_PATTERNS

deployable_model = transform(model, patterns=TENSORRT_PATTERNS).model

Subpackages#