API Documentation#
Module contents#
Python package to make AI models deployment-ready for any hardware.
- class embedl_deploy.TransformationPlan(model: ~torch.fx.graph_module.GraphModule, matches: dict[str, dict[str, ~embedl_deploy._internal.core.pattern.PatternMatch]] = <factory>)[source]#
Bases:
objectEditable transformation plan.
Returned by
get_transformation_plan(). Thematchesdict mapsinput_node_name → pattern_class_name → PatternMatch. Togglematch.apply = Falseto skip specific matches before callingapply_transformation_plan().- matches: dict[str, dict[str, PatternMatch]]#
Nested dict of discovered matches, keyed by the last matched node’s name and the pattern class name.
- model: GraphModule#
Deep copy of the original graph (not yet modified by replacements).
- class embedl_deploy.TransformationResult(model: GraphModule, report: TransformationReport, matches: list[PatternMatch])[source]#
Bases:
objectResult of applying a transformation plan.
Returned by
apply_transformation_plan().- matches: list[PatternMatch]#
The actual
PatternMatchobjects that were applied.
- model: GraphModule#
The transformed model with fused / quantized modules.
- report: TransformationReport#
Summary of what was applied and what was skipped.
- embedl_deploy.apply_transformation_plan(plan: TransformationPlan) TransformationResult[source]#
Apply the enabled matches from plan.
Only matches with
apply=Trueare applied viareplace(). The plan’s model is modified in place (it is already a deep copy created byget_transformation_plan()).- Parameters:
plan – The plan to apply (from
get_transformation_plan()).- Returns:
A
TransformationResultcontainingmodel(transformed),report(summary), andmatches(applied matches).- Raises:
ValueError – If any nodes are included in more than one enabled pattern.
Example:
result = apply_transformation_plan(plan) print(result.report) torch.onnx.export(result.model, x, "deployed.onnx")
- embedl_deploy.get_transformation_plan(graph_module: GraphModule, patterns: Sequence[Pattern]) TransformationPlan[source]#
Find all non-overlapping tree-based pattern matches.
Deep-copies graph_module so the original is never modified. Each pattern’s
match()must returnPatternMatchobjects with a populatedtree_match. Overlapping matches are resolved by marking later overlaps asapply=False.Returns a
TransformationPlanthat can be inspected and edited before callingapply_transformation_plan().- Parameters:
graph_module – The traced graph module to analyse.
patterns – Patterns to search for. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.
- Returns:
A
TransformationPlancontaining graph_module (a deep copy) andmatches(a nested dict of discovered pattern matches).
Example:
from torch import fx from embedl_deploy import get_transformation_plan from embedl_deploy.tensorrt import TENSORRT_PATTERNS graph_module = fx.symbolic_trace(model) plan = get_transformation_plan( graph_module, patterns=TENSORRT_PATTERNS ) for node, pats in plan.matches.items(): for name, match in pats.items(): print(f"{node}: {name} apply={match.apply}")
- embedl_deploy.transform(model: Module | GraphModule, patterns: Sequence[Pattern]) TransformationResult[source]#
Apply pattern transformations to model in one step.
Conversion patterns (
is_conversion=True) are applied iteratively until no new matches are found, then fusion patterns are matched and applied in a single pass. The original model is never modified — a deep copy is made internally byget_transformation_plan().- Parameters:
model – The model to transform. If model is a
nn.Moduleit will be traced withsymbolic_trace().patterns – Patterns to match and apply. Order matters: patterns are matched in sequence, and earlier matches claim nodes first. Supply longest (most specific) patterns first to ensure they take priority over shorter ones when sub-graphs overlap.
- Returns:
A
TransformationResultcontainingmodel(transformed),report(summary), andmatches(applied matches).
Example:
from embedl_deploy import transform from embedl_deploy.tensorrt import TENSORRT_PATTERNS deployable_model = transform(model, patterns=TENSORRT_PATTERNS).model
Subpackages#
embedl_deploy.quantizepackageembedl_deploy.tensorrtpackageembedl_deploy.tensorrt.modulespackageembedl_deploy.tensorrt.patternspackageAdaptiveAvgPoolPatternConvBNActPatternConvBNAddActPatternConvBNPatternDecomposeMultiheadAttentionPatternFlattenLinearToConv1x1PatternLayerNormPatternLinearActPatternLinearPatternMHAInProjectionPatternRemoveAssertPatternRemoveDeadAssertPatternRemoveIdentityAdaptiveAvgPoolPatternRemoveIdentityPatternScaledDotProductAttentionPatternStemConvBNActMaxPoolPattern
- Pattern lists
embedl_deploy.versionpackage