embedl_deploy.tensorrt.patterns package#
Module contents:
Public re-exports of TensorRT pattern classes.
Users import from here:
from embedl_deploy.tensorrt.patterns import ConvBNPattern
- class embedl_deploy.tensorrt.patterns.ActAddPattern[source]#
Bases:
PatternMatch
Act → add(·, residual)and fuse intoFusedActAdd.Placing this pattern before the Conv-based fusion patterns prevents TensorRT from attempting to merge the upstream convolution into an activation-fused kernel when the activation output feeds a residual add. With
Act → addabsorbed into a single pointwise leaf, the upstreamConv → BNis matched byConvBNPatternand quantized independently.- graft#
alias of
FusedActAdd
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = Fork(inputs=((torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Mish | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.PReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.Sigmoid | torch.nn.modules.activation.Tanh,), ()), operator=<function _is_add>, output=(), perms_override=None)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.AdaptiveAvgPoolPattern[source]#
Bases:
PatternMatch
AdaptiveAvgPool2dand wrap in a fused module.Although there is nothing to fuse, wrapping the pool in a recognized module allows the Q/DQ insertion pass to place quantize / dequantize stubs around it.
- graft#
alias of
FusedAdaptiveAvgPool2d
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.ConvBNActPattern[source]#
Bases:
PatternMatch
Conv2d → [BatchNorm2d] → Activationand fuse.Any activation included in
ActivationLikeis accepted. TheBatchNorm2dis optional.- graft#
alias of
FusedConvBNAct
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.conv.Conv2d'>, Wildcard(check=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, quantifier='?', nodes=()), torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Mish | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.PReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.Sigmoid | torch.nn.modules.activation.Tanh)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.ConvBNAddActPattern[source]#
Bases:
PatternMatch
Conv2d → BatchNorm2d → add(·, residual) → Activation.Captures the tail of ResNet-style bottleneck blocks where the convolution path is element-wise added to a skip connection before the final activation.
- graft#
alias of
FusedConvBNAddAct
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = Fork(inputs=((<class 'torch.nn.modules.conv.Conv2d'>, <class 'torch.nn.modules.batchnorm.BatchNorm2d'>), ()), operator=<function _is_add>, output=(torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Mish | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.PReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.Sigmoid | torch.nn.modules.activation.Tanh,), perms_override=None)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.ConvBNPattern[source]#
Bases:
PatternMatch
Conv2d → [BatchNorm2d](no activation) and fuse.The
BatchNorm2dis optional.- graft#
alias of
FusedConvBN
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.conv.Conv2d'>, Wildcard(check=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, quantifier='?', nodes=()))#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.DecomposeMultiheadAttentionPattern[source]#
Bases:
PatternDecompose
nn.MultiheadAttentioninto explicit sub-modules.Replaces each
MultiheadAttentionnode with three sub-modules visible in the FX graph:MHAInProjectionScaledDotProductAttentionnn.Linear(output projection, reused from the original MHA)
Only self-attention (
_qkv_same_embed_dim=True,batch_first=True) without masks is supported. Unsupported configurations are skipped with a warning.- graft: type[Module] | tuple[Callable[[TreeMatch], tuple[Module | Node | Callable[[GraphModule, tuple[Node, ...]], list[Node]], ...]], ...] = (<function _decompose_mha>,)#
The factories to make replacements for each matched tree, if used.
- phase: Phase = 'conversion'#
Pipeline phase this pattern belongs to.
- replace(pattern_match: PatternMatch) list[Node][source]#
Replace one matched occurrence in-place.
- Parameters:
pattern_match – The pattern match to replace.
- Returns:
The replacement nodes inserted into the graph.
- Raises:
ValueError – If the pattern has no
graft.TypeError – If the
graftclass constructor rejects the collected modules.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<function _is_supported_mha>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.FlattenLinearToConv1x1Pattern[source]#
Bases:
PatternReplace
Flatten (4D→2D) → LinearwithConv2d(1×1) → Flatten.Many classification networks end with:
``AdaptiveAvgPool2d → flatten → [Dropout → ReLU →] Linear``
This conversion rewrites the tail into:
``AdaptiveAvgPool2d → [Dropout → ReLU →] Conv2d(1×1) → flatten``
Element-wise ops between flatten and Linear (activations, dropout) are absorbed by a
Wildcardand moved in front of theConv2din the replacement.The resulting
Conv2dcan then be matched by downstream fusion and Q/DQ patterns.This is a structural conversion — it changes graph topology rather than collapsing a chain into a fused module. It must be applied before fusion patterns.
- graft: type[Module] | tuple[Callable[[TreeMatch], tuple[Module | Node | Callable[[GraphModule, tuple[Node, ...]], list[Node]], ...]], ...] = (<function _ew_nodes>, <function _reshape_and_conv>, <function keep_node.<locals>._get>)#
The factories to make replacements for each matched tree, if used.
- phase: Phase = 'conversion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<function _is_flatten>, Wildcard(check=torch.nn.modules.dropout.Dropout | torch.nn.modules.dropout.Dropout2d | torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid, quantifier='*', nodes=()), <class 'torch.nn.modules.linear.Linear'>)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.LayerNormPattern[source]#
Bases:
PatternMatch
LayerNormand wrap in a fused module.- graft#
alias of
FusedLayerNorm
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.normalization.LayerNorm'>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.LinearActPattern[source]#
Bases:
PatternMatch
Linear → Activationand fuse.Any activation included in
ActivationLikeis accepted.- graft#
alias of
FusedLinearAct
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.linear.Linear'>, torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Mish | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.PReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.Sigmoid | torch.nn.modules.activation.Tanh)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.LinearPattern[source]#
Bases:
PatternMatch a standalone
Linearand wrap in a fused module.- graft#
alias of
FusedLinear
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'torch.nn.modules.linear.Linear'>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.MHAInProjectionPattern[source]#
Bases:
PatternMatch
MHAInProjectionand wrap in a fused module.- graft#
alias of
FusedMHAInProjection
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'embedl_deploy._internal.tensorrt.modules.attention.MHAInProjection'>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.RemoveAssertPattern[source]#
Bases:
PatternRemove assertion subgraphs such as
getattr → eq → _assert.timm models often contain shape checks in the traced FX graph (for example
assert x.dim() == 4). They are not runtime compute ops and can be safely erased for deployment graphs.- graft: type[Module] | tuple[Callable[[TreeMatch], tuple[Module | Node | Callable[[GraphModule, tuple[Node, ...]], list[Node]], ...]], ...] = ()#
The factories to make replacements for each matched tree, if used.
- phase: Phase = 'conversion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (Wildcard(check=<function _is_assert_noise>, quantifier='+', nodes=()), <function _is_eq>, <function _is_assert>)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.RemoveIdentityAdaptiveAvgPoolPattern[source]#
Bases:
PatternRemove
AdaptiveAvgPool2dwhere output size equals input size.When
output_size == (H, W)of the incoming feature map the pooling operation is a mathematical identity and can be safely erased. This is common in ConvNeXt-style architectures.Assumes shapes have already been propagated into
node.meta['tensor_meta'](e.g. via a priorShapeProppass). Nodes with missing shape metadata are skipped with a warning.- graft: type[Module] | tuple[Callable[[TreeMatch], tuple[Module | Node | Callable[[GraphModule, tuple[Node, ...]], list[Node]], ...]], ...] = ()#
The factories to make replacements for each matched tree, if used.
- phase: Phase = 'conversion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<function _is_identity_adaptive_avg_pool>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.RemoveIdentityPattern[source]#
Bases:
PatternRemove
nn.Identitymodules from the graph.nn.Identityis a no-op module that passes its input through unchanged. These operations can be safely removed from the graph without affecting model behavior. This simplifies the graph for downstream optimization and fusion patterns.- graft: type[Module] | tuple[Callable[[TreeMatch], tuple[Module | Node | Callable[[GraphModule, tuple[Node, ...]], list[Node]], ...]], ...] = ()#
The factories to make replacements for each matched tree, if used.
- phase: Phase = 'conversion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<function _is_identity_passthrough>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.ScaledDotProductAttentionPattern[source]#
Bases:
PatternMatch
ScaledDotProductAttentionand wrap in a fused module.- graft#
alias of
FusedScaledDotProductAttention
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<class 'embedl_deploy._internal.tensorrt.modules.attention.ScaledDotProductAttention'>,)#
The pattern topology to match, if using tree-based matching.
- class embedl_deploy.tensorrt.patterns.StemConvBNActMaxPoolPattern[source]#
Bases:
PatternMatch
Conv2d(3in, 7×7) → [BatchNorm2d] → Activation → MaxPool2d.Captures the common classification-network stem. The convolution is constrained to
in_channels == 3, kernel_size == (7, 7)so only the actual stem is matched, not arbitraryConv→Act→Poolchains. TheBatchNorm2dis optional.- graft#
alias of
FusedConvBNActMaxPool
- phase: Phase = 'fusion'#
Pipeline phase this pattern belongs to.
- tree: tuple[type[Module] | UnionType | Callable[[Node], bool] | Wildcard, ...] | Fork = (<function _is_stem_conv>, Wildcard(check=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, quantifier='?', nodes=()), torch.nn.modules.activation.ReLU | torch.nn.modules.activation.ReLU6 | torch.nn.modules.activation.GELU | torch.nn.modules.activation.SiLU | torch.nn.modules.activation.Mish | torch.nn.modules.activation.Hardswish | torch.nn.modules.activation.Hardsigmoid | torch.nn.modules.activation.LeakyReLU | torch.nn.modules.activation.PReLU | torch.nn.modules.activation.ELU | torch.nn.modules.activation.Sigmoid | torch.nn.modules.activation.Tanh, <class 'torch.nn.modules.pooling.MaxPool2d'>)#
The pattern topology to match, if using tree-based matching.