embedl_deploy.tensorrt package#
Subpackages:
embedl_deploy.tensorrt.modulespackageembedl_deploy.tensorrt.patternspackageAdaptiveAvgPoolPatternConvBNActPatternConvBNAddActPatternConvBNPatternDecomposeMultiheadAttentionPatternFlattenLinearToConv1x1PatternLayerNormPatternLinearActPatternLinearPatternMHAInProjectionPatternRemoveAssertPatternRemoveDeadAssertPatternRemoveIdentityAdaptiveAvgPoolPatternRemoveIdentityPatternScaledDotProductAttentionPatternStemConvBNActMaxPoolPattern
Module contents:
TensorRT backend — curated pattern lists and convenience API.
Quick start:
import torch
from torchvision.models import resnet50
from embedl_deploy import transform
from embedl_deploy.tensorrt import TENSORRT_PATTERNS
model = resnet50(weights=None).eval()
deployed = transform(model, patterns=TENSORRT_FUSION_PATTERNS).model
Pattern lists#
TENSORRT_CONVERSION_PATTERNSStructural conversions applied before fusion (e.g.
Flatten→Linear → Conv1×1→Flatten).TENSORRT_FUSION_PATTERNSFusion-only patterns (
Conv→BN→ReLU, Stem, residual, etc.).TENSORRT_PATTERNSUnion of conversions + fusions (the default for most users).
TENSORRT_QUANTIZED_PATTERNSQuantized variants (placeholder — not yet implemented).