Skip to main content
Ctrl+K
Embedl Docs Embedl Docs
  • Deploy
  • Hub
  • Models
  • Embedl
  • GitHub
  • HuggingFace
  • Deploy
  • Hub
  • Models
  • Embedl
  • GitHub
  • HuggingFace
  • Embedl Deploy
  • Installation
  • Quickstart
  • User Guide
    • Optimization Pipeline
    • Graph Conversions
    • Operator Fusions
    • Quantization
    • Custom Patterns
    • Benchmark Results
  • Tutorials
    • Faster ResNet deployment
    • Deploying vision models
  • API Documentation
    • embedl_deploy.quantize package
    • embedl_deploy.tensorrt package
      • embedl_deploy.tensorrt.modules package
      • embedl_deploy.tensorrt.patterns package
    • embedl_deploy.version package
      • embedl_deploy.version.public module
  • User Guide

User Guide#

This guide covers the full Embedl Deploy optimization pipeline — from graph conversions through operator fusions to INT8 quantization — with working examples on ResNet50, ConvNeXt, and Vision Transformer (ViT).

  • Optimization Pipeline
    • Pipeline stages
    • One-shot API
    • Plan-based API
    • Pattern priority
    • Pattern groups
    • Verifying numerical equivalence
    • ONNX export and compilation
  • Graph Conversions
    • Built-in TensorRT conversions
    • When conversions matter
    • Running conversions only
  • Operator Fusions
    • Convolution fusions
    • Linear fusions
    • Attention fusions
    • Pooling fusions
    • Fusion summary by architecture
    • Running fusions only
  • Quantization
    • Quantization pipeline
    • Step 1: Transform and fuse
    • Step 2: Insert QDQ stubs
    • Step 3: Calibrate
    • QDQ point placement
    • Why pattern-aware QDQ matters
    • Full example: ResNet50 INT8 PTQ
    • Quantization-Aware Training (QAT)
  • Custom Patterns
    • Why customize?
    • Writing a custom pattern
    • Building a custom pattern list
    • Using the custom pattern list
    • Impact on benchmarks
  • Benchmark Results
    • Test setup
    • ResNet50
    • ConvNeXt
    • Vision Transformer (ViT-B/16)
    • Summary
    • Reproducing these results

previous

Quickstart

next

Optimization Pipeline

This Page

  • Show Source
HuggingFace

© 2026 Embedl All rights reserved