TensorRT export#

In this tutorial, we’ll walk through the process of performing 8-bit quantization on a simple model using TensorRT and Aidge. The steps include:

  • exporting the model

  • modifying the test script for quantization

  • preparing calibration data

  • running the quantization and profile the quantized model

tutorial graph

Furthermore, as shown in this image but not demonstrated in this tutorial, Aidge allows the user to:

  • Add custom operators via the plugin interface

  • Facilitate the transformation of user data into calibration data

0. Requirements for this tutorial#

To complete this tutorial, we hightly recommend following these requirements:

  • To have completed the Aidge 101 tutorial

  • To have installed the aidge_export_tensorrt module

In order to compile the export on your machine, please be sure to have one of these two conditions:

  • To have installed Docker (the export compilation chain is able to use docker)

  • To have installed the correct packages to support TensorRT 8.6

1. Exporting the model#

In this tutorial, we will export MobileNetV2, a lightweight convolutional neural network.

[1]:
!wget -c https://github.com/onnx/models/raw/main/validated/vision/classification/mobilenet/model/mobilenetv2-7.onnx
--2025-06-06 14:59:10--  https://github.com/onnx/models/raw/main/validated/vision/classification/mobilenet/model/mobilenetv2-7.onnx
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/mobilenet/model/mobilenetv2-7.onnx [following]
--2025-06-06 14:59:10--  https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/mobilenet/model/mobilenetv2-7.onnx
Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14246826 (14M) [application/octet-stream]
Saving to: ‘mobilenetv2-7.onnx’

mobilenetv2-7.onnx  100%[===================>]  13.59M  --.-KB/s    in 0.03s

2025-06-06 14:59:11 (432 MB/s) - ‘mobilenetv2-7.onnx’ saved [14246826/14246826]

For visualizing the model structure, we recommend using Netron. If you haven’t installed Netron yet, you can do so by executing the following command:

[2]:
# !pip install netron

Once installed, you can launch Netron to visualize the model:

[3]:
# import netron
# netron.start('mobilenetv2-7.onnx', 8080)

Then let’s export the model using the aidge_export_tensorrt module.

[4]:
# First, be sure that any previous exports are removed
!rm -rf export_trt
[5]:
import aidge_export_tensorrt

# Generate export for your model
# This function takes as argument the name of the export folder
# and the onnx file or the graphview of your model
aidge_export_tensorrt.export("export_trt", "mobilenetv2-7.onnx")
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[5], line 1
----> 1 import aidge_export_tensorrt
      3 # Generate export for your model
      4 # This function takes as argument the name of the export folder
      5 # and the onnx file or the graphview of your model
      6 aidge_export_tensorrt.export("export_trt", "mobilenetv2-7.onnx")

ModuleNotFoundError: No module named 'aidge_export_tensorrt'

The export povides a Makefile with several options to use the export on your machine. You can generate a C++ export or a Python export.

You also have the possibility to compile the export or/and the Python library by using Docker if your host machine doesn’t have the correct packages. In this tutorial, we generate the Python library of the export and use it a Python script.

All of these options are resumed in the helper of the Makefile (run make help in the export folder for more details).

[6]:
# Compile the export Python library by using docker
# and the Makefile provided in the export
!cd export_trt/ && make build_lib_python_docker
/usr/bin/sh: 1: cd: can't cd to export_trt/

2. Modifying the test script for quantization#

Next, you have to modify test.py by adding nb_bits=8 in the graph constructor and call model.calibrate().

calibrate() can accept three arguments:

  • calibration_folder_path: to specify the path to your calibration folder

  • cache_file_path: to use your pre-built calibration cache

  • batch_size: to specify the batch size for calibration data

[7]:
%%writefile export_trt/test.py
"""Example test file for the TensorRT Python API."""

import build.lib.aidge_trt as aidge_trt
import numpy as np

if __name__ == '__main__':
    # Load the model
    model = aidge_trt.Graph("model.onnx", nb_bits=8)

    # Calibrate the model
    model.calibrate()

    # Initialize the model
    model.initialize()

    # Profile the model with 10 iterations
    model.profile(10)

    # Example of running inference
    # img: numpy.array = np.load("PATH TO NPY file")
    # output: numpy.array = model.run_sync([img])

Writing export_trt/test.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[7], line 1
----> 1 get_ipython().run_cell_magic('writefile', 'export_trt/test.py', '"""Example test file for the TensorRT Python API."""\n\nimport build.lib.aidge_trt as aidge_trt\nimport numpy as np\n\nif __name__ == \'__main__\':\n    # Load the model\n    model = aidge_trt.Graph("model.onnx", nb_bits=8)\n\n    # Calibrate the model\n    model.calibrate()\n\n    # Initialize the model\n    model.initialize()\n\n    # Profile the model with 10 iterations\n    model.profile(10)\n\n    # Example of running inference\n    # img: numpy.array = np.load("PATH TO NPY file")\n    # output: numpy.array = model.run_sync([img])\n')

File ~/checkouts/readthedocs.org/user_builds/eclipse-aidge/envs/stable/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2543, in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
   2541 with self.builtin_trap:
   2542     args = (magic_arg_s, cell)
-> 2543     result = fn(*args, **kwargs)
   2545 # The code below prevents the output from being displayed
   2546 # when using magics with decorator @output_can_be_silenced
   2547 # when the last Python token in the expression is a ';'.
   2548 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):

File ~/checkouts/readthedocs.org/user_builds/eclipse-aidge/envs/stable/lib/python3.10/site-packages/IPython/core/magics/osm.py:854, in OSMagics.writefile(self, line, cell)
    851     print("Writing %s" % filename)
    853 mode = 'a' if args.append else 'w'
--> 854 with io.open(filename, mode, encoding='utf-8') as f:
    855     f.write(cell)

FileNotFoundError: [Errno 2] No such file or directory: 'export_trt/test.py'

3. Preparing the calibration dataset#

To ensure accurate calibration, it’s essential to select representative samples. In this example, we will use a 224x224 RGB image from the ImageNet dataset.

However, for practical applications, TensorRT suggests that “The amount of input data required is application-dependent, but experiments indicate that approximately 500 images are adequate for calibrating ImageNet classification networks”.

[8]:
# Create calibration folder
!cd export_trt/ && mkdir calibration_folder
/usr/bin/sh: 1: cd: can't cd to export_trt/
[9]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

demo_img_path = './data/0.jpg'

img = mpimg.imread(demo_img_path)
imgplot = plt.imshow(img)
plt.show()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[9], line 7
      3 import matplotlib.image as mpimg
      5 demo_img_path = './data/0.jpg'
----> 7 img = mpimg.imread(demo_img_path)
      8 imgplot = plt.imshow(img)
      9 plt.show()

File ~/checkouts/readthedocs.org/user_builds/eclipse-aidge/envs/stable/lib/python3.10/site-packages/matplotlib/image.py:1520, in imread(fname, format)
   1513 if isinstance(fname, str) and len(parse.urlparse(fname).scheme) > 1:
   1514     # Pillow doesn't handle URLs directly.
   1515     raise ValueError(
   1516         "Please open the URL for reading and pass the "
   1517         "result to Pillow, e.g. with "
   1518         "``np.array(PIL.Image.open(urllib.request.urlopen(url)))``."
   1519         )
-> 1520 with img_open(fname) as image:
   1521     return (_pil_png_to_float_array(image)
   1522             if isinstance(image, PIL.PngImagePlugin.PngImageFile) else
   1523             pil_to_array(image))

File ~/checkouts/readthedocs.org/user_builds/eclipse-aidge/envs/stable/lib/python3.10/site-packages/PIL/Image.py:3505, in open(fp, mode, formats)
   3502     filename = os.fspath(fp)
   3504 if filename:
-> 3505     fp = builtins.open(filename, "rb")
   3506     exclusive_fp = True
   3507 else:

FileNotFoundError: [Errno 2] No such file or directory: './data/0.jpg'

This image has been preprocessed and stored in /data/ as 0.batch file. Information about the image’s shape is stored in the .info file.

[10]:
import shutil

shutil.copy("data/.info", "export_trt/calibration_folder/.info")
shutil.copy("data/0.batch", "export_trt/calibration_folder/0.batch")
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[10], line 3
      1 import shutil
----> 3 shutil.copy("data/.info", "export_trt/calibration_folder/.info")
      4 shutil.copy("data/0.batch", "export_trt/calibration_folder/0.batch")

File ~/.asdf/installs/python/3.10.17/lib/python3.10/shutil.py:417, in copy(src, dst, follow_symlinks)
    415 if os.path.isdir(dst):
    416     dst = os.path.join(dst, os.path.basename(src))
--> 417 copyfile(src, dst, follow_symlinks=follow_symlinks)
    418 copymode(src, dst, follow_symlinks=follow_symlinks)
    419 return dst

File ~/.asdf/installs/python/3.10.17/lib/python3.10/shutil.py:254, in copyfile(src, dst, follow_symlinks)
    252     os.symlink(os.readlink(src), dst)
    253 else:
--> 254     with open(src, 'rb') as fsrc:
    255         try:
    256             with open(dst, 'wb') as fdst:
    257                 # macOS

FileNotFoundError: [Errno 2] No such file or directory: 'data/.info'

4. Generating the quantized model#

Finally, run the test script to quantize the model with the export python library and profile it.

[11]:
!cd export_trt/ && make test_lib_python_docker
/usr/bin/sh: 1: cd: can't cd to export_trt/

Following these steps have enabled you to conduct 8-bit quantization on your model. Upon completing the calibration, the calibration data can be reused if a calibration_cache exists, saving computational resources.

[12]:
!tail -n +0 export_trt/calibration_cache
tail: cannot open 'export_trt/calibration_cache' for reading: No such file or directory

After quantization, feel free to save the generated TensorRT engine using model.save("name_of_your_model"). The method will save the engine into a .trt file.

To load the engine for further applications, use model.load("name_of_your_model.trt") after instancing a model.