You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are going to demonstrate how to automatically generate a plugin for a custom kernel using Torch-TensorRT using
@@ -21,20 +21,22 @@
21
21
the performance and resource overhead from a graph break.
22
22
23
23
Previously this involved a complex process in not only building a performant kernel but setting it up to run in TensorRT (see: `Using Custom Kernels within TensorRT Engines with Torch-TensorRT <https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/custom_kernel_plugins.html>`_).
24
-
With TensorRT 10.7, there is a new Python native plugin system which greatly streamlines this process. This
24
+
As of TensorRT 10.7, there is a new Python native plugin system which greatly streamlines this process. This
25
25
plugin system also allows Torch-TensorRT to automatically generate the necessary conversion code to convert the
26
26
operation in PyTorch to TensorRT.
27
+
28
+
In addition, Torch-TensorRT provides automatic generation of TensorRT plugin feature (see: `Automatically Generate a Plugin for a Custom Kernel <https://docs.pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/auto_generate_plugins.html>`_).
29
+
However, the above methods generates a JIT plugin that might not satisfy user's performance requirements.
30
+
To support that, Torch-TensorRT provides auto generation of TensorRT AOT Plugin which raps a function to define an Ahead-of-Time (AOT) implementation for a plugin already registered.
31
+
This provides a performance boost comparing to JIT plugin.
32
+
27
33
"""
28
34
29
35
# %%
30
36
# Writing Custom Operators in PyTorch
31
37
# -----------------------------------------
32
38
#
33
-
# Pervious tutorials already cover creating custom operators in PyTorch which later get used with Torch-TensorRT.
34
-
# Here we define a simple elementwise multiplication operator in Triton. This operator is then registered as a custom op in PyTorch.
35
-
# with its host launch code as well as a "meta-kernel", A meta-kernel is a function that describes the shape and data type
36
-
# transformations that the operator will perform. This meta-kernel is used by Dynamo and Torch-TensorRT, so it
37
-
# is necessary to define.
39
+
# Here we define a Triton kernel which will later be compiled ahead of time for TensorRT Plugin.
0 commit comments