tag:github.com,2008:https://github.com/NVIDIA/TensorRT/releases Release notes from TensorRT 2026-02-03T22:19:46Z tag:github.com,2008:Repository/184657328/v10.15 2026-02-03T22:22:41Z TensorRT 10.15 Release <p>For more information, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/release-notes-10/10.15.1.html" rel="nofollow">TensorRT 10.15 Release Notes</a>:</p> <h1>Sample changes</h1> <ul> <li>Added 2 safety samples sampleSafeMNIST, and sampleSafePluginV3 to demonstrate how to use TensorRT with the safety workflow.</li> <li>Added trtSafeExec to accompany the safety workflow release.</li> <li>Added python/stream_writer to showcase how to serialize a TensorRT engine directly to a custom stream using the IStreamWriter interface, rather than writing to a file or a contiguous memory buffer.</li> <li>Added python/strongly_type_autocast to demonstrate how to convert FP32 ONNX models to mixed precision (FP32-FP16) using ModelOpt's AutoCast tool and subsequently building the engine with TensorRT's Strong Typing mode.</li> <li>Added sampleCudla to demonstrate how to use the cuDLA API to run TensorRT engines on the Deep Learning Accelerator (DLA) hardware, which is available on NVIDIA Jetson and DRIVE platforms.</li> <li>Deprecated sampleCharRNN.</li> </ul> <h1>Plugin changes</h1> <ul> <li>Deprecated bertQKVToContextPlugin and will be removed in a future release. No alternatives are planned to be provided.</li> </ul> <h1>Parser changes</h1> <ul> <li>Added support for RotaryEmbedding, RMSNormalization and TensorScatter for improved LLM model support</li> <li>Added more specialized quantization ops for models quantized through TensorRT ModelOptimizer.</li> <li>Added kREPORT_CAPABILITY_DLA flag to enable per-node validation when building DLA engines through TensorRT.</li> <li>Added kENABLE_PLUGIN_OVERRIDE flag to enable TensorRT plugin override for nodes that share names with user plugins.</li> <li>Improved error reporting for models with multiple subgraphs, such as Loop or Scan nodes.</li> </ul> <h1>Demo changes</h1> <ul> <li>demoDiffusion: Stable Diffusion 1.5, 2.0 and 2.1 pipelines have been deprecated and removed.</li> </ul> kevinch-nv tag:github.com,2008:Repository/184657328/v10.14 2026-01-28T17:30:19Z TensorRT 10.14 Release <h2>10.14 GA - 2025-11-7</h2> <ul> <li> <p>Sample changes</p> <ul> <li>Replace all pycuda usages with cuda-python APIs</li> <li>Removed the efficientnet samples</li> <li>Deprecated tensorflow_object_detection and efficientdet samples</li> <li>Samples will no longer be released with the packages. The TensorRT GitHub repository will be the single source.</li> </ul> </li> <li> <p>Parsers:</p> <ul> <li>Added support for the <code>Attention</code> operator</li> <li>Improved refit for <code>ConstantOfShape</code> nodes</li> </ul> </li> <li> <p>Demos</p> <ul> <li>demoDiffusion: <ul> <li>Added support for the Cosmos-Predict2 text2image and video2world pipelines</li> </ul> </li> </ul> </li> </ul> asfiyab-nvidia tag:github.com,2008:Repository/184657328/v10.13.3 2025-09-09T00:16:35Z TensorRT 10.13.3 Release <p>See the <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/release-notes.html" rel="nofollow">TensorRT 10.13.3 Release Notes</a> for more information.</p> <ul> <li>Added support for TensorRT API Capture and Replay feature, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/advanced.html" rel="nofollow">developer guide</a> for more information.</li> </ul> <p>Demo changes</p> <ul> <li>Added support for Flux Kontext pipeline.</li> </ul> kevinch-nv tag:github.com,2008:Repository/184657328/v10.13.2 2025-08-19T16:44:08Z TensorRT 10.13.2 Release <h2>10.13.2 GA - 2025-8-18</h2> <p>For more information, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/10.13.2/getting-started/release-notes.html" rel="nofollow">10.13.2 release notes.</a></p> <ul> <li>Added support for CUDA 13.0, dropped support for CUDA 11.X</li> <li>Dropped support for Ubuntu 20.04</li> <li>Dropped support for Python versions &lt; 3.10 for samples and demos</li> </ul> kevinch-nv tag:github.com,2008:Repository/184657328/v10.13.0 2025-07-24T22:00:51Z TensorRT 10.13 Release <h2>10.13.0 GA - 2025-7-24</h2> <ul> <li>Plugin changes <ul> <li>Fixed a division-by-zero error in geluPlugin that occured when the bias is omitted.</li> <li>Completed transition away from using static plugin field/attribute member variables in standard plugins. There's no such need since presently, TRT does not access field information after plugin creators are destructed (deregistered from the plugin registry), nor does access such information without a creator instance.</li> </ul> </li> <li>Sample changes <ul> <li>Deprecated the <code>yolov3_onnx</code> sample due to unstable url of yolo weights.</li> <li>Updated the <code>1_run_onnx_with_tensorrt</code> and <code>2_construct_network_with_layer_apis</code> samples to use <code>cuda-python</code> instead of <code>PyCUDA</code> for latest GPU/CUDA support.</li> </ul> </li> <li>Parser changes <ul> <li>Decreased memory usage when importing models with external weights</li> <li>Added <code>loadModelProto</code>, <code>loadInitializer</code> and <code>parseModelProto</code> APIs for IParser. These APIs are meant to be used to load user initializers when parsing ONNX models.</li> <li>Added <code>loadModelProto</code>, <code>loadInitializer</code> and <code>refitModelProto</code> APIs for IParserRefitter. These APIs are meant to be used to load user initializers when refitting ONNX models.</li> <li>Deprecated <code>IParser::parseWithWeightDescriptors</code>.</li> </ul> </li> </ul> kevinch-nv tag:github.com,2008:Repository/184657328/v10.12.0 2025-06-18T21:41:29Z TensorRT 10.12 Release <h2>10.12.0 GA - 2025-6-10</h2> <p>Key Features and Updates:</p> <ul> <li>Plugin changes <ul> <li>Migrated <code>IPluginV2</code>-descendent version 1 of <code>cropAndResizeDynamic</code>, to version 2, which implements <code>IPluginV3</code>.</li> <li>Note: The newer versions preserve the attributes and I/O of the corresponding older plugin version. The older plugin versions are deprecated and will be removed in a future release</li> <li>Deprecated the listed versions of the following plugins: <ul> <li><code>DecodeBbox3DPlugin</code> (version 1)</li> <li><code>DetectionLayer_TRT</code> (version 1)</li> <li><code>EfficientNMS_TRT</code> (version 1)</li> <li><code>FlattenConcat_TRT</code> (version 1)</li> <li><code>GenerateDetection_TRT</code> (version 1)</li> <li><code>GridAnchor_TRT</code> (version 1)</li> <li><code>GroupNormalizationPlugin</code> (version 1)</li> <li><code>InstanceNormalization_TRT</code> (version 2)</li> <li><code>ModulatedDeformConv2d</code> (version 1)</li> <li><code>MultilevelCropAndResize_TRT</code> (version 1)</li> <li><code>MultilevelProposeROI_TRT</code> (version 1)</li> <li><code>RPROI_TRT</code> (version 1)</li> <li><code>PillarScatterPlugin</code> (version 1)</li> <li><code>PriorBox_TRT</code> (version 1)</li> <li><code>ProposalLayer_TRT</code> (version 1)</li> <li><code>ProposalDynamic</code> (version 1)</li> <li><code>Region_TRT</code> (version 1)</li> <li><code>Reorg_TRT</code> (version 2)</li> <li><code>ResizeNearest_TRT</code> (version 1)</li> <li><code>ScatterND</code> (version 1)</li> <li><code>VoxelGeneratorPlugin</code> (version 1)</li> </ul> </li> </ul> </li> <li>Demo changes <ul> <li>Added <a href="/NVIDIA/TensorRT/blob/v10.12.0/demo/Diffusion#generate-an-image-with-stable-diffusion-v35-large-with-controlnet-guided-by-an-image-and-a-text-prompt">Image-to-Image</a> support for Stable Diffusion v3.5-large ControlNet models.</li> <li>Enabled download of <a href="https://huggingface.co/stabilityai/stable-diffusion-3.5-large-tensorrt" rel="nofollow">pre-exported ONNX models</a> for the Stable Diffusion v3.5-large pipeline.</li> </ul> </li> <li>Sample changes <ul> <li>Added two refactored python samples <a href="/NVIDIA/TensorRT/blob/v10.12.0/samples/python/refactored/1_run_onnx_with_tensorrt">1_run_onnx_with_tensorrt</a> and <a href="/NVIDIA/TensorRT/blob/v10.12.0/samples/python/refactored/2_construct_network_with_layer_apis">2_construct_network_with_layer_apis</a></li> </ul> </li> <li>Parser changes <ul> <li>Added support for integer-typed base tensors for <code>Pow</code> operations</li> <li>Added support for custom <code>MXFP8</code> quantization operations</li> <li>Added support for ellipses, diagonal, and broadcasting in <code>Einsum</code> operations</li> </ul> </li> </ul> akhilg-nv tag:github.com,2008:Repository/184657328/v10.11 2025-05-21T22:59:38Z TensorRT 10.11 Release <h2>10.11.0 GA - 2025-5-21</h2> <p>Key Features and Updates:</p> <ul> <li>Plugin changes <ul> <li>Migrated <code>IPluginV2</code>-descendent version 1 of <code>modulatedDeformConvPlugin</code>, to version 2, which implements <code>IPluginV3</code>.</li> <li>Migrated <code>IPluginV2</code>-descendent version 1 of <code>DisentangledAttention_TRT</code>, to version 2, which implements <code>IPluginV3</code>.</li> <li>Migrated <code>IPluginV2</code>-descendent version 1 of <code>MultiscaleDeformableAttnPlugin_TRT</code>, to version 2, which implements <code>IPluginV3</code>.</li> <li>Note: The newer versions preserve the attributes and I/O of the corresponding older plugin version. The older plugin versions are deprecated and will be removed in a future release.</li> </ul> </li> <li>Demo changes <ul> <li>demoDiffusion <ul> <li>Added support for Stable Diffusion 3.5-medium and 3.5-large pipelines in BF16 and FP16 precisions.</li> </ul> </li> </ul> </li> <li>Parser changes <ul> <li>Added <code>kENABLE_UINT8_AND_ASYMMETRIC_QUANTIZATION_DLA</code> parser flag to enable UINT8 asymmetric quantization on engines targeting DLA.</li> <li>Removed restriction that inputs to <code>RandomNormalLike</code> and <code>RandomUniformLike</code> must be tensors.</li> <li>Clarified limitations of scan outputs for <code>Loop</code> nodes.</li> </ul> </li> </ul> asfiyab-nvidia tag:github.com,2008:Repository/184657328/v10.10.0 2025-05-09T22:27:41Z TensorRT OSS v10.10.0 <h1>10.10.0 GA</h1> <p>For more information, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/release-notes.html#tensorrt-10-10-0" rel="nofollow">TensorRT 10.10.0 release notes</a>.</p> <p>Key Features and Updates:</p> <ul> <li>Demo changes <ul> <li>demoDiffusion <ul> <li>Added fp16 and fp8 LoRA support for demo diffusion’s SDXL and FLUX pipeline.</li> <li>Added fp16 ControlNet support for demo diffusion’s SDXL pipeline.</li> </ul> </li> </ul> </li> <li>Plugin changes <ul> <li>Deprecated the enum classes <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/c-api/namespacenvinfer1.html#a6fb3932a2896d82a94c8783e640afb34" rel="nofollow">PluginVersion</a> &amp; <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/c-api/namespacenvinfer1.html#a43c4159a19c23f74234f3c34124ea0c5" rel="nofollow">PluginCreatorVersion</a>. PluginVersion &amp; PluginCreatorVersion are used only in relation to IPluginV2-descendent plugin interfaces, which are all deprecated.</li> <li>Added the following APIs that enable users to obtain a list of all Plugin Creators hierarchically registered to a TensorRT Plugin Registry (<a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/c-api/classnvinfer1_1_1_i_plugin_registry.html" rel="nofollow">C++</a>, <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/Plugin/IPluginRegistry.html" rel="nofollow">Python</a>) instance. <ul> <li>C++ API: IPluginRegistry::getAllCreatorsRecursive()</li> <li>Python API: IPluginRegistry.all_creators_recursive</li> </ul> </li> </ul> </li> <li>Parser changes <ul> <li>Cleaned up log spam when the ONNX network contained a mixture Plugins and LocalFunctions</li> <li>UINT8 constants are now properly imported for QuantizeLinear &amp; DequantizeLinear nodes</li> <li>Plugin fallback importer now also reads its namespace from a Node's domain field</li> </ul> </li> <li>Sample changes <ul> <li>Added support for the <a href="https://github.com/NVIDIA/TensorRT/tree/release/10.9/samples/python/python_plugin">python_plugin sample</a> to compile targets to Blackwell.</li> </ul> </li> </ul> poweiw tag:github.com,2008:Repository/184657328/v10.9.0 2025-03-11T22:00:06Z TensorRT OSS v10.9.0 <h1>10.9.0 GA</h1> <p>For more information, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/release-notes.html#tensorrt-10-9-0" rel="nofollow">TensorRT 10.9.0 release notes</a>.</p> <p>Key Features and Updates:</p> <ul> <li>Demo changes <ul> <li>demoDiffusion <ul> <li>Added Canny ControlNet support for the SDXL pipeline</li> </ul> </li> </ul> </li> <li>Plugin changes <ul> <li>Added a readme to the GroupNormalization plugin (<code>GroupNormalizationPlugin</code>) - <a href="https://github.com/NVIDIA/TensorRT/issues/4314" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/TensorRT/issues/4314/hovercard">4314</a></li> <li>Fixed bug in <code>CustomQKVToConte mxtPluginDynamic</code> version 3 where SM 100 was not considered a supported platform.</li> </ul> </li> <li>Parser changes <ul> <li>Added support for Python AOT plugins</li> <li>Added support for opset 21 GroupNorm - <a href="https://github.com/NVIDIA/TensorRT/issues/4336" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/TensorRT/issues/4336/hovercard">4336</a></li> <li>Fixed support for opset 18+ ScatterND</li> </ul> </li> <li>Sample changes <ul> <li>Added a new sample <code>dds_faster_rcnn</code> which demonstrates how to handle data-dependent shaped outputs with <code>IOutputAllocator</code>.</li> </ul> </li> <li>Fixed issues: <ul> <li>Fixed streamReaderV2 Python API performance issue - <a href="https://github.com/NVIDIA/TensorRT/issues/4327" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/TensorRT/issues/4327/hovercard">4327</a></li> </ul> </li> </ul> LeoZDong tag:github.com,2008:Repository/184657328/v10.8.0 2025-02-01T01:09:15Z TensorRT OSS v10.8.0 <h1>10.8.0 GA</h1> <p>For more information, see the <a href="https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html#rel-10-8-0" rel="nofollow">TensorRT 10.8.0 release notes</a>.</p> <p>Key Features and Updates:</p> <ul> <li>Demo changes <ul> <li>demoDiffusion <ul> <li>Added <a href="/NVIDIA/TensorRT/blob/v10.8.0/demo/Diffusion#generate-an-image-guided-by-an-initial-image-and-a-text-prompt-using-flux">Image-to-Image</a> support for Flux-1.dev and Flux.1-schnell pipelines.</li> <li>Added <a href="/NVIDIA/TensorRT/blob/v10.8.0/demo/Diffusion#generate-an-image-guided-by-a-text-prompt-and-a-control-image-using-flux-controlnet">ControlNet</a> support for <a href="https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev" rel="nofollow">FLUX.1-Canny-dev</a> and <a href="https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev" rel="nofollow">FLUX.1-Depth-dev</a> pipelines. Native FP8 quantization is also supported for these pipelines.</li> <li>Added support for ONNX model export only mode. See <a href="/NVIDIA/TensorRT/blob/v10.8.0/demo/Diffusion#https:/gitlab-master.nvidia.com/TensorRT/Public/oss/-/tree/release/10.8/demo/Diffusion?ref_type=heads#use-separate-directories-for-individual-onnx-models">--onnx-export-only</a>.</li> <li>Added FP16, BF16, FP8, and FP4 support for all Flux Pipelines.</li> </ul> </li> </ul> </li> <li>Plugin changes <ul> <li>Added SM 100 and SM 120 support to bertQKVToContextPlugin. This enables demo/BERT on Blackwell GPUs.</li> </ul> </li> <li>Sample changes <ul> <li>Added a new <code>sampleEditableTimingCache</code> to demonstrate how to build an engine with the desired tactics by modifying the timing cache.</li> <li>Deleted the <code>sampleAlgorithmSelector</code> sample.</li> <li>Fixed <code>sampleOnnxMNIST</code> by updating the correct INT8 dynamic range.</li> </ul> </li> <li>Parser changes <ul> <li>Added support for <code>FLOAT4E2M1</code> types for quantized networks.</li> <li>Added support for dynamic axes and improved performance of <code>CumSum</code> operations.</li> <li>Fixed the import of local functions when their input tensor names aliased one from an outside scope.</li> <li>Added support for <code>Pow</code> ops with integer-typed exponent values.</li> </ul> </li> <li>Fixed issues <ul> <li>Fixed segmentation of boolean constant nodes - <a href="https://github.com/NVIDIA/TensorRT/issues/4224" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/TensorRT/issues/4224/hovercard">4224</a>.</li> <li>Fixed accuracy issue when multiple optimization profiles were defined <a href="https://github.com/NVIDIA/TensorRT/issues/4250" data-hovercard-type="issue" data-hovercard-url="/NVIDIA/TensorRT/issues/4250/hovercard">4250</a>.</li> </ul> </li> </ul> yuanyao-nv