https://rustls.dev/ Zola en Sat, 07 Mar 2026 00:00:00 +0000 Benchmarking rustls 0.23.37 vs OpenSSL 3.6.1 vs BoringSSL on x86_64 Sat, 07 Mar 2026 00:00:00 +0000 Unknown https://rustls.dev/perf/2026-03-07-report/ https://rustls.dev/perf/2026-03-07-report/ <h3 id="system-configuration">System configuration</h3> <p>We ran the benchmarks on a bare-metal server with the following characteristics:</p> <ul> <li>OS: Debian 12 (Bookworm).</li> <li>C/C++ toolchains: GCC 12.2.0 and Clang 14.0.6.</li> <li>Rust toolchain: 1.94.0</li> <li>CPU: <a href="https://www.intel.com/content/www/us/en/products/sku/214806/intel-xeon-e2386g-processor-12m-cache-3-50-ghz/specifications.html">Xeon E-2386G</a> (supporting AVX-512).</li> <li>Memory: 32GB.</li> <li>Extra configuration: hyper-threading disabled, dynamic frequency scaling disabled, cpu scaling governor set to performance for all cores.</li> </ul> <h3 id="versions">Versions</h3> <p>The benchmarking tool used for both OpenSSL and BoringSSL was <a href="https://github.com/ctz/openssl-bench/tree/82b86b22">openssl-bench 82b86b22</a>.</p> <p>This was built from source with its makefile.</p> <h4 id="boringssl">BoringSSL</h4> <p>The tested version of BoringSSL is <a href="https://github.com/google/boringssl/tree/30cd935">30cd935</a>, which was the most recent point on main when we started these measurements.</p> <p>BoringSSL was built from source with <code>CC=clang CXX=clang++ cmake -DCMAKE_BUILD_TYPE=Release</code>.</p> <h4 id="openssl">OpenSSL</h4> <p>The tested version of OpenSSL is <a href="https://github.com/openssl/openssl/tree/openssl-3.6.1">3.6.1</a>, which was the latest release at the time of writing.</p> <p>OpenSSL was built from source with <code>./Configure ; make -j12</code>.</p> <h4 id="rustls">Rustls</h4> <p>The tested version of rustls is <a href="https://github.com/rustls/rustls/releases/tag/v%2F0.23.37">0.23.37</a>, which was the latest stable release at the time of writing. This was used with aws-lc-rs 1.16.0 / aws-lc-sys 0.37.1.</p> <h3 id="measurements">Measurements</h3> <p>BoringSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure BORINGSSL=1 </span></code></pre> <p>OpenSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure </span></code></pre> <p>rustls was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/rustls </span><span>$ BENCH_MULTIPLIER=16 setarch -R make -f admin/bench-measure.mk measure </span></code></pre> <h2 id="results">Results</h2> <p>Transfer measurements are in megabytes per second. Handshake units are handshakes per second.</p> <table><thead><tr><th></th><th>BoringSSL 30cd935</th><th>OpenSSL 3.6.1</th><th>rustls 0.23.37</th></tr></thead><tbody> <tr><td>transfer, 1.2, aes-128-gcm, sending</td><td>8291.9</td><td>6610.19</td><td>8133.85</td></tr> <tr><td>transfer, 1.2, aes-128-gcm, receiving</td><td>6722.26</td><td>7129.43</td><td>7946.96</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, sending</td><td>7564.71</td><td>5844.35</td><td>7421.02</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, receiving</td><td>6217.68</td><td>6237.52</td><td>7332.91</td></tr> <tr><td></td><td>BoringSSL 30cd935</td><td>OpenSSL 3.6.1</td><td>rustls 0.23.37</td></tr> <tr><td>full handshakes, 1.2, rsa, client</td><td>5657.81</td><td>3186.87</td><td>8122.99</td></tr> <tr><td>full handshakes, 1.2, rsa, server</td><td>1476.66</td><td>2154.41</td><td>2835.45</td></tr> <tr><td>full handshakes, 1.2, ecdsa, client</td><td>3545.32</td><td>2201.16</td><td>4344.30</td></tr> <tr><td>full handshakes, 1.2, ecdsa, server</td><td>9057.57</td><td>5226.96</td><td>13,524.99</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>3179.48</td><td>2219.64</td><td>4766.63</td></tr> <tr><td>full handshakes, 1.3, rsa, server</td><td>1302.03</td><td>1712.64</td><td>2357.36</td></tr> <tr><td>full handshakes, 1.3, ecdsa, client</td><td>2364.8</td><td>1642.88</td><td>3160.39</td></tr> <tr><td>full handshakes, 1.3, ecdsa, server</td><td>4981.38</td><td>3176.58</td><td>6786.26</td></tr> <tr><td></td><td>BoringSSL 30cd935</td><td>OpenSSL 3.6.1</td><td>rustls 0.23.37</td></tr> <tr><td>resumed handshakes, 1.2, client</td><td>45,390.3</td><td>21,136.9</td><td>63,870.34</td></tr> <tr><td>resumed handshakes, 1.2, server</td><td>44,429</td><td>22,647.1</td><td>72,480.52</td></tr> <tr><td>resumed handshakes, 1.3, client</td><td>4648.41</td><td>3594.88</td><td>6735.88</td></tr> <tr><td>resumed handshakes, 1.3, server</td><td>5687.32</td><td>3780</td><td>7249.08</td></tr> </tbody></table> <p><img src="/2026-03-07-transfer.svg" alt="graph of transfer speeds" /></p> <p><img src="/2026-03-07-full-handshake.svg" alt="graph of full handshakes" /></p> <p><img src="/2026-03-07-resumed-handshake.svg" alt="graph of resumed handshakes" /></p> Rustls and the Rust Foundation's Rust Innovation Lab Wed, 03 Sep 2025 00:00:00 +0000 Joe Birr-Pixton https://rustls.dev/blog/2025-09-03-rustls-and-rust-foundation/ https://rustls.dev/blog/2025-09-03-rustls-and-rust-foundation/ <p>As you may have seen, <a href="https://rustfoundation.org/media/rust-foundation-launches-rust-innovation-lab-with-rustls-as-inaugural-project/">Rustls is the first project in the Rust Foundation's new Rust Innovation Lab program</a>.</p> <p>Rustls is a project I started in May 2016 as I learned Rust. Since then it has grown from a casual project, to having multiple maintainers, hundreds of contributors, funded contributions, and finally multiple funded maintainers.</p> <p>As a project it has expanded to incorporate adjacent efforts, such as an <a href="https://github.com/rustls/rustls-openssl-compat">OpenSSL-compatible API</a>, a <a href="https://github.com/rustls/rustls-ffi">C API</a>, <a href="https://github.com/rustls/rustls-platform-verifier">deep integration into platform-specific certificate verifiers</a> and integrations with important Rust ecosystem crates such as <a href="http://github.com/rustls/tokio-rustls">tokio</a> and <a href="http://github.com/rustls/hyper-rustls">hyper</a>.</p> <p>Now it supports a significant amount of the crates ecosystem and applications with <em>billions</em> of users.</p> <p>Giving the Rustls project an administrative and legal home is the next step in that development.</p> <p>Users will see no change in the project's direction or personnel. If you rely on Rustls in a commercial context we would love to talk about how we can address your needs, and how we can work together to support the project long-term.</p> <p>We want to thank ISRG for its <a href="https://www.memorysafety.org/initiative/rustls/">significant and ongoing support</a>, and <a href="https://www.sovereign.tech/tech/rustls">Sovereign Tech Agency</a> for their recent funding of the project</p> <h1 id="q-a">Q+A:</h1> <h2 id="why-do-this-now">Why do this now?</h2> <p>We want to make it easier for potential funding sources to support the project. In conversations, it became clear that a clear legal and governance status would make the project more attractive for funders.</p> <h2 id="why-the-rust-foundation">Why the Rust Foundation?</h2> <p>We feel the Rust Foundation shares our goals in promoting the Rust language and serving its users.</p> <h2 id="does-this-mean-rustls-is-being-funded-by-my-organization-s-membership-of-the-rust-foundation">Does this mean Rustls is being funded by my organization's membership of the Rust Foundation?</h2> <p>No. Only funding specifically for Rustls will be made available to the Rustls project. This does not affect funding to the Rust Project or other initiatives funded by the Rust Foundation.</p> <h2 id="does-this-change-rustls-technical-direction-or-personnel">Does this change Rustls' technical direction or personnel?</h2> <p>It does not change that. The existing maintainers have complete and unequivocal control over the Rustls project and project roadmap.</p> Benchmarking rustls 0.23.31 vs OpenSSL 3.5.15 vs BoringSSL on x86_64 Thu, 31 Jul 2025 00:00:00 +0000 Unknown https://rustls.dev/perf/2025-07-31-report/ https://rustls.dev/perf/2025-07-31-report/ <h3 id="system-configuration">System configuration</h3> <p>We ran the benchmarks on a bare-metal server with the following characteristics:</p> <ul> <li>OS: Debian 12 (Bookworm).</li> <li>C/C++ toolchains: GCC 12.2.0 and Clang 14.0.6.</li> <li>CPU: <a href="https://www.intel.com/content/www/us/en/products/sku/214806/intel-xeon-e2386g-processor-12m-cache-3-50-ghz/specifications.html">Xeon E-2386G</a> (supporting AVX-512).</li> <li>Memory: 32GB.</li> <li>Extra configuration: hyper-threading disabled, dynamic frequency scaling disabled, cpu scaling governor set to performance for all cores.</li> </ul> <h3 id="versions">Versions</h3> <p>The benchmarking tool used for both OpenSSL and BoringSSL was <a href="https://github.com/ctz/openssl-bench/tree/82b86b22">openssl-bench 82b86b22</a>.</p> <p>This was built from source with its makefile.</p> <h4 id="boringssl">BoringSSL</h4> <p>The tested version of BoringSSL is <a href="https://github.com/google/boringssl/tree/0.20250701.0">0.20250701.0</a>, which was the most recent point on master when we started these measurements.</p> <p>BoringSSL was built from source with <code>CC=clang CXX=clang++ cmake -DCMAKE_BUILD_TYPE=Release</code>.</p> <h4 id="openssl">OpenSSL</h4> <p>The tested version of OpenSSL is <a href="https://github.com/openssl/openssl/tree/openssl-3.5.1">3.5.1</a>, which was the latest release at the time of writing.</p> <p>OpenSSL was built from source with <code>./Configure ; make -j12</code>.</p> <h4 id="rustls">Rustls</h4> <p>The tested version of rustls is <a href="https://github.com/rustls/rustls/releases/tag/v%2F0.23.31">0.23.31</a>, which was the latest release at the time of writing. This was used with aws-lc-rs 1.13.1 / aws-lc-sys 0.29.0.</p> <h3 id="measurements">Measurements</h3> <p>BoringSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure BORINGSSL=1 </span></code></pre> <p>OpenSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure </span></code></pre> <p>rustls was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/rustls </span><span>$ BENCH_MULTIPLIER=16 setarch -R make -f admin/bench-measure.mk measure </span></code></pre> <h2 id="results">Results</h2> <p>Transfer measurements are in megabytes per second. Handshake units are handshakes per second.</p> <table><thead><tr><th></th><th>BoringSSL 0.20250701.0</th><th>OpenSSL 3.5.1</th><th>rustls 0.23.31</th></tr></thead><tbody> <tr><td>transfer, 1.2, aes-128-gcm, sending</td><td>8575.27</td><td>6565.22</td><td>8074.82</td></tr> <tr><td>transfer, 1.2, aes-128-gcm, receiving</td><td>6986.81</td><td>7219.67</td><td>7952.68</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, sending</td><td>7739.61</td><td>6093.27</td><td>7628.68</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, receiving</td><td>6421.36</td><td>6472.3</td><td>7407.83</td></tr> <tr><td></td><td>BoringSSL 0.20250701.0</td><td>OpenSSL 3.5.1</td><td>rustls 0.23.31</td></tr> <tr><td>full handshakes, 1.2, rsa, client</td><td>5375.06</td><td>3251.54</td><td>8206.33</td></tr> <tr><td>full handshakes, 1.2, rsa, server</td><td>1447.33</td><td>2169</td><td>2857.81</td></tr> <tr><td>full handshakes, 1.2, ecdsa, client</td><td>3454.89</td><td>2195.55</td><td>4345.05</td></tr> <tr><td>full handshakes, 1.2, ecdsa, server</td><td>9096.44</td><td>5178.02</td><td>13618.81</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>3125.36</td><td>2222.21</td><td>4187.28</td></tr> <tr><td>full handshakes, 1.3, rsa, server</td><td>1285.88</td><td>1714.24</td><td>2273.13</td></tr> <tr><td>full handshakes, 1.3, ecdsa, client</td><td>2344.76</td><td>1650.56</td><td>2884.83</td></tr> <tr><td>full handshakes, 1.3, ecdsa, server</td><td>5113.83</td><td>3183.26</td><td>6229.71</td></tr> <tr><td></td><td>BoringSSL 0.20250701.0</td><td>OpenSSL 3.5.1</td><td>rustls 0.23.31</td></tr> <tr><td>resumed handshakes, 1.2, client</td><td>47,509.5</td><td>19,936.5</td><td>65,617.35</td></tr> <tr><td>resumed handshakes, 1.2, server</td><td>46,561.8</td><td>21,043.1</td><td>74,771.51</td></tr> <tr><td>resumed handshakes, 1.3, client</td><td>4695.79</td><td>3574.86</td><td>5614.4</td></tr> <tr><td>resumed handshakes, 1.3, server</td><td>5803.03</td><td>3771.28</td><td>6623.94</td></tr> </tbody></table> <p><img src="/2025-07-31-transfer.svg" alt="graph of transfer speeds" /></p> <p><img src="/2025-07-31-full-handshake.svg" alt="graph of full handshakes" /></p> <p><img src="/2025-07-31-resumed-handshake.svg" alt="graph of resumed handshakes" /></p> <h3 id="notable-changes-since-last-time">Notable changes since <a href="/perf/2024-10-18-report">last time</a></h3> <h4 id="post-quantum-key-exchange">Post-quantum key exchange</h4> <p>OpenSSL and rustls now use X25519MLKEM768 post-quantum key exchange by default. BoringSSL is configured to do the same. This applies to all TLS1.3 handshakes.</p> <table><thead><tr><th></th><th>old</th><th></th><th>new</th></tr></thead><tbody> <tr><td></td><td>BoringSSL 76968bb3</td><td>➡️</td><td>BoringSSL 0.20250701.0</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>4813.91 hs/s</td><td>1.54x slower</td><td>3125.36 hs/s</td></tr> <tr><td></td><td>OpenSSL 3.3.2</td><td>➡️</td><td>OpenSSL 3.5.1</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>2788.76 hs/s</td><td>1.25x slower</td><td>2222.21 hs/s</td></tr> <tr><td></td><td>rustls 0.23.15</td><td>➡️</td><td>rustls 0.23.31</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>6803.93 hs/s</td><td>1.62x slower</td><td>4187.28 hs/s</td></tr> </tbody></table> <h4 id="boringssl-avx-512-aes-gcm">BoringSSL AVX-512 AES-GCM</h4> <p>BoringSSL now has AVX512-accelerated AES-GCM. Since last time, that looks like:</p> <table><thead><tr><th></th><th>old</th><th></th><th>new</th></tr></thead><tbody> <tr><td></td><td>BoringSSL 76968bb3</td><td>➡️</td><td>BoringSSL 0.20250701.0</td></tr> <tr><td>transfer, 1.2, aes-128-gcm, sending</td><td>5043.04 MB/s</td><td>1.7x faster</td><td>8575.27 MB/s</td></tr> </tbody></table> <h4 id="rustls-extension-optimizations">rustls extension optimizations</h4> <p>We spent some time improving our internal representation for TLS extensions. This applied to clients and servers, and all TLS versions. But it's most visible here in TLS1.2 performance because there aren't any cryptography changes masking it.</p> <table><thead><tr><th></th><th>old</th><th></th><th>new</th></tr></thead><tbody> <tr><td></td><td>rustls 0.23.15</td><td>➡️</td><td>rustls 0.23.31</td></tr> <tr><td>resumed handshakes, 1.2, client</td><td>64,722.55 hs/s</td><td>1.02x faster</td><td>65,617.35 hs/s</td></tr> <tr><td>resumed handshakes, 1.2, server</td><td>71,149.91 hs/s</td><td>1.05x faster</td><td>74,771.51 hs/s</td></tr> </tbody></table> Measuring and (slightly) Improving Post-Quantum Handshake Performance Tue, 17 Dec 2024 00:00:00 +0000 Unknown https://rustls.dev/perf/2024-12-17-pq-kx/ https://rustls.dev/perf/2024-12-17-pq-kx/ <p>To defend against the potential advent of "Cryptographically Relevant Quantum Computers" there is a move to using "hybrid" key exchange algorithms. These glue together a widely-deployed classical algorithm (like <a href="https://datatracker.ietf.org/doc/html/rfc7748">X25519</a>) and a new post-quantum-secure algorithm (like <a href="https://csrc.nist.gov/pubs/fips/203/final">ML-KEM</a>) and treat the result as one TLS-level key exchange algorithm (like <a href="https://datatracker.ietf.org/doc/draft-ietf-tls-ecdhe-mlkem/">X25519MLKEM768</a>).</p> <p>In this report, first we'll measure the additional cost of post-quantum-secure key exchange. Then we'll describe and measure an optimization we have implemented.</p> <h1 id="headline-measurements">Headline measurements</h1> <p>All these measurements are taken on our amd64 benchmarking machine, which has a <a href="https://www.intel.com/content/www/us/en/products/sku/214806/intel-xeon-e2386g-processor-12m-cache-3-50-ghz/specifications.html">Xeon E-2386G</a> CPU. We'll compare:</p> <ul> <li>rustls using post-quantum-insecure X25519 key exchange,</li> <li>rustls using post-quantum-secure X25519MLKEM768 key exchange, and</li> <li>OpenSSL 3.3.2 using post-quantum-insecure X25519 key exchange.</li> </ul> <p>All three are taken on the same hardware, and the latter measurements are from <a href="https://rustls.dev/perf/2024-10-18-report/">our previous report</a> -- which also contains reproduction instructions and describes what the benchmarks measure.</p> <p>One important thing to note is that post-quantum key exchange involves sending and receiving much larger messages than classical ones. Our benchmark design only covers CPU costs -- and does not include networking -- so real-world performance will be worse than these measurements.</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/tls13-client-hs-openssl.svg" alt="client handshake performance results on amd64 architecture" /></p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/tls13-server-hs-openssl.svg" alt="server handshake performance results on amd64 architecture" /></p> <p>The cost of X25519MLKEM768 post-quantum key exchange is clearly visible for both clients and servers.</p> <p>We can see that the performance headroom that rustls has attained means we can <em>almost</em> completely absorb the extra cost of post-quantum key exchange, while still performing better than (post-quantum-insecure) OpenSSL -- with the exception of client resumption.</p> <p>We will do further comparative benchmarking in this area when OpenSSL gains post-quantum key exchange support.</p> <h1 id="sharing-x25519-setup-costs">Sharing X25519 setup costs</h1> <h2 id="background">Background</h2> <p>In TLS1.3, the client starts the key exchange in its first message (the <code>ClientHello</code>). The <code>ClientHello</code> includes both a description of which algorithms the client supports, and zero or more presumptive "key shares".</p> <p>The server then evaluates which algorithms it is willing to use, and either uses one of the presumptive key shares, or replies with a <code>HelloRetryRequest</code> which instructs the client to send new <code>ClientHello</code> with a specific, mutually-acceptable key share.</p> <p>A <code>HelloRetryRequest</code> can be expensive, because it introduces an additional round trip into the handshake. It also means any work the client did for its presumptive key shares is wasted.</p> <p>It's therefore advantageous for a client to avoid <code>HelloRetryRequest</code>s, by:</p> <ul> <li> <p><strong>Having prior knowledge of the server's preferences.</strong> <a href="https://datatracker.ietf.org/doc/draft-ietf-tls-key-share-prediction/">draft-ietf-tls-key-share-prediction</a> is an effort to standardize a mechanism for a client to learn this out-of-band.</p> </li> <li> <p><strong>Remembering a server's preferences from a previous connection.</strong> rustls has done this since adding support for TLS1.3 in 2017. This generally means a client making many connections to one server may avoid repeated <code>HelloRetryRequest</code>s.</p> </li> <li> <p><strong>Sending many presumptive key shares.</strong> Though there's an obvious trade-off in terms of wasted computation and message size.</p> </li> <li> <p><strong>Following ecosystem preferences.</strong> <a href="https://datatracker.ietf.org/doc/html/rfc7748">X25519</a> key exchange is overwhelmingly preferred in TLS1.3 implementations, due to its performance and implementation quality.</p> </li> </ul> <p>The key shares in a <code>ClientHello</code> would look like:</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/hybrid-only.svg" alt="diagram of TLS1.3 client key exchange with X25519MLKEM768" /></p> <p>At least for a transitional period, we want to avoid a <code>HelloRetryRequest</code> round trip when connecting to a server that hasn't been upgraded to support X25519MLKEM768. That means also offering a separate X25519 key share:</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/hybrid-both.svg" alt="diagram of TLS1.3 client key exchange with X25519MLKEM768 and X25519" /></p> <p>However, this arrangement is not optimal. While X25519 setup is very fast, we are doing it twice and then we are guaranteed to throw away half of that work, because the server can only ever select one key share to use.</p> <p>Instead, we can do:</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/hybrid-opt.svg" alt="diagram of TLS1.3 optimized client key exchange with X25519MLKEM768 and X25519" /></p> <p>This report measures the benefit of that optimization.</p> <p>This optimization is described further in <a href="https://www.ietf.org/archive/id/draft-ietf-tls-hybrid-design-11.html#name-transmitting-public-keys-an">draft-ietf-tls-hybrid-design</a> section 3.2.</p> <h2 id="micro-benchmarking">Micro benchmarking</h2> <p>First, we can micro-benchmark the time to construct and serialize a <code>ClientHello</code>, in a variety of situations:</p> <ul> <li>X25519 key share included only.</li> <li>X25519MLKEM768 and X25519 key shares, with the optimization.</li> <li>X25519MLKEM768 and X25519 key shares, without the optimization.</li> </ul> <p>We run this on two machines that cover both amd64 (Xeon E-2386G) and aarch64 (Ampere Altra Q80-30) architectures.</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/microbench-amd64.svg" alt="micro benchmark results on amd64 architecture" /></p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/microbench-arm64.svg" alt="micro benchmark results on arm64 architecture" /></p> <p>From this we can see:</p> <ul> <li>There is a small but measurable benefit, as expected.</li> <li>ML-KEM-768 key generation costs are significantly more expensive than X25519.</li> </ul> <h2 id="whole-handshakes">Whole handshakes</h2> <p>Next, let's measure the same scenarios in the context of whole client handshakes. The remaining measurements are only done on our amd64 benchmark machine.</p> <p>The above optimization only affects the client's first message, so now we'll see whether the effect of the optimization is meaningful when compared to the rest of the computation a client must do.</p> <p><img src="https://rustls.dev/perf/2024-12-17-pq-kx/tls13-client-hs.svg" alt="client handshake performance results on amd64 architecture" /></p> <p>The difference is visible but small, as it has been diluted by other parts of the handshake. It is approximately 4.3% for resumptions, 2.8% for full RSA handshakes, and 2.6% for ECDSA handshakes.</p> Measuring and Improving rustls's Multithreaded Performance Thu, 28 Nov 2024 00:00:00 +0000 Unknown https://rustls.dev/perf/2024-11-28-threading/ https://rustls.dev/perf/2024-11-28-threading/ <h3 id="system-configuration">System configuration</h3> <p>We ran the benchmarks on a bare-metal server with the following characteristics:</p> <ul> <li>OS: Debian 12 (Bookworm).</li> <li>C/C++ toolchains: GCC 12.2.0 and Clang 14.0.6.</li> <li>CPU: Ampere Altra Q80-30, 80 cores.</li> <li>Memory: 128GB.</li> <li>All cores set to <code>performance</code> CPU frequency governor.</li> </ul> <p>This is the <a href="https://www.hetzner.com/dedicated-rootserver/matrix-rx/">Hetzner RX170</a>.</p> <h3 id="how-the-benchmarks-work">How the benchmarks work</h3> <p>Compared to <a href="/perf">previous reports</a>, the benchmark tool can now perform the same benchmarks in many threads simultaneously. Each thread runs the same benchmarking operation as before, and threads do not contend with each other <em>except</em> via the internals of the TLS library.</p> <p>As before, the benchmarking is performed by measuring a TLS client "connecting" to a TLS server over a memory buffer -- there is no network latency, system calls, or other overhead that would be present in a typical networked application. This arrangement actually should be the worst case for multithreaded testing: every thread should be working all the time (rather than waiting for IO) and therefore contention on any locks in the library under test should be maximal.</p> <h3 id="versions">Versions</h3> <p>The benchmarking tool used for both OpenSSL and BoringSSL was <a href="https://github.com/ctz/openssl-bench/tree/02249496d2963e2a0b694e7be3b37f0d85f8eccd">openssl-bench <code>02249496</code></a>.</p> <h4 id="boringssl-and-openssl">BoringSSL and OpenSSL</h4> <p>The version of BoringSSL and its build instructions are the same as <a href="https://rustls.dev/perf/2024-10-18-report/">the previous report</a>.</p> <p>We test OpenSSL 3.4.0 which is the latest release at the time of writing.</p> <p>We also include measurements using OpenSSL 3.0.14, as packaged by Debian as 3.0.14-1~deb12u2. This is included to observe the thread scalability improvements made between OpenSSL 3.0 and 3.4.</p> <h4 id="rustls">Rustls</h4> <p>The tested version of rustls was 0.23.16, which was the latest release when this work was started. This was used with aws-lc-rs 1.10.0 / aws-lc-sys 0.22.0.</p> <p>Additionally the following three commits were included, which affect the benchmark tool but do not affect the core crate:</p> <ul> <li><a href="https://github.com/rustls/rustls/commit/a5d510ea4e5a44611f49985bbaba84b6c4f51533">a5d510ea</a></li> <li><a href="https://github.com/rustls/rustls/commit/44522ad089add58bc7df54ec9903528ab6d5f64f">44522ad0</a></li> <li><a href="https://github.com/rustls/rustls/commit/d1c33f8641c1c69edc27d98047c38f7f852f55eb">d1c33f86</a></li> </ul> <h3 id="measurements">Measurements</h3> <p>BoringSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=2 setarch -R make threads BORINGSSL=1 </span></code></pre> <p>OpenSSL 3.4.0 was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=2 setarch -R make threads </span></code></pre> <p>OpenSSL 3.0.14 was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=2 setarch -R make threads HOST_OPENSSL=1 </span></code></pre> <p>rustls was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/rustls </span><span>$ BENCH_MULTIPLIER=2 setarch -R make -f admin/bench-measure.mk threads </span></code></pre> <h2 id="initial-results">Initial results</h2> <p>Since we now have three dimensions of measurement (implementation, thread count, test case), this is restricted to some selected test cases.</p> <p>We concentrate on server performance, as servers seem more likely to be run at large concurrencies.</p> <p>The graphs below are handshakes per second <em>per thread</em>. The ideal and expected shape should be a flat, horizontal line up to 80 threads, with a fall-off after that. A flat line means doubling the number of threads doubles the throughput; a declining line would mean less than that, and indicates additional threads reduce the per-thread performance.</p> <h3 id="full-tls1-3-handshakes-on-the-server">Full TLS1.3 handshakes on the server</h3> <p>This case is a common one for a TLS server, and excludes any need for state on the server side.</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/full-server.svg" alt="graph of TLS1.3 full handshake server performance" /></p> <p>This graph shows the OpenSSL 3 scalability problems, the improvements made between OpenSSL 3.0 and 3.4, and the absence of such problems in BoringSSL and rustls.</p> <h3 id="resumed-tls1-2-handshakes-on-the-server">Resumed TLS1.2 handshakes on the server</h3> <p>This covers both ticket-based resumption, and session-id-based resumption. The latter requires the server to maintain a cache across threads, so some "droop" is expected in these traces.</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/resumed-12-server.svg" alt="graph of TLS1.2 resumed handshake server performance" /></p> <p>Clearly something is not right with rustls's ticket resumption performance.</p> <h3 id="resumed-tls1-3-handshakes-on-the-server">Resumed TLS1.3 handshakes on the server</h3> <p>This covers only ticket-based resumption, so the server needs no state between threads.</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/resumed-13-server.svg" alt="graph of TLS1.3 resumed handshake server performance" /></p> <p>Again, rustls's ticket resumption performance is suspect.</p> <h1 id="improving-rustls-s-ticket-resumption-performance">Improving rustls's ticket resumption performance</h1> <p>In rustls we have a <code>TicketSwitcher</code>, which is responsible for rotating between ticket keys periodically. This is important because (especially in TLS1.2) compromise of the ticket key destroys the security of all past and future sessions.</p> <p>Unfortunately, a mutex was held around all ticket operations -- so that a thread finding the key need to be rotated did not race another thread currently using that key.</p> <p>In <a href="https://github.com/rustls/rustls/pull/2193">PR #2193</a> this was improved to use a rwlock -- this means there is no contention at all in the common case.</p> <p>The improvement was released in <a href="https://github.com/rustls/rustls/releases/tag/v%2F0.23.17">rustls 0.23.17</a> and looks like:</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/resumed-12-server-postfix.svg" alt="graph of TLS1.2 resumed handshake server performance" /></p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/resumed-13-server-postfix.svg" alt="graph of TLS1.3 resumed handshake server performance" /></p> <h1 id="measuring-worst-case-latency-at-high-concurrency">Measuring worst-case latency at high concurrency</h1> <p>The above measurements only record <em>average handshake throughput per thread</em>, which is fine for seeing how that changes with the number of threads. Another important measure is: what is the latency distribution of all handshakes performed, at high thread counts? What we're looking for here is evidence that no one handshake experiences poor performance.</p> <p>From here on, the version of rustls tested moved to <a href="https://github.com/rustls/rustls/commit/fc6b4a193b065604d10e16e79d601d8a30c18492"><code>fc6b4a19</code></a> (which is shortly after the 0.23.18 release) to add support for outputting individual handshake latencies in the benchmark tool.</p> <p>In the following test, we measure handshake latency when 80 threads are working at once. In the rustls test, this produces very satisfying <code>htop</code> output:</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/htop-80-99.png" alt="htop output, showing 80 cores at high utilization" /></p> <p>Note that the below charts use a base 10 logarithm x axis, to adequately show the four distributions in one figure.</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/latency-resume-tls12-server.svg" alt="graph of TLS1.2 resumed handshake server latency" /></p> <p>Here we can clearly see an improvement between OpenSSL 3.0 and OpenSSL 3.4. rustls has the tightest distribution, followed by BoringSSL. (OpenSSL 3.4's distribution may appear visually tighter than BoringSSL's, however the scale is logarithmic, so it is not.)</p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/latency-resume-tls13-server.svg" alt="graph of TLS1.3 resumed handshake server latency" /></p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/latency-fullhs-tls12-server.svg" alt="graph of TLS1.2 full handshake server latency" /></p> <p><img src="https://rustls.dev/perf/2024-11-28-threading/latency-fullhs-tls13-server.svg" alt="graph of TLS1.3 full handshake server latency" /></p> <p>The remainder show the same theme: rustls has a tight distribution, followed by BoringSSL, followed by OpenSSL 3.4. OpenSSL 3.0 has the widest distribution.</p> Benchmarking rustls 0.23.15 vs OpenSSL 3.3.2 vs BoringSSL on ARM64 Thu, 31 Oct 2024 00:00:00 +0000 Unknown https://rustls.dev/perf/2024-10-31-arm64/ https://rustls.dev/perf/2024-10-31-arm64/ <h3 id="system-configuration">System configuration</h3> <p>We ran the benchmarks on a bare-metal server with the following characteristics:</p> <ul> <li>OS: Debian 12 (Bookworm).</li> <li>C/C++ toolchains: GCC 12.2.0 and Clang 14.0.6.</li> <li>CPU: Ampere Altra Q80-30, 80 cores.</li> <li>Memory: 128GB.</li> <li>All cores set to <code>performance</code> CPU frequency governor.</li> </ul> <p>This is the <a href="https://www.hetzner.com/dedicated-rootserver/matrix-rx/">Hetzner RX170</a>.</p> <h3 id="versions">Versions</h3> <p>The benchmarking tool used for both OpenSSL and BoringSSL was <a href="https://github.com/ctz/openssl-bench/tree/d5de57d92d483169cabf8ec22c351fe3819ba656">openssl-bench d5de57d9</a>.</p> <p>This was built from source with its makefile.</p> <h4 id="boringssl">BoringSSL</h4> <p>The tested version of BoringSSL is <a href="https://github.com/google/boringssl/tree/76968bb3d5">76968bb3d5</a>, which was the most recent point on master when we started <a href="https://rustls.dev/perf/2024-10-18-report/">the previous measurements</a>.</p> <p>BoringSSL was built from source with <code>CC=clang CXX=clang++ cmake -DCMAKE_BUILD_TYPE=Release</code>. clang is used here to <a href="https://issues.chromium.org/issues/42290529">avoid potential performance deficits to GCC</a> and for consistency with the x86 results.</p> <h4 id="openssl">OpenSSL</h4> <p>The tested version of OpenSSL is <a href="https://github.com/openssl/openssl/tree/openssl-3.3.2">3.3.2</a>, which was the latest release at the time of writing.</p> <p>OpenSSL was built from source with <code>./Configure ; make -j12</code>.</p> <h4 id="rustls">Rustls</h4> <p>The tested version of rustls was 0.23.15, which was the latest release at the time of writing. This was used with aws-lc-rs 1.10.0 / aws-lc-sys 0.22.0.</p> <p>Additionally the following two commits were included, which affect the benchmark tool but do not affect the core crate:</p> <ul> <li><a href="https://github.com/rustls/rustls/commit/13144a0aa391bbec55aa92ee020e88c2bb8c3ea8">13144a0a</a></li> <li><a href="https://github.com/rustls/rustls/commit/b553880a5f5caf58bbd2c43e4031e8c55d6da486">b553880a</a></li> </ul> <h3 id="measurements">Measurements</h3> <p>BoringSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure BORINGSSL=1 </span></code></pre> <p>OpenSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure </span></code></pre> <p>rustls was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/rustls </span><span>$ BENCH_MULTIPLIER=16 setarch -R make -f admin/bench-measure.mk measure </span></code></pre> <h2 id="results">Results</h2> <p>Transfer measurements are in megabytes per second. Handshake units are handshakes per second.</p> <table><thead><tr><th></th><th>BoringSSL 76968bb3</th><th>OpenSSL 3.3.2</th><th>rustls 0.23.15</th></tr></thead><tbody> <tr><td>transfer, 1.2, aes-128-gcm, sending</td><td>2211.53</td><td>2101.23</td><td>2077.19</td></tr> <tr><td>transfer, 1.2, aes-128-gcm, receiving</td><td>2250.93</td><td>2344.94</td><td>2173.4</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, sending</td><td>1886.17</td><td>1741.07</td><td>1809.8</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, receiving</td><td>1899.72</td><td>1953.49</td><td>1935.8</td></tr> <tr><td></td><td>BoringSSL 76968bb3</td><td>OpenSSL 3.3.2</td><td>rustls 0.23.15</td></tr> <tr><td>full handshakes, 1.2, rsa, client</td><td>1968.07</td><td>1588.54</td><td>4498.42</td></tr> <tr><td>full handshakes, 1.2, rsa, server</td><td>334.077</td><td>319.886</td><td>614.27</td></tr> <tr><td>full handshakes, 1.2, ecdsa, client</td><td>1527.73</td><td>1118.56</td><td>2154.06</td></tr> <tr><td>full handshakes, 1.2, ecdsa, server</td><td>3636.48</td><td>2950.54</td><td>8303.67</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>1861.15</td><td>1441.86</td><td>3986.81</td></tr> <tr><td>full handshakes, 1.3, rsa, server</td><td>330.484</td><td>312.446</td><td>599.39</td></tr> <tr><td>full handshakes, 1.3, ecdsa, client</td><td>1459.64</td><td>1045.98</td><td>2032.11</td></tr> <tr><td>full handshakes, 1.3, ecdsa, server</td><td>3252.58</td><td>2440.25</td><td>6212.45</td></tr> <tr><td></td><td>BoringSSL 76968bb3</td><td>OpenSSL 3.3.2</td><td>rustls 0.23.15</td></tr> <tr><td>resumed handshakes, 1.2, client</td><td>45452.2</td><td>18396.5</td><td>65267.61</td></tr> <tr><td>resumed handshakes, 1.2, server</td><td>43356.5</td><td>20426.4</td><td>65313.22</td></tr> <tr><td>resumed handshakes, 1.3, client</td><td>3969.88</td><td>3282.14</td><td>8443.11</td></tr> <tr><td>resumed handshakes, 1.3, server</td><td>3791.21</td><td>3071.61</td><td>7841.35</td></tr> </tbody></table> <p><img src="/2024-10-31-transfer.svg" alt="graph of transfer speeds" /></p> <p><img src="/2024-10-31-full-handshake.svg" alt="graph of full handshakes" /></p> <p><img src="/2024-10-31-resumed-handshake.svg" alt="graph of resumed handshakes" /></p> <h3 id="observations-on-results">Observations on results</h3> <p>rustls trails a little in throughput tests. The three underlying cryptography libraries (BoringSSL, aws-lc, OpenSSL) have their own benchmarking tools which confirm that there is little variance between them:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/boringssl </span><span>$ LD_LIBRARY_PATH=. ./tool/bssl speed -filter AES-256-GCM </span><span>(...) </span><span>Did 139000 AES-256-GCM (16384 bytes) seal operations in 1004138us (138427.2 ops/sec): 2268.0 MB/s </span><span>(...) </span></code></pre> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/aws-lc </span><span>$ LD_LIBRARY_PATH=. ./tool/bssl speed -filter AES-256-GCM </span><span>(...) </span><span>Did 139000 EVP-AES-256-GCM encrypt (16384 bytes) operations in 1004522us (138374.3 ops/sec): 2267.1 MB/s </span><span>(...) </span></code></pre> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl </span><span>$ LD_LIBRARY_PATH=. ./apps/openssl speed -aead -evp aes-256-gcm </span><span>(...) </span><span>Doing AES-256-GCM ops for 3s on 16384 size blocks: 434715 AES-256-GCM ops in 3.00s </span><span>(...) </span><span>The &#39;numbers&#39; are in 1000s of bytes per second processed. </span><span>type 2 bytes 31 bytes 136 bytes 1024 bytes 8192 bytes 16384 bytes </span><span>AES-256-GCM 13570.29k 168865.38k 623050.41k 1766296.58k 2320760.83k 2374123.52k </span></code></pre> <p>That is 2268, 2267 and 2264 MB/s for BoringSSL, aws-lc and OpenSSL respectively. Given these project's shared lineage, it would not be surprising if the implementations are the same.</p> Benchmarking rustls 0.23.15 vs OpenSSL 3.3.2 vs BoringSSL on x86_64 Fri, 18 Oct 2024 00:00:00 +0000 Unknown https://rustls.dev/perf/2024-10-18-report/ https://rustls.dev/perf/2024-10-18-report/ <h3 id="system-configuration">System configuration</h3> <p>We ran the benchmarks on a bare-metal server with the following characteristics:</p> <ul> <li>OS: Debian 12 (Bookworm).</li> <li>C/C++ toolchains: GCC 12.2.0 and Clang 14.0.6.</li> <li>CPU: <a href="https://www.intel.com/content/www/us/en/products/sku/214806/intel-xeon-e2386g-processor-12m-cache-3-50-ghz/specifications.html">Xeon E-2386G</a> (supporting AVX-512).</li> <li>Memory: 32GB.</li> <li>Extra configuration: hyper-threading disabled, dynamic frequency scaling disabled, cpu scaling governor set to performance for all cores.</li> </ul> <h3 id="versions">Versions</h3> <p>The benchmarking tool used for both OpenSSL and BoringSSL was <a href="https://github.com/ctz/openssl-bench/tree/d5de57d92d483169cabf8ec22c351fe3819ba656">openssl-bench d5de57d9</a>.</p> <p>This was built from source with its makefile.</p> <h4 id="boringssl">BoringSSL</h4> <p>The tested version of BoringSSL is <a href="https://github.com/google/boringssl/tree/76968bb3d5">76968bb3d5</a>, which was the most recent point on master when we started these measurements.</p> <p>BoringSSL was built from source with <code>CC=clang CXX=clang++ cmake -DCMAKE_BUILD_TYPE=Release</code>. clang is used here to <a href="https://issues.chromium.org/issues/42290529">avoid potential performance deficits to GCC</a>.</p> <h4 id="openssl">OpenSSL</h4> <p>The tested version of OpenSSL is <a href="https://github.com/openssl/openssl/tree/openssl-3.3.2">3.3.2</a>, which was the latest release at the time of writing.</p> <p>OpenSSL was built from source with <code>./Configure ; make -j12</code>.</p> <h4 id="rustls">Rustls</h4> <p>The tested version of rustls was 0.23.15, which was the latest release at the time of writing. This was used with aws-lc-rs 1.10.0 / aws-lc-sys 0.22.0.</p> <p>Additionally the following two commits were included, which affect the benchmark tool but do not affect the core crate:</p> <ul> <li><a href="https://github.com/rustls/rustls/commit/13144a0aa391bbec55aa92ee020e88c2bb8c3ea8">13144a0a</a></li> <li><a href="https://github.com/rustls/rustls/commit/b553880a5f5caf58bbd2c43e4031e8c55d6da486">b553880a</a></li> </ul> <h3 id="measurements">Measurements</h3> <p>BoringSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure BORINGSSL=1 </span></code></pre> <p>OpenSSL was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/openssl-bench </span><span>$ BENCH_MULTIPLIER=16 setarch -R make measure </span></code></pre> <p>rustls was tested with this command:</p> <pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>~/bench/rustls </span><span>$ BENCH_MULTIPLIER=16 setarch -R make -f admin/bench-measure.mk measure </span></code></pre> <h2 id="results">Results</h2> <p>Transfer measurements are in megabytes per second. Handshake units are handshakes per second.</p> <table><thead><tr><th></th><th>BoringSSL 76968bb3</th><th>OpenSSL 3.3.2</th><th>rustls 0.23.15</th></tr></thead><tbody> <tr><td>transfer, 1.2, aes-128-gcm, sending</td><td>5043.04</td><td>6560.79</td><td>8154.27</td></tr> <tr><td>transfer, 1.2, aes-128-gcm, receiving</td><td>4429.26</td><td>7192.17</td><td>7436.88</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, sending</td><td>4332.5</td><td>5982.18</td><td>7094.13</td></tr> <tr><td>transfer, 1.3, aes-256-gcm, receiving</td><td>3872.34</td><td>6521.2</td><td>7278.15</td></tr> <tr><td></td><td>BoringSSL 76968bb3</td><td>OpenSSL 3.3.2</td><td>rustls 0.23.15</td></tr> <tr><td>full handshakes, 1.2, rsa, client</td><td>5470.01</td><td>3201.92</td><td>8227.61</td></tr> <tr><td>full handshakes, 1.2, rsa, server</td><td>1449.65</td><td>2159.59</td><td>2829.04</td></tr> <tr><td>full handshakes, 1.2, ecdsa, client</td><td>3451.51</td><td>2071.74</td><td>4369.39</td></tr> <tr><td>full handshakes, 1.2, ecdsa, server</td><td>9115.04</td><td>5196.8</td><td>12921.68</td></tr> <tr><td>full handshakes, 1.3, rsa, client</td><td>4813.91</td><td>2788.76</td><td>6803.93</td></tr> <tr><td>full handshakes, 1.3, rsa, server</td><td>1386.06</td><td>1913.38</td><td>2544.31</td></tr> <tr><td>full handshakes, 1.3, ecdsa, client</td><td>3177.49</td><td>1859.77</td><td>3937.7</td></tr> <tr><td>full handshakes, 1.3, ecdsa, server</td><td>7107.86</td><td>3938.47</td><td>8325.74</td></tr> <tr><td></td><td>BoringSSL 76968bb3</td><td>OpenSSL 3.3.2</td><td>rustls 0.23.15</td></tr> <tr><td>resumed handshakes, 1.2, client</td><td>45547.6</td><td>20703.8</td><td>64722.55</td></tr> <tr><td>resumed handshakes, 1.2, server</td><td>43985.3</td><td>22268.1</td><td>71149.91</td></tr> <tr><td>resumed handshakes, 1.3, client</td><td>9818.4</td><td>5328.6</td><td>10912.87</td></tr> <tr><td>resumed handshakes, 1.3, server</td><td>8600.76</td><td>4866.2</td><td>9500.11</td></tr> </tbody></table> <p><img src="/2024-10-18-transfer.png" alt="graph of transfer speeds" /></p> <p><img src="/2024-10-18-full-handshake.png" alt="graph of full handshakes" /></p> <p><img src="/2024-10-18-resumed-handshake.png" alt="graph of resumed handshakes" /></p> <h3 id="observations-on-results">Observations on results</h3> <p>AVX-512 support shows up twice in these results:</p> <ul> <li>rustls/aws-lc and OpenSSL's performance advantage in throughput tests is due to use of AVX-512F/VAES.</li> <li>rustls/aws-lc and OpenSSL's performance advantage in server-side full handshake tests is due to use of AVX-512IFMA-accelerated RSA.</li> </ul> <p>This support was contributed to the respective projects by Intel.</p> <p>TLS1.3 resumption is slower than TLS1.2 resumption because it includes a fresh key exchange.</p>