JRuby HTTP parser improvements by headius · Pull Request #3838 · puma/puma

headius · 2025-12-13T07:02:47Z

This PR will contain improvements for the HTTP parser implementation for JRuby.

Pass bytes from the request handler through the parser loop and leaf methods, only copying when necessary.
Pre-allocate static header keys and avoid constructing them again while preparing the request hash.
Perfect hash for retrieving pre-allocated header keys from incoming bytes.
Refactor parser classes to reduce pointer dereferences
Eliminate dead code
~~Intern unusual headers for future calls using zero-allocation "fake string" logic~~

Contribution checklist:

I have reviewed the guidelines for contributing to this repository.
I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
My pull request is 100 lines added/removed or less so that it can be easily reviewed.
If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
I have updated the documentation accordingly.
All new and existing tests passed, including Rubocop.

As mentioned in the deleted comment, the parser should be able to operate directly against the incoming bytes, rather than making a copy. This commit does so, parsing directly against the incoming bytes and passing them to the leaf methods, which make their own copies of the fields and values as appropriate.

headius · 2025-12-13T07:22:08Z

The first improvement, passing the read bytes through the parser without copying, is mostly done. I'm not quite clear on the life cycle of that buffer string, so my patch still makes defensive copies of the header values before putting them in the request hash. I have not looked at the C code to see if it is sharing the buffer contents with those strings. If anyone has insight here, I would appreciate it.

Next phase will be to pre-allocate the static header keys, similar to the C code, since the Java extension is currently recreating them every time.

I'm not sure I can keep these aggregate changes to less than 100 lines of modifications, so please let me know if that will be a problem for reviewing.

headius · 2025-12-13T08:03:14Z

Related to preallocating the header strings: #3825

I wonder what other optimizations over the years have only been applied to the C code. 🤔

headius · 2025-12-14T07:03:45Z

I don't love the linear search for HTTP envs but it matches the C ext now.

nateberkopec · 2025-12-15T04:42:12Z

Legend! Thank you @headius

Working to get away from storing another reference to the runtime.

While most headers are either pre-allocated or cached as fstrings, header values and query strings must be allocated new for each request. By marking the incoming buffer as shared and allocating other strings as shared views into that buffer's bytes, we can avoid copying those bytes again for each string as is done in CRuby. Marking the original buffer as shared does mean the request logic will be forced to create a new ByteList and byte[] if further data is read, but this is no worse (and probably better) than makiong N copied slices of the buffer while parsing the request. Because of this copy-on-write behavior, the entirei incoming buffer will be held in memory for the duration of the request, but most of that data would be in memory until processed anyway, potentially with a larger actual heap size than just a single large shared buffer.

This removes the linear search for matching field names and replaces it with a perfect hash based on the standard HTTP/1.1 field names (plus some additions) and the supported CGI variables. The array of pre-allocated strings is passed through the parser from Http11. The EnvKey enum moves to its own top-level class, and now contains a gperf-generated perfect hash function to quickly find or reject incoming fields. Along with the EnvKey changes, the snake-upcasing of incoming fields is done lazily, and never writes to the original buffer. I have reverted a previous commit and restored the JRuby-specific assertions related to this change, and I believe the direct modification of the read buffer is a bug, albeit not a very important one since it only affects error messages for malformed HTTP requests.

headius · 2025-12-17T08:37:52Z

I've completed most of the items I found. Performance is only marginally better, probably because the code generated by Ragel is just bad (I've run into similar performance limitations with the json library), but the memory throughput should be drastically reduced. A benchmark like the one linked from #3825 now only allocates objects for the benchmark itself, unusual headers, and values from the request (other than objects required for JRuby internals).

Benchmark were run with a patched JRuby 10.1 that does not raise internal exceptions to wake a ConditionVariable and uses fast fixnum hashing. Numbers are peak performance seen after warmup stabilization.

Performance before this PR:

jruby 10.1.0.0-SNAPSHOT (3.4.5) 2025-12-15 8c2eca74b5 OpenJDK 64-Bit Server VM 25+36-LTS on 25+36-LTS +indy +jit [arm64-darwin]
Warming up --------------------------------------
               parse     7.103k i/100ms
Calculating -------------------------------------
               parse     70.925k (± 0.6%) i/s   (14.10 μs/i) -    355.150k in   5.007611s

After:

jruby 10.1.0.0-SNAPSHOT (3.4.5) 2025-12-15 8c2eca74b5 OpenJDK 64-Bit Server VM 25+36-LTS on 25+36-LTS +indy +jit [arm64-darwin]
Warming up --------------------------------------
               parse     7.401k i/100ms
Calculating -------------------------------------
               parse     73.935k (± 1.1%) i/s   (13.53 μs/i) -    370.050k in   5.005638s

The final item – using a "fake string" to cache and lookup additional header fstrings – is trickier to do in JRuby than CRuby, as the fstring cache does not currently provide a way to perform a lookup of fstrings using only an array of bytes, and any temporary strings used for this purpose must be isolated across threads. If the Http11 object is only ever used in a single thread, such a temporary string could be constructed there at the cost of an additional string per request. This may be worth it to avoid constantly creating new strings for the remaining less-common headers, such as those used in the #3825 benchmark.

headius · 2025-12-17T09:10:16Z

An experiment with interning all incoming headers yielded a small gain (maybe) but I'm not sure it's worth the exposure. Given a large enough request with bogus headers, memory could be made to increase rapidly. I believe this behavior of Puma may be exposing an issue with the fstring cache in CRuby 4.0, as it produces very erratic numbers for the request parsing benchmark:

ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse    27.569k i/100ms
Calculating -------------------------------------
               parse    201.399k (±12.0%) i/s    (4.97 μs/i) -      1.020M in   5.142527s
ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse    15.937k i/100ms
Calculating -------------------------------------
               parse    118.456k (±12.5%) i/s    (8.44 μs/i) -    589.669k in   5.075580s
ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse    10.524k i/100ms
Calculating -------------------------------------
               parse     90.644k (± 9.4%) i/s   (11.03 μs/i) -    452.532k in   5.054908s
ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse     8.775k i/100ms
Calculating -------------------------------------
               parse     83.366k (± 2.0%) i/s   (12.00 μs/i) -    421.200k in   5.054538s
ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse     8.055k i/100ms
Calculating -------------------------------------
               parse     72.597k (± 4.0%) i/s   (13.77 μs/i) -    362.475k in   5.001894s
ruby 4.0.0dev (2025-11-18T23:50:36Z master 32b8f97b34) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               parse     6.705k i/100ms
Calculating -------------------------------------
               parse    183.568k (±60.6%) i/s    (5.45 μs/i) -    549.810k in   5.004438s

I think I'm going to exclude that feature from the JRuby extension for now.

This is probably all I can do for the Ragel-based parser extension on JRuby. It's probably worth looking into a non-Ragel solution, especially for HTTP/1.1 which is quite easy to parse.

headius · 2025-12-18T18:30:28Z

@nateberkopec @schneems I'm not sure why that one JRuby timed out, nor why those unrelated jobs have been consistently failing.

The CRuby jobs have been failing since the first commit, and I did not touch any code used by the CRuby version of the gem. Is it some oddity with PR jobs?

I'll try to reproduce the hang locally but everything else passes and I don't know why my changes would have caused that sort of failure.

Previously the long form of the FString was cached and returned by longHashCode, but that value is actually just an int and can also be returned by hashCode. We can also pre-allocate and cache the fixnum version of the hash so it does not have to be created each time it is used from Ruby. Found while working on performance optimizations for Puma in puma/puma#3838.

When creating an FString for the first time, we should never trust that the incoming byte[] or ByteList are now ours to own. The caller might still have a reference to them that they continue to modify, resulting in a zombie FString that can't be found or that has incorrect contents. This came up while trying to implement the "fake string" optimization for uncached headers in puma/puma#3838. Holding a reference to a RubyString and its ByteList that could be updated in-place and then used to caches new headers led to those cached FStrings being modified directly.

MSP-Greg · 2025-12-18T19:33:39Z

The CRuby jobs have been failing since the first commit, and I did not touch any code used by the CRuby version of the gem. Is it some oddity with PR jobs?

Disregard the CRuby 'head' jobs. I hoped to have time for a better look at the issue. In the meantime, I'll post a fix so CI passes.

MSP-Greg · 2025-12-19T03:18:48Z

@headius

Well, I didn't turn out quite as I hoped, but the CI tests seem to be much better now...

headius · 2026-01-12T18:42:17Z

I've merged in recent changes from main to clean up CI. I'm satisfied with this round of changes and it generally performs much better on JRuby now than on CRuby.

headius · 2026-01-12T19:00:43Z

CI is green except for two truffleruby-head builds that are unrelated to my changes.

nateberkopec · 2026-02-01T08:33:08Z

Thanks again @headius

headius added 2 commits December 13, 2025 03:09

Implement snake_upcase_char as in the C ext

18b905e

Implement cached fstring env keys for JRuby ext

26b54f8

headius mentioned this pull request Dec 14, 2025

Puma 7 on JRuby is 20 times slower on TechEmpower #3788

Open

headius added 9 commits December 14, 2025 23:05

Simplify setup of http env strings

b78cf48

Eliminate unnecessary HttpParser class

ff3e6b1

Pass context to a few methods

fc53e29

Working to get away from storing another reference to the runtime.

Use cached strings for remaining request keys

4675ae0

Eliminate dead code

3d46090

More cleanup of env strings

d08ee4b

Reduce bytecode size throughout

d13fe15

headius force-pushed the direct_parsing branch from 63ccfec to 356a7a7 Compare December 17, 2025 08:20

headius mentioned this pull request Dec 18, 2025

Cache both hash forms for FString jruby/jruby#9143

Merged

headius mentioned this pull request Dec 18, 2025

Never trust external content for FString jruby/jruby#9145

Merged

Merge branch 'main' into direct_parsing

5f93869

headius marked this pull request as ready for review January 12, 2026 18:41

github-actions Bot added the waiting-for-review Waiting on review from anyone label Jan 12, 2026

MSP-Greg merged commit dc947d9 into puma:main Jan 31, 2026
69 of 71 checks passed

dentarg removed the waiting-for-review Waiting on review from anyone label Feb 23, 2026

This was referenced Apr 10, 2026

[ruby-on-rails] Bump Puma to 8.0.0 hayat01sh1da/botpress-accuracy-checkers#90

Merged

[ruby-on-rails] Update Gem Dependencies hayat01sh1da/tutorials#223

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JRuby HTTP parser improvements#3838

JRuby HTTP parser improvements#3838
MSP-Greg merged 13 commits intopuma:mainfrom
headius:direct_parsing

headius commented Dec 13, 2025 •

edited

Loading

Uh oh!

headius commented Dec 13, 2025 •

edited

Loading

Uh oh!

headius commented Dec 13, 2025

Uh oh!

headius commented Dec 14, 2025 •

edited

Loading

Uh oh!

nateberkopec commented Dec 15, 2025

Uh oh!

headius commented Dec 17, 2025 •

edited

Loading

Uh oh!

headius commented Dec 17, 2025

Uh oh!

headius commented Dec 18, 2025

Uh oh!

MSP-Greg commented Dec 18, 2025

Uh oh!

MSP-Greg commented Dec 19, 2025

Uh oh!

headius commented Jan 12, 2026

Uh oh!

headius commented Jan 12, 2026

Uh oh!

Uh oh!

nateberkopec commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

headius commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

headius commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

headius commented Dec 13, 2025

Uh oh!

headius commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nateberkopec commented Dec 15, 2025

Uh oh!

headius commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

headius commented Dec 17, 2025

Uh oh!

headius commented Dec 18, 2025

Uh oh!

MSP-Greg commented Dec 18, 2025

Uh oh!

MSP-Greg commented Dec 19, 2025

Uh oh!

headius commented Jan 12, 2026

Uh oh!

headius commented Jan 12, 2026

Uh oh!

Uh oh!

nateberkopec commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

headius commented Dec 13, 2025 •

edited

Loading

headius commented Dec 13, 2025 •

edited

Loading

headius commented Dec 14, 2025 •

edited

Loading

headius commented Dec 17, 2025 •

edited

Loading