Skip to content

Implement env["puma.mark_as_io_bound"]#3816

Merged
joshuay03 merged 1 commit intopuma:mainfrom
byroot:low-priority-threads
Mar 7, 2026
Merged

Implement env["puma.mark_as_io_bound"]#3816
joshuay03 merged 1 commit intopuma:mainfrom
byroot:low-priority-threads

Conversation

@byroot
Copy link
Copy Markdown
Contributor

@byroot byroot commented Nov 16, 2025

Fix: #3777

Workers can now be marked as "IO bound". This state last until the end of the current request.

At the end of the request, the pool decide whether the "IO bound" thread rejoin the normal population, or is abandonned.

Generally speaking, an "IO bound" thread no longer counts towards the max_threads configuration. However, there is a max_io_threads configuration, and any "IO bound" thread above that limit automatically counts as a "normal" thread.

For example, imagine a config of max_threads: 5, max_io_threads: 10:

If there is 3 normal threads and 12 io threads, then puma won't spawn any new thread, but the next deferred thread that complete will become "normal" again so that we now have 4 normal threads and 11 deferred threads.

Now if there is 5 normal threads and 2 IO threads, then when the next IO threads complete, it will exit.

Copy link
Copy Markdown
Member

@nateberkopec nateberkopec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm still chewing on the interface.

env["puma.deferred_request"].call

Right interface or not?

I'm mostly concerned about what happens when someone screws this up and puts CPU-bound work in to the deferred pool, or if that work unexpectedly becomes CPU bound. In what ways are they worse off than they were before? How can we correct from this condition, if at all?

This feature trades a risk of increased latency for more throughput. Is that a good tradeoff? Should we just say "f off, put these workloads on different machines and don't try to mix them?" Is the project fundamentally doomed?

Comment thread lib/puma/thread_pool.rb Outdated
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 16, 2025

Right interface or not?

Looks good to me. I'm just not 100% certain about the terminology?

I'm mostly concerned about what happens when someone screws this up

Is this really a concern? I think we're all consenting adults here. People already routinely crank up max_thread to unreasonable levels, I don't think this is any worse or more dangerous, if anything it's less dangerous.

@jclusso
Copy link
Copy Markdown
Contributor

jclusso commented Nov 17, 2025

People already routinely crank up max_thread to unreasonable levels, I don't think this is any worse or more dangerous, if anything it's less dangerous.

I can say I have done this while looking for a feature like this because we have a huge amount of requests that are waiting due to an API design that was optimized for the user at the cost of server resources.

Right interface or not?

If I understand this correctly, in a Rails controller if you want to defer the request you'd call env["puma.deferred_request"].call? Not the prettiest code, but I assume there is no way to add something nicer.

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 17, 2025

if you want to defer the request you'd call env["puma.deferred_request"].call?

Well request.env["puma.deferred_request"].call but yes.

Not the prettiest code

That's usually how rack interfaces work. I do plan to integrate it better with Rails though.

@jclusso
Copy link
Copy Markdown
Contributor

jclusso commented Nov 17, 2025

That's usually how rack interfaces work. I do plan to integrate it better with Rails though.

I figured as much was likely the case that it would be on frameworks to make it work nicely.

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Nov 20, 2025

This looks pretty neat/simple to me.

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Nov 20, 2025

This feature trades a risk of increased latency for more throughput. Is that a good tradeoff?

It think that's exactly what's being asked for? #3777 (comment), #3777 (comment)

Should we just say "f off, put these workloads on different machines and don't try to mix them?"

I think providing this could be seen as along the lines of this earlier reply of yours @nateberkopec #3034 (comment)

there are several features we maintain like phased-restart which are intended to support a specific use-case of someone who wants to start a Puma server and just wants Puma to provide everything all by itself, without need for additional programs or tools

Is the project fundamentally doomed?

I don't think so if people are getting value from ActionController::Live already?

I can see how this can make it harder to reason about Puma, number of threads and capacity. "give me some extra threads, but please don't count them" :). I think I see this as something frameworks can use instead of Thread.new more or less?

Copy link
Copy Markdown
Member

@nateberkopec nateberkopec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove my blocks on whether or not we should do this at all, but I do think the class and the env key need better names.

Comment thread lib/puma/request.rb Outdated
Comment thread lib/puma/thread_pool.rb Outdated
@github-actions github-actions Bot added waiting-for-changes Waiting on changes from the requestor and removed waiting-for-review Waiting on review from anyone labels Nov 20, 2025
@dentarg
Copy link
Copy Markdown
Member

dentarg commented Nov 20, 2025

If Rails would take advantage how this, how would Rails detect that it's supported by the running web server?

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 20, 2025

how would Rails detect that it's supported by the running web server?

Like other optional rack features. Just check if the key is present in the env hash.

@byroot byroot force-pushed the low-priority-threads branch from cdaf88b to 0174cc9 Compare November 20, 2025 10:10
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 20, 2025

I started some renaming and added a couple unit tests. I'm traveling today, so not sure how much more I'll be able to do, but it should already look a bit better.

@byroot byroot force-pushed the low-priority-threads branch from 0174cc9 to df18bf4 Compare November 20, 2025 10:44
@byroot byroot changed the title Implement prototype for deferred workers Implement env["puma.mark_as_io_bound"] Nov 20, 2025
@byroot byroot requested a review from nateberkopec November 20, 2025 10:46
Copy link
Copy Markdown
Collaborator

@joshuay03 joshuay03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very keen for this!

Just some feedback on naming based on previous reviews.

We really should rename @workers in ThreadPool to @processors in a follow-up, it's confusing even in the current state.

Comment thread lib/puma/dsl.rb Outdated
Comment thread lib/puma/request.rb Outdated
Comment thread lib/puma/server.rb
Comment thread lib/puma/thread_pool.rb Outdated
Comment thread lib/puma/thread_pool.rb Outdated
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 21, 2025

We really should rename @workers in ThreadPool to @processors in a follow-up, it's confusing even in the current state.

Agreed. But I wanted to minimize diff size for review purposes. If you are interested I can do a PR just for that, and then rebase this one on top.

@byroot byroot force-pushed the low-priority-threads branch from df18bf4 to e2253b9 Compare November 21, 2025 19:51
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Nov 21, 2025

@joshuay03 I addressed your feedback.

@byroot byroot force-pushed the low-priority-threads branch from e2253b9 to 55011b6 Compare November 21, 2025 19:55
@joshuay03
Copy link
Copy Markdown
Collaborator

Agreed. But I wanted to minimize diff size for review purposes. If you are interested I can do a PR just for that, and then rebase this one on top.

Yeah fair, up to you whether you want to PR and rebase or wait and follow up.

I'll leave the final thumbs up on this to @nateberkopec

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Dec 13, 2025

Any news?

@adenta
Copy link
Copy Markdown

adenta commented Dec 16, 2025

So if I'm reading this right this would basically fix action controller live? I love the API but it's basically unusable in its current state

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Dec 16, 2025

Can and should the number of io threads be included in the stats?

# generate stats hash so as not to perform multiple locks
# @return [Hash] hash containing stat info from ThreadPool
def stats
with_mutex do
temp = @backlog_max
@backlog_max = 0
{ backlog: @todo.size,
running: @spawned,
pool_capacity: @waiting + (@max - @spawned),
busy_threads: @spawned - @waiting + @todo.size,
backlog_max: temp
}
end
end

@byroot byroot force-pushed the low-priority-threads branch from 55011b6 to 0376659 Compare December 17, 2025 06:57
@byroot byroot force-pushed the low-priority-threads branch from 0376659 to bbea749 Compare January 20, 2026 07:35
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Jan 20, 2026

@dentarg @nateberkopec anything I can do here 🙏 ?

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Jan 20, 2026

@byroot update the stats?

@dentarg
Copy link
Copy Markdown
Member

dentarg commented Jan 20, 2026

I just randomly came here because I wanted to connect this with the datastar-ruby gem that I happened to be looking and, and noticed this code/comment https://github.com/starfederation/datastar-ruby/blob/226deb42fda9d0ac0d2cf99e9d5438718420b786/lib/datastar/dispatcher.rb#L327-L332

        # Now launch the control thread that actually writes to the socket
        # We don't want to block the main thread, so that servers like Puma
        # which have a limited thread pool can keep serving other requests
        # Other streamers will push any StandardError exceptions to the queue
        # So we handle them here

Sounds like that gem could potentially make use of this mark_as_io_bound feature too?

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Jan 20, 2026

update the stats?

Right sorry, I guess I was asking if you had opinions on how to display it or if I should do what I think is best?

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Jan 20, 2026

Sounds like that gem could potentially make use of this mark_as_io_bound feature too?

Yes, that code is very similar to what AS::Live does.

@byroot byroot force-pushed the low-priority-threads branch from bbea749 to 76f98d3 Compare January 20, 2026 08:44
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Jan 20, 2026

update the stats?

Done!

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Feb 7, 2026

👋

@github-actions github-actions Bot added waiting-for-review Waiting on review from anyone and removed waiting-for-changes Waiting on changes from the requestor labels Feb 12, 2026
@byroot byroot force-pushed the low-priority-threads branch from 76f98d3 to edababf Compare March 3, 2026 08:22
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Mar 3, 2026

I rebased the PR again. How can I get this unblocked?

@joshuay03
Copy link
Copy Markdown
Collaborator

I rebased the PR again. How can I get this unblocked?

Can you have a look at the latest failures. I'll aim to do a final review and merge this weekend.

Fix: puma#3777

This is only meant as a rought prototype to better illustrate
the idea. There's no extra tests to speak of, but the crux of
the logic is here.

Workers can now be marked as "IO bound".
This state last until the end of the current request.

At the end of the request, the pool decide whether the "IO bound" thread
rejoin the normal population, or is abandonned.

Generally speaking, an "IO bound" thread no longer counts towards
the `max_threads` configuration. However, there is a `max_io_threads`
configuration, and any "IO bound" thread above that limit automatically
counts as a "normal" thread.

For example, imagine a config of `max_threads: 5, max_io_threads: 10`:

If there is 3 normal threads and 12 io threads, then puma won't spawn
any new thread, but the next deferred thread that complete will become "normal"
again so that we now have 4 normal threads and 11 deferred threads.

Now if there is 5 normal threads and 2 IO threads, then when the next
IO threads complete, it will exit.
@byroot byroot force-pushed the low-priority-threads branch from edababf to 4734d01 Compare March 4, 2026 20:42
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Mar 4, 2026

My bad, I didn't fix the merge conflict correctly. Expecting it to go green now.

@adenta
Copy link
Copy Markdown

adenta commented Mar 5, 2026

Can someone comment on what happens when this gets merged? Does this magically make Actioncable::Live non blocking so it can be used with Puma without performance issues?

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Mar 5, 2026

Does this magically make Actioncable::Live non blocking so it can be used with Puma without performance issues?

No. I'll have to make changes on the Rails side first.

@joshuay03 joshuay03 dismissed nateberkopec’s stale review March 7, 2026 20:14

Requested changes have been addressed

Copy link
Copy Markdown
Collaborator

@joshuay03 joshuay03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some nits, but I'll follow up separately.

Thank you!

@joshuay03 joshuay03 merged commit bb60494 into puma:main Mar 7, 2026
80 checks passed
joshuay03 added a commit that referenced this pull request Mar 7, 2026
@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Mar 7, 2026

😍

joshuay03 added a commit that referenced this pull request Mar 7, 2026
Comment thread lib/puma/dsl.rb
# Any IO thread over the limit is counted as a regular thread, hence the above configuration also
# allows for 3 regular threads and 7 IO threads for example.
def max_io_threads(max)
max = Integer(max)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now wondering if we should allow this to be unbounded. Maybe something to come back to once it's in use out in the wild, i.e. Rails.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think unbounded if ever a good idea. Even something that use very little CPU still use some, and it's always better to not accept quite as many concurrent requests as you could have (backpressure) than to accept too many.

@adenta
Copy link
Copy Markdown

adenta commented Mar 10, 2026

Does this magically make Actioncable::Live non blocking so it can be used with Puma without performance issues?

No. I'll have to make changes on the Rails side first.

@byroot this is AMAZING. Is there an issue open for the rails changes? Would this be a Rails 8.2 sort of thing? About to start building some chat interfaces, and most everyone in non rubyland is standardizing on SSE. Would love to not need falcon.

@byroot
Copy link
Copy Markdown
Contributor Author

byroot commented Mar 11, 2026

Is there an issue open for the rails changes?

Kinda, this change was driven by rails/rails#55132 (comment)

I still need to find the time to figure out the Rails side of it. Ideally you'd just use AC::Live like today with no change, just the underlying implementation would be way more sound. But I'm not yet 100% sure it's possible, I'll have to see when I attempt it. If really it's not possible, then it will be a new API.

Would this be a Rails 8.2 sort of thing?

Yes.

About to start building some chat interfaces, and most everyone in non rubyland is standardizing on SSE. Would love to not need falcon.

I mean, you don't have to wait for Rails support. If you got this feature deployed you can use it directly with request.env["puma.mark_as_io_bound"].call and then use classic Rack steaming API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-review Waiting on review from anyone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add API to "remove" current thread from the pool

6 participants