-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
I had an issue with Parallel.ForEach - Might be how I was using it, but if that's the case it would be really nice if it could be improved so that one don't run into this issue so easily.
Problem - I have a list of a few hundreds of tasks to run in parallel, each task can take from a few seconds up to a day and MaxDegreeOfParallelism has to be configured. While they do start in parallel, It happens that I end up with a bunch of tasks stuck in a single thread.
Ex.
Lets say 400 tasks for 4 threads. What I noticed is I get 4 partitions of a 100 tasks each. 3 partitions may finish in a few minutes then I have a single threaded partition with a bunch of tasks waiting for one 2-hours task, and I may have 10 like this one after the other in the same partition.
Basically leaving me on a single threaded partition for a large number of tasks while the other partitions already finished resulting on 3 threads being wasted.
I created a barebone gist of my solution. With this idea I went from 2-3 hours (using Parallel.ForEach) to 45 mins for my test data. Not just that, with this idea the order of execution is also predictable.