Use std based ctpl header only library#4126
Use std based ctpl header only library#4126PastaPastaPasta wants to merge 3 commits intodashpay:developfrom
Conversation
|
Can you reproduce these result in multiple runs? Cause for me it fluctuates a couple of percents up and down from one run to another and it's not clear if there are any performance gains or not tbh. On the bright side, there is also no clear performance loss either :) Also, I like the idea of getting rid of boost dependencies if there is no difference in performance but we might want to take it even further though, pls see (and test) 14d095000c. |
|
Yeah, idk, my "long tests" where the benching took about an hour total for each run show the 2% gains, but I'm not sure if those are real or fake gains :) Getting rid of boost is enough of a justification |
|
Doesn't look like a lockfree queue mechanism, I used moodycamel in mine https://github.com/cameron314/concurrentqueue |
|
The ctpl::detail::Queue is not lockfree (uses mutexes) but I think that's probably fine. Locking data structures can sometimes even be faster than non-locking data structures. Is there a reason why we should exclusively look for a non-locking data structure here? |
Lock free is preferred if producers don't need to coordinate and consistency isn't required nor linearity of results. I think overall there are lots of other bottlenecks that are there as a result I only see a marginal improvement but I also had other issues with the boost one documented in the issue. The best results I think will be with bulk enqueuing and dequeuing. If the requirements allow it, you will have much less busy-wait cycles with lock free and less CAS per acquiring and releasing context. If designed right you should see 50-100% improvements in performance so it won't give it here (you start with lockfree and build around it) but lock-free is desired if the parallelization is allowed (communication and coordination not required for things like signature checks). |
I ran some benchmarkes and according to my data, using std based ctpl causes ~2% faster benchmarks on total time, min time, max time, median time see https://docs.google.com/spreadsheets/d/1tw43VBd50fcyNPZ5sUV8F-GpRFgDmiSOO1qdV1wMpqw/edit?usp=sharing for data