PERFORMANCE[30%+ improvement!]: filter_func can optionally take an array of events to make batched filters much faster#8428
Conversation
|
@andrewvc @ph @jordansissel FYI not sure if I should be happy or sad here after all the work that went into Java exec ... but this looks pretty safe to merge to give |
|
Also, this version beats the Java exec by quite a margin in baseline (17% faster) stdin->stdout or just a single fast filter like the UA (10% faster) one in place :( |
…ke batched filters much faster
c3eab94 to
9626c4e
Compare
|
Woa, good catch! This change is low risk we could release it in 6.0.x and event 5.6.x if we ever do another release. I am doing some testing, will report soon. |
|
oh sweet <3 🥇 |
Concerning the java execution, we will make it faster eventually, that should not block us from merging it with a feature flag. |
|
@ph "should" or "should not"? :D |
|
should not :) I did some testing with a really simple pipeline configuration I got a 7% improvements with the code changes, but as soon as you add more plugins to the filter stage, I get the 30% speed improvements. |
ph
left a comment
There was a problem hiding this comment.
LGTM, This can go in multiple branch, master, 6.x 6.0 and even 5.6. I don't see any reason not to do it.
|
@ph alrighty thanks, merging :) |
…ke batched filters much faster Fixes #8428
|
needs manual backport to 6.0 and 5.6, will do later |
|
Thank you for working on this :) |
|
np :) |
…ke batched filters much faster Fixes elastic#8428
…ke batched filters much faster Fixes elastic#8428 Fixes elastic#8444
This one is kinda frustrating when seen in relation to #8357 :( but a huge and trivial performance gain :)
Just thought I'd level the playing field a little when comparing Java and Ruby exec and well this is the result.
Simply fixing the batched filter execution in the Ruby execution gets us within to about 90% (I can add more scientific numbers tomorrow or so, but it's basically this for the Apache example: master 35k/s, this 50k/s and the full Java exec 56k/s ) of the throughput of the pure java exec in #8357.
For baseline (and simple filters like only UA filter) this version wins over
masterand the Java exec by a wide margin. Here it'smasterand Java exec both at ~300k/s and this one at ~380k/s.To better understand this one, take a look at a compiled
filter_func:example:
=> there is no point turning the events into arrays one by one if we can simply turn the batch into an array and input that.
=> trivial fix by allowing
Arrayto be passed throughfilter_funcand adding theto_amethod from #8357 to the batch implementations