Improve performance of interleave_primitive (-15% - 45%) / interleave_bytes (-10-25%)#7420
Conversation
Thanks for checking! I wonder, is this screenshot from Compiler Explorer or a local tool? |
|
I also pushed an improvement for |
5c51c91 to
3850342
Compare
| let mut offsets = BufferBuilder::<T::Offset>::new(indices.len() + 1); | ||
| offsets.append(T::Offset::from_usize(0).unwrap()); | ||
| for (a, b) in indices { | ||
| let mut offsets = Vec::with_capacity(indices.len() + 1); |
There was a problem hiding this comment.
Vec and extend generates better code.
There is probably some other places this pattern can be applied as well @mbutrovich
It's Beyond Compare but you can use your diff tool of choice with the two .txt dumps of the machine code. I use cargo-show-asm to generate the relevant snippets of code. For example, I don't find that Compiler Explorer generates representative code for real projects. Putting snippets in there often doesn't reflect what the compiler does with large projects with complex CFG DAGs, external crates, LTO, and inlined functions. |
| builder.append(v) | ||
| } | ||
| builder.finish() | ||
| let nulls = BooleanBuffer::collect_bool(indices.len(), |i| { |
There was a problem hiding this comment.
This is a pattern that comes up often as well
Ah nice, thanks for the overview! I'll try might try that in the future. |
If you know the name of the function or even part of it - you can specify all of it or parts as well: |
TY @pacak ! |


Which issue does this PR close?
Closes #7421
Closes #.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?