C++: Fix join orders in virtual dispatch computation#21591
C++: Fix join orders in virtual dispatch computation#21591MathiasVP merged 1 commit intogithub:mainfrom
Conversation
…ges and inline pairCand to avoid a giant tuple explosion.
There was a problem hiding this comment.
Pull request overview
This PR targets performance in C/C++ virtual dispatch computation by improving transitive-closure and join-order behavior in TrackVirtualDispatch, aiming to reduce evaluation cost on large databases.
Changes:
- Replace
fastTC(edges/2)withdoublyBoundedFastTC(edges/2, isSource/1, isSink/1)foredgePlus. - Inline/restructure
pairCandcomputation by factoring outmostSpecificForSourceand addingbindingset/inline_lateto influence evaluation order.
C++: Port github#21591 to microsoft
geoffw0
left a comment
There was a problem hiding this comment.
QL changes LGTM.
I made a comment on the DCA run. I don't really know if it's worth investigating or not.
| private predicate isSink(PathNode n) { n.isSink() } | ||
|
|
||
| private predicate edgePlus(PathNode n1, PathNode n2) = | ||
| doublyBoundedFastTC(edges/2, isSource/1, isSink/1)(n1, n2) |
There was a problem hiding this comment.
I've never seen a definition of doublyBoundedFastTC, but I think I see what's going on. It's computing a transitive closure of edges, but limited to those edges that connect somewhere between sources and sinks (presumably via a forwards-backwards passes type algorithm).
There was a problem hiding this comment.
Yep, exactly. doublyBoundedFastTC is also the HOP that's used when you evaluate any dataflow flowPath predicate.
Thanks! I've replied to the comment. The TLDR is: It's not 😂 |
geoffw0
left a comment
There was a problem hiding this comment.
I'm happy with your explanation. :)
This PR improves the performance of C/C++'s virtual dispatch computation in two ways:
doublyBoundedFastTCinstead offastTC. We already have good bounds on the end-point so we might as well make use of them here.pairCandup front and then joining it with theedgePlusrelation we start by computingedgePlusand then we join it with the body ofpairCand. This has a major impact on large C/C++ databases at Microsoft (the time taken to compute virtual dispatch on a fresh DB is reduced by 10x in some cases).Before (canceled before completion):
After (
pairCandis now inlined):and then after a small join order fix: