Skip to content

C++: Fix join orders in virtual dispatch computation#21591

Merged
MathiasVP merged 1 commit intogithub:mainfrom
MathiasVP:restrict-pair-cand
Mar 27, 2026
Merged

C++: Fix join orders in virtual dispatch computation#21591
MathiasVP merged 1 commit intogithub:mainfrom
MathiasVP:restrict-pair-cand

Conversation

@MathiasVP
Copy link
Contributor

This PR improves the performance of C/C++'s virtual dispatch computation in two ways:

  • A minor performance improvement: Switch to doublyBoundedFastTC instead of fastTC. We already have good bounds on the end-point so we might as well make use of them here.
  • A major performance improvement: Instead of computing the pairCand up front and then joining it with the edgePlus relation we start by computing edgePlus and then we join it with the body of pairCand. This has a major impact on large C/C++ databases at Microsoft (the time taken to compute virtual dispatch on a fresh DB is reduced by 10x in some cases).

Before (canceled before completion):

[2026-03-26 16:04:48] Tuple counts for DataFlowDispatch::TrackVirtualDispatch<noDisp>::pairCand/4#3fd492b3/4@696edeu1 after 4m39s:
  863412     ~0%        {3} r1 = SCAN `DataFlowDispatch::TrackVirtualDispatch<noDisp>::mostSpecific/2#8aef19c1` OUTPUT In.2, In.0, In.1
  863412     ~2%        {3}    | JOIN WITH DataFlowPrivate::TSourceCallable#54d42094 ON FIRST 1 OUTPUT Lhs.1, Rhs.1 'callable', Lhs.2
  1157251367 ~1024%     {4}    | JOIN WITH `DataFlowDispatch::TrackVirtualDispatch<d1>::qualifierOfVirtualCallImpl/3#5649cb98_201#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'callable', Lhs.2, Rhs.2
  1462970099 ~871%      {4}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::PathNodeFwd.getNode/0#dispred#8901ad56_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'p2', Lhs.1 'callable', Lhs.2, Lhs.3
  974405553  ~784%      {4}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::reachRev/1#ab50895d` ON FIRST 1 OUTPUT Lhs.0 'p2', Lhs.1 'callable', Lhs.2, Lhs.3
  919333500  ~748%      {4}    | JOIN WITH `num#TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::TPathNodeSink#eea8db44_1#join_rhs` ON FIRST 1 OUTPUT Lhs.3, Lhs.1 'callable', Lhs.2, Lhs.0 'p2'
  919328485  ~781%      {4}    | JOIN WITH DataFlowPrivate::TNormalCall#0f63d738 ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'callable', Lhs.3 'p2', Rhs.1 'call'
  5348269387 ~949%      {4}    | JOIN WITH `DataFlowDispatch::qualifierSourceImpl/2#9b5eb2f5_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'callable', Lhs.2 'p2', Lhs.3 'call'
  7135273052 ~909%      {4}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::PathNodeFwd.getNode/0#dispred#8901ad56_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'p1', Lhs.1 'callable', Lhs.2 'p2', Lhs.3 'call'
  7131364481 ~908%      {4}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::PathNodeFwd.isSource/0#dispred#88b6728d` ON FIRST 1 OUTPUT Lhs.0 'p1', Lhs.1 'callable', Lhs.2 'p2', Lhs.3 'call'
  3333681000 ~877%      {4}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<noDisp>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<noDisp>::qualifierOfVirtualCall>::reachRev/1#ab50895d` ON FIRST 1 OUTPUT Lhs.0 'p1', Lhs.2 'p2', Lhs.1 'callable', Lhs.3 'call'
                        return r1

After (pairCand is now inlined):

[2026-03-26 16:21:55] (854s) Tuple counts for DataFlowDispatch::d5/1#fb855fd5/2@136c8ccu after 16.1s:
  282744     ~0%         {2} r1 = JOIN `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` ON FIRST 1 OUTPUT Lhs.0, Lhs.0
  
  32290878   ~1%         {2} r2 = `doublyBoundedFastTC:TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::edges/2#1a906743_10#higher_order_body:_DataFlowDispatch::TrackVirtualDispatch<d4>::isSink/1#41abce53_TypeTrackingImpl::TypeTracking<Locati__#higher_order_body:DataFlowDispatch::TrackVirtualDispatch<d4>::isSource/1#3bcd6175` UNION r1
  32290878   ~0%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  32197673   ~0%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::PathNodeFwd.isSource/0#dispred#5abd6574` ON FIRST 1 OUTPUT Lhs.0, Lhs.1
  32197673   ~1%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  32097255   ~0%         {2}    | JOIN WITH `num#TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::TPathNodeSink#87fa68f5_1#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  32097255   ~3%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::PathNodeFwd.getNode/0#dispred#191e1ea3` ON FIRST 1 OUTPUT Lhs.1, Rhs.1
  32097255   ~1%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::PathNodeFwd.getNode/0#dispred#191e1ea3` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
  32171721   ~1%         {3}    | JOIN WITH `DataFlowDispatch::TrackVirtualDispatch<d1>::qualifierOfVirtualCallImpl/3#5649cb98` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Rhs.2
  32171721   ~0%         {3}    | JOIN WITH DataFlowPrivate::TNormalCall#0f63d738 ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Rhs.1 'call'
  1893840121 ~0%         {4}    | JOIN WITH `DataFlowDispatch::TrackVirtualDispatch<d4>::mostSpecific/2#3e05b0ea` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.2 'call', Rhs.2
  49769362   ~21935%     {2}    | JOIN WITH `DataFlowDispatch::qualifierSourceImpl/2#9b5eb2f5` ON FIRST 2 OUTPUT Lhs.3, Lhs.2 'call'
  471745     ~104%       {2}    | JOIN WITH DataFlowPrivate::TSourceCallable#54d42094 ON FIRST 1 OUTPUT Lhs.1 'call', Rhs.1 'result'
                          return r2

and then after a small join order fix:

[2026-03-26 17:06:05] (1050s) Tuple counts for DataFlowDispatch::d5/1#fb855fd5/2@6c7ad81n after 922ms:
  282744   ~1%         {2} r1 = JOIN `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` ON FIRST 1 OUTPUT Lhs.0, Lhs.0

  32290878 ~3%         {2} r2 = `doublyBoundedFastTC:TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::edges/2#1a906743_10#higher_order_body:_DataFlowDispatch::TrackVirtualDispatch<d4>::isSink/1#41abce53_TypeTrackingImpl::TypeTracking<Locati__#higher_order_body:DataFlowDispatch::TrackVirtualDispatch<d4>::isSource/1#3bcd6175` UNION r1
  32290878 ~0%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::reachRev/1#60f80b02` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  32197673 ~3%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::PathNodeFwd.isSource/0#dispred#5abd6574` ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  32097255 ~3%         {2}    | JOIN WITH `num#TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::TPathNodeSink#87fa68f5_1#join_rhs` ON FIRST 1 OUTPUT Lhs.0, Lhs.1
  32097255 ~0%         {2}    | JOIN WITH `TypeTrackingImpl::TypeTracking<Location::Location,DataFlowDispatch::TrackVirtualDispatch<d4>::TtInput>::TypeTrack<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierSource>::Graph<DataFlowDispatch::TrackVirtualDispatch<d4>::qualifierOfVirtualCall>::PathNodeFwd.getNode/0#dispred#191e1ea3` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
  32171721 ~0%         {3}    | JOIN WITH `DataFlowDispatch::TrackVirtualDispatch<d1>::qualifierOfVirtualCallImpl/3#5649cb98` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Rhs.2
  32171721 ~0%         {3}    | JOIN WITH DataFlowPrivate::TNormalCall#0f63d738 ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Rhs.1 'call'
  49768855 ~21935%     {2}    | JOIN WITH `DataFlowDispatch::TrackVirtualDispatch<d4>::mostSpecificForSource/2#c1de398d` ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'call'
  567835   ~146%       {2}    | JOIN WITH DataFlowPrivate::TSourceCallable#54d42094 ON FIRST 1 OUTPUT Lhs.1 'call', Rhs.1 'result'
                        return r2

…ges and inline pairCand to avoid a giant tuple explosion.
@MathiasVP MathiasVP requested a review from a team as a code owner March 26, 2026 17:39
Copilot AI review requested due to automatic review settings March 26, 2026 17:39
@github-actions github-actions bot added the C++ label Mar 26, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets performance in C/C++ virtual dispatch computation by improving transitive-closure and join-order behavior in TrackVirtualDispatch, aiming to reduce evaluation cost on large databases.

Changes:

  • Replace fastTC(edges/2) with doublyBoundedFastTC(edges/2, isSource/1, isSink/1) for edgePlus.
  • Inline/restructure pairCand computation by factoring out mostSpecificForSource and adding bindingset / inline_late to influence evaluation order.

@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Mar 26, 2026
LWSimpkins added a commit to microsoft/codeql that referenced this pull request Mar 26, 2026
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QL changes LGTM.

I made a comment on the DCA run. I don't really know if it's worth investigating or not.

private predicate isSink(PathNode n) { n.isSink() }

private predicate edgePlus(PathNode n1, PathNode n2) =
doublyBoundedFastTC(edges/2, isSource/1, isSink/1)(n1, n2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen a definition of doublyBoundedFastTC, but I think I see what's going on. It's computing a transitive closure of edges, but limited to those edges that connect somewhere between sources and sinks (presumably via a forwards-backwards passes type algorithm).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly. doublyBoundedFastTC is also the HOP that's used when you evaluate any dataflow flowPath predicate.

@MathiasVP
Copy link
Contributor Author

QL changes LGTM.

I made a comment on the DCA run. I don't really know if it's worth investigating or not.

Thanks! I've replied to the comment. The TLDR is: It's not 😂

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with your explanation. :)

@MathiasVP MathiasVP merged commit 8fc914f into github:main Mar 27, 2026
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants