Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data flow: Fix a bad join #16251

Merged
merged 1 commit into from
Apr 19, 2024
Merged

Conversation

hvitved
Copy link
Contributor

@hvitved hvitved commented Apr 18, 2024

Before

(726s) Cancelling evaluation of #12545 evaluator rec DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths05/4#6d4ff244/4@2f7ff6v4: No demand for this layer anymore.
(726s) Tuple counts for DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths05/4#6d4ff244/4@i1#2f7ff6v4 after 11m42s:
3969000     ~1572%     {4} r1 = SCAN `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e` OUTPUT In.1 'par', In.0 'arg', In.2, In.3 'out'

3963564     ~1607%     {4} r2 = r1 AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

9998        ~0%        {4} r3 = JOIN r1 WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepFromHidden/2#2db712ccPlus` ON FIRST 1 OUTPUT Rhs.1 'par', Lhs.1 'arg', Lhs.2, Lhs.3 'out'
0           ~0%        {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

3963564     ~1607%     {4} r4 = r2 UNION r3
3963564     ~1607%     {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

3963564     ~1472%     {5} r5 = SCAN r4 OUTPUT In.0 'par', In.2, In.1 'arg', In.3 'out', In.0 'par'

27551160000 ~1459%     {5} r6 = JOIN r4 WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::summaryCtxStep/1#1f7d276aPlus` ON FIRST 1 OUTPUT Rhs.1 'ret', Lhs.1 'arg', Lhs.2, Lhs.3 'out', Lhs.0 'par'
                       {5}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
27551146500 ~1540%     {5}    | SCAN OUTPUT In.0 'ret', In.2, In.1 'arg', In.3 'out', In.4 'par'

27555109564 ~1540%     {5} r7 = r5 UNION r6
0           ~0%        {4}    | JOIN WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepToHidden/2#6528ff55Plus` ON FIRST 2 OUTPUT Lhs.0 'ret', Lhs.2 'arg', Lhs.3 'out', Lhs.4 'par'
                       {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
0           ~0%        {4}    | SCAN OUTPUT In.1 'arg', In.3 'par', In.0 'ret', In.2 'out'

                       {4} r8 = REWRITE `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e` WITH TEST InOut.1 'par' = InOut.2
0           ~0%        {3}    | SCAN OUTPUT In.1 'par', In.0 'arg', In.3 'out'
0           ~0%        {3}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

4500        ~0%        {3} r9 = JOIN `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e_1203#join_rhs` WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepFromHidden/2#2db712ccPlus` ON FIRST 2 OUTPUT Lhs.1 'par', Lhs.2 'arg', Lhs.3 'out'
0           ~0%        {3}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

0           ~0%        {3} r10 = r8 UNION r9
                       {3}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
0           ~0%        {4}    | SCAN OUTPUT In.0 'par', In.1 'arg', In.2 'out', In.0 'par'

3955564     ~1547%     {4} r11 = SCAN r4 OUTPUT In.0 'par', In.2, In.1 'arg', In.3 'out'
3948000     ~1541%     {4}    | JOIN WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::summaryCtxStep/1#1f7d276aPlus` ON FIRST 2 OUTPUT Lhs.1 'ret', Lhs.2 'arg', Lhs.3 'out', Lhs.0 'par'
3948000     ~1541%     {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)

3948000     ~1541%     {4} r12 = r10 UNION r11
                       {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
3940000     ~1561%     {4}    | SCAN OUTPUT In.1 'arg', In.3 'par', In.0 'ret', In.2 'out'

3940000     ~1561%     {4} r13 = r7 UNION r12
                       return r13

After

Evaluated relational algebra for predicate DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths05/4#6d4ff244@a42e25u4 on iteration 1 running pipeline base with tuple counts:
        4426831   ~4%    {4} r1 = SCAN `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e` OUTPUT In.1, In.0, In.2, In.3
                     
        4421313   ~4%    {4} r2 = r1 AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
                     
          10477   ~4%    {4} r3 = JOIN r1 WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepFromHidden/2#2db712ccPlus` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3
              0   ~0%    {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
                     
        4421313   ~4%    {4} r4 = r2 UNION r3
        4421313   ~4%    {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
                     
        4421313   ~0%    {4} r5 = SCAN r4 OUTPUT In.2, In.1, In.3, In.0
                     
                         {4} r6 = r5 AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
        4421313   ~1%    {4}    | SCAN OUTPUT In.3, In.0, In.1, In.2
                     
              0   ~0%    {4} r7 = JOIN r5 WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepToHidden/2#6528ff55Plus#swapped` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.3
                         {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
              0   ~0%    {4}    | SCAN OUTPUT In.3, In.0, In.1, In.2
                     
        4421313   ~1%    {4} r8 = r6 UNION r7
        4421307   ~0%    {4}    | JOIN WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::summaryCtxStep/1#1f7d276aPlus` ON FIRST 2 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.0
                     
                         {4} r9 = REWRITE `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e` WITH TEST InOut.1 = InOut.2
              0   ~0%    {4}    | SCAN OUTPUT In.1, In.0, In.2, In.3
              0   ~0%    {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
                     
           5518   ~1%    {4} r10 = JOIN `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::subpaths04/4#89effa1e_1203#join_rhs` WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepFromHidden/2#2db712ccPlus` ON FIRST 2 OUTPUT Lhs.1, Lhs.2, Lhs.1, Lhs.3
              0   ~0%    {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
                     
              0   ~0%    {4} r11 = r9 UNION r10
                         {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
              0   ~0%    {4}    | SCAN OUTPUT In.2, In.1, In.3, In.0
                         {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
              0   ~0%    {4}    | SCAN OUTPUT In.3, In.1, In.2, In.0
                     
        4421313   ~1%    {4} r12 = SCAN r4 OUTPUT In.0, In.2, In.1, In.3
              0   ~0%    {4}    | JOIN WITH `#DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::Subpaths::localStepToHidden/2#6528ff55Plus` ON FIRST 2 OUTPUT Lhs.0, Lhs.2, Lhs.3, Lhs.0
                         {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
              0   ~0%    {4}    | SCAN OUTPUT In.3, In.1, In.2, In.0
                     
              0   ~0%    {4} r13 = r11 UNION r12
              0   ~0%    {4}    | JOIN WITH `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeMid#f7155bb5` ON FIRST 1 OUTPUT Lhs.3, Lhs.1, Lhs.2, Lhs.0
                     
        4421307   ~0%    {4} r14 = r8 UNION r13
                         {4}    | AND NOT `DataFlowImpl::Impl<TaintedPath::TaintedPath::Flow::C>::PathNodeImpl.isHidden/0#dispred#8010d23f`(FIRST 1)
        4421307   ~0%    {4}    | SCAN OUTPUT In.1, In.3, In.0, In.2
                         return r14

owen-mc
owen-mc previously approved these changes Apr 18, 2024
Copy link
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the problem for me locally. My only hesitation is that I don't know enough about join orders to know if it definitely stops a bad join order being chosen, or if it just perturbs it to a good order in the two cases I checked.

@hvitved hvitved force-pushed the dataflow/fix-bad-join2 branch from 9c8d660 to 339c40c Compare April 19, 2024 06:18
@hvitved hvitved added the no-change-note-required This PR does not need a change note label Apr 19, 2024
@hvitved hvitved requested a review from aschackmull April 19, 2024 06:18
@hvitved hvitved marked this pull request as ready for review April 19, 2024 06:21
Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hvitved hvitved merged commit 18acad5 into github:main Apr 19, 2024
26 checks passed
@hvitved hvitved deleted the dataflow/fix-bad-join2 branch April 19, 2024 07:49
@hvitved hvitved restored the dataflow/fix-bad-join2 branch April 24, 2024 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFlow Library no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants