Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: remove redundant branches in the OR list | tidb-test=pr/2470 #58962

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

time-and-fate
Copy link
Member

@time-and-fate time-and-fate commented Jan 15, 2025

What problem does this PR solve?

Issue Number: close #58998

What changed and how does it work?

In the rule predicate_simplification, add a new step to remove redundant (duplicate) expressions from the OR list. It's more like the existing RemoveDupExprs() than PropagateConstant() for AND lists.

Specifically, it uses HashCode() to check the redundancy, like we did in RemoveDupExprs() and generateANDIndexMerge4NormalIndex(). And it recursively does this for nested AND lists.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. labels Jan 15, 2025
Copy link

ti-chi-bot bot commented Jan 15, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from time-and-fate, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/planner SIG: Planner labels Jan 15, 2025
@time-and-fate time-and-fate changed the title planner: remove redundant branches in the OR list planner: remove redundant branches in the OR list | tidb-test=pr/2470 Jan 15, 2025
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.4821%. Comparing base (f7c94c8) to head (fc9e6ac).
Report is 1 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #58962        +/-   ##
================================================
+ Coverage   73.0164%   73.4821%   +0.4657%     
================================================
  Files          1685       1684         -1     
  Lines        465917     465785       -132     
================================================
+ Hits         340196     342269      +2073     
+ Misses       104790     102578      -2212     
- Partials      20931      20938         +7     
Flag Coverage Δ
integration 42.8110% <100.0000%> (?)
unit 72.2363% <88.8888%> (+0.0086%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 45.2780% <ø> (+0.0086%) ⬆️

@time-and-fate
Copy link
Member Author

/retest

hashCode := string(orItem.HashCode())
// 2-1. If it's not a duplicate, we need to keep this predicate.
if _, ok := dedupMap[hashCode]; !ok {
dedupMap[hashCode] = struct{}{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the value of the map can be anything and may be just a simple constant like True.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, yes. But I think using map[T]struct{} as a set has become a common practice in Golang.
https://stackoverflow.com/a/47544821


func recursiveRemoveRedundantORBranch(sctx base.PlanContext, predicate expression.Expression) expression.Expression {
_, tp := FindPredicateType(sctx, predicate)
if tp != orPredicate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a recursive function (indirectly through removeRedundantORBranch). If you intend to cover the And case (like the code in the function) then you just need a general logic for OR AND lists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't intend to cover the AND case (since that's already covered by other existing logic), I want to handle OR lists nested in another AND list.


drop table if exists t4;
create table t4(a int, b int, c int, d int, index iab(a,b), index iac(a,c), index iad(a,d));
explain format=brief select /*+ use_index_merge(t4) */ * from t4 where a = 1 and (b = 2 or c = 4 or b = 12 or c = 5 or d = 6 or c = 4 or c = 5 or d = 6);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more nested AND/OR cases since the code does that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
I added several more cases in t1 since the logic of building index ranges has some extra functionality to simplify the ranges. So I think printing the expression in Selection can test the effect of this new rule more straightforwardly.
The t4 case is a simplified version of the original case from the user.

@@ -186,6 +186,7 @@ func splitCNF(conditions []expression.Expression) []expression.Expression {
func applyPredicateSimplification(sctx base.PlanContext, predicates []expression.Expression) []expression.Expression {
simplifiedPredicate := shortCircuitLogicalConstants(sctx, predicates)
simplifiedPredicate = mergeInAndNotEQLists(sctx, simplifiedPredicate)
removeRedundantORBranch(sctx, simplifiedPredicate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason of the order of the sub-rule?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no strong reasons. Actually, after I put this new sub-rule in a different order, the new test cases don't have any changes.
I have several minor considerations though:

  1. I try not to change the order of existing sub-rules.
  2. mergeInAndNotEQLists() simplifies the predicates by merging some of them, i.e., constructing some new expressions. Probably there will be new redundant expressions after that sub-rule, so put the new sub-rule after it might be a good idea.
  3. pruneEmptyORBranches() just removes useless OR branches, and should not produce new redundant expressions. So probably it's useless to put the new sub-rule after it.

Copy link

ti-chi-bot bot commented Jan 16, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-01-16 16:07:50.547467868 +0000 UTC m=+221142.002514016: ✖️🔁 reset by ghazalfamilyusa.

@time-and-fate
Copy link
Member Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove redundant branches in the OR list in filters
2 participants