-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](catalog) opt the count pushdown rule for iceberg/paimon/hive scan node #44038
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
clang-tidy review says "All clean, LGTM! 👍" |
d146e3f
to
57654e8
Compare
clang-tidy review says "All clean, LGTM! 👍" |
57654e8
to
e7d15a1
Compare
run buildall |
TeamCity be ut coverage result: |
1adc1f7
to
a61a4d4
Compare
run buildall |
TeamCity be ut coverage result: |
0d6cc74
to
8281749
Compare
run buildall |
TeamCity be ut coverage result: |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
PR approved by anyone and no changes requested. |
64dae3d
to
4d3920f
Compare
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
…an node (apache#44038) ### What problem does this PR solve? 1. Opt the parallelism when doing count push down optimization Count push down optimization is used to optimize queries such as `select count(*) from table`. In this scenario, we can directly obtain the number of rows through the row count statistics of the external table, or the metadata of the Parquet/ORC file, without reading the actual file content, thereby speeding up such queries. Currently, we support count push down optimization for Hive, Iceberg, and Paimon tables. There are two ways to obtain the number of rows: 1. Obtain directly from statistics For Iceberg tables, we can obtain the number of rows directly from statistics. However, due to the historical issues of Iceberg, if there is position/equality delete in the table, this method cannot be used to prevent incorrect row count. In this case, it will degenerate to obtaining from the metadata of the file. 2. Obtain from the metadata of the file For Hive, Paimon, and some of Iceberg tables, the number of rows can be obtained directly from the metadata of the Parquet/ORC file. For Text format tables, efficiency can also be improved by only performing row separation, without column separation. In the task splitting logic, for Count push-down optimization, the number of split tasks should comprehensively consider the file format, number of files, parallelism, number of BE nodes, and the Local Shuffle: 1. Count push-down optimization should avoid Local Shuffle, so the number of split tasks should be greater than or equal to `parallelism * number of BE nodes`. 2. Fix the incorrect logic of Count push-down optimization In the previous code, for Iceberg and Paimon tables, Count push-down optimization did not take effect because we did not push CountPushDown information to FileFormatReader inside TableForamtReader. This PR fixes this problem. 3. Store SessionVaraible variables in FileQueryScanNode. SessionVaraible is a variable in ConnectionContext. And ConnectionContext is a ThreadLocal variable. In FileQueryScanNode, SessionVaraible may be accessed in other threads in some cases, so ThreadLocal variables may not be obtained. Therefore, the SessionVaraible reference is stored in FileQueryScanNode to prevent illegal access. 4. Independent FileSplitter class. The FileSplitter class is a tool class that allows users to split `Split` according to different strategies. This PR does not modify the splitting strategy, but only extracts this part of the logic separately, to be able to perform logic optimization later.
…an node (apache#44038) ### What problem does this PR solve? 1. Opt the parallelism when doing count push down optimization Count push down optimization is used to optimize queries such as `select count(*) from table`. In this scenario, we can directly obtain the number of rows through the row count statistics of the external table, or the metadata of the Parquet/ORC file, without reading the actual file content, thereby speeding up such queries. Currently, we support count push down optimization for Hive, Iceberg, and Paimon tables. There are two ways to obtain the number of rows: 1. Obtain directly from statistics For Iceberg tables, we can obtain the number of rows directly from statistics. However, due to the historical issues of Iceberg, if there is position/equality delete in the table, this method cannot be used to prevent incorrect row count. In this case, it will degenerate to obtaining from the metadata of the file. 2. Obtain from the metadata of the file For Hive, Paimon, and some of Iceberg tables, the number of rows can be obtained directly from the metadata of the Parquet/ORC file. For Text format tables, efficiency can also be improved by only performing row separation, without column separation. In the task splitting logic, for Count push-down optimization, the number of split tasks should comprehensively consider the file format, number of files, parallelism, number of BE nodes, and the Local Shuffle: 1. Count push-down optimization should avoid Local Shuffle, so the number of split tasks should be greater than or equal to `parallelism * number of BE nodes`. 2. Fix the incorrect logic of Count push-down optimization In the previous code, for Iceberg and Paimon tables, Count push-down optimization did not take effect because we did not push CountPushDown information to FileFormatReader inside TableForamtReader. This PR fixes this problem. 3. Store SessionVaraible variables in FileQueryScanNode. SessionVaraible is a variable in ConnectionContext. And ConnectionContext is a ThreadLocal variable. In FileQueryScanNode, SessionVaraible may be accessed in other threads in some cases, so ThreadLocal variables may not be obtained. Therefore, the SessionVaraible reference is stored in FileQueryScanNode to prevent illegal access. 4. Independent FileSplitter class. The FileSplitter class is a tool class that allows users to split `Split` according to different strategies. This PR does not modify the splitting strategy, but only extracts this part of the logic separately, to be able to perform logic optimization later.
…an node (apache#44038) 1. Opt the parallelism when doing count push down optimization Count push down optimization is used to optimize queries such as `select count(*) from table`. In this scenario, we can directly obtain the number of rows through the row count statistics of the external table, or the metadata of the Parquet/ORC file, without reading the actual file content, thereby speeding up such queries. Currently, we support count push down optimization for Hive, Iceberg, and Paimon tables. There are two ways to obtain the number of rows: 1. Obtain directly from statistics For Iceberg tables, we can obtain the number of rows directly from statistics. However, due to the historical issues of Iceberg, if there is position/equality delete in the table, this method cannot be used to prevent incorrect row count. In this case, it will degenerate to obtaining from the metadata of the file. 2. Obtain from the metadata of the file For Hive, Paimon, and some of Iceberg tables, the number of rows can be obtained directly from the metadata of the Parquet/ORC file. For Text format tables, efficiency can also be improved by only performing row separation, without column separation. In the task splitting logic, for Count push-down optimization, the number of split tasks should comprehensively consider the file format, number of files, parallelism, number of BE nodes, and the Local Shuffle: 1. Count push-down optimization should avoid Local Shuffle, so the number of split tasks should be greater than or equal to `parallelism * number of BE nodes`. 2. Fix the incorrect logic of Count push-down optimization In the previous code, for Iceberg and Paimon tables, Count push-down optimization did not take effect because we did not push CountPushDown information to FileFormatReader inside TableForamtReader. This PR fixes this problem. 3. Store SessionVaraible variables in FileQueryScanNode. SessionVaraible is a variable in ConnectionContext. And ConnectionContext is a ThreadLocal variable. In FileQueryScanNode, SessionVaraible may be accessed in other threads in some cases, so ThreadLocal variables may not be obtained. Therefore, the SessionVaraible reference is stored in FileQueryScanNode to prevent illegal access. 4. Independent FileSplitter class. The FileSplitter class is a tool class that allows users to split `Split` according to different strategies. This PR does not modify the splitting strategy, but only extracts this part of the logic separately, to be able to perform logic optimization later.
What problem does this PR solve?
Opt the parallelism when doing count push down optimization
Count push down optimization is used to optimize queries such as
select count(*) from table
.In this scenario, we can directly obtain the number of rows through the row count statistics
of the external table, or the metadata of the Parquet/ORC file,
without reading the actual file content, thereby speeding up such queries.
Currently, we support count push down optimization for Hive, Iceberg, and Paimon tables.
There are two ways to obtain the number of rows:
Obtain directly from statistics
For Iceberg tables, we can obtain the number of rows directly from statistics.
However, due to the historical issues of Iceberg, if there is position/equality delete in the table,
this method cannot be used to prevent incorrect row count.
In this case, it will degenerate to obtaining from the metadata of the file.
Obtain from the metadata of the file
For Hive, Paimon, and some of Iceberg tables, the number of rows can be obtained directly
from the metadata of the Parquet/ORC file.
For Text format tables, efficiency can also be improved by only performing row separation, without column separation.
In the task splitting logic, for Count push-down optimization, the number of split tasks should comprehensively
consider the file format, number of files, parallelism, number of BE nodes, and the Local Shuffle:
parallelism * number of BE nodes
.Fix the incorrect logic of Count push-down optimization
In the previous code, for Iceberg and Paimon tables, Count push-down optimization did not take effect because we did not push
CountPushDown information to FileFormatReader inside TableForamtReader. This PR fixes this problem.
Store SessionVaraible variables in FileQueryScanNode.
SessionVaraible is a variable in ConnectionContext. And ConnectionContext is a ThreadLocal variable.
In FileQueryScanNode, SessionVaraible may be accessed in other threads in some cases,
so ThreadLocal variables may not be obtained.
Therefore, the SessionVaraible reference is stored in FileQueryScanNode to prevent illegal access.
Independent FileSplitter class.
The FileSplitter class is a tool class that allows users to split
Split
according to different strategies.This PR does not modify the splitting strategy, but only extracts this part of the logic separately,
to be able to perform logic optimization later.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)