Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Improvement](shuffle) Use a knob to decide whether a serial exchange… (
apache#44676) … should be used This improvement was completed in apache#43199 and reverted by apache#44075 due to performance fallback. After fixing it, this improvement is re-submited. A new knob to control a exchange node should be serial or not. For example, a partitioned hash join should be executed like below: ``` ┌────────────────────────────┐ ┌────────────────────────────┐ │ │ │ │ │Exchange(HASH PARTITIONED N)│ │Exchange(HASH PARTITIONED N)│ │ │ │ │ └────────────────────────────┴─────────┬────────┴────────────────────────────┘ │ │ │ │ │ │ ┌──────▼──────┐ │ │ │ HASH JOIN │ │ │ └─────────────┘ ``` After turning on this knob, the real plan should be: ``` ┌──────────────────────────────┐ ┌──────────────────────────────┐ │ │ │ │ │ Exchange (HASH PARTITIONED 1)│ │ Exchange (HASH PARTITIONED 1)│ │ │ │ │ └────────────┬─────────────────┘ └────────────┬─────────────────┘ │ │ │ │ │ │ │ │ │ │ ┌──────────────▼─────────────────────┐ ┌──────────────▼─────────────────────┐ │ │ │ │ │ Local Exchange(HASH PARTITIONED N)│ │ Local Exchange(HASH PARTITIONED N)│ │ 1 -> N │ │ 1 -> N │ └────────────────────────────────────┴─────────┬────────┴────────────────────────────────────┴ │ │ │ │ │ │ ┌──────▼──────┐ │ │ │ HASH JOIN │ │ │ └─────────────┘ ``` For large cluster, X (mappers) * Y (reducers) rpc channels can be reduced to X (mappers) * Z (BEs).
- Loading branch information