Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run DESCRIBE HISTORY in parallel to improve performance #917

Merged
merged 1 commit into from
Jul 20, 2024

Conversation

mars-lan
Copy link
Contributor

@mars-lan mars-lan commented Jul 20, 2024

🤔 Why?

Each DESCRIBE HISTORY command takes 1~5 seconds to complete. The time adds up quickly when running against a large number of tables.

🤓 What?

  • Batch run DESCRIBE HISTORY commands in a thread pool to improve performance.
  • Add create_connect_pool util method for code sharing.
  • Bump up LIMIT for DESCRIBE HISTORY from 50 to 100 since we're less contained by time.

🧪 Tested?

Verified against a production instance with ~3000 tables and saw close to 10x speed up (took 7 mins to get all last refresh dates).

☑️ Checks

  • My PR contains actual code changes, and I have updated the version number in pyproject.toml.

Copy link

github-actions bot commented Jul 20, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
19301 18093 94% 85% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
metaphor/unity_catalog/config.py 100% 🟢
metaphor/unity_catalog/extractor.py 97% 🟢
metaphor/unity_catalog/profile/extractor.py 94% 🟢
metaphor/unity_catalog/utils.py 89% 🟢
TOTAL 95% 🟢

updated for commit: de9f5ea by action🐍

Copy link

codecov bot commented Jul 20, 2024

Codecov Report

Attention: Patch coverage is 90.24390% with 4 lines in your changes missing coverage. Please review.

Project coverage is 93.74%. Comparing base (b517c90) to head (ec534f3).

Files Patch % Lines
metaphor/unity_catalog/utils.py 85.71% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #917      +/-   ##
==========================================
+ Coverage   93.57%   93.74%   +0.16%     
==========================================
  Files         216      171      -45     
  Lines       19447    19299     -148     
==========================================
- Hits        18197    18091     -106     
+ Misses       1250     1208      -42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mars-lan mars-lan force-pushed the marslan/sc-27635/improve-describe-history-speed branch 3 times, most recently from c1bab61 to e0798dc Compare July 20, 2024 18:16
@mars-lan mars-lan force-pushed the marslan/sc-27635/improve-describe-history-speed branch from e0798dc to de9f5ea Compare July 20, 2024 18:17
@mars-lan mars-lan marked this pull request as ready for review July 20, 2024 18:42
Copy link
Contributor

@alyiwang alyiwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mars-lan mars-lan merged commit 7dae5ad into main Jul 20, 2024
4 checks passed
@mars-lan mars-lan deleted the marslan/sc-27635/improve-describe-history-speed branch July 20, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants