Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sc-27549] Switch to using DiscoveryAPI for dbt.cloud #926

Conversation

usefulalgorithm
Copy link
Contributor

@usefulalgorithm usefulalgorithm commented Jul 23, 2024

🤔 Why?

DBT's discovery API exposes more information than manifest.json, plus it's more flexible for us to control what we want to get instead of having to download the entire file every time.

🤓 What?

  • Switch to using discovery API for dbt.cloud extractor.
  • Added script to download schema.graphql from DBT's Apollo server and generate graphql client code automatically.
  • Updated CI to respect coverage configurations in pyproject.toml
  • Ignore all generated files in coverage reporting.
  • Updated pre-commit hook to ignore bandit issues in generated files

Difference between using manifest.json and using discovery api

  • With manifest.json we could not get column types most of the time. With discovery api it is always available (at least during the tests i've run)
  • Discovery API is picking up columns that are not covered by parsing manifest.json. These models don't have any field when parsed with manifest.json, but have all the fields when done with DiscoveryAPI:
    • jaffle_shop.metricflow_time_spine
    • jaffle_shop.order_items
    • london_bike_analysis.cleaned_bike_rides_from_snapshot
    • london_bike_analysis.cycle_hire_snapshot
    • london_bike_analysis.raw_bike_hires
    • london_bike_analysis.rides_by_month_2017
    • london_bike_analysis.rides_by_month_start_station_2017
  • Parsing with manifest.json does not support metrics for v10 and up. With DiscoveryAPI metrics is available.

Script used for comparing MCEs: https://pastebin.com/zc9zE2GG

🧪 Tested?

  • Tested locally with staging configuration
  • Tested with production configuration
  • Unit test

☑️ Checks

  • My PR contains actual code changes, and I have updated the version number in pyproject.toml.

Copy link

github-actions bot commented Jul 23, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
11556 10268 89% 85% 🟢

New Files

File Coverage Status
metaphor/dbt/cloud/parser/common.py 100% 🟢
metaphor/dbt/cloud/parser/dbt_macro_parser.py 97% 🟢
metaphor/dbt/cloud/parser/dbt_metric_parser.py 94% 🟢
metaphor/dbt/cloud/parser/dbt_node_parser.py 86% 🟢
metaphor/dbt/cloud/parser/dbt_source_parser.py 94% 🟢
metaphor/dbt/cloud/parser/dbt_test_parser.py 86% 🟢
metaphor/dbt/cloud/parser/parser.py 96% 🟢
metaphor/dbt/cloud/utils.py 90% 🟢
TOTAL 93% 🟢

Modified Files

File Coverage Status
metaphor/common/entity_id.py 100% 🟢
metaphor/dbt/cloud/client.py 88% 🟢
metaphor/dbt/cloud/extractor.py 88% 🟢
TOTAL 92% 🟢

updated for commit: 438f386 by action🐍

Copy link

codecov bot commented Jul 23, 2024

Codecov Report

Attention: Patch coverage is 91.44254% with 35 lines in your changes missing coverage. Please review.

Project coverage is 88.85%. Comparing base (a17afd9) to head (4966bdd).

Files Patch % Lines
metaphor/dbt/cloud/parser/dbt_node_parser.py 85.71% 17 Missing ⚠️
metaphor/dbt/cloud/parser/dbt_test_parser.py 85.50% 10 Missing ⚠️
metaphor/dbt/cloud/parser/parser.py 96.47% 3 Missing ⚠️
metaphor/dbt/cloud/parser/dbt_source_parser.py 93.75% 2 Missing ⚠️
metaphor/dbt/cloud/parser/dbt_macro_parser.py 97.29% 1 Missing ⚠️
metaphor/dbt/cloud/parser/dbt_metric_parser.py 94.44% 1 Missing ⚠️
metaphor/dbt/cloud/utils.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #926      +/-   ##
==========================================
- Coverage   93.54%   88.85%   -4.69%     
==========================================
  Files         229      178      -51     
  Lines       20033    11556    -8477     
==========================================
- Hits        18739    10268    -8471     
+ Misses       1294     1288       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@usefulalgorithm usefulalgorithm marked this pull request as ready for review July 24, 2024 18:34
@mars-lan
Copy link
Contributor

Let's exclude metaphor/dbt/cloud/discovery_api/* from coverage to avoid a big drop?

@usefulalgorithm
Copy link
Contributor Author

Let's exclude metaphor/dbt/cloud/discovery_api/* from coverage to avoid a big drop?

coverage on generated files looks alright, it's the parser not parsing macros and sources for jaffle_shop. I'll add a new test case today.

.pre-commit-config.yaml Outdated Show resolved Hide resolved
metaphor/dbt/cloud/discovery_api/codegen.sh Show resolved Hide resolved
@usefulalgorithm usefulalgorithm requested a review from mars-lan July 26, 2024 04:56
Copy link
Contributor

@mars-lan mars-lan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a high level. @elic-eon pls take a closer look

elic-eon
elic-eon previously approved these changes Jul 29, 2024
Copy link
Contributor

@elic-eon elic-eon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

elic-eon
elic-eon previously approved these changes Jul 29, 2024
@usefulalgorithm
Copy link
Contributor Author

ci is failling because setuptools + poetry is broken:

pypa/setuptools#4519
AUTOMATIC1111/stable-diffusion-webui#16289

@usefulalgorithm usefulalgorithm merged commit 127ce75 into main Jul 29, 2024
4 checks passed
@usefulalgorithm usefulalgorithm deleted the tsung-julii/sc-27549/use-discovery-api-exclusively-for-dbt-cloud branch July 29, 2024 09:17
@mars-lan mars-lan mentioned this pull request Aug 5, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants