Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sc-27549] Switch to using DiscoveryAPI for dbt.cloud #926

Merged
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1c96ca1
[sc-27549] Switch to using DiscoveryAPI for dbt.cloud
usefulalgorithm Jul 23, 2024
4530f2f
use codegen
usefulalgorithm Jul 23, 2024
c09837d
rewrite pt 1
usefulalgorithm Jul 23, 2024
5952fc6
pt2
usefulalgorithm Jul 23, 2024
61248fd
refactor
usefulalgorithm Jul 24, 2024
f022589
refactor
usefulalgorithm Jul 24, 2024
eac3ee7
parse test
usefulalgorithm Jul 24, 2024
d1e76fb
finish implementation
usefulalgorithm Jul 24, 2024
662a7e1
ignore generated files in bandit
usefulalgorithm Jul 24, 2024
10263ae
paginate macro arguments query
usefulalgorithm Jul 24, 2024
fcf3c04
finish implementation
usefulalgorithm Jul 24, 2024
916e6ac
dont include codegen as dependency
usefulalgorithm Jul 24, 2024
e2c9f6a
rename files
usefulalgorithm Jul 24, 2024
b3aa3ed
use pascalcase query names
usefulalgorithm Jul 24, 2024
46b6779
remove unused stuff
usefulalgorithm Jul 24, 2024
a67e045
add test
usefulalgorithm Jul 24, 2024
ddb01d3
Merge remote-tracking branch 'origin/main' into tsung-julii/sc-27549/…
usefulalgorithm Jul 24, 2024
57e0fb3
bump version
usefulalgorithm Jul 24, 2024
3b54c4b
rename graphql_client to generated
usefulalgorithm Jul 25, 2024
6ad7866
add codegen.sh
usefulalgorithm Jul 25, 2024
197f14b
address comments
usefulalgorithm Jul 25, 2024
3f3b432
refactor tests
usefulalgorithm Jul 25, 2024
bea8204
add test input
usefulalgorithm Jul 25, 2024
cd12379
add test input
usefulalgorithm Jul 25, 2024
b18e0eb
do not ignore node columns
usefulalgorithm Jul 25, 2024
2c96e41
ignore generated files in coverage
usefulalgorithm Jul 25, 2024
bdbff8e
parse run result completion time
usefulalgorithm Jul 25, 2024
f25351e
update docs
usefulalgorithm Jul 25, 2024
bc0e7ed
address comments
usefulalgorithm Jul 25, 2024
cda2dfb
use pyproject.toml for precommit bandit
usefulalgorithm Jul 25, 2024
9c30769
add toml dep in precommit bandit
usefulalgorithm Jul 25, 2024
5496a34
fix
usefulalgorithm Jul 26, 2024
43f7212
fix time format
usefulalgorithm Jul 26, 2024
4966bdd
Merge remote-tracking branch 'origin/main' into tsung-julii/sc-27549/…
usefulalgorithm Jul 26, 2024
666e972
Merge branch 'main' into tsung-julii/sc-27549/use-discovery-api-exclu…
usefulalgorithm Jul 29, 2024
83d16fe
bump version
usefulalgorithm Jul 29, 2024
7321d48
lock setuptools version
usefulalgorithm Jul 29, 2024
80e9a21
explicitly install pyhive due to setuptool issue
usefulalgorithm Jul 29, 2024
f4b1131
force reinstall setuptools
usefulalgorithm Jul 29, 2024
8853e82
try
usefulalgorithm Jul 29, 2024
438f386
finally fix ci
usefulalgorithm Jul 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,9 @@ jobs:
poetry run mypy . --explicit-package-bases
poetry run bandit -r . -c pyproject.toml

# TODO(SC-14236): Include __init__.py back to coverage after fixing async testing issues
- name: Test
run: |
poetry run coverage run --source=metaphor --omit='**/__init__.py' -m pytest
poetry run coverage run -m pytest
poetry run coverage xml

- name: Codecov
Expand Down
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ repos:
rev: 1.7.8
hooks:
- id: bandit
args: ['--skip=B101,B106,B404,B603,B607,B608']
args: [-c, pyproject.toml]
additional_dependencies: ['bandit[toml]']

- repo: https://github.com/pycqa/flake8
rev: 7.0.0
Expand Down
17 changes: 17 additions & 0 deletions metaphor/common/entity_id.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,3 +121,20 @@ def dataset_normalized_name(
return normalize_full_dataset_name(
".".join([part for part in [db, schema, table] if part is not None])
)


def parts_to_dataset_entity_id(
usefulalgorithm marked this conversation as resolved.
Show resolved Hide resolved
platform: DataPlatform,
account: Optional[str],
database: Optional[str] = None,
schema: Optional[str] = None,
table: Optional[str] = None,
) -> EntityId:
"""
converts parts of a dataset, its platform and account into a dataset entity ID
"""
return to_dataset_entity_id(
dataset_normalized_name(database, schema, table),
platform,
account,
)
2 changes: 2 additions & 0 deletions metaphor/dbt/cloud/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ class DbtRun(NamedTuple):
project_id: int
job_id: int
run_id: int
environment_id: int

def __str__(self) -> str:
return f"ID = {self.run_id}, project ID = {self.project_id}, job ID = {self.job_id}"
Expand Down Expand Up @@ -115,6 +116,7 @@ def get_last_successful_run(
project_id=run.get("project_id"),
job_id=run.get("job_definition_id"),
run_id=run.get("id"),
environment_id=run.get("environment_id"),
)

offset += page_size
Expand Down
100 changes: 0 additions & 100 deletions metaphor/dbt/cloud/discovery_api.py

This file was deleted.

5 changes: 5 additions & 0 deletions metaphor/dbt/cloud/discovery_api/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from .generated.client import Client as DiscoveryAPIClient

__all__ = [
"DiscoveryAPIClient",
]
83 changes: 83 additions & 0 deletions metaphor/dbt/cloud/discovery_api/apollo-codegen-config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
{
"schemaNamespace": "MySchema",
"schemaDownload": {
"downloadMethod": {
"introspection": {
"endpointURL": "https://metadata.cloud.getdbt.com/graphql",
"httpMethod": {
"POST": {}
},
"includeDeprecatedInputValues": false,
"outputFormat": "SDL"
}
},
"downloadTimeout": 60,
"headers": [],
"outputPath": "./schema.graphql"
},
"experimentalFeatures": {
"clientControlledNullability": true,
"legacySafelistingCompatibleOperations": true
},
"operationManifest": {
"generateManifestOnCodeGeneration": false,
"path": "/operation/identifiers/path",
"version": "persistedQueries"
},
"input": {
"operationSearchPaths": [
"/search/path/**/*.graphql"
],
"schemaSearchPaths": [
"/path/to/schema.graphqls"
]
},
"output": {
"operations": {
"absolute": {
"accessModifier": "internal",
"path": "/absolute/path"
}
},
"schemaTypes": {
"moduleType": {
"embeddedInTarget": {
"accessModifier": "public",
"name": "SomeTarget"
}
},
"path": "/output/path"
},
"testMocks": {
"swiftPackage": {
"targetName": "SchemaTestMocks"
}
}
},
"options": {
"additionalInflectionRules": [
{
"pluralization": {
"replacementRegex": "animals",
"singularRegex": "animal"
}
}
],
"cocoapodsCompatibleImportStatements": true,
"conversionStrategies": {
"enumCases": "none",
"fieldAccessors": "camelCase",
"inputObjects": "camelCase"
},
"deprecatedEnumCases": "exclude",
"operationDocumentFormat": [
"definition"
],
"pruneGeneratedFiles": false,
"schemaDocumentation": "exclude",
"selectionSetInitializers": {
"localCacheMutations": true
},
"warningsOnDeprecatedUsage": "exclude"
}
}
5 changes: 5 additions & 0 deletions metaphor/dbt/cloud/discovery_api/ariadne-codegen.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[tool.ariadne-codegen]
schema_path = "schema.graphql"
queries_path = "queries.graphql"
async_client = false
target_package_name = "generated"
38 changes: 38 additions & 0 deletions metaphor/dbt/cloud/discovery_api/codegen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Generate GraphQL client code

## Requirements

- Python >= 3.9
- `ariadne-codegen`

## Usage

```bash
cd metaphor/dbt/cloud/discovery_api
./codegen.sh
```

## Existing files

### `codegen.sh`

Run this script to get the schema from DBT's Apollo server, and generate the corresponding GraphQL client code.

### `queries.graphql`

The queries we will execute from the extractor class.

### `apollo-codegen-config.json`

Copied from [Full Codegen Configuration Example](https://www.apollographql.com/docs/ios/code-generation/codegen-configuration/#full-codegen-configuration-example) on Apollo's site. The only modifications are:

- `endpointURL`
- `outputPath`

### `ariadne-codegen.toml`

Controls the behavior of `ariadne-codegen`.

### `schema.graphql`

The upstream DBT GraphQL schema. This file will be downloaded from upstream whenever `codegen.sh` is run.
17 changes: 17 additions & 0 deletions metaphor/dbt/cloud/discovery_api/codegen.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env bash

# The tool is called `apollo-ios-cli`: https://www.apollographql.com/docs/ios/code-generation/codegen-cli/
# It does not mean it's iOS only.
APOLLO_IOS_CLI_VERSION=1.14.0
usefulalgorithm marked this conversation as resolved.
Show resolved Hide resolved

wget -c \
"https://github.com/apollographql/apollo-ios/releases/download/${APOLLO_IOS_CLI_VERSION}/apollo-ios-cli.tar.gz" -O - | \
tar -xz

./apollo-ios-cli fetch-schema --path ./apollo-codegen-config.json

rm -f ./apollo-ios-cli

poetry run ariadne-codegen --config ariadne-codegen.toml
poetry run black .
poetry run isort .
Loading
Loading