-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dagster-airbyte] Support client_id
and client_secret
in AirbyteCloudResource
#23451
Merged
maximearmstrong
merged 8 commits into
master
from
maxime/ds-396/support-client-id-and-secret-for-airbyte-cloud
Aug 13, 2024
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
ecaf4ff
Update AirbyteCloudResource to use client id & secret
maximearmstrong 0028969
Use setup_for_execution and update refresh methods
maximearmstrong 79a8a2b
Lint
maximearmstrong c28dd9c
Update docs and code snippets
maximearmstrong 04fe7c1
Update resource and test_assets_defs
maximearmstrong e11939d
Update tests
maximearmstrong 5b35b92
Test refresh access token
maximearmstrong f2fefcb
Move refresh token to make_request
maximearmstrong File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,12 +5,14 @@ | |
import time | ||
from abc import abstractmethod | ||
from contextlib import contextmanager | ||
from datetime import datetime, timedelta | ||
from typing import Any, Dict, List, Mapping, Optional, cast | ||
|
||
import requests | ||
from dagster import ( | ||
ConfigurableResource, | ||
Failure, | ||
InitResourceContext, | ||
_check as check, | ||
get_dagster_logger, | ||
resource, | ||
|
@@ -19,13 +21,17 @@ | |
from dagster._core.definitions.resource_definition import dagster_maintained_resource | ||
from dagster._utils.cached_method import cached_method | ||
from dagster._utils.merger import deep_merge_dicts | ||
from pydantic import Field | ||
from pydantic import Field, PrivateAttr | ||
from requests.exceptions import RequestException | ||
|
||
from dagster_airbyte.types import AirbyteOutput | ||
|
||
DEFAULT_POLL_INTERVAL_SECONDS = 10 | ||
|
||
# The access token expire every 3 minutes in Airbyte Cloud. | ||
# Refresh is needed after 2.5 minutes to avoid the "token expired" error message. | ||
AIRBYTE_CLOUD_REFRESH_TIMEDELTA_SECONDS = 150 | ||
|
||
|
||
class AirbyteState: | ||
RUNNING = "running" | ||
|
@@ -94,7 +100,11 @@ def all_additional_request_params(self) -> Mapping[str, Any]: | |
raise NotImplementedError() | ||
|
||
def make_request( | ||
self, endpoint: str, data: Optional[Mapping[str, object]] = None, method: str = "POST" | ||
self, | ||
endpoint: str, | ||
data: Optional[Mapping[str, object]] = None, | ||
method: str = "POST", | ||
include_additional_request_params: bool = True, | ||
) -> Optional[Mapping[str, object]]: | ||
"""Creates and sends a request to the desired Airbyte REST API endpoint. | ||
|
||
|
@@ -120,10 +130,11 @@ def make_request( | |
if data: | ||
request_args["json"] = data | ||
|
||
request_args = deep_merge_dicts( | ||
request_args, | ||
self.all_additional_request_params, | ||
) | ||
if include_additional_request_params: | ||
request_args = deep_merge_dicts( | ||
request_args, | ||
self.all_additional_request_params, | ||
) | ||
|
||
response = requests.request( | ||
**request_args, | ||
|
@@ -244,7 +255,7 @@ def sync_and_poll( | |
|
||
|
||
class AirbyteCloudResource(BaseAirbyteResource): | ||
"""This resource allows users to programatically interface with the Airbyte Cloud API to launch | ||
"""This resource allows users to programmatically interface with the Airbyte Cloud API to launch | ||
syncs and monitor their progress. | ||
|
||
**Examples:** | ||
|
@@ -255,7 +266,8 @@ class AirbyteCloudResource(BaseAirbyteResource): | |
from dagster_airbyte import AirbyteResource | ||
|
||
my_airbyte_resource = AirbyteCloudResource( | ||
api_key=EnvVar("AIRBYTE_API_KEY"), | ||
client_id=EnvVar("AIRBYTE_CLIENT_ID"), | ||
client_secret=EnvVar("AIRBYTE_CLIENT_SECRET"), | ||
) | ||
|
||
airbyte_assets = build_airbyte_assets( | ||
|
@@ -269,15 +281,48 @@ class AirbyteCloudResource(BaseAirbyteResource): | |
) | ||
""" | ||
|
||
api_key: str = Field(..., description="The Airbyte Cloud API key.") | ||
client_id: str = Field(..., description="The Airbyte Cloud client ID.") | ||
client_secret: str = Field(..., description="The Airbyte Cloud client secret.") | ||
|
||
_access_token_value: Optional[str] = PrivateAttr(default=None) | ||
_access_token_timestamp: Optional[float] = PrivateAttr(default=None) | ||
|
||
def setup_for_execution(self, context: InitResourceContext) -> None: | ||
# Refresh access token when the resource is initialized | ||
self._refresh_access_token() | ||
|
||
@property | ||
def api_base_url(self) -> str: | ||
return "https://api.airbyte.com/v1" | ||
|
||
@property | ||
def all_additional_request_params(self) -> Mapping[str, Any]: | ||
return {"headers": {"Authorization": f"Bearer {self.api_key}", "User-Agent": "dagster"}} | ||
# Make sure the access token is refreshed before using it when calling the API. | ||
if self._needs_refreshed_access_token(): | ||
self._refresh_access_token() | ||
return { | ||
"headers": { | ||
"Authorization": f"Bearer {self._access_token_value}", | ||
"User-Agent": "dagster", | ||
} | ||
} | ||
|
||
def make_request( | ||
self, | ||
endpoint: str, | ||
data: Optional[Mapping[str, object]] = None, | ||
method: str = "POST", | ||
include_additional_request_params: bool = True, | ||
) -> Optional[Mapping[str, object]]: | ||
# Make sure the access token is refreshed before using it when calling the API. | ||
if include_additional_request_params and self._needs_refreshed_access_token(): | ||
self._refresh_access_token() | ||
return super().make_request( | ||
endpoint=endpoint, | ||
data=data, | ||
method=method, | ||
include_additional_request_params=include_additional_request_params, | ||
) | ||
|
||
def start_sync(self, connection_id: str) -> Mapping[str, object]: | ||
job_sync = check.not_none( | ||
|
@@ -306,6 +351,31 @@ def _should_forward_logs(self) -> bool: | |
# Airbyte Cloud does not support streaming logs yet | ||
return False | ||
|
||
def _refresh_access_token(self) -> None: | ||
response = check.not_none( | ||
self.make_request( | ||
endpoint="/applications/token", | ||
data={ | ||
"client_id": self.client_id, | ||
"client_secret": self.client_secret, | ||
}, | ||
# Must not pass the bearer access token when refreshing it. | ||
include_additional_request_params=False, | ||
) | ||
) | ||
self._access_token_value = str(response["access_token"]) | ||
self._access_token_timestamp = datetime.now().timestamp() | ||
|
||
def _needs_refreshed_access_token(self) -> bool: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This behavior makes sense to me, this is low enough complexity that it feels ok to abstract from the user |
||
return ( | ||
not self._access_token_value | ||
or not self._access_token_timestamp | ||
or self._access_token_timestamp | ||
<= datetime.timestamp( | ||
datetime.now() - timedelta(seconds=AIRBYTE_CLOUD_REFRESH_TIMEDELTA_SECONDS) | ||
) | ||
) | ||
|
||
|
||
class AirbyteResource(BaseAirbyteResource): | ||
"""This resource allows users to programatically interface with the Airbyte REST API to launch | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a
@property
, we should avoid doing real work inside the fn body (like refreshing the access token). One different approach could be be to pull out the refresh into a separate fn like_get_or_refresh_access_token()
so that it's clear that invoking it may have a cost, and call that directly inmake_request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to
make_request
in f2fefcb