Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extra data option to add_source #83

Merged
merged 6 commits into from
May 24, 2024
Merged

Conversation

joaquimadraz
Copy link
Contributor

Enables sending extra body data when creating a source. Useful for the crunchlake source.

@joaquimadraz
Copy link
Contributor Author

@arlind87 can you help with the build issues on this one?

@arlind87
Copy link
Contributor

@arlind87 can you help with the build issues on this one?

@joaquimadraz i will check on this but it seems some of the python versions are depricated and have not worked for a long time

@joaquimadraz joaquimadraz force-pushed the extra-data-to-sources branch from dd8030c to 5a96516 Compare May 23, 2024 13:54
"""Create a new Source on the given dataset and return its URL."""
sources_url = ds.user_url.catalogs['sources']
# Don't call Catalog.post here (which would force application/json);
# we want requests.Session to set multipart/form-data with a boundary.
new_source_url = ds.session.post(
sources_url, files={"uploaded_file": (filename, fp, mimetype)}
sources_url, files={"uploaded_file": (filename, fp, mimetype)}, data=data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this data argument? The POST method on the sources controller does not specify such argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sources controller now expects data in the POST body which is where we are sending the required attributes
for the crunchlake source.

data is not a sources controller specific attribute. It's request body data for the POST request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it, this is not a data query param, but the request data

@@ -53,13 +53,13 @@ def wait_for_batch_status(self, batch, status):
raise ValueError("The batch did not reach the '%s' state in the "
"given time. Please check again later." % status)

def add_source(self, ds, filename, fp, mimetype):
def add_source(self, ds, filename, fp, mimetype, data={}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to default to None - data=None, in this context an empty dict is probably safe, but it is considered to be a bad smell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, definitely a bug waiting to happen

Copy link
Contributor

@jjdelc jjdelc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That data={} needa to change for data=None

@@ -53,13 +53,13 @@ def wait_for_batch_status(self, batch, status):
raise ValueError("The batch did not reach the '%s' state in the "
"given time. Please check again later." % status)

def add_source(self, ds, filename, fp, mimetype):
def add_source(self, ds, filename, fp, mimetype, data={}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, definitely a bug waiting to happen

"""Create a new Source on the given dataset and return its URL."""
sources_url = ds.user_url.catalogs['sources']
# Don't call Catalog.post here (which would force application/json);
# we want requests.Session to set multipart/form-data with a boundary.
new_source_url = ds.session.post(
sources_url, files={"uploaded_file": (filename, fp, mimetype)}
sources_url, files={"uploaded_file": (filename, fp, mimetype)}, data=data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it, this is not a data query param, but the request data

@enkidulan
Copy link
Contributor

@jjdelc I've changed the default value for data argument

@enkidulan enkidulan requested a review from jjdelc May 23, 2024 17:53
@joaquimadraz joaquimadraz merged commit 9bb0733 into master May 24, 2024
5 checks passed
@joaquimadraz joaquimadraz deleted the extra-data-to-sources branch May 24, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants