Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add custom Azure host/port to support custom blob endpoint #164

Merged
merged 1 commit into from
Jan 3, 2024

Conversation

jeqo
Copy link
Contributor

@jeqo jeqo commented Dec 20, 2023

About this change - What it does

Adds custom host and port configs to Azure to support custom blob endpoint like Azurite.

Resolves: #163

Why this way

Follow similar approach to S3, to set custom host/port.
Moved conn_string construction out of the initialization to be able to test it.

@jeqo jeqo force-pushed the jeqo/custom-azure-url branch 4 times, most recently from 9a54a02 to c0ce800 Compare December 20, 2023 16:12
@codecov-commenter
Copy link

codecov-commenter commented Dec 20, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (f3b1f53) 71.97% compared to head (e5e4303) 72.44%.
Report is 5 commits behind head on main.

Files Patch % Lines
rohmu/object_storage/azure.py 76.92% 3 Missing ⚠️
rohmu/object_storage/config.py 88.23% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #164      +/-   ##
==========================================
+ Coverage   71.97%   72.44%   +0.47%     
==========================================
  Files          35       35              
  Lines        3928     4035     +107     
==========================================
+ Hits         2827     2923      +96     
- Misses       1101     1112      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jeqo jeqo requested a review from rominf December 20, 2023 16:32
rohmu/object_storage/azure.py Outdated Show resolved Hide resolved
rohmu/object_storage/azure.py Outdated Show resolved Hide resolved
rohmu/object_storage/azure.py Outdated Show resolved Hide resolved
test/object_storage/test_azure.py Outdated Show resolved Hide resolved
@jeqo jeqo requested a review from rominf December 21, 2023 12:09
@jeqo jeqo force-pushed the jeqo/custom-azure-url branch 3 times, most recently from 2f2c3b9 to a7ae503 Compare December 28, 2023 11:02
Copy link
Contributor

@exg exg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixup commits should be folded.

@giacomo-alzetta-aiven
Copy link
Contributor

Overall LGTM but I agree with Emanuele that commit history could be cleaned up before merging.

@jeqo jeqo force-pushed the jeqo/custom-azure-url branch from 8d91072 to 2e7e68f Compare January 2, 2024 10:53
@jeqo jeqo requested a review from exg January 2, 2024 10:53
@jeqo
Copy link
Contributor Author

jeqo commented Jan 2, 2024

@exg @giacomo-alzetta-aiven thanks! Commits are folded now, have another look.

conn.append(f"EndpointSuffix={endpoint_suffix}")
else:
if not host or not port:
raise InvalidConfigurationError("Custom host and port must be specified together")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use pydantic validators for custom validation, see https://docs.pydantic.dev/1.10/usage/validators/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Adding this on the latest commit, have a look

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation of azure_cloud can also be moved in the validator. Also, remember to fold fixup commits before pushing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can move it to validators as well.

Re: fixups, I prefer to keep them separate for easier review, but can surely rebase/squash once approved if that's ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also fine, and I used to recommend that too before. I just find it unnecessary given that GitHub provides the diff between the previous and current branch tip on force push (via the "Compare" button).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if a reviewer marks the files as "viewed" when reviewing GitHub will also show them which files have changed when the PR is updated, so it's not like reviewers are forced to re-read everything to discover what you updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will keep that in mind.
I've squashed all commits now for this change.

@jeqo jeqo force-pushed the jeqo/custom-azure-url branch 2 times, most recently from f4e5491 to 9ff51d4 Compare January 2, 2024 14:16
@jeqo jeqo requested a review from exg January 2, 2024 15:05
@jeqo jeqo force-pushed the jeqo/custom-azure-url branch from 9a3dd93 to aee7dee Compare January 2, 2024 16:55
@root_validator
@classmethod
def host_and_port_must_be_set_together(cls, values: Dict[str, Any]) -> Dict[str, Any]:
if "host" in values and "port" in values:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is redundant because the fields have a default value and so they are always present. BTW, another option would be to introduce a HostInfo model, similar to ProxyInfo, containing the new fields. This way there would be no need for a custom validator, and it would be more consistent. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixing that on the latest commit.

About HostInfo, I like the idea, but we could do that as a separate refactoring PR as there are many places where this type would fit. At the moment this is consistent with the S3 approach to set custom host/port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I did not realize that this is based on the existing S3 config.

@jeqo jeqo force-pushed the jeqo/custom-azure-url branch from aee7dee to e5e4303 Compare January 2, 2024 18:17
@jeqo jeqo requested a review from exg January 2, 2024 18:19
@root_validator
@classmethod
def host_and_port_must_be_set_together(cls, values: Dict[str, Any]) -> Dict[str, Any]:
if not values["host"] or not values["port"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rejects (existing) configs where host and port are both missing, which is not correct. I think you want

(values["host"] is None) != (values["port"] is None)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, test_azure_config_host_port_set_together or a similar test should also verify that the valid cases do not raise an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch! I added tests to validate the minimal case and fix this.

@jeqo jeqo force-pushed the jeqo/custom-azure-url branch from e5e4303 to 04fe766 Compare January 3, 2024 12:58
@jeqo jeqo requested a review from exg January 3, 2024 12:59
@jeqo jeqo force-pushed the jeqo/custom-azure-url branch from 04fe766 to 579d154 Compare January 3, 2024 14:17
Copy link
Contributor

@giacomo-alzetta-aiven giacomo-alzetta-aiven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I will wait @exg review too before merging

Comment on lines +124 to +128
if not host and not port:
endpoint_suffix = AZURE_ENDPOINT_SUFFIXES[azure_cloud]
conn.append(f"EndpointSuffix={endpoint_suffix}")
else:
conn.append(f"BlobEndpoint={protocol}://{host}:{port}/{account_name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess we could do something like:

Suggested change
if not host and not port:
endpoint_suffix = AZURE_ENDPOINT_SUFFIXES[azure_cloud]
conn.append(f"EndpointSuffix={endpoint_suffix}")
else:
conn.append(f"BlobEndpoint={protocol}://{host}:{port}/{account_name}")
if not host and not port:
endpoint_suffix = AZURE_ENDPOINT_SUFFIXES[azure_cloud]
conn.append(f"EndpointSuffix={endpoint_suffix}")
elif host and port:
conn.append(f"BlobEndpoint={protocol}://{host}:{port}/{account_name}")
else:
raise ValueError("You must either specify both host and port or neither of them")

In most cases the AzureTransfer will be build using the get_transfer facade where we have the PyDantic validation, but I believe people can still manually create the transfer, so maybe validating this again here is better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is possible to create transfers bypassing the pydantic config models?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just saying that we are not indicating that AzureTransfer is private, so users could instantiate it directly instead of going through the factory methods that use from_model to create the instance from the pydantic configuration.

I don't this is a big deal, hence why I class this as nitpicking.

@giacomo-alzetta-aiven giacomo-alzetta-aiven merged commit 7107456 into main Jan 3, 2024
8 checks passed
@giacomo-alzetta-aiven giacomo-alzetta-aiven deleted the jeqo/custom-azure-url branch January 3, 2024 18:02
@giacomo-alzetta-aiven
Copy link
Contributor

Since we all agree let's merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support custom blob endpoint for Azurite
5 participants