Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Archive Token from requests instead of setting #29

Closed
wants to merge 1 commit into from

Conversation

LTDakin
Copy link
Contributor

@LTDakin LTDakin commented Sep 6, 2024

Archive Token from Request

  • add token capture middleware that grabs the authorization header from the request to store in a cache and use it when making archive requests in get_archive_url
  • update readme now that manual setting no longer needed for local dev
  • remove ARCHIVE_TOKEN from settings since its no longer needed

bonus:

  • moved api paths to api group, since they also are under the api/ tree

Warning: When deploying this PR remember to remove the ARCHIVE TOKEN setting from the helm charts or wherever we populate settings

Copy link
Collaborator

@mgdaily mgdaily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we want to pass through the user's API token we need to think about it a bit more carefully. I don't think setting a global cache key is going to scale, unfortunately.

def __call__(self, request):
token = request.headers.get('Authorization')
if token:
cache.set('archive_token', token, timeout=None)
Copy link
Collaborator

@mgdaily mgdaily Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this cache doesn't differentiate between users. What happens if one user sends a request to the backend, the token gets set, and then another user overwrites the token by sending a request? It seems hard/impossible to guarantee that the token will be set correctly per user, per request with possibly hundreds or thousands of requests coming in at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it a bit and it seems like threading().local might be an option to store it? I read online some people store the request in the local thread so its accessible throughout the app. Wondering on your thoughts if this could be a solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than caching it, you can just place it in the request object so the things using it downstream have access to it. You can also just ignore using a middleware and get it out when you need it from the headers in the request.

Copy link
Contributor Author

@LTDakin LTDakin Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thing is, I use it in a util function called get_fits thats called basically everywhere, so we'd have to pass the request object to every function as an argument. So I was looking for a global scope to store it in, unless theres a way to access the request object from anywhere already? Putting it in local() seems like a common thing people do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would just pass it where it needs to go... I haven't used local stuff before but maybe that is okay too?

@@ -92,7 +92,7 @@ def get_archive_url(basename: str, archive: str = settings.ARCHIVE_API) -> dict:
query_params = {'basename_exact': basename }

headers = {
'Authorization': f'Token {settings.ARCHIVE_API_TOKEN}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to have a super user archive token here. I don't think we necessarily need to set it per-user, as the UI will only be requesting images that the user has access to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of doing this is to have the users own token passed in so the permissions they have is enforced. Right now using a superuser token, anyone could manually make a datalab api request for data they don't have permissions for and get it, so its a security issue.

@LTDakin
Copy link
Contributor Author

LTDakin commented Oct 4, 2024

tested out threading local but its a little too much just to take care of having an auth token for archive requests. Theres going to a refactor of the data to create an ImageData class so this is better added to that as introducing that class will change a lot about how images are fetched and data worked on.

@LTDakin LTDakin closed this Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants