Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DR-3077] Log DRS resolution and URL signing to Bard #76

Merged
merged 5 commits into from
Sep 29, 2023

Conversation

nmalfroy
Copy link
Contributor

@nmalfroy nmalfroy commented Sep 19, 2023

This PR:

  • Adds a Bard client definition and client to send events to Bard for it to track all successful drs resolution calls. The EventProperties is interesting since it's a combination of set fields and random extra values. I needed to add a custom serializer to properly handle that
  • Adds a request interceptor to asynchronously send these events (these are non-blocking so there should be no perf impact other that this will add Runnables to the async queue). Only methods tagged with the newly created @TrackEvent annotation will be tracked.
  • Worth calling out again: only successful drs resolutions will be logged since this is to track what files folks are using. We should use the usual methods (e.g. grafana) for logging excessive 404s for instance
  • It's configurable but by default, the flag is set to NOT send events to Mixpanel when they reach Bard so the requests will be logged in the Warehouse.

At the end of the day, this ends up sending events to Bard that look like:

{
  "event": "drshub:api",
  "properties": {
    "appId": "drshub",
    "pushToMixpanel": false,
    "statusCode": 200,
    "requestUrl": "/api/v4/drs/resolve or /api/v4/gcs/getSignedUrl",
    "userProject": "<the x-user-project header value>",
    "ip": "<the X-Forwarded-For header value>",
    "forceAccessUrl": "<the drshub-force-access-url header value>"
  }
}

where the last three are only sent if there is a header value set. This should give us enough information (the userProject field in particular)

This is better reviewed commit-by-commit

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (b0b22b5) 83.26% compared to head (1afba40) 84.17%.

Additional details and impacted files
@@             Coverage Diff              @@
##                dev      #76      +/-   ##
============================================
+ Coverage     83.26%   84.17%   +0.91%     
- Complexity      207      228      +21     
============================================
  Files            33       36       +3     
  Lines           723      809      +86     
  Branches         69       75       +6     
============================================
+ Hits            602      681      +79     
- Misses           89       96       +7     
  Partials         32       32              
Files Coverage Δ
...c/main/java/bio/terra/drshub/config/WebConfig.java 100.00% <100.00%> (ø)
.../terra/drshub/controllers/DrsHubApiController.java 87.50% <ø> (ø)
...bio/terra/drshub/controllers/GcsApiController.java 100.00% <ø> (ø)
...ava/bio/terra/drshub/services/TrackingService.java 100.00% <100.00%> (ø)
...java/bio/terra/drshub/services/BardApiFactory.java 90.90% <90.90%> (ø)
...bio/terra/drshub/tracking/TrackingInterceptor.java 85.18% <85.18%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nmalfroy nmalfroy force-pushed the dr-3077-nm-log-resolution-to-bard branch from 99ec736 to 8c77ebe Compare September 19, 2023 20:58
Copy link
Contributor

@okotsopoulos okotsopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very close to being done with this review, just want to look more closely at the TrackingServiceTest and TrackingInterceptorTest :) here are my initial questions!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this definition come from? If Bard changes, I'm guessing this will be out of date, could that pose an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I started from the swagger that gets generated from the api definition there. I ended up needing to do quite a bit of hand building to make it play nice with the code gen. I think for now it's ok but ultimately, yes we'll want some way to keep this in sync (thinking pact tests)

type: object
required:
- appId
Itentify:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a funny name :) do you know why this is Itentify instead of Identify, is the latter a reserved word?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because I derped :). I'll update

@@ -30,4 +32,8 @@ public interface DrsHubConfigInterface {
Integer getPencilsDownSeconds();

Integer asyncThreads();

// If this is true, then we will track calls to the Bard API in MixPanel in addition to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this might be nice as a docstring instead, I think that might make the comment more visible when viewing references in IDEs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that we've named this slightly off from the existing convention of pushToMixpanel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I'll update!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever going to be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely no, but it made it easier to test so I left it in as a potential knob

import org.springframework.web.client.RestTemplate;

@Service
public record BardApiFactory(DrsHubConfig drsHubConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a place where we'd benefit from sharing common HTTP clients? I wasn't 100% sure if it was relevant here. We made that switch in TDR and it helped our memory usage quite a bit:

DataBiosphere/jade-data-repo#1462

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm maybe. Let me look

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be treacherous to share api clients: #75

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially since each API Client has the bearer-token baked into it, and it can't be set per-request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll leave it separate

@@ -25,6 +26,7 @@ public record DrsHubApiController(
implements DrsHubApi {

@Override
@TrackCall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was a cool approach :)


/**
* Add this annotation to a method to track its call as a user event in Bard. Note this should just
* be used within Controller classes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this be used within Controller classes only, is it to guard against multiple logs for the same root event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wanted to log the explicit controller call since it maps nicely to a URL and HTTP method which is ultimately what we're interested (no value in tracking internal method calls)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! That could be helpful context to add in the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough!


/**
* Call the syncProfile endpoint in Bard. This ensures that the Terra user is properly associated
* with a MixPanel user. Note that this is if mixpanel tracking is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not solid here on the behavior when mixpanel tracking isn't enabled: is this method never called, do the internals behave as no-ops, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's branching that lives in Bard. We always log to the warehouse but we can opt out of also logging to mixpanel (which we want to do in this case in all likelihood to not explode our quota)

Exception ex) {
var path = request.getRequestURI();
var responseStatus = HttpStatus.valueOf(response.getStatus());
// Note: invalid responses will not be logged since there are no sessions to associate users
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate this comment!

Copy link
Contributor

@okotsopoulos okotsopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! No critical blockers in my comments.

Comment on lines +58 to +62
addToPropertiesIfPresentInHeader(request, properties, "x-user-project", "userProject");
addToPropertiesIfPresentInHeader(request, properties, "X-Forwarded-For", "ip");
addToPropertiesIfPresentInHeader(
request, properties, "drshub-force-access-url", "forceAccessUrl");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing the intention behind pulling these specific header values into properties, but I think comment(s) would help (or breaking the code out further).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's because those are the only known headers (that I could find) and I didn't want to just pass along all headers since that could include some sensitive info (like auth tokens, etc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I think this would help grok-ability long-term to have in a code comment or something like that.

@@ -30,4 +32,8 @@ public interface DrsHubConfigInterface {
Integer getPencilsDownSeconds();

Integer asyncThreads();

// If this is true, then we will track calls to the Bard API in MixPanel in addition to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever going to be true?

import org.springframework.web.client.RestTemplate;

@Service
public record BardApiFactory(DrsHubConfig drsHubConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be treacherous to share api clients: #75

import org.springframework.web.client.RestTemplate;

@Service
public record BardApiFactory(DrsHubConfig drsHubConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially since each API Client has the bearer-token baked into it, and it can't be set per-request.

this.trackInMixpanel = config.trackInMixPanel();
}

@Async("asyncExecutor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is using the same executor as the requests going to cut into how performant DRSHub is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about that. I kinda doubt it since it's already set to Tomcat max threads. One more reason to bump the JDK and us virtual threads

Note: the Interceptor has an @async annotation, meaning that logging will not block the response.  This is also why I pulled in awaitility for the testing
- Add doc strings
- Fix typo on api spec
@nmalfroy nmalfroy force-pushed the dr-3077-nm-log-resolution-to-bard branch from 8c77ebe to 1afba40 Compare September 28, 2023 20:44
@sonarcloud
Copy link

sonarcloud bot commented Sep 28, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@nmalfroy nmalfroy merged commit ef79888 into dev Sep 29, 2023
14 checks passed
@nmalfroy nmalfroy deleted the dr-3077-nm-log-resolution-to-bard branch September 29, 2023 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants