Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for automatic APM tracers configuration #354

Merged
merged 14 commits into from
Oct 13, 2023

Conversation

nikita-tkachenko-datadog
Copy link
Collaborator

@nikita-tkachenko-datadog nikita-tkachenko-datadog commented Aug 10, 2023

Requirements for Contributing to this repository

  • Fill out the template below. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
  • The pull request must only fix one issue at the time.
  • The pull request must update the test suite to demonstrate the changed functionality.
  • After you create the pull request, all status checks must be pass before a maintainer reviews your contribution. For more details, please see CONTRIBUTING.

What does this PR do?

Adds support for injecting Datadog APM Tracers into Jenkins builds transparently to the user.
"Enable Datadog Test Visibility" checkbox should be set in job configuration UI in order to enable tracer injection for a specific job.

Description of the Change

A new JobProperty is added that stores tracer injection settings.
An EnvironmentContributor (for freestyle projects) and a StepContributor (for pipelines) are added as well.

If Java tracer injection is enabled one of the contributors downloads the tracer JAR (the JAR is downloaded to the node where the build is running) and augments job environment with MAVEN_OPTS/GRADLE_OPTS and Datadog environment variables.
Tracer JAR is stored inside the job workspace. By default it is downloaded only if no JAR has been downloaded previously, or if previous JAR is older than 12 hours (TTL is configurable).

Injection for other languages works similarly (a combination of installing the tracer and injecting it via environment variables).

Alternate Designs

  1. Configuring tracer injection centrally in Datadog plugin settings - rejected, as this is too coarse grained, customer might not want to run ALL of their jobs with the tracer, but likely only a subset of them.
  2. Adding custom a job/pipeline step - rejected, as this would require customer to alter their job/pipeline definitions, and is not likely to bring additional value (it is very unlikely that the customer might want to trigger multiple Maven/Gradle builds in the same job and only instrument some of them. Even in such situations the accepted design allows this by modifying the injected env variables).
  3. Using BuildWrapper instead of EnvironmentContributor - rejected, as using a build wrapper requires a dedicated UI control in addition to job property that is already there

Possible Drawbacks

Tracer injection will not work if a job starts Maven/Gradle build inside a container.
In this case customer can fallback to manual tracer configuration, the way this is done at the moment.

Verification Process

Verified manually using a dockerized Jenkins instance.
Checked the following scenarios:

  • freestyle build executed on master node
  • freestyle build executed on agent node
  • pipeline build executed on master node
  • pipeline build executed on agent node

Both Maven and Gradle builds were verified.

Additional Notes

Release Notes

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

@nikita-tkachenko-datadog nikita-tkachenko-datadog added the changelog/Added Added features results into a minor version bump label Aug 10, 2023
@github-actions github-actions bot added the documentation Documentation related changes label Aug 11, 2023
Base automatically changed from nikita-tkachenko/network-layer-update to master August 28, 2023 08:43
@nikita-tkachenko-datadog nikita-tkachenko-datadog force-pushed the nikita-tkachenko/tracer-autoconfig branch from 9344634 to 4cfc92f Compare August 29, 2023 11:30

final class JavaConfigurator implements TracerConfigurator {

private static final String TRACER_DISTRIBUTION_URL = "https://dtdg.co/latest-java-tracer";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other environments such as PCF, we've found it useful to allow for configuring the version of tracer used, instead of always downloading the latest version

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the possibility of overriding distribution URL via env var (configured either centrally for the Jenkins process, or via the new job property).
I would really like to push the clients to using the "latest" URL as it takes care of updating the tracer transparently for them, so I am intentionally not exposing this in UI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree it's fine to leave it out of the UI. Do you mind adding verification around this URL to ensure it is a valid datadog/tracer URL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added verification to ensure we're either downloading from the Datadog's distribution site or from Maven Central (this can be useful since Datadog does not server some minor releases). In the latter case artifact path is also checked to ensure it's the tracer that we're downloading.

@nikita-tkachenko-datadog nikita-tkachenko-datadog changed the title Add support for automatic Java tracer configuration Add support for automatic APM tracers configuration Aug 31, 2023
public ShellCommandCallable(long timeoutMillis, String... command) {
this.timeoutMillis = timeoutMillis;
this.command = command;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SpotBugs bot noted this:

new org.datadog.jenkins.plugins.datadog.tracer.ShellCommandCallable(long, String[]) may expose internal representation by storing an externally mutable object into ShellCommandCallable.command

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added defensive copying of the array

Copy link

@urnfdog urnfdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only just scratched the surface, I'll get back to this tomorrow. Feel free to ignore anything that sounds off.

StandardListBoxModel result = new StandardListBoxModel();
// If the users does not have permissions to list credentials, only list the current value
if (item == null) {
if (!Jenkins.get().hasPermission(Jenkins.ADMINISTER)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use an innate Jenkins/Hudson access control library to do this check? For example, if this is an AccessControlled object, can this be checkPermission(ADMINISTER)?

If not, does it make sense to make DatadogGlobalConfiguration an access controlled object? There's a small advantage in using this library in that it'll stay up to date with any major security changes to Jenkins itself, rather than doing our own check.

Apologies if I'm off on this, new to Jenkins.

Reference: https://www.jenkins.io/doc/developer/security/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that we do use the Jenkins control library: we are checking the root ACL, that is we're verifying if the user has the permission to administer the Jenkins instance.

Implementing AccessControlled and calling checkPermission(ADMINISTER) works in a way that is very similar, the only difference is that the access controlled object can have its own ACL, which allows to manage access to it in a more fine-grained manner.

Technically it is possible, of course, to make the plugin config an AccessControlled object. I would argue, however, that changing the way authorisation works in the plugin is outside the scope of this particular feature.

The application key that is being added to the config is managed in a way that is consistent with the other protected config properties.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue, however, that changing the way authorisation works in the plugin is outside the scope of this particular feature.

Sounds good. Let's not do anything out of scope, especially if it's a concern for the broader plugin and can be captured in a separate review if necessary.

The application key that is being added to the config is managed in a way that is consistent with the other protected config properties.

I don't know how it works, but as long as it's not done in a way that's less secure than current standards those are questions for another place.

@@ -140,6 +141,9 @@ public class DatadogGlobalConfiguration extends GlobalConfiguration {
private Secret targetApiKey = null;
private String targetCredentialsApiKey = null;
private Secret usedApiKey = null;
private Secret targetApplicationKey = null;
private String targetCredentialsApplicationKey = null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we want to keep this a String rather than a Secret?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the key itself, but an identifier of the Credentials that store the key.

* Populates the targetCredentialsApplicationKey field from the configuration screen with all of the valid credentials
*
* @param item - The context within which to list available credentials
* @param targetCredentialsApplicationKey - A String containing the application key as a credential
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there length checks before the string params gets here? I.e. can someone throw in a 2,147,483,647 char key?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The String object is created by the Java Servlet mechanism that parses request parameters. This is outside of the plugin scope, we just get whatever is provided by the lower abstraction layers.

This is being submitted in a POST request, so there is no limit on the length of parameter. It is technically possible for someone to submit a parameter this long. It would create a 4GB string in the JVM heap, which either occupies a significant part of it, or lead to an OutOfMemoryError if the heap is not big enough

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this regard it's identical to any other user-supplied config parameter

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It sounds like there are no unique vulnerabilities the this would be subject to in terms of inputs, and likely a failure would already occur out of scope much further upstream. Is it safe to say then that even though the APM tracer configuration is ingesting new inputs, it is not responsible for handling the sanitization and enforcing limits, delegating that to the servlet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, request parameter parsing is done by servlet.

&& !item.hasPermission(CredentialsProvider.USE_ITEM)) {
return result.includeCurrentValue(targetCredentialsApplicationKey);
}
}
Copy link

@urnfdog urnfdog Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Informational: Are there unit tests for the permission logic here? I might have missed them in the test changes. Just a recommendation.

My main concern here around tests, outside of good practice for authorization code, is that the logic here is approximately "fail open" (your default state is the sensitive operation) - I only say approximate, because in case of an exception, as far as I know Java will stop execution.

However, this means the logic required some extra thought to figure out exactly the scenarios allowed for access and I worry a little bit about maintainability.

For example, say that Jenkins introduces a new permission type called "SUPERUSER" and we want to allow that too. Throwing this in requires the DeMorgan-ing our logic to !Jenkins.get().hasPermission(Jenkins.ADMINISTER) && !Jenkins.get().hasPermission(Jenkins.SUPERUSER), but I feel it's easy to make the mistake !Jenkins.get().hasPermission(Jenkins.ADMINISTER) || !Jenkins.get().hasPermission(Jenkins.SUPERUSER), which would be an access control error and grant unrestricted access.

Might get caught in code review, might not - but explicit tests would go a long way here.

Along those lines, if the ADMINISTER permission is inclusive of all item permissions, I wonder if it's easier to maintain if the logic is inverted, though this can still look a bit janky:

if (Jenkins.get().hasPermission(Jenkins.ADMINISTER)
        || (item != null 
                && (item.hasPermission(Item.EXTENDED_READ)
                        || item.hasPermission(CredentialsProvider.USE_ITEM))) {
        return result.includeEmptyValue()
                .includeMatchingAs(ACL.SYSTEM,
                        Jenkins.get(),
                        StringCredentials.class,
                        Collections.emptyList(),
                        CredentialsMatchers.instanceOf(StringCredentials.class))
                .includeCurrentValue(targetCredentialsApplicationKey);
}
return result.includeCurrentValue(targetCredentialsApplicationKey);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, that covering this with tests is a good idea, however I would argue that it should be done in a separate PR: in here I am simply introducing an additional field which is handled in a way identical to the others that existed before, and it would be best to change/verify the way permissions are checked in a centralised manner (that is, do it for all config properties).


DatadogClient.ClientType clientType = DatadogClient.ClientType.valueOf(datadogConfig.getReportWith());
switch (clientType) {
case HTTP:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this with or without TLS? If not, followup feels like it's out of scope but it does appear that there's some sensitive items being sent over the wire (API Key/App key).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The injected APM tracer will always use HTTPS. The plugin itself, when sending job data to Datadog, will use whatever URL is supplied by the user in the config settings, so it depends on whether Datadog allows unencrypted HTTP connections or not. I tried configuring URLs with http:// protocol, but the server side always seems to negotiate the usage of HTTPS.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding will use whatever URL is supplied by the user in the config settings - so we don't necessary enforce this has to go to a datadog domain, and correct me if I'm wrong: this tracer isn't designed to enforce that. It provides the shape of data we expect, but we allow the user to send to any recipient? Because of that, it's a design decision to use HTTP by default, where the recipient ends up negotiating encryption?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HTTP constant name might be misleading here, in reality the URL can be either http or https, depending on what the user configures.

Yes, technically the user can configure the plugin to send the data to any endpoint, not necessarily Datadog. I do not know if this is a valid or expected use case (might be useful if the user wants the data to go through some proxy), but it is technically possible to enter anything into the endpoint URL inputs: the validation logic simply checks that the url string is not empty and that it contains http substring.

The default values all point to Datadog endpoints, and use https protocol. If I change them and specify http, the remote end (Datadog backend) will still force the usage of https.

As a side note, URL config inputs, validation logic, and different client types were implemented before. In this feature I am using this pre-existing logic to ensure that the injected tracer uses the same config values as the plugin itself.

}
}

private void validateUserSuppliedTracerUrl(String distributionUrl) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it looks like we validate the url, is there some way for the configurator to confirm that this is a validated copy of the tracer? Main thing guarding against here is a compromise of maven. I notice that there are some degree of shasum checks later on - does that apply here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracer available in Maven Central is signed with Datadog's PGP key. I have added signature verification for cases when we download it from Maven.

Copy link

@urnfdog urnfdog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a lot of recommendations, but nothing glaringly significant with my current level of context. Feel free to push back on anything that doesn't make sense that you feel have other controls or are exceedingly unlikely. Thanks!

@nikita-tkachenko-datadog nikita-tkachenko-datadog force-pushed the nikita-tkachenko/tracer-autoconfig branch from 9e28ce4 to b68423f Compare September 27, 2023 14:59
@nikita-tkachenko-datadog nikita-tkachenko-datadog force-pushed the nikita-tkachenko/tracer-autoconfig branch from b68423f to b53ace7 Compare September 27, 2023 15:01
@nikita-tkachenko-datadog nikita-tkachenko-datadog merged commit f206645 into master Oct 13, 2023
15 checks passed
@nikita-tkachenko-datadog nikita-tkachenko-datadog deleted the nikita-tkachenko/tracer-autoconfig branch October 13, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Added Added features results into a minor version bump documentation Documentation related changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants