-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add otel integration #19
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @BenoitRanque
A couple of questions in the review, and having OTEL_EXPORTER_OTLP_ENDPOINT
determine if the feature is on or not will be useful for transition to Beta.
I also have a strong desire to see all the SDKs output trace information to STDOUT if OTEL is disabled, or if the OTEL service can't be found. But this is a broader change that will apply to Rust SDK, etc. too so there's no reason to debate that here.
"@opentelemetry/api": "^1.7.0", | ||
"@opentelemetry/exporter-metrics-otlp-proto": "^0.48.0", | ||
"@opentelemetry/instrumentation-fastify": "^0.33.0", | ||
"@opentelemetry/instrumentation-http": "^0.48.0", | ||
"@opentelemetry/instrumentation-pino": "^0.35.0", | ||
"@opentelemetry/resources": "^1.21.0", | ||
"@opentelemetry/sdk-metrics": "^1.21.0", | ||
"@opentelemetry/sdk-node": "^0.48.0", | ||
"@opentelemetry/semantic-conventions": "^1.21.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a few deps. Do we reference these all directly?
return connector.getCapabilities(configuration); | ||
const capabilitiesResponse = tracer.startActiveSpan( | ||
"getCapabilities", | ||
(span) => { | ||
const capabilitiesResponse = connector.getCapabilities(configuration); | ||
span.end(); | ||
return capabilitiesResponse; | ||
} | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I remember seeing some fastify middleware that wrapped the endpoints for OTEL automatically. Would this be viable? No worries if you want to keep it manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to know more about what the tracing output looks like on the terminal and how we can control the verbosity, if necessary.
Connectors like the NodeJS Lambda connector output information such as compile errors and function type errors on the terminal and if that just gets lost in reams of debug trace info, that would be a bad outcome.
if (options.otelExporterOtlpEndpoint) { | ||
initTelemetry( | ||
options.otelServiceName, | ||
options.otelExporterOtlpEndpoint | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this wasn't added inside startServer
?
For users of the SDK that provide their own server startup action (that eventually calls startServer
), it would be nice if this was just done for them inside startServer
so they don't need to do it themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this should go inside startServer
const queryResponse = await tracer.startActiveSpan( | ||
"handleQuery", | ||
async (span) => { | ||
const queryResponse = await connector.query( | ||
configuration, | ||
state, | ||
request.body | ||
); | ||
|
||
span.end(); | ||
|
||
return queryResponse; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this tracing pattern, what happens if connector.query
throws and span.end
does not get called?
I'm closing this PR in favour of #21, which forwardports the backport (lol 😵). That was easier than cherry picking the changes from the backport up here. |
This PR adds tracing and metrics, but there are open questions:
OTEL wants to be loaded before anything else.
The otel examples stress that OTEL libraries must be loaded before anything else, and that the require node flag should be used for this.
We may consider doing this in a standard dockerfile? For now, it's not been a problem, but users could have issues if they load some of our instrumented libraries before they load our SDK.
Our packaging spec suggests traces should be sent to the endpoint specified with
OTEL_EXPORTER_OTLP_ENDPOINT
, however the OTEL spec specifies that this should be the base url for telemetry, and traces should be sent to<base_url>/v1/traces
(or useOTEL_EXPORTER_OTLP_TRACES_ENDPOINT
for traces). We implement the OTEL spec with the expectation that the connector packaging spec will change. comment on RFCedit: this has now been clarified, and this implementation is correct.
I tried to test this with our infra, but I think the latest changes are not deployed yet, or I did something else wrong. Either way, this is not tested end-to-end, so caution is advised.
JIRA: NDC-236