Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appender-tracing: Include trace ID and span ID in logs when nested in tracing crate's spans #2438

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

dylanahsmith
Copy link

Fixes #1378

Problem Summary

opentelemetry-appender-tracing was only including the trace & span IDs from scopes opened using the opentelemetry tracing API. However, given that opentelemetry-appender-tracing is meant to provide support for getting these events from the tracing crate, it seems more likely that the spans would also be created through the tracing crate, but the trace & span IDS from those spans weren't being included in the logs.

Changes

Similar to the previous fix attempt (#1394), the span/trace IDs are obtained from the OtelData span extension from that the opentelemetry-tracing adds.

Issues resolved from the previous fix attempt (#1394):

Additionally, I've made the following changes:

  • Added a regression test that ensures the trace & span IDs are consistent between the logs and tracing signals
  • I've made the new tracing feature a default feature, since it seems very likely that there will be tracing spans to correlate if opentelemetry-appender-tracing is being. This should help avoid a lot of confusion, but still allows users to opt-out of this feature.

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@dylanahsmith dylanahsmith requested a review from a team as a code owner December 17, 2024 06:33
Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes to discuss #2438 (comment)

@@ -17,6 +17,7 @@ tracing = { workspace = true, features = ["std"]}
tracing-core = { workspace = true }
tracing-log = { version = "0.2", optional = true }
tracing-subscriber = { workspace = true, features = ["registry", "std"] }
tracing-opentelemetry = { version = "0.28", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a conflict if the application uses version 0.29 of tracing-opentelemetry, while the appender depends on version 0.28 as specified in this configuration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracing-opentelemetry v0.29 hasn't been released, but I think I see what you are getting at.

tracing-opentelemetry v0.28 (latest release) depends on opentelemetry v0.27, so won't be compatible with opentelemetry v0.28 when it is released. It looks like typically all the opentelemetry crates would be bumped together so there is a version of all these crates that are compatible with each other. However, this means the point at which opentelemetry-appender-tracing v0.28 would be released there wouldn't be a compatible version of tracing-opentelemetry without coordination with that external crate.

Given that this is an experimental feature, I'm guessing it doesn't make sense to wait for tracing-opentelemetry v0.29 to be released before releasing opentelemetry-appender-tracing.

Removing the version constraint would probably work in practice with these unstable versions given that the constraint on opentelemetry crates should ensure the correct version is used. Although, potentially tracing-opentelemetry could introduce a breaking change that is incompatible with opentelemetry-appender-tracing ithout changing the opentelemetry v0.x version that it depends on, which I think would result in the incompatible version of tracing-opentelemetry to be chosen to use by cargo when a compatible version does exist.

I think the simplest thing to do for now (while things are unstable) when releasing a new v0.x version of opentelemetry crates would be to just optimistically bump the version of this optional tracing-opentelemetry dependency to its next v0.y version. E.g. when releasing opentelemetry v0.28.0, depend on tracing-opentelemetry v0.29, so that it can be made compatible with opentelemetry-appender-tracing without any further changes to opentelemetry-appender-tracing. If tracing-opentelemetry v0.29.0 isn't compatible, then v0.28.1 can be released of opentelemetry-appender-tracing could be released fixing this feature. However, it also won't result in an incompatible v0.30 version of tracing-opentelemetry being used when v0.29 is already working with this feature.

Functionally, you could think of a release of a new v0.x version of opentelemetry crates (including opentelemetry-appender-tracing) as removing this feature, then automatically reintroducing the new feature when the next v0.y version of tracing-opentelemetry is made. This could be done manually, which would allow for extra testing after the v0.y release of tracing-opentelemetry is made, but it just seems like it introduces unnecessary work and complicates the opentelemetry-appender-tracing release process.

Once things stabilize, this could be coordinated more carefully between opentelemetry-appender-tracing & tracing-opentelemetry using RC releases, such that the new major version of opentelemetry-appender-tracing is made compatible with a RC version of tracing-opentelemetry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dylanahsmith. I agree that optimistically bumping the version of tracing-opentelemetry to the next v0.y (e.g., v0.29 when releasing opentelemetry v0.28) seems like a practical approach for now, given the experimental nature of the feature. If compatibility issues arise, a quick patch release can address them.

Comment on lines +55 to +58
[patch.crates-io]
opentelemetry = { path = "opentelemetry" }
opentelemetry_sdk = { path = "opentelemetry-sdk" }
opentelemetry-stdout = { path = "opentelemetry-stdout" }
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the build failure

     Checking opentelemetry-prometheus v0.27.0 (/home/runner/work/opentelemetry-rust/opentelemetry-rust/opentelemetry-prometheus)
error[E0624]: associated function `empty` is private
   --> opentelemetry-prometheus/src/lib.rs:287:33
    |
287 |             resource: Resource::empty(),
    |                                 ^^^^^ private associated function
    |
   ::: /home/runner/work/opentelemetry-rust/opentelemetry-rust/opentelemetry-sdk/src/resource/mod.rs:82:5
    |
82  |     pub(crate) fn empty() -> Self {
    |     ----------------------------- private associated function defined here

is due to opentelemetry-prometheus not using path dependencies for opentelemetry/opentelemetry_sdk and being incompatible with the workspace version of these crates (e.g. #2322 seems to be the cause of this incompatibility), so this PRs dependency override causes it to actually use the workspace versions which it can't compatible with.

Should opentelemetry-prometheus be using the workspace versions of these crates? If it isn't desired for all crates in this workspace to be tested against the workspace version, then should opentelemetry-appender-tracing also get tested against the latest release on crates.io?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opentelemetry-prometheus is not in active development, and it won't always work with the local workspace version. Can we have this to ignore the dependency override?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this to ignore the dependency override?

I don't see a way to scope the patch so it doesn't apply to a crate in the workspace and cargo requires the patch be applied at the root of the workspace (i.e. not just to a specific crate in the workspace).

One idea would be to remove it as a member of the workspace, but continue to have CI test it (e.g. add another cargo test command to scripts/test.sh). A glob is used to specify the workspace members, but exclude = ["opentelemetry-*"] can be added to the [workspace] section. Would this be an acceptable solution to this problem? Or does any have any better ideas?

Copy link
Member

@lalitb lalitb Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea would be to remove it as a member of the workspace, but continue to have CI test it

this looks fine to me, specifically since the fate of Prometheus crate is still under discussion #2451. @cijothomas @TommyCpp do you see any concerns, or better option here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dylanahsmith Can you do the above changes as part of this PR, so that the CI will be successful.

@lfdominguez
Copy link

lfdominguez commented Dec 28, 2024

Please merge this, working perfect for me:

Using this dependencies on my actix-web app:

# Tracing
tracing = "0.1"
tracing-log = "0.2"
opentelemetry = "0.27"
opentelemetry_sdk = { version = "0.27", features = ["rt-tokio-current-thread"] }
opentelemetry-otlp = "0.27"
opentelemetry-semantic-conventions = "0.27"
opentelemetry-appender-tracing = { version = "0.27", features = ["experimental_use_tracing_span_context"] }
tracing-opentelemetry = "0.28"
tracing-subscriber = { version = "0.3", features = ["registry", "env-filter"] }
tracing-bunyan-formatter = "0.3"
tracing-actix-web = { version = "0.7", features = ["opentelemetry_0_27"] }

[patch.crates-io]
opentelemetry = { package = "opentelemetry",git = "https://github.com/dylanahsmith/opentelemetry-rust.git", branch = "appender-tracing-fix-trace-context"  }
opentelemetry-appender-tracing = { package = "opentelemetry-appender-tracing", git = "https://github.com/dylanahsmith/opentelemetry-rust.git", branch = "appender-tracing-fix-trace-context" }
opentelemetry-http = { package = "opentelemetry-http", git = "https://github.com/dylanahsmith/opentelemetry-rust.git", branch = "appender-tracing-fix-trace-context"  }
opentelemetry-otlp = { package = "opentelemetry-otlp", git = "https://github.com/dylanahsmith/opentelemetry-rust.git", branch = "appender-tracing-fix-trace-context"  }
opentelemetry_sdk = { package = "opentelemetry_sdk", git = "https://github.com/dylanahsmith/opentelemetry-rust.git", branch = "appender-tracing-fix-trace-context"  }

And setup:

use std::sync::LazyLock;
use opentelemetry::KeyValue;
use opentelemetry::trace::{TraceError, TracerProvider};
use opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;
use opentelemetry_otlp::{LogExporter, MetricExporter, SpanExporter, WithExportConfig};
use opentelemetry_sdk::{logs::{self as sdklogs}, runtime, trace::{self as sdktrace}, Resource};
use opentelemetry_sdk::logs::LoggerProvider;
use opentelemetry_sdk::metrics::{MetricError, SdkMeterProvider};
use opentelemetry_sdk::propagation::TraceContextPropagator;
use opentelemetry_semantic_conventions::resource;
use tracing_bunyan_formatter::{JsonStorageLayer};
use tracing_subscriber::{EnvFilter, Layer};
use tracing_subscriber::layer::SubscriberExt;
use crate::error::CustomError;
use crate::tracing_utils;

const APP_NAME: &str = crate::built_info::PKG_NAME;

static RESOURCE: LazyLock<Resource> =
    // LazyLock::new(|| Resource::new(vec![KeyValue::new(resource::SERVICE_NAME, APP_NAME)]));
    LazyLock::new(|| {
        Resource::builder()
            .with_service_name(APP_NAME)
            .build()
    });

pub fn init_telemetry() -> Result<(), CustomError> {
    opentelemetry::global::set_text_map_propagator(TraceContextPropagator::new());

    let log_tracer = tracing_utils::init_logs()?;
    let span_tracer = tracing_utils::init_traces()?;
    let metric_tracer = tracing_utils::init_metrics()?;

    opentelemetry::global::set_meter_provider(metric_tracer.clone());

    let filter_otel = EnvFilter::new("info")
        .add_directive("hyper=off".parse().unwrap())
        .add_directive("opentelemetry=off".parse().unwrap())
        .add_directive("opentelemetry_sdk=off".parse().unwrap())
        .add_directive("tonic=off".parse().unwrap())
        .add_directive("h2=off".parse().unwrap())
        .add_directive("reqwest=off".parse().unwrap());


    let otel_log_layer = OpenTelemetryTracingBridge::new(&log_tracer).with_filter(filter_otel);
    let otel_span_layer = tracing_opentelemetry::layer().with_tracer(span_tracer.tracer(APP_NAME));

    let env_filter = EnvFilter::try_from_default_env().unwrap_or(EnvFilter::new("info")
        .add_directive("opentelemetry=debug".parse().unwrap())
        .add_directive("opentelemetry_sdk=off".parse().unwrap())
    );

    let format = tracing_subscriber::fmt::format().pretty();
    let formatting_layer = tracing_subscriber::fmt::layer().event_format(format);


    let subscriber = tracing_subscriber::Registry::default()
        .with(env_filter)
        .with(otel_span_layer)
        .with(otel_log_layer)
        .with(JsonStorageLayer)
        .with(formatting_layer);

    tracing::subscriber::set_global_default(subscriber).map_err(|err| {
        CustomError::TracingGenericError(format!("Fail to setup global tracing config: {}", err.to_string()))
    })?;

    Ok(())
}

fn init_logs() -> Result<sdklogs::LoggerProvider, opentelemetry_sdk::logs::LogError> {
    let exporter = LogExporter::builder()
        .with_tonic()
        .with_endpoint("http://127.0.0.1:44317")
        .build()?;

    Ok(LoggerProvider::builder()
        // .with_batch_exporter(exporter, runtime::TokioCurrentThread)
        .with_batch_exporter(exporter)
        .with_resource(RESOURCE.clone())
        .build())
}

fn init_traces() -> Result<sdktrace::TracerProvider, TraceError> {
    let exporter = SpanExporter::builder()
        .with_tonic()
        .with_endpoint("http://127.0.0.1:44317")
        .build()?;

    Ok(sdktrace::TracerProvider::builder()
        .with_batch_exporter(exporter, runtime::TokioCurrentThread)
        .with_resource(RESOURCE.clone())
        .build())
}

fn init_metrics() -> Result<opentelemetry_sdk::metrics::SdkMeterProvider, MetricError> {
    let exporter = MetricExporter::builder()
        .with_tonic()
        .with_endpoint("http://127.0.0.1:44317")
        .build()?;

    // let reader = opentelemetry_sdk::metrics::PeriodicReader::builder(exporter, runtime::TokioCurrentThread)
    let reader = opentelemetry_sdk::metrics::PeriodicReader::builder(exporter)
        .with_interval(std::time::Duration::from_secs(1))
        .build();

    Ok(SdkMeterProvider::builder()
        .with_reader(reader)
        .with_resource(RESOURCE.clone())
        .build())
}

and set this on actix-web httpserver app setup:

.wrap(TracingLogger::default())

image

@VincentBaijot
Copy link

This correction is very nice.
I use it in a project and its awsome to be able to use instrument macro and see the trace and logs automatically correlated even in open telemetries.

Im waiting for it to be merged.

@cijothomas
Copy link
Member

Tagging for consideration in 0.28 though not strictly required for 0.28.

@cijothomas cijothomas added this to the 0.28.0 milestone Jan 21, 2025
@lalitb
Copy link
Member

lalitb commented Jan 21, 2025

@maximebedard - Would it be possible for you to help with the review, as changes are similar to #1394.
@dylanahsmith - Need to fix the CI build as discussed here.

@cijothomas cijothomas dismissed their stale review January 22, 2025 01:32

Feature flag is added now, so removing the block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: opentelemetry-appender-tracing should respect information inside tracing::Span
7 participants