Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: debug log shows me "Span written to the storage by the collector" but trace not found in jaeger #6200

Open
fissionlin opened this issue Nov 12, 2024 · 0 comments

Comments

@fissionlin
Copy link

What happened?

Jaeger v1.62.0
Commit 4b74462
Build 2024-10-07T09:16:33Z
Jaeger UI v1.62.0

We receive traces by OTEL collector and export to jaeger by OTLP which is natively supported by Jaeger. In OTEL collector debug log of exporter we clearly get the trace and from the jaeger debug log tells me that :

{“level”:”debug”,”ts”:1731406040.6013877,”caller”:”app/span_processor.go:154”,”msg”:”Span written to the storage by the collector”,”trace-id”:”0979e8e44fa0e1f8e584a6d5034df85a”}

But :
1. From jaeger UI we cannot see the service name in "Service" dropdown list but the service.name in trace is there.
2. We use below command to query that trace shows me not found:

curl localhost:16686/api/traces/0979e8e44fa0e1f8e584a6d5034df85a

{“data”:null,”total”:0,”limit”:0,”offset”:0,”errors”:[{“code”:404,”msg”:”trace not found”}]}

We use elasticsearch as SPAN_STORAGE_TYPE.

The version of elasticsearch is 8.15.3.
The version of OTEL Collector is otelcol-contrib_0.112.0_linux_amd64.

The issue is only for our own instrumentation, the traces from jaeger itself is query-able in UI.

May I know in which circumstance that the logs show me trace is written successfully but actually not ?

Steps to reproduce

The OTEL Collector 's yaml file is as below:

extensions:
health_check:
pprof:
endpoint: 0.0.0.0:1777
zpages:
endpoint: 0.0.0.0:55679

receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
cors:
allowed_origins:
- "http://"
- "https://
"

Host metrics

hostmetrics:
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
disk:
load:
filesystem:
exclude_mount_points:
mount_points:
- /dev/*
- /proc/*
- /sys/*
- /run/k3s/containerd/*
- /var/lib/docker/*
- /var/lib/kubelet/*
- /snap/*
match_type: regexp
exclude_fs_types:
fs_types:
- autofs
- binfmt_misc
- bpf
- cgroup2
- configfs
- debugfs
- devpts
- devtmpfs
- fusectl
- hugetlbfs
- iso9660
- mqueue
- nsfs
- overlay
- proc
- procfs
- pstore
- rpc_pipefs
- securityfs
- selinuxfs
- squashfs
- sysfs
- tracefs
match_type: strict
memory:
metrics:
system.memory.utilization:
enabled: true
network:
paging:
processes:
process:
mute_process_exe_error: true
mute_process_io_error: true
mute_process_user_error: true

#Elasticsearch metrics
elasticsearch:
metrics:
elasticsearch.node.fs.disk.available:
enabled: false
nodes: ["_local"]
indices: ["abc*"]
endpoint: http://localhost:9200
collection_interval: 10s

Collect own metrics

prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 10s
static_configs:
- targets: ['0.0.0.0:8888']

jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14251
thrift_binary:
endpoint: 0.0.0.0:6834
thrift_compact:
endpoint: 0.0.0.0:6833
thrift_http:
endpoint: 0.0.0.0:14266

processors:
batch:

connectors:
spanmetrics:

exporters:
debug:
verbosity: detailed
otlp:
endpoint: "172.22.133.10:4315"
tls:
insecure: true
otlphttp/prometheus:
endpoint: "http://172.22.133.10:9090/api/v1/otlp"
tls:
insecure: true
elasticsearch:
endpoint: "http://172.22.133.10:9200"
tls:
insecure: true

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp, debug]

metrics:
  receivers: [otlp]
  processors: [batch]
  exporters: [otlphttp/prometheus]

logs:
  receivers: [otlp]
  processors: [batch]
  exporters: [elasticsearch]

Expected behavior

The trace should be written in jaeger and should be query-able.

Relevant log output

Screenshot

No response

Additional context

No response

Jaeger backend version

1.62.0

SDK

Opentelemetry 1.30.0

Pipeline

OTEL SDK -> OTEL Collector -> Jaeger All-in-one -> Elasticsearch

Stogage backend

Elasticsearch 8.15.3

Operating system

RHEL 9.4

Deployment model

Linux

Deployment configs

#!/bin/sh
export SPAN_STORAGE_TYPE=elasticsearch
nohup /root/source/jaeger-1.62.0-linux-amd64/jaeger-all-in-one --collector.grpc-server.host-port 14251  --collector.http-server.host-port 14266 --collector.otlp.grpc.host-port 4315 --collector.otlp.http.host-port 4316 --processor.jaeger-compact.server-host-port 6833  --processor.jaeger-binary.server-host-port 6834 --es.server-urls http://172.22.133.10:9200 --es.num-shards 3 --es.num-replicas 1 --es.index-prefix abc --log-level=debug&
@fissionlin fissionlin added the bug label Nov 12, 2024
jacobtrombetta added a commit to spaceandtimelabs/sxt-proof-of-sql that referenced this issue Nov 18, 2024
# Rationale for this change
The latest Jaeger all-in-one Docker image, version `1.63.0`, has a bug.
After running benchmarks you cannot view the output from the UI -
jaegertracing/jaeger#6200. This PR updates the
documents referencing the Jaeger all-in-one image version to point to
the latest working version - `1.62.0`.

# What changes are included in this PR?
- Docs are updated to specify the latest working Jaeger all-in-one
Docker image version - `1.62.0`.

# Are these changes tested?
Yes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant