From 4e3b44bfd4519ed39bdf80ebfd5318f243930ba6 Mon Sep 17 00:00:00 2001 From: Adriana Villela Date: Thu, 20 Jun 2024 11:41:13 -0400 Subject: [PATCH 1/9] Add OTel Operator troubleshooting tips for auto-instrumentation. Ref issue #4723 --- .../operator/troubleshooting/_index.md | 8 + .../operator/troubleshooting/automatic.md | 179 ++++++++++++++++++ 2 files changed, 187 insertions(+) create mode 100644 content/en/docs/kubernetes/operator/troubleshooting/_index.md create mode 100644 content/en/docs/kubernetes/operator/troubleshooting/automatic.md diff --git a/content/en/docs/kubernetes/operator/troubleshooting/_index.md b/content/en/docs/kubernetes/operator/troubleshooting/_index.md new file mode 100644 index 000000000000..8349c7457f97 --- /dev/null +++ b/content/en/docs/kubernetes/operator/troubleshooting/_index.md @@ -0,0 +1,8 @@ +--- +title: Troubleshooting the OpenTelemetry Operator for Kubernetes +linkTitle: Troubleshooting +description: + Contains a collection of tips for troubleshooting various aspects of the + OpenTelemetry Kubernetes Operator. For example, what to do when the target + allocator isn't scraping metrics. +--- diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md new file mode 100644 index 000000000000..123e8092d632 --- /dev/null +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -0,0 +1,179 @@ +--- +title: Auto-instrumentation +--- + +If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s +[auto-instrumentation](/docs/kubernetes/operator/automatic) capability and +you're not seeing any traces or metrics, then there are a few troubleshooting +steps that you can take to help you understand what’s going on and to get things +back on track. + +## Troubleshooting Steps + +### 1- Check installation status + +After installing the `Instrumentation` resource, make sure that it _actually_ +installed correctly by running this command: + +```shell +kubectl describe otelinst -n +``` + +Where `` is the namespace in which the `Instrumentation` resource is +deployed. + +Your output should look something like this: + +```yaml +Name: python-instrumentation +Namespace: application +Labels: app.kubernetes.io/managed-by=opentelemetry-operator +Annotations: instrumentation.opentelemetry.io/default-auto-instrumentation-apache-httpd-image: + ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-apache-httpd:1.0.3 + instrumentation.opentelemetry.io/default-auto-instrumentation-dotnet-image: + ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:0.7.0 + instrumentation.opentelemetry.io/default-auto-instrumentation-go-image: + ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.2.1-alpha + instrumentation.opentelemetry.io/default-auto-instrumentation-java-image: + ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.26.0 + instrumentation.opentelemetry.io/default-auto-instrumentation-nodejs-image: + ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.40.0 + instrumentation.opentelemetry.io/default-auto-instrumentation-python-image: + ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.39b0 +API Version: opentelemetry.io/v1alpha1 +Kind: Instrumentation +Metadata: + Creation Timestamp: 2023-07-28T03:42:12Z + Generation: 1 + Resource Version: 3385 + UID: 646661d5-a8fc-4b64-80b7-8587c9865f53 +Spec: +... + Exporter: + Endpoint: http://otel-collector-collector.opentelemetry.svc.cluster.local:4318 +... + Propagators: + tracecontext + baggage + Python: + Image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.39b0 + Resource Requirements: + Limits: + Cpu: 500m + Memory: 32Mi + Requests: + Cpu: 50m + Memory: 32Mi + Resource: + Sampler: +Events: +``` + +### 2- Check the OpenTelemetry Operator Logs + +Check the OpenTelemetry Operator logs for errors, by running this command: + +```shell +kubectl logs -l app.kubernetes.io/name=opentelemetry-operator --container manager -n opentelemetry-operator-system --follow +``` + +The logs should not show any errors related to auto-instrumentation errors. + +### 3- Check deployment order + +Order matters. The `Instrumentation` resource must be deployed before deploying +the corresponding `Deployment` resource(s) being auto-instrumented. + +Consider the following auto-instrumentation annotation snippet: + +```yaml +annotations: + instrumentation.opentelemetry.io/inject-python: 'true' +``` + +It tells the OpenTelemetry Operator to look for an `Instrumentation` resource in +the pod’s namespace. It also tells the Operator to inject Python +auto-instrumentation into the pod. + +When the pod starts up, the annotation tells the Operator to look for an +`Instrumentation` resource in the pod’s namespace, and to inject Python +auto-instrumentation into the pod. It adds an +[init-container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) +to the application’s pod, called `opentelemetry-auto-instrumentation`, which is +then used to injects the auto-instrumentation into the app container. + +But if the `Instrumentation` resource isn’t present by the time the `Deployment` +is deployed, the `init-container` can’t be created. This means that if the +`Deployment` resource is deployed _before_ you deploy the `Instrumentation` +resource, the auto-instrumentation will fail to initialize. + +Check that the `opentelemetry-auto-instrumentation` `init-container` has started +up correctly (or has even started up at all), by running the following command: + +```shell +kubectl get events -n +``` + +Which should result in output that looks something like this: + +```text +53s Normal Created pod/py-otel-server-7f54bf4cbc-p8wmj Created container opentelemetry-auto-instrumentation +53s Normal Started pod/py-otel-server-7f54bf4cbc-p8wmj Started container opentelemetry-auto-instrumentation +``` + +If the output is missing `Created` and/or `Started` entries for +`opentelemetry-auto-instrumentation`, then it means that there is an issue with +your auto-instrumentation configuration. This can be the result of any of the +following: + +- The `Instrumentation` resource wasn’t installed (or wasn’t installed + properly). +- The `Instrumentation` resource was installed _after_ the application was + deployed. +- There’s an error in the auto-instrumentation annotation, or the annotation in + the wrong spot — see #4 below. + +You might also want to check the output of the events command for any errors, as +these might help point to your issue. + +### 4- Check the auto-instrumentation configuration + +You’ve added the auto-instrumentation annotation, but did you do it correctly? +Here are a couple of things to check for: + +- **Are you auto-instrumenting for the right language?** For example, did you + try to auto-instrument a Python application by adding a JavaScript + auto-instrumentation annotation instead? +- **Did you put the auto-instrumentation annotation in the right spot?** When + you’re defining a `Deployment` resource, there are two spots where you could + add annotations: `spec.metadata.annotations`, and + `spec.template.metadata.annotations`. The auto-instrumentation annotation + needs to be added to `spec.template.metadata.annotations`, otherwise _it won’t + work_. + +### 5- Check auto-instrumentation endpoint configuration + +The `spec.exporter.endpoint` configuration in the `Instrumentation` resource +allows you to define the destination for your telemetry data. If you omit it, it +defaults to `http://localhost:4317`. Unfortunately, that won’t send your output +anywhere useful. + +If you’re sending out your instrumentation to a [Collector](/docs/collector/), +the value of `spec.exporter.endpoint` should reference the name of your +Collector +[`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). + +For example: `http://otel-collector.opentelemetry.svc.cluster.local:4318`. + +Where: + +- `otel-collector` is the name of the OTel Collector Kubernetes + [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/) +- In addition, if the Collector is running in a different namespace, you must + append `opentelemetry.svc.cluster.local` to the Collector’s service name, + where `opentelemetry` is the namespace in which the Collector resides (it can + be any namespace of your choosing). + +Finally, make sure that you are using the right Collector port. Normally, you +can choose either `4317` (gRPC) or `4318` (HTTP); however, for +[Python auto-instrumentation, you can only use `4318`](/docs/kubernetes/operator/automatic/#python). From 7f66fa84bb32a48df4edf08f253e076f4148882b Mon Sep 17 00:00:00 2001 From: Adriana Villela <50256412+avillela@users.noreply.github.com> Date: Thu, 20 Jun 2024 12:10:00 -0400 Subject: [PATCH 2/9] Apply suggestions from code review Co-authored-by: Severin Neumann --- .../kubernetes/operator/troubleshooting/automatic.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 123e8092d632..9d0841a9f14d 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -3,14 +3,14 @@ title: Auto-instrumentation --- If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s -[auto-instrumentation](/docs/kubernetes/operator/automatic) capability and +capability to inject [auto-instrumentation](/docs/kubernetes/operator/automatic) and you're not seeing any traces or metrics, then there are a few troubleshooting steps that you can take to help you understand what’s going on and to get things back on track. ## Troubleshooting Steps -### 1- Check installation status +### Check installation status After installing the `Instrumentation` resource, make sure that it _actually_ installed correctly by running this command: @@ -69,7 +69,7 @@ Spec: Events: ``` -### 2- Check the OpenTelemetry Operator Logs +### Check the OpenTelemetry Operator Logs Check the OpenTelemetry Operator logs for errors, by running this command: @@ -79,7 +79,7 @@ kubectl logs -l app.kubernetes.io/name=opentelemetry-operator --container manage The logs should not show any errors related to auto-instrumentation errors. -### 3- Check deployment order +### Check deployment order Order matters. The `Instrumentation` resource must be deployed before deploying the corresponding `Deployment` resource(s) being auto-instrumented. @@ -136,7 +136,7 @@ following: You might also want to check the output of the events command for any errors, as these might help point to your issue. -### 4- Check the auto-instrumentation configuration +### Check the auto-instrumentation configuration You’ve added the auto-instrumentation annotation, but did you do it correctly? Here are a couple of things to check for: @@ -151,7 +151,7 @@ Here are a couple of things to check for: needs to be added to `spec.template.metadata.annotations`, otherwise _it won’t work_. -### 5- Check auto-instrumentation endpoint configuration +### Check auto-instrumentation endpoint configuration The `spec.exporter.endpoint` configuration in the `Instrumentation` resource allows you to define the destination for your telemetry data. If you omit it, it From 306c8b898085b689968de94a3d1077dd074a5c2c Mon Sep 17 00:00:00 2001 From: Adriana Villela Date: Thu, 20 Jun 2024 12:13:03 -0400 Subject: [PATCH 3/9] Prettify --- .../docs/kubernetes/operator/troubleshooting/automatic.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 9d0841a9f14d..bcd4f36da02e 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -3,10 +3,10 @@ title: Auto-instrumentation --- If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s -capability to inject [auto-instrumentation](/docs/kubernetes/operator/automatic) and -you're not seeing any traces or metrics, then there are a few troubleshooting -steps that you can take to help you understand what’s going on and to get things -back on track. +capability to inject [auto-instrumentation](/docs/kubernetes/operator/automatic) +and you're not seeing any traces or metrics, then there are a few +troubleshooting steps that you can take to help you understand what’s going on +and to get things back on track. ## Troubleshooting Steps From 3b4fae9674350859ff75d3a51331386b950c8cb9 Mon Sep 17 00:00:00 2001 From: Adriana Villela <50256412+avillela@users.noreply.github.com> Date: Mon, 24 Jun 2024 11:25:38 -0400 Subject: [PATCH 4/9] Apply suggestions from code review Co-authored-by: Fabrizio Ferri-Benedetti --- .../operator/troubleshooting/automatic.md | 65 +++++++++---------- 1 file changed, 32 insertions(+), 33 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index bcd4f36da02e..863c25880087 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -4,15 +4,14 @@ title: Auto-instrumentation If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s capability to inject [auto-instrumentation](/docs/kubernetes/operator/automatic) -and you're not seeing any traces or metrics, then there are a few -troubleshooting steps that you can take to help you understand what’s going on -and to get things back on track. +and you're not seeing any traces or metrics, follow these +troubleshooting steps to understand what’s going on. -## Troubleshooting Steps +## Troubleshooting steps ### Check installation status -After installing the `Instrumentation` resource, make sure that it _actually_ +After installing the `Instrumentation` resource, make sure that it installed correctly by running this command: ```shell @@ -22,7 +21,7 @@ kubectl describe otelinst -n Where `` is the namespace in which the `Instrumentation` resource is deployed. -Your output should look something like this: +Your output should look like this: ```yaml Name: python-instrumentation @@ -71,7 +70,7 @@ Events: ### Check the OpenTelemetry Operator Logs -Check the OpenTelemetry Operator logs for errors, by running this command: +Check the OpenTelemetry Operator logs for errors by running this command: ```shell kubectl logs -l app.kubernetes.io/name=opentelemetry-operator --container manager -n opentelemetry-operator-system --follow @@ -81,8 +80,9 @@ The logs should not show any errors related to auto-instrumentation errors. ### Check deployment order -Order matters. The `Instrumentation` resource must be deployed before deploying -the corresponding `Deployment` resource(s) being auto-instrumented. +Make sure the deployment order is correct. The `Instrumentation` resource +must be deployed before deploying the corresponding `Deployment` resources +that are auto-instrumented. Consider the following auto-instrumentation annotation snippet: @@ -91,9 +91,9 @@ annotations: instrumentation.opentelemetry.io/inject-python: 'true' ``` -It tells the OpenTelemetry Operator to look for an `Instrumentation` resource in -the pod’s namespace. It also tells the Operator to inject Python -auto-instrumentation into the pod. +The previous snippet tells the OpenTelemetry Operator to look for an +`Instrumentation` resource in the pod’s namespace. It also tells the +Operator to inject Python auto-instrumentation into the pod. When the pod starts up, the annotation tells the Operator to look for an `Instrumentation` resource in the pod’s namespace, and to inject Python @@ -102,10 +102,10 @@ auto-instrumentation into the pod. It adds an to the application’s pod, called `opentelemetry-auto-instrumentation`, which is then used to injects the auto-instrumentation into the app container. -But if the `Instrumentation` resource isn’t present by the time the `Deployment` +If the `Instrumentation` resource isn’t present by the time the `Deployment` is deployed, the `init-container` can’t be created. This means that if the -`Deployment` resource is deployed _before_ you deploy the `Instrumentation` -resource, the auto-instrumentation will fail to initialize. +`Deployment` resource is deployed before you deploy the `Instrumentation` +resource, the auto-instrumentation fails to initialize. Check that the `opentelemetry-auto-instrumentation` `init-container` has started up correctly (or has even started up at all), by running the following command: @@ -114,52 +114,51 @@ up correctly (or has even started up at all), by running the following command: kubectl get events -n ``` -Which should result in output that looks something like this: +Which should result in output that looks like the following example: ```text 53s Normal Created pod/py-otel-server-7f54bf4cbc-p8wmj Created container opentelemetry-auto-instrumentation 53s Normal Started pod/py-otel-server-7f54bf4cbc-p8wmj Started container opentelemetry-auto-instrumentation ``` -If the output is missing `Created` and/or `Started` entries for -`opentelemetry-auto-instrumentation`, then it means that there is an issue with +If the output is missing `Created` or `Started` entries for +`opentelemetry-auto-instrumentation`, there might be an issue with your auto-instrumentation configuration. This can be the result of any of the following: -- The `Instrumentation` resource wasn’t installed (or wasn’t installed - properly). -- The `Instrumentation` resource was installed _after_ the application was +- The `Instrumentation` resource wasn’t installed or wasn’t installed + properly. +- The `Instrumentation` resource was installed after the application was deployed. - There’s an error in the auto-instrumentation annotation, or the annotation in - the wrong spot — see #4 below. + the wrong spot. See the next section. You might also want to check the output of the events command for any errors, as these might help point to your issue. ### Check the auto-instrumentation configuration -You’ve added the auto-instrumentation annotation, but did you do it correctly? -Here are a couple of things to check for: +The auto-instrumentation annotation might have not been added +correctly. Check for the following: -- **Are you auto-instrumenting for the right language?** For example, did you +- Are you auto-instrumenting for the right language? For example, did you try to auto-instrument a Python application by adding a JavaScript auto-instrumentation annotation instead? -- **Did you put the auto-instrumentation annotation in the right spot?** When - you’re defining a `Deployment` resource, there are two spots where you could +- Did you put the auto-instrumentation annotation in the right location? When + you’re defining a `Deployment` resource, there are two locations where you could add annotations: `spec.metadata.annotations`, and `spec.template.metadata.annotations`. The auto-instrumentation annotation - needs to be added to `spec.template.metadata.annotations`, otherwise _it won’t - work_. + needs to be added to `spec.template.metadata.annotations`, otherwise it doesn't + work. ### Check auto-instrumentation endpoint configuration The `spec.exporter.endpoint` configuration in the `Instrumentation` resource allows you to define the destination for your telemetry data. If you omit it, it -defaults to `http://localhost:4317`. Unfortunately, that won’t send your output -anywhere useful. +defaults to `http://localhost:4317`, which causes the data to be dropped. -If you’re sending out your instrumentation to a [Collector](/docs/collector/), -the value of `spec.exporter.endpoint` should reference the name of your +If you’re sending out your telemetry to a [Collector](/docs/collector/), +the value of `spec.exporter.endpoint` must reference the name of your Collector [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). From bd2f43955ff128f36e1c340d4b12c13f3454c56e Mon Sep 17 00:00:00 2001 From: Adriana Villela <50256412+avillela@users.noreply.github.com> Date: Tue, 25 Jun 2024 12:36:09 -0400 Subject: [PATCH 5/9] Apply suggestions from code review Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com> --- .../operator/troubleshooting/automatic.md | 22 ++++++------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 863c25880087..7c70348eeaab 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -11,7 +11,7 @@ troubleshooting steps to understand what’s going on. ### Check installation status -After installing the `Instrumentation` resource, make sure that it +After installing the `Instrumentation` resource, make sure that it is installed correctly by running this command: ```shell @@ -68,7 +68,7 @@ Spec: Events: ``` -### Check the OpenTelemetry Operator Logs +### Check the OpenTelemetry Operator logs Check the OpenTelemetry Operator logs for errors by running this command: @@ -91,16 +91,13 @@ annotations: instrumentation.opentelemetry.io/inject-python: 'true' ``` -The previous snippet tells the OpenTelemetry Operator to look for an -`Instrumentation` resource in the pod’s namespace. It also tells the -Operator to inject Python auto-instrumentation into the pod. When the pod starts up, the annotation tells the Operator to look for an `Instrumentation` resource in the pod’s namespace, and to inject Python auto-instrumentation into the pod. It adds an [init-container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) -to the application’s pod, called `opentelemetry-auto-instrumentation`, which is -then used to injects the auto-instrumentation into the app container. +called `opentelemetry-auto-instrumentation` to the application’s pod, which is +then used to inject the auto-instrumentation into the app container. If the `Instrumentation` resource isn’t present by the time the `Deployment` is deployed, the `init-container` can’t be created. This means that if the @@ -130,7 +127,7 @@ following: properly. - The `Instrumentation` resource was installed after the application was deployed. -- There’s an error in the auto-instrumentation annotation, or the annotation in +- There’s an error in the auto-instrumentation annotation, or the annotation is in the wrong spot. See the next section. You might also want to check the output of the events command for any errors, as @@ -164,14 +161,9 @@ Collector For example: `http://otel-collector.opentelemetry.svc.cluster.local:4318`. -Where: +Where `otel-collector` is the name of the OTel Collector Kubernetes [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). -- `otel-collector` is the name of the OTel Collector Kubernetes - [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/) -- In addition, if the Collector is running in a different namespace, you must - append `opentelemetry.svc.cluster.local` to the Collector’s service name, - where `opentelemetry` is the namespace in which the Collector resides (it can - be any namespace of your choosing). +In addition, if the Collector is running in a different namespace, you must append `opentelemetry.svc.cluster.local` to the Collector’s service name, where `opentelemetry` is the namespace in which the Collector resides. It can be any namespace of your choosing. Finally, make sure that you are using the right Collector port. Normally, you can choose either `4317` (gRPC) or `4318` (HTTP); however, for From 273c9d09be2812a9c5ed1348934439966aea5de3 Mon Sep 17 00:00:00 2001 From: Adriana Villela Date: Tue, 25 Jun 2024 21:02:45 -0400 Subject: [PATCH 6/9] Incorporate suggestions --- .../operator/troubleshooting/automatic.md | 83 ++++++++++++------- 1 file changed, 54 insertions(+), 29 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 7c70348eeaab..2425769d930d 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -4,15 +4,15 @@ title: Auto-instrumentation If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s capability to inject [auto-instrumentation](/docs/kubernetes/operator/automatic) -and you're not seeing any traces or metrics, follow these -troubleshooting steps to understand what’s going on. +and you're not seeing any traces or metrics, follow these troubleshooting steps +to understand what’s going on. ## Troubleshooting steps ### Check installation status -After installing the `Instrumentation` resource, make sure that it is -installed correctly by running this command: +After installing the `Instrumentation` resource, make sure that it is installed +correctly by running this command: ```shell kubectl describe otelinst -n @@ -80,9 +80,9 @@ The logs should not show any errors related to auto-instrumentation errors. ### Check deployment order -Make sure the deployment order is correct. The `Instrumentation` resource -must be deployed before deploying the corresponding `Deployment` resources -that are auto-instrumented. +Make sure the deployment order is correct. The `Instrumentation` resource must +be deployed before deploying the corresponding `Deployment` resources that are +auto-instrumented. Consider the following auto-instrumentation annotation snippet: @@ -91,7 +91,6 @@ annotations: instrumentation.opentelemetry.io/inject-python: 'true' ``` - When the pod starts up, the annotation tells the Operator to look for an `Instrumentation` resource in the pod’s namespace, and to inject Python auto-instrumentation into the pod. It adds an @@ -99,8 +98,8 @@ auto-instrumentation into the pod. It adds an called `opentelemetry-auto-instrumentation` to the application’s pod, which is then used to inject the auto-instrumentation into the app container. -If the `Instrumentation` resource isn’t present by the time the `Deployment` -is deployed, the `init-container` can’t be created. This means that if the +If the `Instrumentation` resource isn’t present by the time the `Deployment` is +deployed, the `init-container` can’t be created. This means that if the `Deployment` resource is deployed before you deploy the `Instrumentation` resource, the auto-instrumentation fails to initialize. @@ -119,34 +118,57 @@ Which should result in output that looks like the following example: ``` If the output is missing `Created` or `Started` entries for -`opentelemetry-auto-instrumentation`, there might be an issue with -your auto-instrumentation configuration. This can be the result of any of the +`opentelemetry-auto-instrumentation`, there might be an issue with your +auto-instrumentation configuration. This can be the result of any of the following: -- The `Instrumentation` resource wasn’t installed or wasn’t installed - properly. +- The `Instrumentation` resource wasn’t installed or wasn’t installed properly. - The `Instrumentation` resource was installed after the application was deployed. -- There’s an error in the auto-instrumentation annotation, or the annotation is in - the wrong spot. See the next section. +- There’s an error in the auto-instrumentation annotation, or the annotation is + in the wrong spot. See the next section. You might also want to check the output of the events command for any errors, as these might help point to your issue. +### Check the auto-instrumentation annotation + +Consider the following auto-instrumentation annotation snippet: + +```yaml +annotations: + instrumentation.opentelemetry.io/inject-python: 'true' +``` + +If your `Deployment` resource is deployed to a namespace called `application` +and you have an `Instrumentation` resource called `my-instrumentation` which is +deployed to a namespace called `opentelemetry`, then the above annotation will +not work. + +Instead, the annotation should be: + +```yaml +annotations: + instrumentation.opentelemetry.io/opentelemetry/inject-python: 'opentelemetry/my-instrumentation' +``` + +Where `opentelemetry` is the namesapce of the `Instrumentation` resource, and +`my-instrumentation` is the name of the `Instrumentation` resource. + ### Check the auto-instrumentation configuration -The auto-instrumentation annotation might have not been added -correctly. Check for the following: +The auto-instrumentation annotation might have not been added correctly. Check +for the following: -- Are you auto-instrumenting for the right language? For example, did you - try to auto-instrument a Python application by adding a JavaScript +- Are you auto-instrumenting for the right language? For example, did you try to + auto-instrument a Python application by adding a JavaScript auto-instrumentation annotation instead? - Did you put the auto-instrumentation annotation in the right location? When - you’re defining a `Deployment` resource, there are two locations where you could - add annotations: `spec.metadata.annotations`, and + you’re defining a `Deployment` resource, there are two locations where you + could add annotations: `spec.metadata.annotations`, and `spec.template.metadata.annotations`. The auto-instrumentation annotation - needs to be added to `spec.template.metadata.annotations`, otherwise it doesn't - work. + needs to be added to `spec.template.metadata.annotations`, otherwise it + doesn't work. ### Check auto-instrumentation endpoint configuration @@ -154,16 +176,19 @@ The `spec.exporter.endpoint` configuration in the `Instrumentation` resource allows you to define the destination for your telemetry data. If you omit it, it defaults to `http://localhost:4317`, which causes the data to be dropped. -If you’re sending out your telemetry to a [Collector](/docs/collector/), -the value of `spec.exporter.endpoint` must reference the name of your -Collector +If you’re sending out your telemetry to a [Collector](/docs/collector/), the +value of `spec.exporter.endpoint` must reference the name of your Collector [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). For example: `http://otel-collector.opentelemetry.svc.cluster.local:4318`. -Where `otel-collector` is the name of the OTel Collector Kubernetes [`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). +Where `otel-collector` is the name of the OTel Collector Kubernetes +[`Service`](https://kubernetes.io/docs/concepts/services-networking/service/). -In addition, if the Collector is running in a different namespace, you must append `opentelemetry.svc.cluster.local` to the Collector’s service name, where `opentelemetry` is the namespace in which the Collector resides. It can be any namespace of your choosing. +In addition, if the Collector is running in a different namespace, you must +append `opentelemetry.svc.cluster.local` to the Collector’s service name, where +`opentelemetry` is the namespace in which the Collector resides. It can be any +namespace of your choosing. Finally, make sure that you are using the right Collector port. Normally, you can choose either `4317` (gRPC) or `4318` (HTTP); however, for From dec09d83035244a9e956bff0ea77f5a6be442cae Mon Sep 17 00:00:00 2001 From: Severin Neumann Date: Fri, 28 Jun 2024 10:52:10 +0200 Subject: [PATCH 7/9] Apply suggestions from code review --- .../en/docs/kubernetes/operator/troubleshooting/automatic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 2425769d930d..8e9003e713b3 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -152,7 +152,7 @@ annotations: instrumentation.opentelemetry.io/opentelemetry/inject-python: 'opentelemetry/my-instrumentation' ``` -Where `opentelemetry` is the namesapce of the `Instrumentation` resource, and +Where `opentelemetry` is the namespace of the `Instrumentation` resource, and `my-instrumentation` is the name of the `Instrumentation` resource. ### Check the auto-instrumentation configuration From 0b950a75b8d685359a2d3be6bc8b9699b7ee34d8 Mon Sep 17 00:00:00 2001 From: Adriana Villela Date: Mon, 15 Jul 2024 17:47:24 -0400 Subject: [PATCH 8/9] Incorporate suggestions from @pavolloffay --- .../operator/troubleshooting/automatic.md | 150 +++++++++++++++++- static/refcache.json | 12 ++ 2 files changed, 159 insertions(+), 3 deletions(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index 8e9003e713b3..d9b678c7e598 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -1,5 +1,6 @@ --- title: Auto-instrumentation +cSpell:ignore: PYTHONPATH --- If you're using the [OpenTelemetry Operator](/docs/kubernetes/operator)'s @@ -98,6 +99,125 @@ auto-instrumentation into the pod. It adds an called `opentelemetry-auto-instrumentation` to the application’s pod, which is then used to inject the auto-instrumentation into the app container. +Which you can see when you run: + +```shell +kubectl describe pod -n +``` + +Where `` is the namespace in which your pod is deployed. The +resulting output should look like the following example, which shows what the +pod spec may look like after auto-instrumentation injection: + +```text +Name: py-otel-server-f89fdbc4f-mtsps +Namespace: opentelemetry +Priority: 0 +Service Account: default +Node: otel-target-allocator-talk-control-plane/172.24.0.2 +Start Time: Mon, 15 Jul 2024 17:23:45 -0400 +Labels: app=my-app + app.kubernetes.io/name=py-otel-server + pod-template-hash=f89fdbc4f +Annotations: instrumentation.opentelemetry.io/inject-python: true +Status: Running +IP: 10.244.0.10 +IPs: + IP: 10.244.0.10 +Controlled By: ReplicaSet/py-otel-server-f89fdbc4f +Init Containers: + opentelemetry-auto-instrumentation-python: + Container ID: containerd://20ecf8766247e6043fcad46544dba08c3ef534ee29783ca552d2cf758a5e3868 + Image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.45b0 + Image ID: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python@sha256:3ed1122e10375d527d84c826728f75322d614dfeed7c3a8d2edd0d391d0e7973 + Port: + Host Port: + Command: + cp + -r + /autoinstrumentation/. + /otel-auto-instrumentation-python + State: Terminated + Reason: Completed + Exit Code: 0 + Started: Mon, 15 Jul 2024 17:23:51 -0400 + Finished: Mon, 15 Jul 2024 17:23:51 -0400 + Ready: True + Restart Count: 0 + Limits: + cpu: 500m + memory: 32Mi + Requests: + cpu: 50m + memory: 32Mi + Environment: + Mounts: + /otel-auto-instrumentation-python from opentelemetry-auto-instrumentation-python (rw) + /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x2nmj (ro) +Containers: + py-otel-server: + Container ID: containerd://95fb6d06b08ead768f380be2539a93955251be6191fa74fa2e6e5616036a8f25 + Image: otel-target-allocator-talk:0.1.0-py-otel-server + Image ID: docker.io/library/import-2024-07-15@sha256:a2ed39e9a39ca090fedbcbd474c43bac4f8c854336a8500e874bd5b577e37c25 + Port: 8082/TCP + Host Port: 0/TCP + State: Running + Started: Mon, 15 Jul 2024 17:23:52 -0400 + Ready: True + Restart Count: 0 + Environment: + OTEL_NODE_IP: (v1:status.hostIP) + OTEL_POD_IP: (v1:status.podIP) + OTEL_METRICS_EXPORTER: console,otlp_proto_http + OTEL_LOGS_EXPORTER: otlp_proto_http + OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED: true + PYTHONPATH: /otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation:/otel-auto-instrumentation-python + OTEL_TRACES_EXPORTER: otlp + OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: http/protobuf + OTEL_EXPORTER_OTLP_METRICS_PROTOCOL: http/protobuf + OTEL_SERVICE_NAME: py-otel-server + OTEL_EXPORTER_OTLP_ENDPOINT: http://otelcol-collector.opentelemetry.svc.cluster.local:4318 + OTEL_RESOURCE_ATTRIBUTES_POD_NAME: py-otel-server-f89fdbc4f-mtsps (v1:metadata.name) + OTEL_RESOURCE_ATTRIBUTES_NODE_NAME: (v1:spec.nodeName) + OTEL_PROPAGATORS: tracecontext,baggage + OTEL_RESOURCE_ATTRIBUTES: service.name=py-otel-server,service.version=0.1.0,k8s.container.name=py-otel-server,k8s.deployment.name=py-otel-server,k8s.namespace.name=opentelemetry,k8s.node.name=$(OTEL_RESOURCE_ATTRIBUTES_NODE_NAME),k8s.pod.name=$(OTEL_RESOURCE_ATTRIBUTES_POD_NAME),k8s.replicaset.name=py-otel-server-f89fdbc4f,service.instance.id=opentelemetry.$(OTEL_RESOURCE_ATTRIBUTES_POD_NAME).py-otel-server + Mounts: + /otel-auto-instrumentation-python from opentelemetry-auto-instrumentation-python (rw) + /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x2nmj (ro) +Conditions: + Type Status + Initialized True + Ready True + ContainersReady True + PodScheduled True +Volumes: + kube-api-access-x2nmj: + Type: Projected (a volume that contains injected data from multiple sources) + TokenExpirationSeconds: 3607 + ConfigMapName: kube-root-ca.crt + ConfigMapOptional: + DownwardAPI: true + opentelemetry-auto-instrumentation-python: + Type: EmptyDir (a temporary directory that shares a pod's lifetime) + Medium: + SizeLimit: 200Mi +QoS Class: Burstable +Node-Selectors: +Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s + node.kubernetes.io/unreachable:NoExecute op=Exists for 300s +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Normal Scheduled 99s default-scheduler Successfully assigned opentelemetry/py-otel-server-f89fdbc4f-mtsps to otel-target-allocator-talk-control-plane + Normal Pulling 99s kubelet Pulling image "ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.45b0" + Normal Pulled 93s kubelet Successfully pulled image "ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.45b0" in 288.756166ms (5.603779501s including waiting) + Normal Created 93s kubelet Created container opentelemetry-auto-instrumentation-python + Normal Started 93s kubelet Started container opentelemetry-auto-instrumentation-python + Normal Pulled 92s kubelet Container image "otel-target-allocator-talk:0.1.0-py-otel-server" already present on machine + Normal Created 92s kubelet Created container py-otel-server + Normal Started 92s kubelet Started container py-otel-server +``` + If the `Instrumentation` resource isn’t present by the time the `Deployment` is deployed, the `init-container` can’t be created. This means that if the `Deployment` resource is deployed before you deploy the `Instrumentation` @@ -107,10 +227,11 @@ Check that the `opentelemetry-auto-instrumentation` `init-container` has started up correctly (or has even started up at all), by running the following command: ```shell -kubectl get events -n +kubectl get events -n ``` -Which should result in output that looks like the following example: +Where `` is the namespace in which your pod is deployed. The +resulting output should look like the following example: ```text 53s Normal Created pod/py-otel-server-7f54bf4cbc-p8wmj Created container opentelemetry-auto-instrumentation @@ -149,12 +270,21 @@ Instead, the annotation should be: ```yaml annotations: - instrumentation.opentelemetry.io/opentelemetry/inject-python: 'opentelemetry/my-instrumentation' + instrumentation.opentelemetry.io/inject-python: 'opentelemetry/my-instrumentation' ``` Where `opentelemetry` is the namespace of the `Instrumentation` resource, and `my-instrumentation` is the name of the `Instrumentation` resource. +[The possible values for the annotation can be](https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md?plain=1#L151-L156): + +- "true" - inject `OpenTelemetryCollector` resource from the namespace. +- "sidecar-for-my-app" - name of `OpenTelemetryCollector` CR instance in the + current namespace. +- "my-other-namespace/my-instrumentation" - name and namespace of + `OpenTelemetryCollector` CR instance in another namespace. +- "false" - do not inject + ### Check the auto-instrumentation configuration The auto-instrumentation annotation might have not been added correctly. Check @@ -193,3 +323,17 @@ namespace of your choosing. Finally, make sure that you are using the right Collector port. Normally, you can choose either `4317` (gRPC) or `4318` (HTTP); however, for [Python auto-instrumentation, you can only use `4318`](/docs/kubernetes/operator/automatic/#python). + +### Check configuration sources + +Auto-instrumentation currently overrides Java's `JAVA_TOOL_OPTIONS`, Python's +`PYTHONPATH`, and Node.js's `NODE_OPTIONS` for Node.js when set in a Docker +image or when defined in a `ConfigMap`. This is a known issue, and as a result, +these methods of setting these environment variables should be avoided until the +issue is resolved. + +See reference issues for +[Java](https://github.com/open-telemetry/opentelemetry-operator/issues/1814), +[Python](https://github.com/open-telemetry/opentelemetry-operator/issues/1884), +and +[NodeJS](https://github.com/open-telemetry/opentelemetry-operator/issues/1393). diff --git a/static/refcache.json b/static/refcache.json index 873f53192e86..d9afeafefca5 100644 --- a/static/refcache.json +++ b/static/refcache.json @@ -4007,6 +4007,18 @@ "StatusCode": 200, "LastSeen": "2024-01-18T19:37:11.461365-05:00" }, + "https://github.com/open-telemetry/opentelemetry-operator/issues/1393": { + "StatusCode": 200, + "LastSeen": "2024-07-15T17:37:20.027422-04:00" + }, + "https://github.com/open-telemetry/opentelemetry-operator/issues/1814": { + "StatusCode": 200, + "LastSeen": "2024-07-15T17:37:14.803673-04:00" + }, + "https://github.com/open-telemetry/opentelemetry-operator/issues/1884": { + "StatusCode": 200, + "LastSeen": "2024-07-15T17:37:15.74053-04:00" + }, "https://github.com/open-telemetry/opentelemetry-operator/issues/1906": { "StatusCode": 200, "LastSeen": "2024-05-13T07:25:18.846726619Z" From 3de06064b09378b2af03105c141c43f545b61a75 Mon Sep 17 00:00:00 2001 From: Adriana Villela Date: Mon, 15 Jul 2024 17:51:20 -0400 Subject: [PATCH 9/9] Fix linting issues --- .../en/docs/kubernetes/operator/troubleshooting/automatic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md index d9b678c7e598..577a4782e8d7 100644 --- a/content/en/docs/kubernetes/operator/troubleshooting/automatic.md +++ b/content/en/docs/kubernetes/operator/troubleshooting/automatic.md @@ -336,4 +336,4 @@ See reference issues for [Java](https://github.com/open-telemetry/opentelemetry-operator/issues/1814), [Python](https://github.com/open-telemetry/opentelemetry-operator/issues/1884), and -[NodeJS](https://github.com/open-telemetry/opentelemetry-operator/issues/1393). +[Node.js](https://github.com/open-telemetry/opentelemetry-operator/issues/1393).