- Set up namespace
- Create ConfigMap
- Create the Deployment
- Expected Outcome after running this sample
- Try it out yourself (Optional)
If this is the first example you are trying out, follow the Setup instructions to complete the prerequisites.
NOTE: As a reminder, if you completed the pre-requisite steps successfully, the following requirements should have already been met -
- You should see the image name in manifest.yaml updated with the fully qualified docker image name.
- The above mentioned docker image is actually uploaded to Artifact Registry and is visible there in cloud console.
If the above requirements are not met, please ensure that all the Setup instructions have been followed. You may need to perform a local build again.
Create a namespace in your cluster to run the collector:
export OTEL_NAMESPACE=otel-collector
kubectl create namespace $OTEL_NAMESPACE
The otel-config.yaml
file contains a sample OpenTelemetry Collector config that is
prepopulated with some of the receivers, exporters, and processors included in this project. Edit it to
your desired configuration and create a ConfigMap
from it in the namespace you created above:
cd deploy/gke/simple/
kubectl create configmap otel-config --from-file=./otel-config.yaml -n $OTEL_NAMESPACE
Create a kubernetes service account to associate with your collector:
kubectl create serviceaccount otel-collector -n $OTEL_NAMESPACE
If your GKE cluster has Workload Identity enabled, which is on by default on GKE Autopilot, you will need to grant the OTel Collector's ServiceAccount the permission to send telemetry to Google Cloud. If you are not using Workload Identity, you can skip this section. Without Workload Identity, the collector will inherit the GKE Node's IAM permissions, which already grant the ability to write telemetry.
Set up your environment. If your service account and GKE cluster are in the same project, then these two are the same:
export PROJECT_ID=<your Google Cloud project ID>
export SERVICE_ACCOUNT_PROJECT=$PROJECT_ID
Create the Google Cloud service account:
gcloud iam service-accounts create otel-collector --project=${SERVICE_ACCOUNT_PROJECT}
Grant it permissions to write telemetry:
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member "serviceAccount:otel-collector@${SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com" \
--role "roles/logging.logWriter"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member "serviceAccount:otel-collector@${SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com" \
--role "roles/cloudtrace.agent"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member "serviceAccount:otel-collector@${SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com" \
--role "roles/monitoring.metricWriter"
Grant the Kubernetes service account premission to act as the IAM service account:
gcloud iam service-accounts add-iam-policy-binding "otel-collector@${SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com" \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:${SERVICE_ACCOUNT_PROJECT}.svc.id.goog[${OTEL_NAMESPACE}/otel-collector]"
Annotate the Kubernetes Service account to complete the setup:
kubectl annotate serviceaccount otel-collector \
--namespace $OTEL_NAMESPACE \
iam.gke.io/gcp-service-account=otel-collector@${SERVICE_ACCOUNT_PROJECT}.iam.gserviceaccount.com
Note If you see permisison denied errors, try deleting the collector pod to force it to pick up changes to the service account.
Create this manifest in your cluster with:
kubectl apply -f manifest.yaml -n $OTEL_NAMESPACE
After creating the deployment, you should verify that all pods created as part of the deployment are running -
kubectl get deployments -n $OTEL_NAMESPACE
If the pods are not running, try using kubectl describe
on the failing pods to get the exact cause for failure.
You can also use kubectl logs
to check the logs of the failing pod containers to pinpoint the cause.
The troubleshooting guide for more information on some of the most common issues such as authentication related issues.
After a successful deployment of this sample, what we have is a GKE cluster on which we have a OpenTelemetry collector running. The collector is configured using the otel-config file.
Since there is no application currently running on the cluster, the collector is not recieving any telemetry data. The current configuration in the otel-config file does however, scrape the collector itself for some metrics (See the prometheus/self
declared under receivers
) and exports these to stdout and google cloud.
- To check for metrics being exported to stdout, run
kubectl logs <pod_name> -n $OTEL_NAMESPACE
. - To check for metrics being exported to google cloud, open Metrics Explorer in Google Cloud Console.
There are many metrics that are emitted from the OpenTelemetry collector itself and most of these metrics start with the prefix Otelcol
and you can search for this string in metrics explorer.
NOTE: You can chnage the configuration for the OpenTelemetry Collector to alter its behaviour.
In case you are not seeing the expected outcome or are running into errors, look at the troubleshooting guide for more information.
Thus far, we've set up an OpenTelemetry collector configured to scrape itself and send metrics to the configured exporters. In this section we will introduce external source(s) of telemetry data to see how the collector operates on these sources. The sources will include data representing metrics, logs and traces.
You should be able to follow all steps mentioned in this README till Verify the Deployment successfully. This ensures that there are no permissions issues and you are able to succeesfully run a GKE cluster and connect to it.
The example uses JSON file(s) containing telemetry data in OTLP format as source of telemetry data. These files can contain metrics, traces or logs - but each file should contain only a single type of telemetry data. Sample files have been provided in the folder - otlp-data. The collector running in the cluster will then read these files and treat the data in them as if it were coming from a running application.
Before continuing, update the JSON files within the otlp-data folder with recent timestamps. This is done so that the telemetry data exported by the collector is recent enough to show up in Google Cloud console -
# Update logs JSON file
make update-timestamp-logs
# Update metrics JSON file
make update-timestamp-metrics
# Update traces JSON file
make update-timestamp-traces
# Shortcut command to update all the files
make update-timestamp-all
We need to update the collector configuration otel-config.yaml file to add a reciever that is able to receive telemetry data from the added JSON file.
Under the receivers
section in the config file, add the following configuration -
otlpjsonfile:
start_at: "beginning"
include:
- "/mnt/testdata/metrics/*.json"
- "/mnt/testdata/traces/*.json"
- "/mnt/testdata/logs/*.json"
For more details about this particular receiver, check otlpjsonfilereceiver.
NOTE: The path in include
points to where the file would be mounted within the Kubernetes cluster environment and is therefore different from where it is present on your local machine. More information on this in the next section.
Next, add this receiver in the traces pipeline, so your trace pipeline looks like -
traces:
receivers: [otlp, otlpjsonfile]
processors: [memory_limiter, batch, resourcedetection/gke]
exporters: [googlecloud, logging]
metrics:
receivers: [otlp, prometheus/self, otlpjsonfile]
processors: [memory_limiter, batch, resourcedetection/gke]
exporters: [googlecloud, logging]
logs:
receivers: [otlp, otlpjsonfile]
processors: [memory_limiter, batch, resourcedetection/gke]
exporters: [googlecloud, logging]
For exporting logs to Google Cloud Platform, we need to further configure the collector with the log name and the GCP project ID. The project ID is used by the cloud exporter to create GCP log entries.
Add the following configuration to the googlecloud
exporter -
googlecloud:
retry_on_failure:
enabled: false
log:
default_log_name: otel-collector-builder-sample/gke-simple-demo # This could be anything
NOTE: If you are unsure about where exactly these snippets should be placed in the collector config file, checkout otel-config-sample.yaml for reference.
Simply adding the JSON files within the directory does not grant the running cluster the access to these files. In order to get access to these files within the Kubernetes environment, we will be mounting them as a Kubernetes ConfigMap
s for a new deployment -
- Make sure that there are no current deployments by running
kubectl get deployments -n $OTEL_NAMESPACE
. If there are active deployments, delete them usingkubectl delete
. - Since we also updated our collector configuration, we will need to recreate the
otel-config
ConfigMap too. Delete the old ConfigMaps, if any before proceeding.- To delete ConfigMaps, you can use
kubectl delete configmaps <ConfigMap name> -n $OTEL_NAMESPACE
.
- To delete ConfigMaps, you can use
- Recreate
otel-config
ConfigMap with the updated configuration -kubectl create configmap otel-config --from-file=./otel-config.yaml -n $OTEL_NAMESPACE
- Create new ConfigMaps - one for each kind of telemetry data. These will be used to mount our test data in Kubernetes cluster as volumes -
- For metrics, create ConfigMap named
otlp-test-data-metrics
-kubectl create configmap otlp-test-data-metrics --from-file=./otlp-data/testdata-metrics.json -n $OTEL_NAMESPACE
- For traces, create ConfigMap named
otlp-test-data-traces
-kubectl create configmap otlp-test-data-traces --from-file=./otlp-data/testdata-traces.json -n $OTEL_NAMESPACE
- For logs, create ConfigMap named
otlp-test-data-logs
-kubectl create configmap otlp-test-data-logs --from-file=./otlp-data/testdata-logs.json -n $OTEL_NAMESPACE
- For metrics, create ConfigMap named
- Update the manifest.yaml to add
volume
andvolumeMount
configurations for the newly created ConfigMaps.- Update the
spec.volumes
section to add a newconfigMap
entriesvolumes: - configMap: name: otel-config name: otel-collector-config-vol # Add the following volume configurations - configMap: name: otlp-test-data-metrics # Volume for test metrics data name: test-data-vol-metrics # This could be changed to anything - configMap: name: otlp-test-data-traces # Volume for test traces data name: test-data-vol-traces # This could be changed to anything - configMap: name: otlp-test-data-logs # Volume for test logs data name: test-data-vol-logs # This could be changed to anything
- Update the
spec.containers.volumeMounts
section to configure new volumeMount(s) for the added volumes.volumeMounts: - name: otel-collector-config-vol mountPath: /conf # Add the following volumeMount configurations - name: test-data-vol-metrics # This name should match to the name of volume in spec.volumes mountPath: /mnt/testdata/metrics # This is the location in the kubernetes environment where the file for configMap for metrics will be mounted - name: test-data-vol-traces # This name should match to the name of volume in spec.volumes mountPath: /mnt/testdata/traces # This is the location in the kubernetes environment where the file for configMap for traces will be mounted - name: test-data-vol-logs # This name should match to the name of volume in spec.volumes mountPath: /mnt/testdata/logs # This is the location in the kubernetes environment where the file for configMap for logs will be mounted
- Update the
- Deploy the collector using the new deployment manifest -
kubectl apply -f manifest.yaml -n $OTEL_NAMESPACE
- Verify if the telemetry data is now being emitted by using
kubectl logs
-# Get the pod name(s) kubectl get pods -n $OTEL_NAMESPACE # Get the logs from a pod kubectl logs <pod_container_name> -n $OTEL_NAMESPACE
In case there are issues, follow steps in troubleshooting guide to verify file presence and contents of testdata files in the cluster.
NOTE: You will need to update the timestamps in testdata-metrics to be within 24 hours of current time otherwise the metrics will not show up in Google Cloud console's Metrics Explorer. You might also need to update timestamps in other testdata - metrics and logs in case they become old enough to not be recognized by Google Cloud.
At this point, we have several files acting as sources of telemetry data and a receiver configured within the OpenTelemetry Collector that listens to these files whenever there are updates made to them. The collector also receives telemetry data from the files once when they are initially mounted.
So there should already be telemetry data emitted which should have been caught by the Collector. The current collector configuration has 2 exporters configured to where these traces will be exported -
googlecloud
- You can log into your Google Cloud console and then use Trace List to look for traces, Metrics Explorer or Cloud Logging for logs.
logging
- You can view the telemetry data on stdout for the pod container, run
kubectl logs <pod_container_name> -n $OTEL_NAMESPACE
.
- You can view the telemetry data on stdout for the pod container, run
You might want to update the telemetry data being recieved by the collector to add new traces, metrics or logs or update any other attributes. To do this you need to make a change to the ConfigMap that is created from our test data file. To do this, you need to -
- Update the telemetry data in the desired data file - testdata-metrics.json, testdata-traces.json. Make sure that JSON here is not pretty printed and remains minified.
- Update the corresponding ConfigMap. For instance, if you made changes to
testdata-traces.json
file, you need to updateotlp-test-data-traces
-kubectl create configmap otlp-test-data-traces --from-file=./otlp-data/testdata-traces.json -n $OTEL_NAMESPACE --dry-run=client -o yaml | kubectl apply -f -
To check if ConfigMap is updated, use following command with desired ConfigMap name -
kubectl describe configmaps otlp-test-data-traces -n $OTEL_NAMESPACE
NOTE: It may take some time (usually a few seconds) for the config file to be updated. Once the file(s) are updated within the cluster, changes in the telemetry output would be visible on all configured exporters. If they are not visible after some time, follow steps in follow steps in troubleshooting guide to verify the contents of the files(s).
When you are done, you can clean up everything you've done with the following steps:
kubectl delete namespace $OTEL_NAMESPACE