diff --git a/content/en/blog/2024/otel-collector-container-log-parser/index.md b/content/en/blog/2024/otel-collector-container-log-parser/index.md index 295f5772943d..bde20a571523 100644 --- a/content/en/blog/2024/otel-collector-container-log-parser/index.md +++ b/content/en/blog/2024/otel-collector-container-log-parser/index.md @@ -1,13 +1,14 @@ --- title: Introducing the new container log parser for OpenTelemetry Collector -linkTitle: OTel Collector container log parser +linkTitle: Collector container log parser date: 2024-05-16 author: '[Christos Markou](https://github.com/ChrsMark) (Elastic)' cSpell:ignore: Christos containerd Filelog filelog Jaglowski kube Markou --- -Filelog receiver is one of the most commonly used components of the -OpenTelemetry Collector, as indicated by the most recent +[Filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver) +is one of the most commonly used components of the +[OpenTelemetry Collector](/docs/collector), as indicated by the most recent [survey](/blog/2024/otel-collector-survey/#otel-components-usage). According to the same survey, it's unsurprising that [Kubernetes is the leading platform for Collector deployment (80.6%)](/blog/2024/otel-collector-survey/#deployment-scale-and-environment). @@ -41,30 +42,27 @@ in container log parsing. First of all we need to quickly recall the different container log formats that we can meet out there: -### Docker container logs - -`{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}` - -### cri-o logs - -`2024-04-13T07:59:37.505201169-05:00 stdout F This is a cri-o log line!` - -### Containerd logs - -`2024-04-22T10:27:25.813799277Z stdout F This is an awesome containerd log line!` +- Docker container logs: + `{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}` +- cri-o logs: + `2024-04-13T07:59:37.505201169-05:00 stdout F This is a cri-o log line!` +- Containerd logs: + `2024-04-22T10:27:25.813799277Z stdout F This is an awesome containerd log line!` We can notice that cri-o and containerd log formats are quite similar (both follow the CRI logging format) but with a small difference in the timestamp format. Consequently, in order to properly handle these 3 different formats we need 3 -different routes of stanza operators as we can see in the +different routes of +[stanza](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/stanza) +operators as we can see in the [container parser operator issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959). In addition, the CRI format can provide partial logs which we would like to combine them into one at first place: -```console +```text 2024-04-06T00:17:10.113242941Z stdout P This is a very very long line th 2024-04-06T00:17:10.113242941Z stdout P at is really really long and spa 2024-04-06T00:17:10.113242941Z stdout F ns across multiple log entries @@ -105,7 +103,7 @@ receivers: ``` That configuration is more than enough to properly parse the log line and -extract all the useful K8s metadata. +extract all the useful Kubernetes metadata. A log line `{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}`