Add decoupleafterbatch converter to ensure decouple processor follows…

… batch processor (#1255) * Always put decouple processor first in pipeline * Add converter to derive processors from a base * Remove scratchpad code * implement rules and test * update tests * improve tests for reviewers * fix toggle for append predicate * Fix typo in function comment * Document converter and auto-configuration * Document converter and auto-configuration * rm errant test * Add tests to clarify decouple->batch ill-formed chain * Fix typo in test case description * Improve name of predicate/helper * Update collector/processor/decoupleprocessor/README.md Co-authored-by: Adam Charrett <[email protected]> * gofmt -s -w . * restructure tests to extend coverage * go mod tidy * Update collector/internal/confmap/converter/decoupleafterbatchconverter/README.md Co-authored-by: Adam Charrett <[email protected]> * Add auto-config explaination to Collector --------- Co-authored-by: Adam Charrett <[email protected]>
open-telemetry · Apr 22, 2024 · dff9bc6 · dff9bc6
1 parent c1d07ac
commit dff9bc6
Show file tree

Hide file tree

Showing 12 changed files with 331 additions and 77 deletions.
diff --git a/collector/README.md b/collector/README.md
@@ -90,68 +90,26 @@ from an S3 object using a CloudFormation template:
 
 Loading configuration from S3 will require that the IAM role attached to your function includes read access to the relevant bucket.
 
+## Auto-Configuration
+
+Configuring the Lambda Collector without the decouple processor and batch processor can lead to performance issues. So the OpenTelemetry Lambda Layer automatically adds the decouple processor to the end of the chain if the batch processor is used and the decouple processor is not.
+
 # Improving Lambda responses times
 At the end of a lambda function's execution, the OpenTelemetry client libraries will flush any pending spans/metrics/logs
-to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this 
-means that the response of the lambda function is delayed until the data has been exported. 
+to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this
+means that the response of the lambda function is delayed until the data has been exported.
 This delay can potentially be for hundreds of milliseconds.
 
-To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the 
-two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported 
+To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the
+two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported
 before the Lambda environment is frozen.
 
-Below is a sample configuration that uses the decouple processor:
-```yaml
-receivers:
-  otlp:
-    protocols:
-      grpc:
-
-exporters:
-  logging:
-    loglevel: debug
-  otlp:
-    endpoint: { backend endpoint }
-
-processors:
-  decouple:
-
-service:
-  pipelines:
-    traces:
-      receivers: [otlp]
-      processors: [decouple]
-      exporters: [logging, otlp]
-```
+See the section regarding auto-configuration above. You don't need to manually add the decouple processor to your configuration.
 
 ## Reducing Lambda runtime
-If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch 
-processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data. 
+If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch
+processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data.
 When used with the batch processor the decouple processor must be the last processor in the pipeline to ensure that data
 is successfully exported before the lambda environment is frozen.
 
-An example use of the batch and decouple processors:
-```yaml
-receivers:
-  otlp:
-    protocols:
-      grpc:
-
-exporters:
-  logging:
-    loglevel: debug
-  otlp:
-    endpoint: { backend endpoint }
-
-processors:
-  decouple:
-  batch:
-    timeout: 5m
-
-service:
-  pipelines:
-    traces:
-      receivers: [otlp]
-      processors: [batch, decouple]
-      exporters: [logging, otlp]
-```
+As stated previously in the auto-configuration section, the OpenTelemetry Lambda Layer will automatically add the decouple processor to the end of the processors if the batch is used and the decouple processor is not. The result will be the same whether you configure it manually or not.
diff --git a/collector/go.mod b/collector/go.mod
@@ -20,6 +20,7 @@ replace cloud.google.com/go => cloud.google.com/go v0.107.0
 
 require (
 	github.com/golang-collections/go-datastructures v0.0.0-20150211160725-59788d5eb259
+	github.com/google/go-cmp v0.6.0
 	github.com/open-telemetry/opentelemetry-collector-contrib/confmap/provider/s3provider v0.92.0
 	github.com/open-telemetry/opentelemetry-lambda/collector/lambdacomponents v0.91.0
 	github.com/open-telemetry/opentelemetry-lambda/collector/lambdalifecycle v0.0.0-00010101000000-000000000000

diff --git a/collector/internal/collector/collector.go b/collector/internal/collector/collector.go
@@ -32,6 +32,7 @@ import (
 	"go.uber.org/zap/zapcore"
 
 	"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/disablequeuedretryconverter"
+	"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/decoupleafterbatchconverter"
 )
 
 // Collector runs a single otelcol as a go routine within the
@@ -68,7 +69,7 @@ func NewCollector(logger *zap.Logger, factories otelcol.Factories, version strin
 		ResolverSettings: confmap.ResolverSettings{
 			URIs:       []string{getConfig(l)},
 			Providers:  mapProvider,
-			Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New()},
+			Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New(), decoupleafterbatchconverter.New()},
 		},
 	}
 	cfgProvider, err := otelcol.NewConfigProvider(cfgSet)

diff --git a/collector/internal/confmap/converter/decoupleafterbatchconverter/README.md b/collector/internal/confmap/converter/decoupleafterbatchconverter/README.md
@@ -0,0 +1,11 @@
+# DecoupleAfterBatch Converter
+
+The `DecoupleAfterBatch` converter automatically modifies the collector's configuration for the Lambda distribution. Its purpose is to ensure that a decouple processor is always present after a batch processor in a pipeline, in order to prevent potential data loss due to the Lambda environment being frozen.
+
+## Behavior
+
+The converter scans the collector's configuration and makes the following adjustments:
+
+1. If a pipeline contains a batch processor with no decouple processor defined after it, the converter will automatically add a decouple processor to the end of the pipeline.
+
+2. If a pipeline contains a batch processor with a decouple processor already defined after it or there is no batch processor defined, the converter will not make any changes to the pipeline configuration.
diff --git a/collector/internal/confmap/converter/decoupleafterbatchconverter/converter.go b/collector/internal/confmap/converter/decoupleafterbatchconverter/converter.go
@@ -0,0 +1,118 @@
+// Copyright The OpenTelemetry Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//      http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// The decoupleafterbatchconverter implements the Converter for mutating Collector
+// configurations to ensure the decouple processor is placed after the batch processor.
+// This is logically implemented by appending the decouple processor to the end of
+// processor chains where a batch processor is found unless another decouple processor
+// was seen.
+package decoupleafterbatchconverter
+
+import (
+	"context"
+	"fmt"
+	"strings"
+
+	"go.opentelemetry.io/collector/confmap"
+)
+
+const (
+	serviceKey        = "service"
+	pipelinesKey      = "pipelines"
+	processorsKey     = "processors"
+	batchProcessor    = "batch"
+	decoupleProcessor = "decouple"
+)
+
+type converter struct{}
+
+// New returns a confmap.Converter that ensures the decoupleprocessor is placed first in the pipeline.
+func New() confmap.Converter {
+	return &converter{}
+}
+
+func (c converter) Convert(_ context.Context, conf *confmap.Conf) error {
+	serviceVal := conf.Get(serviceKey)
+	service, ok := serviceVal.(map[string]interface{})
+	if !ok {
+		return nil
+	}
+
+	pipelinesVal, ok := service[pipelinesKey]
+	if !ok {
+		return nil
+	}
+
+	pipelines, ok := pipelinesVal.(map[string]interface{})
+	if !ok {
+		return nil
+	}
+
+	// accumulates updates over the pipelines and applies them
+	// once all pipeline configs are processed
+	updates := make(map[string]interface{})
+	for telemetryType, pipelineVal := range pipelines {
+		pipeline, ok := pipelineVal.(map[string]interface{})
+		if !ok {
+			continue
+		}
+
+		processorsVal, ok := pipeline[processorsKey]
+		if !ok {
+			continue
+		}
+
+		processors, ok := processorsVal.([]interface{})
+		if !ok {
+			continue
+		}
+
+		// accumulate config updates
+		if shouldAppendDecouple(processors) {
+			processors = append(processors, decoupleProcessor)
+			updates[fmt.Sprintf("%s::%s::%s::%s", serviceKey, pipelinesKey, telemetryType, processorsKey)] = processors
+			break
+		}
+
+	}
+
+	// apply all updates
+	if len(updates) > 0 {
+		if err := conf.Merge(confmap.NewFromStringMap(updates)); err != nil {
+			return err
+		}
+	}
+
+	return nil
+}
+
+// The shouldAppendDecouple is the filter predicate for the Convert function action. It tells whether
+// (bool) there was a decouple processor after the last
+// batch processor, which Convert uses to decide whether to append the decouple processor.
+func shouldAppendDecouple(processors []interface{}) bool {
+	var shouldAppendDecouple bool
+	for _, processorVal := range processors {
+		processor, ok := processorVal.(string)
+		if !ok {
+			continue
+		}
+		processorBaseName := strings.Split(processor, "/")[0]
+		if processorBaseName == batchProcessor {
+			shouldAppendDecouple = true
+		} else if processorBaseName == decoupleProcessor {
+			shouldAppendDecouple = false
+		}
+	}
+	return shouldAppendDecouple
+}
diff --git a/collector/internal/confmap/converter/decoupleafterbatchconverter/converter_test.go b/collector/internal/confmap/converter/decoupleafterbatchconverter/converter_test.go
@@ -0,0 +1,153 @@
+// Copyright The OpenTelemetry Authors
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//	http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+package decoupleafterbatchconverter
+
+import (
+	"context"
+	"testing"
+
+	"go.opentelemetry.io/collector/confmap"
+
+	"github.com/google/go-cmp/cmp"
+)
+
+func TestConvert(t *testing.T) {
+	// Since this really tests differences in input, it's easier to read cases
+	// without the repeated definition of other fields in the config.
+	baseConf := func(input []interface{}) *confmap.Conf {
+		return confmap.NewFromStringMap(map[string]interface{}{
+			"service": map[string]interface{}{
+				"pipelines": map[string]interface{}{
+					"traces": map[string]interface{}{
+						"processors": input,
+					},
+				},
+			},
+		})
+	}
+
+	testCases := []struct {
+		name     string
+		input    *confmap.Conf
+		expected *confmap.Conf
+		err      error
+	}{
+		// This test is first, because it illustrates the difference in making the rule that when
+		// batch is present the converter appends decouple processor to the end of chain versus
+		// the approach of this code which is to do this only when the last instance of batch
+		// is not followed by decouple processor.
+		{
+			name:     "batch then decouple in middle of chain",
+			input:    baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
+			expected: baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
+		},
+		{
+			name:     "no service",
+			input:    confmap.New(),
+			expected: confmap.New(),
+		},
+		{
+			name: "no pipelines",
+			input: confmap.NewFromStringMap(
+				map[string]interface{}{
+					"service": map[string]interface{}{
+						"extensions": map[string]interface{}{},
+					},
+				},
+			),
+			expected: confmap.NewFromStringMap(
+				map[string]interface{}{
+					"service": map[string]interface{}{
+						"extensions": map[string]interface{}{},
+					},
+				},
+			),
+		},
+		{
+			name: "no processors in chain",
+			input: confmap.NewFromStringMap(
+				map[string]interface{}{
+					"service": map[string]interface{}{
+						"extensions": map[string]interface{}{},
+						"pipelines": map[string]interface{}{
+							"traces": map[string]interface{}{},
+						},
+					},
+				},
+			),
+			expected: confmap.NewFromStringMap(map[string]interface{}{
+				"service": map[string]interface{}{
+					"extensions": map[string]interface{}{},
+					"pipelines": map[string]interface{}{
+						"traces": map[string]interface{}{},
+					},
+				},
+			},
+			),
+		},
+		{
+			name:     "batch processor in singleton chain",
+			input:    baseConf([]interface{}{"batch"}),
+			expected: baseConf([]interface{}{"batch", "decouple"}),
+		},
+		{
+			name:     "batch processor present twice",
+			input:    baseConf([]interface{}{"batch", "processor1", "batch"}),
+			expected: baseConf([]interface{}{"batch", "processor1", "batch", "decouple"}),
+		},
+
+		{
+			name:     "batch processor not present",
+			input:    baseConf([]interface{}{"processor1", "processor2"}),
+			expected: baseConf([]interface{}{"processor1", "processor2"}),
+		},
+		{
+			name:     "batch sandwiched between input no decouple",
+			input:    baseConf([]interface{}{"processor1", "batch", "processor2"}),
+			expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
+		},
+
+		{
+			name:     "batch and decouple input already present in correct position",
+			input:    baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
+			expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
+		},
+		{
+			name:     "decouple and batch",
+			input:    baseConf([]interface{}{"decouple", "batch"}),
+			expected: baseConf([]interface{}{"decouple", "batch", "decouple"}),
+		},
+		{
+			name:     "decouple then batch mixed with others in the pipelinefirst then batch somewhere",
+			input:    baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3"}),
+			expected: baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3", "decouple"}),
+		},
+	}
+
+	for _, tc := range testCases {
+		t.Run(tc.name, func(t *testing.T) {
+			conf := tc.input
+			expected := tc.expected
+
+			c := New()
+			err := c.Convert(context.Background(), conf)
+			if err != tc.err {
+				t.Errorf("unexpected error converting: %v", err)
+			}
+			if diff := cmp.Diff(expected.ToStringMap(), conf.ToStringMap()); diff != "" {
+				t.Errorf("Convert() mismatch: (-want +got):\n%s", diff)
+			}
+		})
+	}
+}