Skip to content

Commit

Permalink
Add decoupleafterbatch converter to ensure decouple processor follows…
Browse files Browse the repository at this point in the history
… batch processor (#1255)

* Always put decouple processor first in pipeline

* Add converter to derive processors from a base

* Remove scratchpad code

* implement rules and test

* update tests

* improve tests for reviewers

* fix toggle for append predicate

* Fix typo in function comment

* Document converter and auto-configuration

* Document converter and auto-configuration

* rm errant test

* Add tests to clarify decouple->batch ill-formed chain

* Fix typo in test case description

* Improve name of predicate/helper

* Update collector/processor/decoupleprocessor/README.md

Co-authored-by: Adam Charrett <[email protected]>

* gofmt -s -w .

* restructure tests to extend coverage

* go mod tidy

* Update collector/internal/confmap/converter/decoupleafterbatchconverter/README.md

Co-authored-by: Adam Charrett <[email protected]>

* Add auto-config explaination to Collector

---------

Co-authored-by: Adam Charrett <[email protected]>
  • Loading branch information
nslaughter and adcharre authored Apr 22, 2024
1 parent c1d07ac commit dff9bc6
Show file tree
Hide file tree
Showing 12 changed files with 331 additions and 77 deletions.
66 changes: 12 additions & 54 deletions collector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,68 +90,26 @@ from an S3 object using a CloudFormation template:
Loading configuration from S3 will require that the IAM role attached to your function includes read access to the relevant bucket.
## Auto-Configuration
Configuring the Lambda Collector without the decouple processor and batch processor can lead to performance issues. So the OpenTelemetry Lambda Layer automatically adds the decouple processor to the end of the chain if the batch processor is used and the decouple processor is not.
# Improving Lambda responses times
At the end of a lambda function's execution, the OpenTelemetry client libraries will flush any pending spans/metrics/logs
to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this
means that the response of the lambda function is delayed until the data has been exported.
to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this
means that the response of the lambda function is delayed until the data has been exported.
This delay can potentially be for hundreds of milliseconds.
To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the
two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported
To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the
two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported
before the Lambda environment is frozen.
Below is a sample configuration that uses the decouple processor:
```yaml
receivers:
otlp:
protocols:
grpc:

exporters:
logging:
loglevel: debug
otlp:
endpoint: { backend endpoint }

processors:
decouple:

service:
pipelines:
traces:
receivers: [otlp]
processors: [decouple]
exporters: [logging, otlp]
```
See the section regarding auto-configuration above. You don't need to manually add the decouple processor to your configuration.
## Reducing Lambda runtime
If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch
processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data.
If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch
processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data.
When used with the batch processor the decouple processor must be the last processor in the pipeline to ensure that data
is successfully exported before the lambda environment is frozen.
An example use of the batch and decouple processors:
```yaml
receivers:
otlp:
protocols:
grpc:

exporters:
logging:
loglevel: debug
otlp:
endpoint: { backend endpoint }

processors:
decouple:
batch:
timeout: 5m

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, decouple]
exporters: [logging, otlp]
```
As stated previously in the auto-configuration section, the OpenTelemetry Lambda Layer will automatically add the decouple processor to the end of the processors if the batch is used and the decouple processor is not. The result will be the same whether you configure it manually or not.
1 change: 1 addition & 0 deletions collector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ replace cloud.google.com/go => cloud.google.com/go v0.107.0

require (
github.com/golang-collections/go-datastructures v0.0.0-20150211160725-59788d5eb259
github.com/google/go-cmp v0.6.0
github.com/open-telemetry/opentelemetry-collector-contrib/confmap/provider/s3provider v0.92.0
github.com/open-telemetry/opentelemetry-lambda/collector/lambdacomponents v0.91.0
github.com/open-telemetry/opentelemetry-lambda/collector/lambdalifecycle v0.0.0-00010101000000-000000000000
Expand Down
3 changes: 2 additions & 1 deletion collector/internal/collector/collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import (
"go.uber.org/zap/zapcore"

"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/disablequeuedretryconverter"
"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/decoupleafterbatchconverter"
)

// Collector runs a single otelcol as a go routine within the
Expand Down Expand Up @@ -68,7 +69,7 @@ func NewCollector(logger *zap.Logger, factories otelcol.Factories, version strin
ResolverSettings: confmap.ResolverSettings{
URIs: []string{getConfig(l)},
Providers: mapProvider,
Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New()},
Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New(), decoupleafterbatchconverter.New()},
},
}
cfgProvider, err := otelcol.NewConfigProvider(cfgSet)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# DecoupleAfterBatch Converter

The `DecoupleAfterBatch` converter automatically modifies the collector's configuration for the Lambda distribution. Its purpose is to ensure that a decouple processor is always present after a batch processor in a pipeline, in order to prevent potential data loss due to the Lambda environment being frozen.

## Behavior

The converter scans the collector's configuration and makes the following adjustments:

1. If a pipeline contains a batch processor with no decouple processor defined after it, the converter will automatically add a decouple processor to the end of the pipeline.

2. If a pipeline contains a batch processor with a decouple processor already defined after it or there is no batch processor defined, the converter will not make any changes to the pipeline configuration.
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// The decoupleafterbatchconverter implements the Converter for mutating Collector
// configurations to ensure the decouple processor is placed after the batch processor.
// This is logically implemented by appending the decouple processor to the end of
// processor chains where a batch processor is found unless another decouple processor
// was seen.
package decoupleafterbatchconverter

import (
"context"
"fmt"
"strings"

"go.opentelemetry.io/collector/confmap"
)

const (
serviceKey = "service"
pipelinesKey = "pipelines"
processorsKey = "processors"
batchProcessor = "batch"
decoupleProcessor = "decouple"
)

type converter struct{}

// New returns a confmap.Converter that ensures the decoupleprocessor is placed first in the pipeline.
func New() confmap.Converter {
return &converter{}
}

func (c converter) Convert(_ context.Context, conf *confmap.Conf) error {
serviceVal := conf.Get(serviceKey)
service, ok := serviceVal.(map[string]interface{})
if !ok {
return nil
}

pipelinesVal, ok := service[pipelinesKey]
if !ok {
return nil
}

pipelines, ok := pipelinesVal.(map[string]interface{})
if !ok {
return nil
}

// accumulates updates over the pipelines and applies them
// once all pipeline configs are processed
updates := make(map[string]interface{})
for telemetryType, pipelineVal := range pipelines {
pipeline, ok := pipelineVal.(map[string]interface{})
if !ok {
continue
}

processorsVal, ok := pipeline[processorsKey]
if !ok {
continue
}

processors, ok := processorsVal.([]interface{})
if !ok {
continue
}

// accumulate config updates
if shouldAppendDecouple(processors) {
processors = append(processors, decoupleProcessor)
updates[fmt.Sprintf("%s::%s::%s::%s", serviceKey, pipelinesKey, telemetryType, processorsKey)] = processors
break
}

}

// apply all updates
if len(updates) > 0 {
if err := conf.Merge(confmap.NewFromStringMap(updates)); err != nil {
return err
}
}

return nil
}

// The shouldAppendDecouple is the filter predicate for the Convert function action. It tells whether
// (bool) there was a decouple processor after the last
// batch processor, which Convert uses to decide whether to append the decouple processor.
func shouldAppendDecouple(processors []interface{}) bool {
var shouldAppendDecouple bool
for _, processorVal := range processors {
processor, ok := processorVal.(string)
if !ok {
continue
}
processorBaseName := strings.Split(processor, "/")[0]
if processorBaseName == batchProcessor {
shouldAppendDecouple = true
} else if processorBaseName == decoupleProcessor {
shouldAppendDecouple = false
}
}
return shouldAppendDecouple
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package decoupleafterbatchconverter

import (
"context"
"testing"

"go.opentelemetry.io/collector/confmap"

"github.com/google/go-cmp/cmp"
)

func TestConvert(t *testing.T) {
// Since this really tests differences in input, it's easier to read cases
// without the repeated definition of other fields in the config.
baseConf := func(input []interface{}) *confmap.Conf {
return confmap.NewFromStringMap(map[string]interface{}{
"service": map[string]interface{}{
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{
"processors": input,
},
},
},
})
}

testCases := []struct {
name string
input *confmap.Conf
expected *confmap.Conf
err error
}{
// This test is first, because it illustrates the difference in making the rule that when
// batch is present the converter appends decouple processor to the end of chain versus
// the approach of this code which is to do this only when the last instance of batch
// is not followed by decouple processor.
{
name: "batch then decouple in middle of chain",
input: baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
expected: baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
},
{
name: "no service",
input: confmap.New(),
expected: confmap.New(),
},
{
name: "no pipelines",
input: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
},
},
),
expected: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
},
},
),
},
{
name: "no processors in chain",
input: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{},
},
},
},
),
expected: confmap.NewFromStringMap(map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{},
},
},
},
),
},
{
name: "batch processor in singleton chain",
input: baseConf([]interface{}{"batch"}),
expected: baseConf([]interface{}{"batch", "decouple"}),
},
{
name: "batch processor present twice",
input: baseConf([]interface{}{"batch", "processor1", "batch"}),
expected: baseConf([]interface{}{"batch", "processor1", "batch", "decouple"}),
},

{
name: "batch processor not present",
input: baseConf([]interface{}{"processor1", "processor2"}),
expected: baseConf([]interface{}{"processor1", "processor2"}),
},
{
name: "batch sandwiched between input no decouple",
input: baseConf([]interface{}{"processor1", "batch", "processor2"}),
expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
},

{
name: "batch and decouple input already present in correct position",
input: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
},
{
name: "decouple and batch",
input: baseConf([]interface{}{"decouple", "batch"}),
expected: baseConf([]interface{}{"decouple", "batch", "decouple"}),
},
{
name: "decouple then batch mixed with others in the pipelinefirst then batch somewhere",
input: baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3"}),
expected: baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3", "decouple"}),
},
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
conf := tc.input
expected := tc.expected

c := New()
err := c.Convert(context.Background(), conf)
if err != tc.err {
t.Errorf("unexpected error converting: %v", err)
}
if diff := cmp.Diff(expected.ToStringMap(), conf.ToStringMap()); diff != "" {
t.Errorf("Convert() mismatch: (-want +got):\n%s", diff)
}
})
}
}
Loading

0 comments on commit dff9bc6

Please sign in to comment.