-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix static converters #2270
base: main
Are you sure you want to change the base?
Fix static converters #2270
Changes from 4 commits
b900f01
7e99944
1f9f8fa
1e769d5
fb283ef
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -91,13 +91,20 @@ func toremotewriteexporterConfig(cfg *remotewriteexporter.Config, forwardTo []st | |
defaultArgs := &prometheus.Arguments{} | ||
defaultArgs.SetToDefault() | ||
|
||
return &prometheus.Arguments{ | ||
args := &prometheus.Arguments{ | ||
IncludeTargetInfo: defaultArgs.IncludeTargetInfo, | ||
IncludeScopeInfo: defaultArgs.IncludeScopeInfo, | ||
IncludeScopeLabels: defaultArgs.IncludeScopeLabels, | ||
GCFrequency: cfg.StaleTime, | ||
GCFrequency: defaultArgs.GCFrequency, | ||
ForwardTo: forwardTo, | ||
AddMetricSuffixes: defaultArgs.AddMetricSuffixes, | ||
ResourceToTelemetryConversion: defaultArgs.ResourceToTelemetryConversion, | ||
} | ||
|
||
// Override default only if > 0 because GCFrequency of 0 is not allowed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here, do we really need to check if the incoming config is invalid? Also, if we set a different GC frequency than what the user specified, we're technically not converting their exact config anymore. Maybe it'd make more sense to just refuse to convert the config, or to log a warning. But IMO we could just convert it as it is and let Alloy fail when it loads it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dug deeper into this one because I could not understand why it would be set to 0 by default and I found out that when we moved to Alloy we transformed some static components into empty shells: https://github.com/grafana/alloy/pull/55/files#diff-1b127540731f7403cbe5211de129eddfc35579d88d6bcbc371520642a9aa5486. The problem with this component is that the default values were set in the newRemoteWriteExporter() function. This StaleTime default value is actually supposed to be 15 minutes. I got rid of the check in the converter and set the correct value in the yaml component. This way, if you convert a config that uses the default yaml value, it will be correctly converted to 15m in Alloy. |
||
if cfg.StaleTime > 0 { | ||
args.GCFrequency = cfg.StaleTime | ||
} | ||
|
||
return args | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ package test_common | |
import ( | ||
"bufio" | ||
"bytes" | ||
"context" | ||
"fmt" | ||
"io/fs" | ||
"os" | ||
|
@@ -12,6 +13,7 @@ import ( | |
"strings" | ||
"testing" | ||
|
||
"github.com/grafana/alloy/internal/component" | ||
"github.com/grafana/alloy/internal/converter/diag" | ||
"github.com/grafana/alloy/internal/featuregate" | ||
alloy_runtime "github.com/grafana/alloy/internal/runtime" | ||
|
@@ -20,6 +22,7 @@ import ( | |
cluster_service "github.com/grafana/alloy/internal/service/cluster" | ||
http_service "github.com/grafana/alloy/internal/service/http" | ||
"github.com/grafana/alloy/internal/service/labelstore" | ||
"github.com/grafana/alloy/internal/service/livedebugging" | ||
remotecfg_service "github.com/grafana/alloy/internal/service/remotecfg" | ||
"github.com/prometheus/client_golang/prometheus" | ||
"github.com/stretchr/testify/require" | ||
|
@@ -214,14 +217,42 @@ func attemptLoadingAlloyConfig(t *testing.T, bb []byte) { | |
clusterService, | ||
labelstore.New(nil, prometheus.DefaultRegisterer), | ||
remotecfgService, | ||
livedebugging.New(), | ||
}, | ||
EnableCommunityComps: true, | ||
}) | ||
err = f.LoadSource(cfg, nil, "") | ||
|
||
// This is a bit of an ugly workaround but the spanmetrics connector is starting a go routine in its start function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess this is referring to the start function of OTel Collector components? Do we call this when we call Alloy's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah we often have the pattern that the New() function is calling the Update function and this creates some side-effects but only for the tests. Currently we start the Otel components in the Run() function and that's where the race problems appeared. This change: #2262 is making the otel components start in the New() function, which is solving the problem at runtime but it can be a bit more annoying for testing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for pointing this out... I added a comment to the other PR. |
||
// If the goroutine is not stopped, it will result in a panic. To stop it we need to run the Alloy controller to run the component and then stop it. | ||
// We cannot run the Alloy controller for all tests because some components such as the statsd_exporter will panic after being stopped and some other components have wrong config. | ||
runAlloyController := false | ||
components, err := f.ListComponents("", component.InfoOptions{}) | ||
require.NoError(t, err) | ||
for _, component := range components { | ||
if component.ComponentName == "otelcol.connector.spanmetrics" { | ||
runAlloyController = true | ||
break | ||
} | ||
} | ||
|
||
if runAlloyController { | ||
ctx, cancel := context.WithCancel(context.Background()) | ||
done := make(chan struct{}) | ||
go func() { | ||
f.Run(ctx) | ||
close(done) | ||
}() | ||
defer func() { | ||
cancel() | ||
<-done | ||
}() | ||
} | ||
|
||
// Many components will fail to build as e.g. the cert files are missing, so we ignore these errors. | ||
// This is not ideal, but we still validate for other potential issues. | ||
if err != nil && strings.Contains(err.Error(), "Failed to build component") { | ||
// Tests that require the Alloy controller to run should not have build error else the components won't run. | ||
if !runAlloyController && err != nil && strings.Contains(err.Error(), "Failed to build component") { | ||
t.Log("ignoring error: " + err.Error()) | ||
return | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user supplied a config with both auth methods, isn't that technically a problem with the config and not with the converters? I'm ok for this check to remain, but not sure if it's overkill to check for invalid configuration in the converter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that setting two auth methods is possible in static mode because you must use the
authentication_method
field to specify your method (default OAuth). The problem here is that no matter which auth method the user would set, the converter would always create the two auth blocks, resulting in an invalid config.