-
Notifications
You must be signed in to change notification settings - Fork 653
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MeterProvider: Excessive creation of metrics instruments #2820
Comments
To add more context, if I were writing an adapter for the Go Prometheus Client I would need to write this caching myself inside the adapter to avoid duplicate registration of metrics. I don't know how heavy-weight the creation of instruments are in other client-SDKs for metrics, so that could also be a consideration. |
I'm not familiar with Prometheus - does that implementation not cache instruments/meters by default? The OTEL SDK for Go explicitly states that it does, which we released an adapter for out-of-the-box. So, since we need to generally support functional options that modify config on a per-request basis (e.g. setting a different one for a specific operation) we took advantage of this and made the operation workflow invoke Since our clients support functional options, I'm leaning towards taking a hard stance that Meter() and Instrument() calls through the smithy-go APIs should be idempotent on spec. |
Using the Prometheus Go SDK, the flow is:
It's an error to register two metrics with the same unique-identifier (or rather, it's an error to register two metrics which would result in the same time-series, if that makes sense), and there is no out-of-the-box mechanism to look up previously registered metrics to avoid creating new ones - it is up to the application to keep a reference to the metric. If we knew that for a given scope + name we would never get different configs (which sounds like what you're suggesting), a hypothetical Prometheus meter-provider-adapter could keep these metrics persistent in-between calls. |
It also looks like the OTEL SDK has an adapter to register itself with a Prometheus registry - that might be the smoothest option if you're primarily targeting and testing against the OTEL SDKs. |
I'm okay closing this if your primary target is OTEL metrics - the OTEL project has a Prometheus adapter which seems to work okay. For reference, a sketch of what that looks like is below. That assumes the caller is okay hauling around the OTEL SDK for these purposes. A prometheus adapter plugged directly into the AWS SDK which caches instruments wouldn't be too hard, and might avoid some allocations per SDK-call. I'm not sure about what I'm doing with histogram-buckets - that might not actually be necessary. Click here!package main
import (
"context"
"fmt"
"io"
"net/http/httptest"
"os"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
"github.com/aws/smithy-go/metrics/smithyotelmetrics"
"go.opentelemetry.io/otel"
otelprom "go.opentelemetry.io/otel/exporters/prometheus"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
func main() {
if err := mainErr(); err != nil {
fmt.Fprintf(os.Stderr, "error: %s\n", err)
os.Exit(1)
}
}
func mainErr() error {
// set up our metric-exporter
setupOTELExporter()
// for demo purposes, scrape all prom metrics and dump to stdout
defer scrapePromMetrics()
ctx := context.Background()
cfg, err := config.LoadDefaultConfig(ctx)
if err != nil {
return fmt.Errorf("loading aws config: %s", err)
}
s3c := s3.NewFromConfig(cfg, func(o *s3.Options) {
o.MeterProvider = smithyotelmetrics.Adapt(otel.GetMeterProvider())
})
// make a few API calls
_, err = s3c.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
return fmt.Errorf("list buckets: %s", err)
}
_, err = s3c.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
return fmt.Errorf("list buckets: %s", err)
}
_, err = s3c.ListBuckets(ctx, &s3.ListBucketsInput{})
if err != nil {
return fmt.Errorf("list buckets: %s", err)
}
return nil
}
func setupOTELExporter() {
// create an otel metric-exporter associated with the
// default prometheus registry
metricExporter, err := otelprom.New(
otelprom.WithNamespace("aws"),
otelprom.WithoutScopeInfo(),
// OTEL default buckets assume you're using milliseconds. Substitute defaults
// appropriate for units of seconds.
//
// https://github.com/open-telemetry/opentelemetry-go/issues/5821
otelprom.WithAggregationSelector(func(ik sdkmetric.InstrumentKind) sdkmetric.Aggregation {
switch ik {
case sdkmetric.InstrumentKindHistogram:
return sdkmetric.AggregationExplicitBucketHistogram{
Boundaries: prometheus.DefBuckets,
NoMinMax: false,
}
default:
return sdkmetric.DefaultAggregationSelector(ik)
}
}),
)
if err != nil {
panic(err)
}
// create a meter-provider associated with the exporter
meterProvider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(metricExporter),
)
otel.SetMeterProvider(meterProvider)
}
// for demo purposes, dump all prom metrics to stdout
func scrapePromMetrics() {
req := httptest.NewRequest("GET", "/", nil)
rw := httptest.NewRecorder()
promhttp.Handler().ServeHTTP(rw, req)
io.Copy(os.Stdout, rw.Result().Body)
} |
Another difference between OTEL and Prometheus is that Prometheus requires that the time-series labels be defined during instrument-creation, but it looks like the APIs provided by smithy-metrics defer that to observation, which makes the direct Prometheus adapter more awkward. Anyway, I have a basic path forward. If I'm okay with the OTEL SDK as a dependency the code is pretty straightforward. If I want a native Prometheus-client integration, implementing my own middleware is likely going to be less work that fitting the smithy-metrics interfaces. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Acknowledgements
go get -u github.com/aws/aws-sdk-go-v2/...
)Describe the bug
When supplying a MeterProvider to the s3 client, the underlying retryer seems to be creating a metrics-instrument on every API call.
Following the OTEL model (and other similar metrics-libraries) I would have expected instruments to be creating once, cached, and then getting new observations on existing instruments on each API call.
For example, from the OTEL documentation on instrumenting a Go application: https://opentelemetry.io/docs/languages/go/instrumentation/#using-counters
Click here!
Excerpt of my test code:
Full code sample:
Click here!
Regression Issue
Expected Behavior
I expect my demo-meter-provider to log each instrument creation only once.
Current Behavior
The instruments were creating three times - once per API call I made to the S3 service:
Click here!
Reproduction Steps
Possible Solution
No response
Additional Information/Context
No response
AWS Go SDK V2 Module Versions Used
Compiler and Version used
go version go1.23.1 linux/amd64
Operating System and version
Arch Linux x86_64, Linux 6.10.10-arch1-1
The text was updated successfully, but these errors were encountered: