Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

desc = "error reading server preface: http2: frame too large" #646

Open
haseeb-aziz opened this issue Feb 15, 2023 · 20 comments
Open

desc = "error reading server preface: http2: frame too large" #646

haseeb-aziz opened this issue Feb 15, 2023 · 20 comments

Comments

@haseeb-aziz
Copy link

Hello
I’m getting this error “Err: connection error: desc = "error reading server preface: http2: frame too large" {"grpc_log": true}” when I use this configuration in eks-kubernetes-cluster.
But same configuration when i use successfully export prometheus metrics to uptrace without eks kubernetes cluster

Configuration:

prometheus_simple:
collection_interval: 10s
endpoint: '10.XXX.XX.XXX:9090'
metrics_path: '/metrics'
use_service_account: false
tls_enabled: false

exporters:
otlp:
endpoint: 10.XXX.X.XX:14317
headers: { 'uptrace-dsn': 'http://[email protected]:14317/2' }
tls:
insecure: true

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Feb 15, 2023

@haseeb-aziz that error message is normally hiding a deeper issue. I've seen it when not being able to communicate properly with the grpc endpoint I'm trying to send data to.

Can you update your issue with your values.yaml? Which helm chart version are you using?

@TylerHelmuth
Copy link
Member

@haseeb-aziz Please format your post as yaml

@haseeb-aziz
Copy link
Author

haseeb-aziz commented Feb 16, 2023

This is the logs of opentelemetry pod. I'm getting this error

info exporterhelper/queued_retry.go:433 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: http2: frame too large"", "interval": "38.375612262s"}

Please advise

@haseeb-aziz
Copy link
Author

haseeb-aziz commented Feb 16, 2023

@TylerHelmuth Thanks
Helm chart version: 0.48.1

values.yaml file of opentelemetry

nameOverride: ""
fullnameOverride: ""

mode: "deployment"

presets:
  logsCollection:
    enabled: false
    includeCollectorLogs: false
    storeCheckpoints: false
  
  hostMetrics:
    enabled: false
  
  kubernetesAttributes:
    enabled: false
  
  clusterMetrics:
    enabled: false
  
  kubeletMetrics:
    enabled: false

configMap:
  create: true
config:
  exporters:
    logging: {}
  extensions:
    health_check: {}
    memory_ballast: {}
  processors:
    batch: {}
    memory_limiter: null
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: ${MY_POD_IP}:4317
        http:
          endpoint: ${MY_POD_IP}:4318
    prometheus_simple:
      collection_interval: 10s
      endpoint: 'XXX.XXX.XXX.147:9090'
      metrics_path: '/metrics'
      use_service_account: false
      tls_enabled: false   
    zipkin:
      endpoint: ${MY_POD_IP}:9411
  exporters:
    otlp:
      endpoint: XX.XXX.1.XXX:14317
      headers: { 'uptrace-dsn': 'http://[email protected]:14317/2' }
      tls:
        insecure: true 

  service:
    telemetry:
      metrics:
        address: ${MY_POD_IP}:9090
    extensions:
      - health_check
      - memory_ballast
    pipelines:
      metrics:
        exporters:
          - otlp
        receivers:
          - prometheus_simple
      
image:
  repository: otel/opentelemetry-collector-contrib
  pullPolicy: IfNotPresent
  tag: ""
  digest: ""
imagePullSecrets: []
command:
  name: otelcol-contrib
  extraArgs: []

serviceAccount:
  create: true
  annotations: {}
  name: ""

clusterRole:
  create: false
  annotations: {}
  name: ""
  rules: []
  clusterRoleBinding:
    annotations: {}
    name: ""

podSecurityContext: {}
securityContext: {}

nodeSelector: {}
tolerations: []
affinity: {}
topologySpreadConstraints: {}


priorityClassName: ""

extraEnvs: []
extraVolumes: []
extraVolumeMounts: []


ports:
  otlp:
    enabled: true
    containerPort: 4317
    servicePort: 4317
    hostPort: 4317
    protocol: TCP
    # nodePort: 30317
    appProtocol: grpc
  otlp-http:
    enabled: true
    containerPort: 4318
    servicePort: 4318
    hostPort: 4318
    protocol: TCP
  jaeger-compact:
    enabled: true
    containerPort: 6831
    servicePort: 6831
    hostPort: 6831
    protocol: UDP
  jaeger-thrift:
    enabled: true
    containerPort: 14268
    servicePort: 14268
    hostPort: 14268
    protocol: TCP
  jaeger-grpc:
    enabled: true
    containerPort: 14250
    servicePort: 14250
    hostPort: 14250
    protocol: TCP
  zipkin:
    enabled: true
    containerPort: 9411
    servicePort: 9411
    hostPort: 9411
    protocol: TCP
  metrics:
    enabled: false
    containerPort: 8888
    servicePort: 8888
    protocol: TCP


resources:
  limits:
    cpu: 256m
    memory: 512Mi

podAnnotations: {}

podLabels: {}

hostNetwork: false

dnsPolicy: ""


replicaCount: 1


revisionHistoryLimit: 10

annotations: {}

extraContainers: []

initContainers: []

lifecycleHooks: {}

service:
  type: ClusterIP
  annotations: {}

ingress:
  enabled: false
  additionalIngresses: []

podMonitor:
  enabled: false
  metricsEndpoints:
    - port: metrics
  extraLabels: {}

serviceMonitor:
  enabled: false
  metricsEndpoints:
    - port: metrics
  extraLabels: {}
  
podDisruptionBudget:
  enabled: false

autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

rollout:
  rollingUpdate: {}
  strategy: RollingUpdate

prometheusRule:
  enabled: false
  groups: []
  defaultRules:
    enabled: false
  extraLabels: {}

statefulset:
  volumeClaimTemplates: []
  podManagementPolicy: "Parallel"

@haseeb-aziz
Copy link
Author

haseeb-aziz commented Feb 16, 2023

@TylerHelmuth
I also check uptrace pod using tcpdump, but Uptrace not receiving any traffic from opentelelmetry on that port.
Same configuration work uptrace and otel collector deployed on same host using docker compose.

@povilasv
Copy link
Contributor

povilasv commented Feb 17, 2023

This looks like exporter misconfiguration. The server is not grpc or behaves weirdly? You can debug this by adding environment variable GODEBUG=http2debug=2 and it should dump the grpc requests it is trying to send.

Reference: https://stackoverflow.com/a/44482155

@qdongxu
Copy link

qdongxu commented Apr 21, 2023

This may be caused by the grpc client sending HTTP/2 to a https server.

I got a similar issue when sending grpc request to a nginx grpc proxy.

this got the 'http2: frame too large' error when Dial as below( had intended to ignore any SSL verification):

	conn, err := grpc.Dial(*addr, grpc.WithTransportCredentials(insecure.NewCredentials()))

as tcpdump observed, the client sends HTTP/2 frames without TLS encryption. and the server side sends back 404 BadRequest and error messages in HTTP body unencrypted in HTTP 1.1. Then the client reports error reading server preface: http2: frame too large

but succeeded in this way (load the server certificate):

	proxyCA := "/var/tmp/fullchain.pem" // CA cert that signed the proxy
	f, err := os.ReadFile(proxyCA)
	
	p := x509.NewCertPool()
	p.AppendCertsFromPEM(f)
	tlsConfig := &tls.Config{
		RootCAs: p,
	}
	conn, err := grpc.Dial(*addr, grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)))

It may be not the same cause. Just for reference since I am looking for the root cause and come across this issue.

@basch255
Copy link

basch255 commented Jul 5, 2023

Are there any updates on this topic?
I run into a similar error. It try to connect two collectors via grpc over nginx (secured by oidc).

client-collector:

    exporters:
      otlp:
        endpoint: server-collector.example.de:443
        auth:
          authenticator: oauth2client
        tls:
          insecure_skip_verify: true

server-collector:

    receivers:
      otlp:
        protocols:
          grpc:
            auth:
              authenticator: oidc

ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
  name: server-collector
spec:
  ingressClassName: nginx
  rules:
    - host: server-collector.example.de
      http:
        paths:
          - backend:
              service:
                name: server-collector
                port:
                  number: 4317
            path: /
            pathType: ImplementationSpecific

@haseeb-aziz
Copy link
Author

use http rather then grpc. My problem resolve using http.

@veyselsahin
Copy link

@haseeb-aziz could you able to sent metrics over HTTP instead of GRPC?

@haseeb-aziz
Copy link
Author

haseeb-aziz commented Jul 8, 2023 via email

@xiaoqinglee
Copy link

xiaoqinglee commented Jul 12, 2023

"http2: frame too large" will happen when you try to initialize an http2 connection (using grpc, for example) to a target port which is expecting http1.1 connections.

@shiw2021
Copy link

same error in v2rayN client. after edit config, i delete the fingerprint input (which used to be "chrome"). the error gone.

@raniellyferreira
Copy link

you are trying connecting to http/2 over http, try again with tls connection with insecure_skip_verify = true

@pedrogneri
Copy link

I have the same issue, my grpc server use a nginx proxy that have TLS.
So i just have to add TLS credentials to my Dial

        grpcClientConn, err := grpc.Dial(os.Getenv("GRPC_SERVER_ADDR"), grpc.WithTransportCredentials(credentials.NewTLS(&tls.Config{})))

instead of

	conn, err := grpc.Dial(*addr, grpc.WithTransportCredentials(insecure.NewCredentials()))

@izavadynskyi
Copy link

Hi guys, I have also faced with the same issue.
Does anybody configured the external tempo-distributor endpoint to send GRPC traces (port 4317) via ingress nginx?
We have dedicated common Tempo cluster in separate EKS environment and using the otlp http port via tempo gateway for sending the black box traces using OTel agents from applications hosted on others EKS clusters and this solution works fro us. But for some services we need to use GRPC only.
So tried to created direct ingress endpoint to tempo-distributor port 4317 but faced with the following exceptions:

2023-12-22T10:25:07.675Z	warn	zapgrpc/zapgrpc.go:195	[core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
  "Addr": "tempo-distributor.dev.observability.internal:80",
  "ServerName": "tempo-distributor.dev.observability.internal:80",
  "Attributes": null,
  "BalancerAttributes": null,
  "Type": 0,
  "Metadata": null
}. Err: connection error: desc = "error reading server preface: http2: frame too large"	{"grpc_log": true}

Ingress configuration :

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx-internal
    meta.helm.sh/release-name: tempo
    meta.helm.sh/release-namespace: tempo
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/grpc-backend: "true"
  labels:
    app.kubernetes.io/component: distributor
    app.kubernetes.io/instance: tempo
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: tempo
    app.kubernetes.io/version: 2.0.1
    helm.sh/chart: tempo-distributed-1.2.7
  name: tempo-distributor
  namespace: tempo
spec:
  rules:
  - host: tempo-distributor.dev.observability.internal
    http:
      paths:
      - backend:
          service:
            name: tempo-distributor
            port:
              number: 4317
        path: /
        pathType: Prefix

Opentelemetry collector config below:

kind: OpenTelemetryCollector
metadata:
  name: cloud
spec:
  config: |
    receivers:
      otlp:
        protocols:
          http:
          grpc:
            endpoint: 0.0.0.0:5555
    processors:
      batch:
        timeout: 1s
        send_batch_size: 1024
    exporters:
      logging:
        loglevel: info
      otlp:
        endpoint: tempo-distributor.dev.observability.internal:80
        tls:
          insecure: true
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlp] 

verified such configuration with local tempo-distributor service (sending traces directly via Opentelemetry collector to tempo-distributor port 4317 service without ingress ) and everything works properly

Will be appreciated for any help if somebody used such approach

@izavadynskyi
Copy link

resolving it on Opentelemetry collector side

  config: |
    receivers:
      otlp:
        protocols:
          http:
          grpc:
            endpoint: 0.0.0.0:5555
    processors:
      batch:
        timeout: 1s
        send_batch_size: 1024
    exporters:
      logging:
        loglevel: info
      otlphttp:
        endpoint: [http://tempo-gateway.dev.observability.internal:80/otlp](http://tempo-gateway.dev.observability.internal/otlp)
        tls:
          insecure: true
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]

by receiving
receivers: [otlp]
and exporting
exporters: [otlphttp]
to ingress tempo-gateway

@OndrejValenta
Copy link

Ok, so in my case it was a certificate issue. Or more like Loki going through proxy, even if it shouldnt, it does not respect NO_PROXY.

Once I put CA certificate to trusted cert store, and disabled skipped verification, all is good.

http_config:
insecure_skip_verify: false

@fkamaliada
Copy link

fkamaliada commented Apr 18, 2024

Same here (AWS EKS environment). Problem solved after changing otel collector config map from:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:55680
exporters:
  otlp/data-prepper:
    endpoint: data-prepper.opentelemetry:21890
    tls:
      insecure: true
  otlp/data-prepper-metrics:
    endpoint: data-prepper-metrics.opentelemetry:4900
    tls:
      insecure: true
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp/data-prepper]
    metrics:
      receivers: [otlp]
      exporters: [otlp/data-prepper-metrics]

to:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:55680
exporters:
  otlphttp/data-prepper:
    endpoint: data-prepper.opentelemetry:21890
    tls:
      insecure: true
  otlphttp/data-prepper-metrics:
    endpoint: data-prepper-metrics.opentelemetry:4900
    tls:
      insecure: true
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlphttp/data-prepper]
    metrics:
      receivers: [otlp]
      exporters: [otlphttp/data-prepper-metrics]

@albertojnk
Copy link

I was getting this error only on production environment, local worked fine.
After a lot investigation I figured out that when my gitlab-ci ran it looked up into the .env-prod file, not .env, and turns out my .env-prod file had the wrong port in there, a port for an http service instead of my grpc one.

so tldr: look your env files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests