Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[grpc storage]: Propagate tenant to grpc backend #6030

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

frzifus
Copy link
Member

@frzifus frzifus commented Sep 30, 2024

Which problem is this PR solving?

  • Missing tenant information in a gRPC remote backend.
    image

Description of the changes

Before the gRPC plugins were removed in Jaeger 1.58, tenant information was distributed in via the internal context.
With this change, the tenant information is also propagated to a removed grpc backend.

How was this change tested?

  • Manually with jaeger-query, tempo-query and wireshark.

Checklist


cc @albertteoh do you think we can get this into the next jaeger release tomorrow? ^^

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.90%. Comparing base (99e9e0e) to head (20a31ba).
Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6030   +/-   ##
=======================================
  Coverage   96.90%   96.90%           
=======================================
  Files         349      349           
  Lines       16599    16600    +1     
=======================================
+ Hits        16086    16087    +1     
  Misses        329      329           
  Partials      184      184           
Flag Coverage Δ
badger_v1 7.98% <0.00%> (-0.01%) ⬇️
badger_v2 1.81% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1 15.74% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2 1.74% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1 15.74% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2 1.74% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 18.69% <0.00%> (-0.01%) ⬇️
elasticsearch-7.x-v1 18.74% <0.00%> (-0.01%) ⬇️
elasticsearch-8.x-v1 18.93% <0.00%> (-0.02%) ⬇️
elasticsearch-8.x-v2 1.80% <0.00%> (-0.01%) ⬇️
grpc_v1 9.51% <100.00%> (-0.01%) ⬇️
grpc_v2 7.10% <0.00%> (-0.01%) ⬇️
kafka-v1 9.69% <0.00%> (-0.01%) ⬇️
kafka-v2 1.81% <0.00%> (-0.01%) ⬇️
memory_v2 1.80% <0.00%> (-0.02%) ⬇️
opensearch-1.x-v1 18.80% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v1 18.78% <0.00%> (-0.01%) ⬇️
opensearch-2.x-v2 1.80% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.46% <0.00%> (-0.01%) ⬇️
unittests 95.70% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@frzifus frzifus closed this Sep 30, 2024
@frzifus frzifus reopened this Oct 1, 2024
@frzifus frzifus marked this pull request as ready for review October 1, 2024 11:25
@frzifus frzifus requested a review from a team as a code owner October 1, 2024 11:25
@frzifus frzifus requested a review from pavolloffay October 1, 2024 11:25
// BearerTokenKey is the key name for the bearer token context value.
BearerTokenKey = "bearer.token"
// TenantKey is the key name for the x-tenant context value.
TenantKey = "x.tenant"
Copy link
Member Author

@frzifus frzifus Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to reuse the configured tenant header key?
Coming from --multi-tenancy.header=x-scope-orgid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. That should already be the case. Since tenancy.NewClientUnaryInterceptor and tenancy.NewClientStreamInterceptor are registered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running Jaeger query without this patch:

SPAN_STORAGE_TYPE=grpc ./query --query.base-path=/ --grpc-storage.server=127.0.0.1:7777  --query.bearer-token-propagation=true  --multi-tenancy.enabled=true --multi-tenancy.header=x-scope-orgid --query.grpc-server.host-port=localhost:16685 --query.http-server.host-port=localhost:16686 --admin.http.host-port=localhost:16687

Then going a request:

curl -X GET "http://localhost:16686/api/traces?service=dummy" \
  -H "Authorization: Bearer abc" \
  -H "x-scope-orgid: 124"

Will only propagate the following metadata:

Stream Request Metadata: map[:authority:[127.0.0.1:7777] bearer.token:[abc] content-type:[application/grpc] grpc-accept-encoding:[snappy,zstd,gzip] user-agent:[grpc-go/1.65.0]]

Using this patch x.tenant:[124] will be part of it.

Copy link
Member Author

@frzifus frzifus Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. That should already be the case. Since tenancy.NewClientUnaryInterceptor and tenancy.NewClientStreamInterceptor are registered.

Seems the problem was that this branch was never executed. Since the tenancy options from config_v1 have not been translated into config_v2.

if tenancyMgr.Enabled {
opts = append(opts, grpc.WithUnaryInterceptor(tenancy.NewClientUnaryInterceptor(tenancyMgr)))
opts = append(opts, grpc.WithStreamInterceptor(tenancy.NewClientStreamInterceptor(tenancyMgr)))

After adding the tenancy options in TranslateToConfigV2 the tenant infos x-scope-orgid:[124] are propagated.

Stream Request Metadata: map[:authority:[127.0.0.1:7777] bearer.token:[abc] content-type:[application/grpc] grpc-accept-encoding:[snappy,zstd,gzip] user-agent:[grpc-go/1.67.0] x-scope-orgid:[124]]

cc @yurishkuro @pavolloffay

@yurishkuro
Copy link
Member

  1. Is the header name x.tenant in any way standard in gRPC?
  2. I noticed that we don't have much in the way of documentation for these two headers, bearer.token and x.tenant - they both seem to be non-standard. Should we add a paragraph to https://www.jaegertracing.io/docs/1.61/deployment/#remote-storage explaining their use?
  3. Do we also need a symmetrical header extractor in grpc/shared/grpc_handler?

This reverts commit 090962d.

Signed-off-by: Benedikt Bongartz <[email protected]>
@frzifus frzifus requested a review from yurishkuro October 1, 2024 20:31
@frzifus
Copy link
Member Author

frzifus commented Oct 1, 2024

  1. Is the header name x.tenant in any way standard in gRPC?

No, the header name x.tenant is not a standard header in gRPC. I followed the bearer token implementation with the naming.

  1. I noticed that we don't have much in the way of documentation for these two headers, bearer.token and x.tenant - they both seem to be non-standard. Should we add a paragraph to https://www.jaegertracing.io/docs/1.61/deployment/#remote-storage explaining their use?

Sure, I will document the outcome of this PR afterwards.

  1. Do we also need a symmetrical header extractor in grpc/shared/grpc_handler?

I think no. I'm also not sure if we should keep the bearer.token key for the authorization token. From my point of view it makes sense to provide gRPC server & client interceptor for a list of headers. That should work with any header key you want to forward.

@yurishkuro
Copy link
Member

From my point of view it makes sense to provide gRPC server & client interceptor for a list of headers.

What would those interceptors do with the headers? In case of outbound ones they read the values from designated keys in the Context, so I am not sure what a "generic" interceptor for "headers" would do.

I do think we need an inbound interceptor for use in cmd/remote-storage, because it can be used to run storage backends natively supported by Jaeger which would be able to read those values from the Context.

@@ -38,6 +38,7 @@ func DefaultConfigV2() ConfigV2 {

func (c *Configuration) TranslateToConfigV2() *ConfigV2 {
return &ConfigV2{
Tenancy: c.TenancyOpts,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mahadzaryab1 I think this is another example of divergence between v1/v2 configs - this is why my preference is always NOT to have separate configs, so that omissions like this won't happen. Do you want to take a look at that as part of #5229?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frzifus what kind of unit or e2e test could we have to catch this bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to add a unittest for this translate config function. But when I started to re-type this 4 line logic, I thought it doesnt make much sense.

An e2e test that runs jaeger-query with the SPAN_STORAGE_TYPE=grpc storage and the --multi-tenancy.header=x-scope-orgid flag and checks the metadata of the outgoing gRPC request would be perfect to avoid that issue in the future.
Similar to this: #6030 (comment)

@yurishkuro Would it be ok, creating an issue to keep track and submit something afterwards?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do @yurishkuro

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An e2e test that runs jaeger-query with the SPAN_STORAGE_TYPE=grpc storage and the --multi-tenancy.header=x-scope-orgid flag and checks the metadata of the outgoing gRPC request would be perfect to avoid that issue in the future.

We have a grpc storage e2e test today. We can add a test there that will run the storage with tenancy enabled and store a span with tenant1 then try to retrieve it with tenant1 (found) and tenant2 (not found). The memory storage used in that e2e test already supports separation of data by tenant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even better to do this with jaeger-v2 e2e tests, but one thing that I don't know about is how we can thread the tenant ID through the OTEL Collector pipeline, from OTLP receiver to jaeger-storage-extension.

@frzifus
Copy link
Member Author

frzifus commented Oct 1, 2024

What would those interceptors do with the headers? In case of outbound ones they read the values from designated keys in the Context, so I am not sure what a "generic" interceptor for "headers" would do.

mh.. y, I think you are right. In that case, we would need to attach a map as value and build something custom. Probably not worth it.

I do think we need an inbound interceptor for use in cmd/remote-storage, because it can be used to run storage backends natively supported by Jaeger which would be able to read those values from the Context.

Its the case here, right?

if tm.Enabled {
grpcOpts = append(grpcOpts,
grpc.StreamInterceptor(tenancy.NewGuardingStreamInterceptor(tm)),
grpc.UnaryInterceptor(tenancy.NewGuardingUnaryInterceptor(tm)),
)
}

@yurishkuro
Copy link
Member

Its the case here, right?

ah, so it handles tenancy but not the bearer token? I actually don't see anything grpc-related in pkg/bearertoken

@yurishkuro
Copy link
Member

we should move upgradeContextWithBearerToken from plugin/storage/grpc/shared/grpc_client.go to be implemented as interceptor in pkg/bearertoken

@frzifus
Copy link
Member Author

frzifus commented Oct 1, 2024

ah, so it handles tenancy but not the bearer token? I actually don't see anything grpc-related in pkg/bearertoken

y, I did not go through the code yet. It might be handled somewhere. But having a pkg with the gRPC impl. (similar to the tenant thing) would simplify it.

@yurishkuro yurishkuro changed the title storage: propagate tenant to grpc backend [grpc storage]: Propagate tenant to grpc backend Oct 3, 2024
@yurishkuro yurishkuro merged commit d6631f5 into jaegertracing:main Oct 3, 2024
49 of 50 checks passed
@frzifus frzifus deleted the grpc-tenancy branch October 4, 2024 14:06
@frzifus frzifus mentioned this pull request Oct 4, 2024
6 tasks
yurishkuro pushed a commit that referenced this pull request Oct 4, 2024
)

## Which problem is this PR solving?
- Resolves #6041

## Description of the changes
- We were currently maintaining two entirely separate configurations for
v1 and v2 for the GRPC Storage Component and were initializing v1
configurations and then translating them to the v2 configurations which
were the ones actually being used. This caused an issue where one
configuration field was left out of the translation method (see
#6030 for more details).
- In this PR, we consolidate the v1 and v2 configurations into a single
config type that is directly initialized to avoid having configurations
that diverge or running into issues like the one described above.

## How was this change tested?
- Unit tests were updated
- Integration tests were not touched but should still pass since this
was just an internal change and the interface was not touched

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants