OPA: authorization based on request body #2518

mjungsbluth · 2023-08-16T10:01:02Z

This PR adds the ability to use the request's body in authorisation decisions with Open Policy Agent.

It supports the OPA Envoy plugin behaviour and only parses the body up to a configurable maximum size and otherwise tees the request body. This is to avoid parsing large bodies and create memory and performance issues inside Skipper.

Disabling body parsing can be configured via the Open Policy Agent configuration file. It supports the same content types that the upstream OPA plugin supports: application/json and multipart/form-data.

To make negative impact on latency or memory consumption measurable, two new filter names are introduced that capture the intent to use the body. The implementation is also smart in the sense that it will check if the policy actually uses the body on from the input object before inspecting the body stream.

AlexanderYastrebov · 2023-08-21T09:28:47Z

filters/openpolicyagent/openpolicyagent.go

@@ -37,6 +40,7 @@ const (
 	defaultReuseDuration       = 30 * time.Second
 	defaultShutdownGracePeriod = 30 * time.Second
 	DefaultCleanIdlePeriod     = 10 * time.Second
+	DefaultMaxBodySize         = 128 * 1024 * 1024


This feels to much. What is the rule of thumb to estimate this?

I suggest we have no default constant, have flag value zero and configure actual value in the deployment manifest.

We could also make it lower case, private, and change it later. I agree that right now 128MB seems to be a bit big for a body.

Would 10 MB be fine?

I have to admit I got carried away and did not see it when reviewing my changes. 128KB is what I intended to actually use. This is based on internal use cases that we looked at where 60-100KB of JSON are valid cases so far.
I think the max header size is currently 1MB which we could also default to and would already be generous. 10MB is already strange for an authz use case IMO.

szuecs · 2023-08-21T09:35:50Z

filters/openpolicyagent/openpolicyagent.go

+		}
+
+		var err error
+		rawBody, err = bufferedReader.Peek(opa.maxBodyBytes)


That's likely blocking call to read(2) and can be used as DoS vector.
Can you please make sure it's not blocking, but streaming in case you need to work on the request body?

Good point it is very likely blocking, any pointers how to prove that or improve it? As we need to parse the body before feeding it into OPA, we kind of need to wait. We could timeout though on read and rather fail for slow clients (or attackers) but I am not familiar enough with Skipper's architecture or Golang...

I would suggest to create another filter for the body inspection.
Then for the implementation it would be a streaming filter that works on the body and as soon as it is clear that the body should be denied stop by returning an error, similar to blockContent filter.
For details how to do streaming filters: #2428 , I also had some more diret pointers in chat.

@szuecs I have some questions
With streaming do you mean that

We read the body partially like the blockContent filter does?

With streaming is the expectation to still hit the backend before the request filter finishes the processing?

Thanks for the pointers @szuecs, seems doable after reading the linked PR, just a few clarifying questions (and slightly separate topics)

(a) Separate Filter
Is the motivation to not touch the body unless needed or is there some other thing that is missing?
It should be possible to be a bit smarter in the filter and only parse the body only if it is really needed by the policy (needs some exploration with partial evaluation) and it fits from a size perspective. From a UX perspective I think ending up with different filters for each use case could be confusing but it is also a bit hard to predict which other cases we'll see. This would also be a good motivator to include a benchmark...

(b) Streaming body
Given that the mentioned PR is not yet merged, this essentially would result in a special purpose wrapper, do you think this is ok and we try to unify later or should we wait for the PR to be merged?
Secondly, the policy can influence the returned body / headers. If the deny happens during the body processing, I am not yet sure where we would implement that. I assume that the error returned during Read() is somehow propagated to the Response() method of the filter or is there a better way for this?

(a) if it's possible to know at parse time that you need to touch the body, then it would be great to only touch the body in case you need.
(b) yes it would already send the request headers to the backend and start streaming, but your filter will be able to stop the body streaming, because it gets the bytes before these will be written into the wire. You can pass errors to the Response of the filter via filtercontext's stateBag (map[string]interface{}) .

Thanks, then I'll proceed with making the changes:
(a) will try to get this info, if that is not feasible or slow, we can default to the separate filter
(b) will switch to a streaming processing then with the information you provided...

When starting to implement (b), I encountered an issue that I missed: The decision can influence the headers that are sent to the downstream service and we have an internal use case that will rely on that.
I assume that the actual Read for passing the request is actually done by the http.Client that makes the outgoing request. This client actually copies the headers before starting to read the body which means that they can no longer be influenced.
I am now a bit unsure how to proceed. (a) seems possible after initial testing and the read buffer can be optimised so that parsing the body does not always consume the max body size but this would still block in the Request() method of the filter. I will reach out via internal chat to discuss options...

AlexanderYastrebov · 2023-08-21T09:41:49Z

filters/openpolicyagent/openpolicyagent.go

+	body := req.Body
+	var rawBody []byte
+	if body != nil && !opa.EnvoyPluginConfig().SkipRequestBodyParse && opa.maxBodyBytes > 0 {
+		bufferedReader := bufio.NewReaderSize(body, opa.maxBodyBytes)


This will always allocate maxBodyBytes while actual body could be much smaller

Correct, it saved re-implementing a smarter buffered reader which I can also do. My assumption was that the buffer is typically in the range of hundreds of KBs so should not hurt too much. Would you recommend re-doing this with a fixed size buffer (that is in low two digits KBs)?

mjungsbluth · 2023-09-21T07:44:00Z

The PR is updated with the following changes:

There are two separate filter aliases that enable body parsing but otherwise re-use the existing filter code
body parsing is only done if not allowed by the control plane and the policy actually depends on the body for decision making
The body is read in chunks (by default 8kb) and stops at either content length or the maximum configured buffer size (1MB = may header size)

Happy to change things esp. if there are concerns with the default sizes.

szuecs · 2023-09-21T10:13:07Z

config/config.go

@@ -497,6 +498,7 @@ func NewConfig() *Config {
 	flag.StringVar(&cfg.OpenPolicyAgentConfigTemplate, "open-policy-agent-config-template", "", "file containing a template for an Open Policy Agent configuration file that is interpolated for each OPA filter instance")
 	flag.StringVar(&cfg.OpenPolicyAgentEnvoyMetadata, "open-policy-agent-envoy-metadata", "", "JSON file containing meta-data passed as input for compatibility with Envoy policies in the format")
 	flag.DurationVar(&cfg.OpenPolicyAgentCleanerInterval, "open-policy-agent-cleaner-interval", openpolicyagent.DefaultCleanIdlePeriod, "JSON file containing meta-data passed as input for compatibility with Envoy policies in the format")
+	flag.Uint64Var(&cfg.OpenPolicyAgentMaxBodySize, "open-policy-agent-max-body-size", http.DefaultMaxHeaderBytes, "Maximum number of bytes from the body that are passed as input to the policy")


Is this reasonable to depend on stdlib max header bytes?
I guess we should have our own MaxBodySize in openpolicyagent package.

Will change

szuecs · 2023-09-21T10:13:45Z

config/config_test.go

@@ -161,6 +161,7 @@ func defaultConfig() *Config {
 		LuaModules:                              commaListFlag(),
 		LuaSources:                              commaListFlag(),
 		OpenPolicyAgentCleanerInterval:          10 * time.Second,
+		OpenPolicyAgentMaxBodySize:              1048576,


openpolicyagent.MaxBodySize ?

As above, will change

szuecs · 2023-09-21T10:14:24Z

docs/reference/filters.md

+
+This filter has the same parameters that the `opaAuthorizeRequest` filter has.
+
+The body is parsed up to a maximum size that can be configured via the `-open-policy-agent-max-body-size` command line argument.


Tell about the default size here, too

szuecs · 2023-09-21T10:14:40Z

docs/reference/filters.md

+
+This filter has the same parameters that the `opaServeResponse` filter has.
+
+The body is parsed up to a maximum size that can be configured via the `-open-policy-agent-max-body-size` command line argument.


tell about the default size, too

szuecs · 2023-09-21T10:27:59Z

filters/openpolicyagent/openpolicyagent.go

+	body := req.Body
+
+	if body != nil && !opa.EnvoyPluginConfig().SkipRequestBodyParse &&
+		opa.maxBodyBytes > 0 && req.ContentLength <= int64(opa.maxBodyBytes) {


opa.maxBodyBytes > 0 seems to be included in req.ContentLength <= int64(opa.maxBodyBytes) and 0 case should be handled by body != nil

szuecs

Can you please add a test case for chaining the request filter?
I expect that people will use chaining and 2 days ago we found #2605.

mjungsbluth · 2023-09-21T14:25:40Z

Can you please add a test case for chaining the request filter?
I expect that people will use chaining and 2 days ago we found #2605.

Interesting 🤔. Yes will do that…

mjungsbluth · 2023-09-25T07:24:53Z

The PR has been updated, the chaining luckily worked without any changes and the other comments have been addressed as well.

szuecs · 2023-09-25T18:42:14Z

👍

szuecs · 2023-09-25T18:42:25Z

@AlexanderYastrebov please review, thanks

mjungsbluth · 2023-09-28T12:47:21Z

I will add one more memory setting to keep the total size of all concurrent bodies in check and avoid an OOM.

mjungsbluth · 2023-10-02T11:23:52Z

Added a new memory setting that is global for all in-flight requests that depend on body authz which is enforced using a semaphore. Exceeding that limit will start failing requests with a 5xx status. Happy to chat on names and if we should block instead of failing ...

szuecs · 2023-10-02T12:13:43Z

filters/openpolicyagent/openpolicyagent.go

-func (opa *OpenPolicyAgentInstance) ExtractHttpBodyOptionally(req *http.Request) (io.ReadCloser, []byte, error) {
+func bodyUpperBound(contentLength, maxBodyBytes int64) int64 {
+	if contentLength <= 0 {
+		return maxBodyBytes


Do you have a test that has no body in the request?

Actually no. Will add one...

Added a few tests against unknown and nil bodies.

skipper.go

szuecs · 2023-10-02T12:21:23Z

docs/reference/filters.md

@@ -1804,7 +1804,7 @@ Requests can also be authorized based on the request body the same way that is s

 This filter has the same parameters that the `opaAuthorizeRequest` filter has.

-The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-body-size` command line argument.
+The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many authorized body requests, another flag `-open-policy-agent-max-total-body-size` controls how much memory can be used across all requests with a default of 100MB. If  in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error. 


Maybe also explain the equation that is used to limit max concurrent users.
Is it 100MB/1MB = 100 or something else?

Will add this. It is effectively depending on the actual content length. So if you have an average of 80kb bodies, you can have 100MB/80KB=1280 concurrent requests before the filters would start to consume a lot more memory than granted.

Amended the docs.

filters/openpolicyagent/openpolicyagent.go

szuecs · 2023-10-04T19:27:16Z

Added a new memory setting that is global for all in-flight requests that depend on body authz which is enforced using a semaphore. Exceeding that limit will start failing requests with a 5xx status. Happy to chat on names and if we should block instead of failing ...

Fail fast is preferable.

szuecs · 2023-10-07T18:45:34Z

docs/reference/filters.md

@@ -1857,7 +1857,7 @@ If you want to serve requests directly from an Open Policy Agent policy that use

 This filter has the same parameters that the `opaServeResponse` filter has.

-The body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many authorized body requests, another flag `-open-policy-agent-max-total-body-size` controls how much memory can be used across all requests with a default of 100MB. If  in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error. 
+A request's body is parsed up to a maximum size with a default of 1MB that can be configured via the `-open-policy-agent-max-request-body-size` command line argument. To avoid OOM errors due to too many concurrent authorized body requests, another flag `-open-policy-agent-max-memory-body-parsing` controls how much memory can be used across all requests with a default of 100MB. If  in-flight requests that use body authorization exceed that limit, incoming requests that use the body will be rejected with an internal server error. The number of concurrent requests is <max-memory-body-parsing> / min(avg(<request content-length>), <max-request-body-size>), so if requests on average have 100KB and the maximum memory is set to 100MB, on average 1024 authorized requests can be processed concurrently.


You can also use Math syntax like in https://github.com/zalando/skipper/blob/master/docs/reference/filters.md?plain=1#L2289
I think this would be better readable.

Hah, did not know that. Looks indeed nicer, will change

szuecs · 2023-10-07T18:46:59Z

lgtm, only the doc could be nicer :)

zalando-robot · 2023-10-09T07:30:54Z

Docker image "container-registry-test.zalando.net/teapot/skipper-test:sha256:559cd36314207aa50947cfa2a76e57d804e71e7d0b39af76e6b2529d440b3492" is not based on an approved base image. Any production deployment relying on this image will be blocked.