-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POST body processing #902
Comments
While #628 implements alphabet checking, for now we do not parse POST body and just skip it, so this issue must use alphabet checking from #628 and implement the POST validation. Please update Custom character sets wiki page for the new POST processing functionality. |
There is a evasion attack based on character escaping in field parameters. According to the RFC 7231, boundary parameter may be a "quoted-string". Something like In the request below, PHP will see "Hello, PHP!", while application with a conformant parser will see "Hello, stranger!":
|
Parts of the issue are implemented in pull requests #1139 and #1154. The short summary for TODO:
|
RFC 1867 is obsoleted by 2854, but it's about text/html and seems use MIME just in example. The real references for us must be RFC 2045, 2046 and maybe 2049. Following implementations can be used as reference for multipart parser:
The second one deals with nested multipart bodies. I don't think that HTTP ever uses this. It'd be good to check some RFC for this or source code of some mature web project like Django, Node.js or PHP whether they parse nested parts (I checked Ngixn Unit and it seems it leave POST multipart processing for the application logic). As noted above we should not With this task we only need to parse the POST multipart body and (1) validate it and (2) build an internal data structure to represent all the parameters. See for example AWS S3 POST structure. The data structure representing the multipart body is TBD, but consider following options:
|
@krizhanovsky RFC 7578: sect. 4.3 and 5.2 explicitely allow multiple form fields with identical field names. |
@krizhanovsky Also |
The POST body processing option must be configured per-location and per-vhost by a new configuration option
Where N is maximum content length,
0
by default means process all POST requests w/o any upper limit for the length. By default, if the option is missed in the configuration file, we should not perform any POST validation.We have to process POST body, at least
boundary
. Empty and doubling boundary must be correctly handled e.g.:See more cases in https://www.slideshare.net/ssusera0a306/zeronights-2016-a-blow-under-the-belt-how-to-avoid-wafipsdlp-wafipsdlp and https://blog.qualys.com/wp-content/uploads/2012/07/Protocol-Level%20Evasion%20of%20Web%20Application%20Firewalls%20v1.1%20(18%20July%202012).pdf
A new configuration option must be introduced
The configuration option influences all RFC-undefined HTTP content mutation with probably security issues.
strict
means just drop a request, i.e. if RFC doesn't allow some parameter to be doubling, then just drop the request and write a warning to log.transform
takes the first occurrence and ignores all the following, and write a warning log message.log
just write a warning message, as both other modes, and leaves everything as is. Very important list all the cases affected by the option in Wiki, that's very crucial for debugging probably issues with web application. Also add description of the attacks to the Web security wiki with examples and use cases.Tempesta must not allow multiple same-name parameters in a
Content-Disposition
part header, doubling.POSTs can be pretty large and have many parameters. So need a good string search algorithm, BM or an AVX2 matcher. Need to test the algorithm performance on different part sizes (it must be fast for small parts as well). The matching strings can be chunked on network or HTTP transfer encoding layers, e.g.
There are 2 chunks of size 70 and 161 and the chunks boundary is at the multipart boundary identifier. The search algorithm must store current state on a chunk boundary and continue on the next chunk.
Please check standards and implementations for
boundary
usage. Apparently it's only forContent-Type: multipart/form-data
- web forms which aren't so large and also have many attack vectors. So we should not go into a situation when we're scanning a large blob (e.g. a DVD image) forboundary
. In normal case of course. Passing wrong content is addressed by #1119.Required for #2. Both the issues care only about
strict
(just drop bad requests) andtransform
(rewrite a request using the first occurrence to save resources) modes. We do not care about particular end-point application personalities, e.g. differences between parsing by ASP or PHP.Functional test is described in #843 - please implement the appropriate checkbox running all the relevant POST attacks from both the links above.
Testing
The appropriate test issue for the feature tempesta-tech/tempesta-test#108
The text was updated successfully, but these errors were encountered: