Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. #42

adon-at-work · 2015-10-08T08:30:17Z

Here's a sample of an EOF in Attribute value (double-quoted) state:

hello <a href="<script>{{untrusted}}</script>

According to the spec, when the EOF is encountered. It says it's a parse error, and that compliant browser will switch to DATA state. If rendered in latest browsers like Chrome and Firefox, only hello will get rendered, and ended in DATA state. the incomplete tag is actually NOT emitted to the DOM/output.

But unfortunately older browsers like IE7-8 behaved differently, the incomplete tag gets rendered, and that the string <a href="<script>{{untrusted}}</script> is considered as begun in DATA state, and somehow transitioned into SCRIPT state.

Context parser now considers the placeholder {{untrusted}} as placed in attribute value (double-quoted) state. But it ignored the consequence of EOF. That leads the downstream project secure-handlebars to simply insert a filter equiv. to uriInDoubleQuotedAttr() for that placeholder. An attacker using alert(1) will be able to launch XSS.

The EOF problem was marked as TODO inside the source code.

@neraliu @yukinying @maditya

The text was updated successfully, but these errors were encountered:

yukinying · 2015-10-08T22:50:10Z

How did your canalization logic handle the case without EOF? i.e. Could you show me the code that you handle <a href="<script>{{untrusted}}</script>">bar</a>? (note the closing double quote)

adon-at-work · 2015-10-09T03:09:11Z

Short answer: No canonicalization needed.

tl;dr

Our current canonicalization categorizations
(a) parse error correction
(b) fixing comment precedence in RAWTEXT/RCDATA
(c) voiding IE conditional comments

The step-by-step process I check whether and how something is canonicalized:

Given the sample, it's obviously unrelated to the two categories (b) and (c). Most chars being used there looks innocuous to me except those in the attr value.
Check the spec, there're two parse errors in attr val double-quoted state
https://html.spec.whatwg.org/multipage/syntax.html#attribute-value-(double-quoted)-state
None of the chars in the sample actually could trigger parse errors. So, no canonicalizations.

In case really a char is found resulting in parse error (note, none in this sample), a further check is to see the corresponding 1:1 mapping of state-specific cases at https://github.com/yahoo/context-parser/blob/master/tests/unit/run-strict-context-parser.js#L338 and we can quickly tell how it's corrected from the example case. Further check the source code if details are needed.

also note that the sample ends in DATA state, where EOF at there is valid.

yukinying · 2015-10-09T05:25:11Z

Let me repeat my question in another form:

Have you check if IE7 and 8 exploitable with the following?

<a href="<script>{{untrusted}}</script>">bar</a>

yukinying · 2015-10-09T05:38:53Z

Our current canonicalization categorizations
(a) parse error correction
(b) fixing comment precedence in RAWTEXT/RCDATA
(c) voiding IE conditional comments

I think (a) should be "Null byte parse error correction" instead of all types of parse error, right?

adon-at-work · 2015-10-09T06:18:16Z

Have you check if IE7 and 8 exploitable with the following?
<a href="<script>alert(1)</script>">bar</a> ({{untrusted}} is swapped with alert(1))

Checked. Surely non-exploitable.

Our current canonicalization categorizations
(a) parse error correction
(b) fixing comment precedence in RAWTEXT/RCDATA
(c) voiding IE conditional comments

I think (a) should be "Null byte parse error correction" instead of all types of parse error, right?

No. Here're some non-NULL kind of parse error corrections:
https://html.spec.whatwg.org/multipage/syntax.html#attribute-value-(unquoted)-state
https://github.com/yahoo/context-parser/blob/master/tests/unit/run-strict-context-parser.js#L350

yukinying changed the title from Handling parse errors due to EOF to Canonicalization should take care of EOF as it would lead to XSS in IE8 or below

It may not be appropriate to immediately jump to a title like using canonicalization to handle EOF before any discussions on the mitigation side.

My two cents:

My first reaction is that in fact EOF is unlike others, it's NOT a character. This naturally cannot fit into the existing Canonicalize() model.
And a quick fix to this problem could be like just warn the user when the HTML does not end in DATA state.
... may be there's an intermediate fix...
A complete fix may call for the approach to emit valid tags only until > is encountered to transition back to DATA state. This is what I've perceived as fusion with the html-purify() approach. So basically incomplete tags <... at the end won't get into the output, just like how the latest Chrome/Firefox behave. This takes time though. c.c. @maditya

yukinying · 2015-10-09T18:18:26Z

It may not be appropriate to immediately jump to a title like using canonicalization to handle EOF before any discussions on the mitigation side.

The current implementation of canonicalization is unfit to handle EOF, and it does not necessarily mean it should not be in scope of canonicalization. Indeed this should give you some hints that the way Canonicalize() is designed is not sufficient and some rewrite of it is unavoidable.

adon-at-work added the bug label Oct 8, 2015

adon-at-work changed the title ~~Solving parse errors due to EOF~~ Handling parse errors due to EOF Oct 8, 2015

yukinying changed the title ~~Handling parse errors due to EOF~~ Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. Oct 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. #42

Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. #42

adon-at-work commented Oct 8, 2015

yukinying commented Oct 8, 2015

adon-at-work commented Oct 9, 2015

yukinying commented Oct 9, 2015

yukinying commented Oct 9, 2015

adon-at-work commented Oct 9, 2015

yukinying commented Oct 9, 2015

Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. #42

Canonicalization should take care of EOF as it would lead to XSS in IE8 or below. #42

Comments

adon-at-work commented Oct 8, 2015

yukinying commented Oct 8, 2015

adon-at-work commented Oct 9, 2015

yukinying commented Oct 9, 2015

yukinying commented Oct 9, 2015

adon-at-work commented Oct 9, 2015

yukinying commented Oct 9, 2015