-
Notifications
You must be signed in to change notification settings - Fork 203
Jonathan edited this page Jun 7, 2022
·
7 revisions
Interfaces: IHtmlSanitizer
Cleans HTML documents and fragments from constructs that can lead to XSS attacks.
XSS attacks can occur at several levels within an HTML document or fragment:
- HTML Tags (e.g. the <script> tag)
- HTML attributes (e.g. the "onload" attribute)
- CSS styles (url property values)
- malformed HTML or HTML that exploits parser bugs in specific browsers
The HtmlSanitizer class addresses all of these possible attack vectors by using a sophisticated HTML parser (AngleSharp).
In order to facilitate different use cases, HtmlSanitizer can be customized at the levels mentioned above:
- You can specify the allowed HTML tags through the property AllowedTags. All other tags will be stripped.
- You can specify the allowed HTML attributes through the property AllowedAttributes. All other attributes will be stripped.
- You can specify the allowed CSS property names through the property AllowedCssProperties. All other styles will be stripped.
- You can specify the allowed URI schemes through the property AllowedSchemes. All other URIs will be stripped.
- You can specify the HTML attributes that contain URIs (such as "src", "href" etc.) through the property UriAttributes.
var sanitizer = new HtmlSanitizer();
var html = @"<script>alert('xss')</script><div onload="" alert('xss')"" style="" background-color: test"">Test<img src="" test.gif"" style="" background-image: url(javascript:alert('xss')); margin: 10px""></div>";
var sanitized = sanitizer.Sanitize(html, "http://www.example.com");
// -> "<div style="background-color: test">Test<img style="margin: 10px" src="http://www.example.com/test.gif"></div>"
- i = instance
- s = static
Name | Description |
---|---|
PostProcessNode(PostProcessNodeEventArgs) | Occurs for every node after sanitizing. |
RemovingTag(RemovingTagEventArgs) | Occurs before a tag is removed. |
RemovingAttribute(RemovingAttributeEventArgs) | Occurs before an attribute is removed. |
RemovingStyle(RemovingStyleEventArgs) | Occurs before a style is removed. |
RemovingAtRule(RemovingAtRuleEventArgs) | Occurs before an at-rule is removed. |
RemovingComment(RemovingCommentEventArgs) | Occurs before a comment is removed. |
Name | Description | |
---|---|---|
i | Sanitize(String) | Sanitizes the specified HTML body fragment. If a document is given, only the body part will be returned. |
i | Sanitize(String, String) | Sanitizes the specified HTML body fragment. If a document is given, only the body part will be returned. Relative URLs will be resolved against the baseUrl parameter. |
i | Sanitize(String, String, IMarkupFormatter) | Sanitizes the specified HTML body fragment. If a document is given, only the body part will be returned. Relative URLs will be resolved against the baseUrl parameter. Sanitized output will be formatted using the outputFormatter parameter. |
i | SanitizeDocument(String) | Sanitizes the specified HTML document. Even if only a fragment is given, a whole document will be returned. |
i | SanitizeDocument(String, String) | Sanitizes the specified HTML document. Even if only a fragment is given, a whole document will be returned. Relative URLs will be resolved against the baseUrl parameter. |
i | SanitizeDocument(String, String, IMarkupFormatter) | Sanitizes the specified HTML document. Even if only a fragment is given, a whole document will be returned. Relative URLs will be resolved against the baseUrl parameter. Sanitized output will be formatted using the outputFormatter parameter. |
Name | Description | |
---|---|---|
AllowDataAttributes | Gets or sets a boolean value for allowing all HTML5 data attributes (the attributes prefixed with data- ) |
|
AllowedAtRules | Gets or sets the allowed CSS at-rules such as "@media" and "@font-face". | |
AllowedAttributes | Gets or sets the allowed HTML attributes such as "href" and "alt". | |
AllowedCssProperties | Gets or sets the allowed CSS properties such as "font" and "margin". | |
AllowedSchemes | Gets or sets the allowed URI schemes such as "http" and "https". | |
AllowedTags | Gets or sets the allowed HTML tag names such as "a" and "div". | |
DefaultHtmlParserFactory | Gets or sets the default Func object that creates the parser used for parsing the input. |
|
DefaultKeepChildNodes | Gets or sets the default value indicating whether to keep child nodes of elements that are removed. Default is false . |
|
DefaultOutputFormatter | Gets or sets the default IMarkupFormatter object used for generating output. Default is Instance. |
|
DisallowCssPropertyValue | Gets or sets a regex that must not match for legal CSS property values. | |
HtmlParserFactory | Gets or sets the Func object the creates the parser used for parsing the input. |
|
KeepChildNodes | Gets or sets a value indicating whether to keep child nodes of elements that are removed. Default is DefaultKeepChildNodes. | |
OutputFormatter | Gets or sets the IMarkupFormatter object used for generating output. Default is DefaultOutputFormatter. |
|
UriAttributes | Gets or sets the HTML attributes that can contain a URI such as "href". |
Name | Description | |
---|---|---|
DefaultAllowedAtRules | The default allowed CSS at-rules. | |
DefaultAllowedSchemes | The default allowed URI schemes. | |
DefaultAllowedTags | The default allowed HTML tag names. | |
DefaultAllowedAttributes | The default allowed HTML attributes. | |
DefaultUriAttributes | The default URI attributes. | |
DefaultAllowedCssProperties | The default allowed CSS properties. | |
DefaultDisallowedCssPropertyValue | The default regex for disallowed CSS property values. |