Add support for native xml parsing. #4252
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for XML parsing and rendering.
You can parse XML strings using a decorator and type annotation. There are two different representations of XML you can use.
Record access representation
This is the easiest representation. It is intended for quick access to parsed value via
record and tuples.
Here's an example:
Things to note:
tag name -> tag content
xml_params
bar
tag in the example)string
,bool
,float
orinteger
), the mapping is from tag name to the corresponding value, with xml attributes attached as methodsThe parsing is driven by the type annotation and is intended to be permissive. For instance, this will work:
Formal representation
Because XML format can result in complex values, the parser can also use a generic representation.
Here's an example:
This representation is much less convenient to manipulate but allows an exact representation of all XML values.
Things to note:
(<tag name>, <tag properties>)
<tag properties>
contains the following:xml_params
, represented as a list of pairs(string * string)
xml_children
, containing an array of the XML node's children.xml_text
, present when the node is a text node. In this case,xml_params
orxm_children
are empty.xml_text
.Rendering XML values
XML values can be converted back to strings using
xml.stringify
.Both the formal and record-access form can be rendered back into XML strings however, with the record-access representations, if a node has multiple children with the same tag, the conversion to XML string will fail.
More generally, if the values you want to convert to XML strings are complex, for instance if they use several times the same tag as child node or if the order of child nodes matters, we recommend using the formal representation to make sure that children ordering is properly preserved.
This is because record methods are not ordered in the language so we make no guarantee that the child nodes they represent be rendered in a specific order.