Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for native xml parsing. #4252

Merged
merged 14 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/opam/liquidsoap-core-windows.opam
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ depends: [
"fileutils"
"fileutils-windows"
"curl-windows"
"xml-light-windows"
"mem_usage-windows" {>= "0.1.1"}
"metadata-windows" {>= "0.3.0"}
"dune-site-windows"
Expand Down
2 changes: 2 additions & 0 deletions .github/scripts/build-posix.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ echo "::endgroup::"

echo "::group::Setting up specific dependencies"

opam install -y xml-light

cd /tmp/liquidsoap-full/liquidsoap

./.github/scripts/checkout-deps.sh
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ jobs:
cd /tmp/liquidsoap-full/liquidsoap
eval "$(opam config env)"
opam update
opam install -y saturn_lockfree.0.4.1
opam install -y xml-light
dune build --profile release ./src/js/interactive_js.bc.js

tree_sitter_parse:
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ repos:
exclude: dune.inc

- repo: https://github.com/savonet/pre-commit-liquidsoap
rev: c5eab8dceed09fa985b3cf0ba3fe7f398fc00c04
rev: 056cf2da9d985e1915a069679f126a461206504a
hooks:
- id: liquidsoap-prettier

Expand Down
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

New:

- Added support for parsing and rendering XML natively (#4252)
- Added support for `WAVE_FORMAT_EXTENSIBLE` to the internal
wav dexcoder.
- Added optional `buffer_size` parameter to `input.alsa` and
Expand Down
160 changes: 160 additions & 0 deletions doc/content/xml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
## Importing/exporting XML values

Support for XML parsing and rendering was first added in liquidsoap `2.3.1`.

You can parse XML strings using a decorator and type annotation. There are two different representations of XML you can use.

### Record access representation

This is the easiest representation. It is intended for quick access to parsed value via
record and tuples.

Here's an example:

```liquidsoap
s =
'<bla param="1" bla="true">
<foo opt="12.3">gni</foo>
<bar />
<bar>bla</bar>
<blo>1.23</blo>
<blu>false</blu>
<ble>123</ble>
</bla>'

let xml.parse (x :
{
bla: {
foo: string.{ xml_params: {opt: float} },
bar: (unit * string),
blo: float,
blu: bool,
ble: int,
xml_params: { bla: bool }
}
}
) = s

print("The value for blu is: #{x.bla.ble}")
```

Things to note:

- The basic mappings are: `<tag name> -> <tag content>`
- Tag content maps tag parameters to a `xml_params` method.
- When multiple tags are present, their values are collected as tuple (`bar` tag in the example)
- When a tag contains a single ground value (`string`, `bool`, `float` or `integer`), the mapping is from tag name to the corresponding value, with xml attributes attached as methods
- Tag parameters can be converted to ground values and omitted.

The parsing is driven by the type annotation and is intended to be permissive. For instance, this will work:

```liquidsoaop
s = '<bla>foo</bla>'

# Here, `foo` is omitted.
let xml.parse (x: { bla: unit }) = s

# x contains: { bla = () }

# Here, `foo` is made optional
let xml.parse (x: { bla: string? }) = s

# x contains: { bla = "foo" }
```

### Formal representation

Because XML format can result in complex values, the parser can also use a generic representation.

Here's an example:

```liquidsoap
s =
'<bla param="1" bla="true">
<foo opt="12.3">gni</foo>
<bar />
<bar>bla</bar>
<blo>1.23</blo>
<blu>false</blu>
<ble>123</ble>
</bla>'

let xml.parse (x :
(
string
*
{
xml_params: [(string * string)],
xml_children: [
(
string
*
{
xml_params: [(string * string)],
xml_children: [(string * {xml_text: string})]
}
)
]
}
)
) = s

# x contains:
(
"bla",
{
xml_children=
[
(
"foo",
{
xml_children=[("xml_text", {xml_text="gni"})],
xml_params=[("opt", "12.3")]
}
),
("bar", {xml_children=[], xml_params=[]}),
(
"bar",
{
xml_children=[("xml_text", {xml_text="bla"})],
xml_params=[("option", "aab")]
}
),
(
"blo",
{xml_children=[("xml_text", {xml_text="1.23"})], xml_params=[]}
),
(
"blu",
{xml_children=[("xml_text", {xml_text="false"})], xml_params=[]}
),
(
"ble",
{xml_children=[("xml_text", {xml_text="123"})], xml_params=[]}
)
],
xml_params=[("param", "1"), ("bla", "true")]
}
)
```

This representation is much less convenient to manipulate but allows an exact representation of all XML values.

Things to note:

- XML nodes are represented by a pair of the form: `(<tag name>, <tag properties>)`
- `<tag properties>` is a record containing the following methods:
- `xml_params`, represented as a list of pairs `(string * string)`
- `xml_children`, containing a list of the XML node's children. Each entry in the list is a node in the formal XML representation.
- `xml_text`, present when the node is a text node. In this case, `xml_params` and `xm_children` are empty.
- By convention, text nodes are labelled `xml_text` and are of the form: `{ xml_text: "node content" }`

### Rendering XML values

XML values can be converted back to strings using `xml.stringify`.

Both the formal and record-access form can be rendered back into XML strings however, with the record-access representations, if a node has multiple children with the same tag, the conversion to XML string will fail.

More generally, if the values you want to convert to XML strings are complex, for instance if they use several times the same tag as child node or if the order of child nodes matters, we recommend using the formal representation to make sure that children ordering is properly preserved.

This is because record methods are not ordered in the language so we make no guarantee that the child nodes they represent be rendered in a specific order.
129 changes: 129 additions & 0 deletions doc/dune.inc
Original file line number Diff line number Diff line change
Expand Up @@ -9281,6 +9281,134 @@
)
)

(rule
(alias doc)
(package liquidsoap)
(enabled_if (not %{bin-available:pandoc}))
(deps (:no_pandoc no-pandoc))
(target xml.html)
(action (run cp %{no_pandoc} %{target}))
)

(rule
(alias doc)
(package liquidsoap)
(enabled_if %{bin-available:pandoc})
(deps
liquidsoap.xml
language.dtd
template.html
content/liq/append-silence.liq
content/liq/archive-cleaner.liq
content/liq/basic-radio.liq
content/liq/beets-amplify.liq
content/liq/beets-protocol-short.liq
content/liq/beets-protocol.liq
content/liq/beets-source.liq
content/liq/blank-detect.liq
content/liq/blank-sorry.liq
content/liq/complete-case.liq
content/liq/cross.custom.liq
content/liq/crossfade.liq
content/liq/decoder-faad.liq
content/liq/decoder-flac.liq
content/liq/decoder-metaflac.liq
content/liq/dump-hourly.liq
content/liq/dump-hourly2.liq
content/liq/dynamic-source.liq
content/liq/external-output.file.liq
content/liq/fallback.liq
content/liq/ffmpeg-filter-dynamic-volume.liq
content/liq/ffmpeg-filter-flanger-highpass.liq
content/liq/ffmpeg-filter-hflip.liq
content/liq/ffmpeg-filter-hflip2.liq
content/liq/ffmpeg-filter-parallel-flanger-highpass.liq
content/liq/ffmpeg-live-switch.liq
content/liq/ffmpeg-relay-ondemand.liq
content/liq/ffmpeg-relay.liq
content/liq/ffmpeg-shared-encoding-rtmp.liq
content/liq/ffmpeg-shared-encoding.liq
content/liq/fixed-time1.liq
content/liq/fixed-time2.liq
content/liq/frame-size.liq
content/liq/harbor-auth.liq
content/liq/harbor-dynamic.liq
content/liq/harbor-insert-metadata.liq
content/liq/harbor-metadata.liq
content/liq/harbor-redirect.liq
content/liq/harbor-simple.liq
content/liq/harbor-usage.liq
content/liq/harbor.http.register.liq
content/liq/harbor.http.response.liq
content/liq/hls-metadata.liq
content/liq/hls-mp4.liq
content/liq/http-input.liq
content/liq/icy-update.liq
content/liq/input.mplayer.liq
content/liq/jingle-hour.liq
content/liq/json-ex.liq
content/liq/json-stringify.liq
content/liq/json1.liq
content/liq/live-switch.liq
content/liq/medialib-predicate.liq
content/liq/medialib.liq
content/liq/medialib.sqlite.liq
content/liq/multitrack-add-video-track.liq
content/liq/multitrack-add-video-track2.liq
content/liq/multitrack-default-video-track.liq
content/liq/multitrack.liq
content/liq/multitrack2.liq
content/liq/multitrack3.liq
content/liq/output.file.hls.liq
content/liq/playlists.liq
content/liq/prometheus-callback.liq
content/liq/prometheus-settings.liq
content/liq/radiopi.liq
content/liq/re-encode.liq
content/liq/regular.liq
content/liq/replaygain-metadata.liq
content/liq/replaygain-playlist.liq
content/liq/request.dynamic.liq
content/liq/rtmp.liq
content/liq/samplerate3.liq
content/liq/scheduling.liq
content/liq/seek-telnet.liq
content/liq/settings.liq
content/liq/shoutcast.liq
content/liq/single.liq
content/liq/source-cue.liq
content/liq/space_overhead.liq
content/liq/split-cue.liq
content/liq/sqlite.liq
content/liq/srt-receiver.liq
content/liq/srt-sender.liq
content/liq/switch-show.liq
content/liq/transcoding.liq
content/liq/video-anonymizer.liq
content/liq/video-bluescreen.liq
content/liq/video-canvas-example.liq
content/liq/video-default-canvas.liq
content/liq/video-in-video.liq
content/liq/video-logo.liq
content/liq/video-osc.liq
content/liq/video-simple.liq
content/liq/video-static.liq
content/liq/video-text.liq
content/liq/video-transition.liq
content/liq/video-weather.liq
content/liq/video-webcam.liq
(:md content/xml.md)
)
(target xml.html)
(action
(pipe-stdout
(run pandoc %{md} -t json)
(run pandoc-include --directory content/liq)
(run pandoc -f json --syntax-definition=liquidsoap.xml --highlight=pygments --metadata pagetitle=xml --template=template.html -o %{target})
)
)
)

(rule
(alias doc)
(package liquidsoap)
Expand Down Expand Up @@ -10496,6 +10624,7 @@
(strings_encoding.html as html/strings_encoding.html)
(video-static.html as html/video-static.html)
(video.html as html/video.html)
(xml.html as html/xml.html)
(yaml.html as html/yaml.html)
)
)
1 change: 1 addition & 0 deletions dune-project
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@
(ppx_hash :build)
(sedlex (>= 3.2))
(menhir (>= 20240715))
xml-light
)
(sites (share libs) (share bin) (share cache) (lib_root lib_root))
(synopsis "Liquidsoap language library"))
Expand Down
1 change: 1 addition & 0 deletions liquidsoap-lang.opam
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ depends: [
"ppx_hash" {build}
"sedlex" {>= "3.2"}
"menhir" {>= "20240715"}
"xml-light"
"odoc" {with-doc}
]
build: [
Expand Down
10 changes: 10 additions & 0 deletions src/lang/builtins_string.ml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,16 @@ let _ =
let s2 = Lang.to_string (Lang.assoc "" 2 p) in
Lang.string (s1 ^ s2))

let _ =
Lang.add_builtin ~base:string "compare" ~category:`String
~descr:"Compare strings in lexicographical order."
[("", Lang.string_t, None, None); ("", Lang.string_t, None, None)]
Lang.int_t
(fun p ->
let s1 = Lang.to_string (Lang.assoc "" 1 p) in
let s2 = Lang.to_string (Lang.assoc "" 2 p) in
Lang.int (String.compare s1 s2))

let _ =
Lang.add_builtin ~base:string "digest" ~category:`String
~descr:"Return an MD5 digest for the given string."
Expand Down
Loading
Loading