Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

malformed XML generated from malformed marc #100

Open
jakub-id opened this issue Nov 7, 2023 · 0 comments
Open

malformed XML generated from malformed marc #100

jakub-id opened this issue Nov 7, 2023 · 0 comments

Comments

@jakub-id
Copy link
Contributor

jakub-id commented Nov 7, 2023

We've received records that contain subfield delimiters (1F) in the subfield contents. While this is wrong according to MARC spec, yaz-macdump will process the record without error. While the generated line and JSON representations are valid, XML will be malformed:

yaz-marcdump -f marc8 -o marcxml malformed.mrc | xmllint --noout -
-:148: parser error : Unescaped '<' not allowed in attributes values
    <subfield code="树上又有什么让人惊讶的景象呢</subfield>
                                                              ^
-:148: parser error : attributes construct error
    <subfield code="树上又有什么让人惊讶的景象呢</subfield>
                                                              ^
-:148: parser error : Couldn't find end of Start Tag subfield line 148
    <subfield code="树上又有什么让人惊讶的景象呢</subfield>
                                                              ^
-:148: parser error : Opening and ending tag mismatch: datafield line 145 and subfield
    <subfield code="树上又有什么让人惊讶的景象呢</subfield>
                                                                         ^
-:150: parser error : Opening and ending tag mismatch: record line 2 and datafield
  </datafield>
              ^
-:200: parser error : Opening and ending tag mismatch: collection line 1 and record
</record>
         ^
-:201: parser error : Extra content at the end of the document
</collection>
^

Expected result: yaz-marcdump handles the records but warns and generates well-formed XML.

malformed.mrc.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant