Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<html> element issue in XHTML document generation from an ODD file #683

Open
daydin opened this issue May 10, 2024 · 10 comments
Open

<html> element issue in XHTML document generation from an ODD file #683

daydin opened this issue May 10, 2024 · 10 comments
Assignees
Labels
status: needsDiscussion Council has not yet been able to agree on how to proceed. type: bug A bug report.

Comments

@daydin
Copy link

daydin commented May 10, 2024

Hi all!

I've been working with an ODD file to generate documentation alongside my Relax NG schema. While the schema generates just fine, the resulting documentation (HTML format) seems to have validation errors of the following sort:
value of attribute "href" is invalid; must be a string matching the regular expression "[ -]*" or must be a URI

because some of the URLs have not been replaced. A sample broken URL replacement looks like this:

                    <div class="refdoc" id="TEI.xh_body">
                        <h3><span class="headingNumber">Appendix A.1.2 </span><span class="head"
                                >&lt;body&gt;</span></h3>
                        <div class="table">
                            <table class="wovenodd">
                                <tr>
                                    ...
                                </tr>
                                <tr>
                                    ...
                                </tr>
                                <tr>
                                    ...
                                </tr>
                                <tr>
                                    <td class="wovenodd-col1"><span xml:lang="en" lang="en"
                                            class="label">Contained by</span></td>
                                    <td class="wovenodd-col2"><div class="parent">
                                            <div class="specChildren">
                                                <div class="specChild">
                                                  <span class="specChildModule">dniHtml: </span>
                                                  <span class="specChildElements"><a
                                                  class="link_odd_elementSpec"
                                                  href="[[undefined TEI.html]]">html</a></span>
                                                </div>
                                            </div>
                                        </div></td>
                                </tr>
                                <tr>
                                    ...
                                </tr>
                                <tr>
                                    ...
                                </tr>
                                <tr>
                                    ...
                                </tr>
                            </table>
                        </div>
                    </div>

and I am defining the <html> element as below:

              <elementSpec ident="html" ns="http://www.w3.org/1999/xhtml" module="dniHtml" >
                  <gloss>The html element, as required by XHTML specification.</gloss>
                  <desc>This is used as the root element.</desc>
                  <classes>
                     <memberOf key="att.identifiable"/>
                     <memberOf key="att.hasTitle"/>
                  </classes>
                  <content>
                     <sequence>
                        <alternate minOccurs="1" maxOccurs="unbounded">
                           <elementRef key="head"/>
                           <elementRef key="body"/>
                        </alternate>
                        <!-- We allow script here for staticSearch overrides. -->
                        <elementRef key="script" minOccurs="0" maxOccurs="unbounded"/>
                     </sequence>
                  </content>
               </elementSpec>

I've found a workaround solution, which was to add mode="change" in the <html> elementSpec. I don't really understand why this works though, unless there is a pre-existing <html> definition hiding somewhere outside of my ODD file. I confirmed that my ODD file only has the one <html> definition as I pasted above. Any insight is welcome!

@sydb
Copy link
Member

sydb commented May 11, 2024

Hmmm … some questions:

  • Is this a stand-alone ODD file or a customization ODD file?
  • How do you go about generating the HTML from it?
  • Is <TEI> defined?

Years ago I defined a subset of HTML in ODD just to demonstrate generating a complete markup language in ODD. (I created a version of John Cowan’s cleverly named Itsy Bitsy Teeny Weeny Simple HyperText DTD, except, of course, I do not actually use the DTD, I use the RNG.) It does this by using an ODD that does have a <schemaSpec> but does not have any <moduleRef>s. (Well, I just looked, and it turns out it actually has a reference to the "tei" module, but I am not sure why — I do not think it makes any useful difference.) It has no problems with the @href attributes in the HTML. (But it may not do what yours does, and it does have problems.)

@martindholmes
Copy link
Contributor

@sydb This is a customization ODD which @daydin will post shortly. It's derived from lots of other examples I've had in projects for years, and we're seeing the same problem in all these projects. This is something which has recently happened in the Stylesheets, possibly only in dev (I use the bleeding-edge Oxygen plugin and I think @daydin does too). So I think this is a new regression which results from recent work in the Stylesheets.

Adding @mode="add" does not solve the problem (which is expected, because that's the default).

@daydin
Copy link
Author

daydin commented May 13, 2024

Here is the ODD file itself. I had to rename the extension to pass it through GitHub's extension checker.
dni.odd.txt
I'm still fairly new to this, so I don't know what either of "stand-alone" or "customization" ODD files mean but @martindholmes chimed in already. This is an ODD file that is meant to define a Relax NG schema and also to auto-generate XHTML documentation using Oxygen's internal tooling. (See attached.)
Screenshot 2024-05-13 at 8 46 22 AM
Finally, is defined in the file.

@joeytakeda
Copy link
Contributor

This is interesting! Not sure what is going on here yet, but I can confirm that internal links to specification elements are broken in the processed.odd (at least when running ./bin/teitohtml --odd --debug dni.odd). However, only some of these links are broken (e.g. the ones in the list of elements, but not the RNC).

@joeytakeda
Copy link
Contributor

joeytakeda commented May 14, 2024

Curiouser and curiouser — I just tested the same command (./bin/teitohtml --odd --debug dni.odd) with the last three releases (7.54.0, 7.55.0, 7.56.0) and all three exhibit the same behaviour: a handful of broken links to the the html element (e.g. the [ in the <a> as @daydin reported) and then no links at all to most other elements (e.g. <span class="specChildElements">meta title</span>).

I then tested the staticSearch.odd (since it's a similar kind of customization) and the produced HTML was valid in all cases. After some guessing and checking and comparing (e.g. adding @defaultExceptions, changing the @prefix), I was able to get the right results with dni.odd simply by removing the xhtml namespace declaration (e.g. the xmlns:xh) from the file. Removing that produces proper links — e.g.:

<div class="specChild">
    <span class="specChildModule">dniHtml: </span>
    <span class="specChildElements">
        <a class="link_odd_elementSpec" href="#TEI.meta" title="&lt;meta&gt;">meta</a>
        <a class="link_odd_elementSpec" href="#TEI.title" title="&lt;title&gt;">title</a>
    </span>
</div>

(Though, it seems strange to me that the ids are all prefixed by "TEI" — I would have assumed they would have been prefixed by something like the ident or the prefix?) In any case, declaring the staticSearch namespace in the staticSearch.odd similarly creates broken links, so I think the namespace declaration is the problem (though I'm not sure why).

@martindholmes
Copy link
Contributor

@joeytakeda Good catch! Along with @daydin's discovery that @mode="change" also works around it, this gives us something to go on, and two ways to avoid the problem. This is definitely relatively recent, though, because I have other projects whose HTML documentation has been validated as part of the build process with vnu.jar for years, and suddenly a regeneration of the documentation causes this breakage. It doesn't happen with TEI customizations, only apparently with other XML languages declared through ODD.

I'm intrigued that this goes back as far as 7.54; it's possibly I haven't regenerated documentation for some of the older projects for that long, but it seems unlikely. The other possibility is that this has been caused by a recent change to p5subset.xml.

@daydin
Copy link
Author

daydin commented May 14, 2024

I have another find. I did the transformation with Saxon 10 as follows:

java -jar saxon-he-10.jar -xsl:/Applications/Oxygen\ XML\ Editor/frameworks/tei/xml/tei/stylesheet/odds/odd2odd.xsl -s:../schema/dni.odd > dni.html
and that produced a dni.html that was fully functional. I was suspecting Saxon because the missing URL links look like broken token replacements. I am able to reproduce the issue using the Saxon HE 12 bundled in Oxygen, but using Saxon 10 from the command line works just fine. @joeytakeda, cna you confirm which version of Saxon you were using when you were doing your experiments?

joeytakeda added a commit that referenced this issue May 14, 2024
Links to customizations no longer respected prefixes, so ensure those are there
joeytakeda added a commit that referenced this issue May 14, 2024
Make sure all members of classSpecs also get their prefixes
@joeytakeda
Copy link
Contributor

joeytakeda commented May 14, 2024

@martindholmes — out of curiosity, do all of those ODDs also define the xh namespace on the root TEI (as well as the xh_ @prefix?)

And thanks @daydin — The Saxon I used was the one bundled with the Stylesheets (which is also Saxon10HE), but I'm not sure how you were able to get the proper html file from odd2odd.xsl since that stylesheet just produces the "standalone" ODD and not an HTML document. And I believe the token replacement [[undefined TEI.html]] comes from html.xsl:

Stylesheets/html/html.xsl

Lines 793 to 805 in 69faae1

<xsl:template name="generateEndLink">
<xsl:param name="where"/>
<xsl:choose>
<xsl:when test="id($where)">
<xsl:apply-templates mode="generateLink" select="id($where)"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>[[undefined </xsl:text>
<xsl:value-of select="$where"/>
<xsl:text>]]</xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

However, I think the main issue derives from how prefixes are retrieved and the links are generated in the step before (e.g. in the odd2lite transformation). I've started working on this in a branch, but the tests aren't passing, so there's definitely some more investigation that needs to happen here.

But what I've found so far is that, for the first issue (e.g. links that didn't resolve properly), the prefixes weren't being passed properly (see 2a58dd7); the HTML stylesheets in this case just bailed and didn't make the link. But for whatever reasons, references to elements as members of a class (e.g. the spec table for att.identified) produce that broken link, since they also didn't get the prefix (and weren't resolvable). Adding those prefixes (20acc5c) allowed the HTML to be produced without error, but fails the tests, so there's still something awry in these stylesheets.

@daydin
Copy link
Author

daydin commented May 14, 2024

oh lol I've just named the output dni.html as I was expecting Saxon to produce an HTML and not another ODD file. My bad!

@martindholmes
Copy link
Contributor

@joeytakeda Yes, I always have the xh: prefix declared on the root TEI element. It's also declared using <sch:ns>.

@trishaoconnor trishaoconnor added the type: bug A bug report. label Nov 27, 2024
@trishaoconnor trishaoconnor added this to the Release 7.58.0 milestone Nov 27, 2024
@trishaoconnor trishaoconnor added the status: needsDiscussion Council has not yet been able to agree on how to proceed. label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needsDiscussion Council has not yet been able to agree on how to proceed. type: bug A bug report.
Projects
None yet
Development

No branches or pull requests

5 participants