Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c2sc OpenType feature typesets improper codepoints #273

Open
YellowJacketLinux opened this issue Jan 29, 2024 · 2 comments
Open

c2sc OpenType feature typesets improper codepoints #273

YellowJacketLinux opened this issue Jan 29, 2024 · 2 comments
Labels
enhancement New feature or request works with mode=harf This problem will disappear when a harfbuzz enhanced luatex is available

Comments

@YellowJacketLinux
Copy link

I first reported this bug to fontspec latex3/fontspec#497 but was told it's an engine issue.

I hope this is the right place.

I personally don't consider this high priority because things at least visually work.

The c2sc OpenType feature is supposed to use the small-caps variant of the lower case letter where the upper case letter is requested, but the unicode code-point should still be for the upper-case variant so that copy and paste still produces an upper-case letter regardless of font features in the document being pasted into.

See the MWE and copy/paste the strings into a text editor.

In the MWE overline example, I didn't use Greek for c2sc because TeX Gyre Termes doesn't have small-caps for Greek.

But the overline example shows how typographically better it is to use c2sc with nomem sacrum (especially if the
small-caps are actually a little taller than x-height although that's not shown).

If using a font with small-caps for Greek, one could even use the U+0305 combining character to make the overline (note TeX Gyre also doesn't have U+0305 but some Greek/Coptic fonts do as both scripts historically use it frequently) so that even the overline itself itself is copied and pasted---but since LuaLaTeX is using lower-case codepoints with c2sc, what gets pasted would be lower-case letters and not the upper-case that nomina sacra traditionally use.

Even though things visually work, it's possible that the engine using lower-case code points creates an issue for screen readers too, but in this use case (abbreviations) a text alt-tag should probably be used anyway, so perhaps it's not an accessibility issue but in some use cases it actually might be.

The MWE:

\RequirePackage{fontspec}
\documentclass[letterpaper,fontsize=14pt]{scrarticle}

\setmainfont
  [ Ligatures   = TeX ,
    Extension   = .otf ,
    UprightFont = *-regular ,
    BoldFont = *-bold ,
    ItalicFont = *-italic ,
    BoldItalicFont = *-bolditalic ]
  {texgyretermes}
\setsansfont
  [ Ligatures   = TeX ,
    Extension   = .otf ,
    UprightFont = *-regular ,
    BoldFont = *-bold ,
    ItalicFont = *-italic ,
    BoldItalicFont = *-bolditalic ]
  {texgyreheros}
\setmonofont
  [ Ligatures   = NoCommon ,
    Extension   = .otf ,
    UprightFont = *-regular ,
    BoldFont = *-bold ,
    ItalicFont = *-italic ,
    BoldItalicFont = *-bolditalic ]
  {texgyrecursor}

\makeatletter
\newcommand*{\textoverline}[1]{$\overline{\hbox{#1}}\m@th$}
\makeatother
% \symbol{"0305}

\usepackage[colorlinks=true]{hyperref}

\begin{document}
\section{Stuff}
Herod the Great died at around
4~{\fontspec[Letters=UppercaseSmallCaps]{texgyretermes-regular.otf}B.C.E.}\
but Quirinius did not become governor of Syria until
6~{\fontspec[Letters=UppercaseSmallCaps]{texgyretermes-regular.otf}C.E.}\
which means the tax of Quirinius did not happen until at least ten years after Herod the Great died.

This paragraph shows how the bug impacts my intended purpose of using the c2sc feature, which has to
do with Byzantine-era Greek and \textit{nomina sacra}---the practice of abbreviating holy names. Compare
\textoverline{ΔΑΔ} with
\textoverline{\fontspec[Letters=UppercaseSmallCaps]{texgyretermes-regular.otf}DAD}
and notice the text overline on the first is much closer to the text in the line above it, creating a
visual typography issue hence the need for c2sc.

\end{document}
@zauguin zauguin added enhancement New feature or request works with mode=harf This problem will disappear when a harfbuzz enhanced luatex is available labels Jan 29, 2024
@zauguin
Copy link
Member

zauguin commented Jan 30, 2024

This is basically what I commented on at https://tex.stackexchange.com/questions/707772/xelatex-fontspec-stylisticset-changes-underlying-unicode-characters-in-the-t#comment1759919_707772 recently. This is rather hard to avoid unless we very fundamentally change how we output mappings to Unicode like we do in harf mode. We might have to consider doing that though, then we might want to move it out of the mode specific part and make parts of it generic. This will probably require rather heavy patching of the ConTeXt fontloader.
@u-fischer I'm guessing these things will become rather important from a tagpdf point of view?

@YellowJacketLinux For now you can avoid the issue by using HarfBuzz mode (by adding Renderer=HarfBuzz in fontspec).

@YellowJacketLinux
Copy link
Author

I can confirm the issue does not exist with Renderer=HarfBuzz

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request works with mode=harf This problem will disappear when a harfbuzz enhanced luatex is available
Projects
None yet
Development

No branches or pull requests

2 participants