-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible Bug: Unicode conversion. #1103
Comments
I have read the discussion - please try the sample script provided by @mojavelinux with the GNU FreeSerif font. This font has the requested character as well as a mapping table understood by ttfunk, the font library used by Prawn. It may be the case that other fonts only provide some kind of internal link from the requested character to the greek sigma symbol, and that ttfunk has not implemented support for this kind of redirection. |
I can confirm that GNU FreeSerif font works. What's curious is that VL-Gothic-Regular, which also has this character, does not work. So this is very likely a problem in ttfunk as @gettalong has suggested. I did not dive into the two fonts to try to see what the tables look like. |
To be clear, this is not a unicode conversion problem. It's a glyph resolution problem. |
Ok. What I did is the following. Installed the GNU Free Serif Font. Implemented Font into Asciidoc Document and compiled using Asciidoctor-pdf. I see that the font is different, so this seems to work. However my special characters stay unconverted, they are printed as entered in the asciidoc file. Example characters:𝜎 or ţ. There is no error message regarding the conversion of the characters. |
@Clemi81 Can you please test the sample script below? It's probably not useful for this forum to be testing using Asciidoctor PDF. require 'prawn'
Prawn::Document.generate 'missing-glyph.pdf' do
def register_font data
font_families.tap {|accum| data.each {|key, val| accum[key.to_s] = val } }
end
register_font Serif: {
normal: '/usr/share/fonts/gnu-free/FreeSerif.ttf'
}
font :Serif do
text '𝜎'
end
end |
Dear Dan, I use Windows 7. |
@Clemi81 You'd need to install ruby. I assume you already have it installed since you're experiencing the issue. You'd also need to install prawn gem. I assume you have that too for the same reason. You need to save the script into a file. Then on a command line execute |
I just ran this script with the font I was testing with in the asciidoctor-pdf discussion. The resulting output is a single square outline in the top left corner of the page. Marc. |
Some more pictures. When I create the same table as I did in the asciidoctor-pdf discussion, starting from character 120512, while using the font FreeSans, I get the following output: Now, when I change the font to FreeSerif, the output looks as follows: Below is a screenshot showing those two fonts next to each other in FontForge. FreeSerif is on the left, FreeSans is on the right. The highlighted position in the top left of each of the two windows is character 120512. I hope this helps. Marc. |
Hello, I get the same result as Marc for Sigma. Free Serif Works, FreeSans does show a square with ?. I played a bit more and tested the characters in the Asciidoc. I did the following. Write unicode of sigma into document: So it seems it's not only a problem of font. It seems that Asciidoctor-pdf does not recognize it should convert the character For me one workaround could be, to render a character in Firefox and copy it back into asciidoc. Unfortunately there are some Symbols that are still not recognized even when I use this workaround. For example arrow up |
@Clemi81 Thank you for testing and for the additional information. I'd like to encourage you again not to discuss behaviors directly related to Asciidoctor PDF here. In this thread, we should only be talking about Ruby code that uses the Prawn API directly. That keeps the discussion on point. (It's very likely that the raw When testing here, we should always be referring back to the sample Ruby application that was posted above. |
Dear Dan, I'm sorry. I tend to post all information I have, but I understand it may be misleading. Cheers, Clemens |
I made a slight modification to the script:
The above works for both FreeSerif and FreeSans. I also tried to use Marc. |
You can use Generally, please note that although U+03C3 and U+1D70E may have the same appearance, there must be support in the font file for both codepoints. So testing U+03C3 probably won't help for this problem. |
Thank you for clarifying. |
I looked a bit into this and I'm not sure what the issue is. To me it looks like a wrong font is used in asciidoctor-pdf or something. The test script Dan provided seem to work as expected. Specifically, it shows a missing character for FreeSans because that font doesn't have a glyph for character 120590 (Mathematical Italic Small Sigma), and it shows a fine sigma with FreeSerif font because it does have the glyph. I'm closing this now but feel free to reopen if you have more pointers to how Prawn is at fault here. |
The whole point is about the behavior when the glyph is missing from the
font (again, it has nothing to do with Asciidoctor PDF specifically). In
this case, the missing font glyph is being set to 0 instead a non-zero
width. It was in the change you made that this behavior started happening.
From my perspective, you really aren't trying to understand the issue. I've
provided what I'm able to provide to help communicate what's going on. I
can't make you care about it, so I'll just have to accept that I need to
work around this on my end. It's frustrating, but so be it.
|
@mojavelinux I'm confused. It seem like you're addressing a different issue in your comment. Could you please confirm that we're still talking about the missing glyphs? Width of glyphs was never mentioned in this issue or the linked asciidoctor forum thread. My understanding is that a person wants to use some specific character. They have it in their source document specified as a direct Unicode codepoint. They're confused that the character is displayed in Firefox (I presume, HTML version of the generated document) but not in PDF. I took your script from your previous comment. I also took the latest FreeFont (freefont-ttf-20120503.zip. I get a Missing Character glyph with FreeSans font (in the latest version of FreeFont it's a rectangle with a question mark in it.), and a sigma with FreeSerif. I also confirmed that both glyphs have non-zero width. Likewise, I confirm that FreeSerif does have a glyph for character code 12590, and FreeSans does not. Screenshot from 1marc1 is very much indicative (character code 12590 is in row 5, second from the right). From my point of view, Prawn displays correct glyphs for both fonts. I understand that it doesn't match HTML behaviour. My assumption is that the user didn't specify the correct font and Firefox is more persistent in finding a fallback font with a glyph for the character. I agree that it's great for the end user but this is not the promise Prawn ever gave. Prawn only uses specified fonts and doesn't go looking until it find every single glyph. Could you please confirm, deny, or otherwise state the desired outcome? Could you please describe in what way the output of your test script is different from what you'd expect? Or if it's not representative, could you please provide another one that would demonstrate the issue? |
@pointlessone @mojavelinux If I may: I think that this issue and the one from #1322 get a bit mixed up. From what I can see in this issue, it is a problem with fonts missing some glyphs, as @mojavelinux said. So I think this issue is indeed solved. However, the one with the missing glyph (gid=0) having width 0 instead of the correct width of the that glyph, is still being debated over at #1322. |
@gettalong I agree. I just want to make sure we're on the same page with Dan. |
My mistake. My comment ended up on the wrong issue. I was indeed referring to #1322. And it seems there is now an update there. Please disregard my previous statement as it was misdirected. My apologies. Thanks @gettalong for playing moderator and getting us back on track. |
Back to the topic at hand, I think I know what the problem is. There are two characters which look visually identical, but are not actually the same Unicode character:
The first is U+1d70e (mathematical italic small sigma) whereas the second is U+03c3 (greek small letter sigma, sigma). The font in question is missing one of them, so it correctly displays the missing glyph character. I assert that Prawn is doing the correct thing. |
Hello,
there seems to be a bug regarding the conversion of Unicode characters with prawn,
Please see this link for details:
http://discuss.asciidoctor.org/Unicode-characters-not-converted-in-pdf-td6703.html
Thank you for reading,
Clemens
The text was updated successfully, but these errors were encountered: