-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pages with Emoji fail to render on macOS (due to lxml bug) #3686
Comments
This is most likely a bug with lxml, please report it to the lxml project. |
If I use the program
I have no idea how lxml is used by Nikola. |
Sure, here’s some sample code: import lxml.html
html = """<!DOCTYPE html>
<head><meta charset="utf-8"></head>
<body>
<h1>Hello, world!</h1>
<div>
<p>\U0001f63a</p>
</div>
</body>
</html>"""
parser = lxml.html.HTMLParser(remove_blank_text=True)
doc = lxml.html.document_fromstring(html, parser)
data = lxml.html.tostring(doc, encoding='utf8', method='html', pretty_print=True, doctype='<!DOCTYPE html>')
print(data) Can you reproduce the issue using this code on macOS? For reference, I get the following output on Windows and Linux:
|
Bingo!
I reported the lxml issue at https://bugs.launchpad.net/lxml/+bug/2019038 |
Environment
Python Version:
Python 3.11.3 (main, Apr 7 2023, 19:29:16) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Installed from Homebrew
Nikola Version:
Nikola 8.2.4
Operating System:
macOS Monterey 12.6.5
Description:
If I use a Unicode character like a smiley or 😺 in a
.rst
page then the generated html is bogus.Source
unicode.rst
page:The generated html page contains:
And in the browser I see: "h t m l > " for the content of the post.
I have no problem with another Unicode character like an accented letter like "è".
Debian is OK
I then tried the same manipulation on a Debian GNU/Linux version 12 (the next Debian stable) and I have no problem.
On Debian I use:
In both cases I use a venv.
Debug
I tried to debug but I am new to Nikola.
I tried
nikola rst2html
.On macOS I get:
Here again the result is correct if run on Debian.
Maybe the problem is in a dependency used by Nikola.
The text was updated successfully, but these errors were encountered: