-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utf8 parsing performance #4
Comments
Thanks for putting this together! I've been wanting to do some benchmark work. There were a few problems with your test setup. I opened a PR. That said, the results aren't much better, but at least they are correct!
Going to mark this as a bug because we should be able to be std easily. |
Hi, some years ago I implemented an utf8 decoder with the same table, |
I've done some minimal optimization effort in #8. When I've got a bit more time, I plan to look into Björn Höhrmann's article mentioned by @carl-erwin to see if we can do better. As to why the std parser does so much better, this seems due to optimizations available when it's possible to view multiple bytes at once. |
You might also be interested in |
Hi, I was eager to benchmark your table-based utf8 parsing approach against the standard library implementation, so I did:
https://github.com/ConnyOnny/utf8perf
If my testing setup is not wrong (see main.rs) it seems branching is not everything.
The text was updated successfully, but these errors were encountered: