-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Ignoring encoding error 'utf-8' codec can't decode byte 0xd4 in position 38: invalid continuation byte
when playing file with russian file name
#283
Comments
Maybe related to #208 and #205? Try updating to the master branch. Install from branch:
I don't have enough time right now to try to reproduce this locally. |
I was already on master, so no changes for me. |
Same with ukrainian:
|
Pinging @Sp3EdeR to see if they have any insights on this. |
This is happening in mpv, though I'm guessing we have to do something similar to what we did for mpchc... |
The issue is raised because the continuation byte is D4 (11010100), and according to the UTF-8 standard, all continuation bytes must begin with 10... in binary. So this is a clear hint that the path is not transmitted in UTF-8. The logs say that the system encoding scheme is CP1251, which is the cyrillic code page, which makes sense for this system, of course. In this single-byte codepage, the D4 byte represents the Ф character, which is what @soredake pasted, so they match. This clearly shows that somewhere in the script, we try to decode with The second case, where logs are attached show the UTF-8 and System string lines though. In this log, both lines correctly represent the My first guess would be that somewhere in the code, we use MPV.net's output that is in the encoding set in Windows (Windows is capable of unicode in UTF-16, but it is up to each program to choose whether to use UTF-16 or the encoding set by the system.) I think the easiest solution would be to reproduce the issue with MPV.net while setting the Windows codepage to whatever cp, and reproducing the issue with an aptly named folder. Then through debugging, it would be easiest to find where utf8 is assumed instead of using the correct system encoding. The difference from MPC-HC will be that MPC specifically uses UTF-8 on its web API, so it does not change from computer to computer. In this case, it can be expected that every computer will have a different setup, and thus the encoding should be read (from the system or the MPV.net interface) instead of assumed. |
Describe the bug
A clear and concise description of what the bug/error is.
Desktop (please complete the following information):
To Reproduce
Steps to reproduce the behavior:
Физрук - S01E01 - Episode 1.mkv
)Ignoring encoding error 'utf-8' codec can't decode byte 0xd4 in position 38: invalid continuation byte
Log file
Click to see log contents
The text was updated successfully, but these errors were encountered: