Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem in non English M&B save-file reading . #9

Open
Nex0817 opened this issue Mar 21, 2020 · 6 comments
Open

Problem in non English M&B save-file reading . #9

Nex0817 opened this issue Mar 21, 2020 · 6 comments

Comments

@Nex0817
Copy link

Nex0817 commented Mar 21, 2020

Hi int19h.

I was very excited when I found this neat save editors, but soon frustrated.
It is only works on English version of M&B save file.
(I uses Korean version of M&B).

When I searched Google on this issue, I thought it might be due to the encoding problem(UTF-8 BOM or whatever).

And, I found someone's comments : "Yeah it is EASY. to solve that, just use constructor of stream."
With provided source code, a little programming skill and a great amount of groundless confidence I tried to fix it but failed.

Trying 5 hours, I couldn't understand even how this program calls LineReader()
:(

Hoping you can solve this problem, I attached my save file.(Warband Native)
https://drive.google.com/file/d/14eioKtPZXcW0x52MYZDWR8BXxz2jS4hz/view?usp=sharing

Thanks for reading my issue.

P.S
if(you.state == busy || !you.is_interest(this_issue) ||
this_issue.difficulty == extreme || you.is_hate(asian))
you can ignore this issue, and I'm sorry for bothering you.
else
I hope 1.0.5 VERSION coming soon!

@int19h
Copy link
Owner

int19h commented Mar 22, 2020

Which file are you getting an error on? Is it the save itself, or one of the .txt files from the module?

The save itself is loaded using the OS locale - this is what you have selected under Time & Language -> Language -> Administrative language settings -> Language for non-Unicode programs.

For .txt files, it uses UTF-8, which is probably wrong - it didn't even occur to me that those might contain anything non-ASCII, to be honest. LineReader is only used for those.

@Nex0817 Nex0817 closed this as completed Mar 29, 2020
@Nex0817
Copy link
Author

Nex0817 commented Mar 29, 2020

Thank you! I'll try!.

I have addicted to this game, I didn't remember my request.
i'm very sorry for late seeing your kind reply.

@int19h
Copy link
Owner

int19h commented Mar 29, 2020

It's still a valid bug, so lets keep it open, especially as other users might also run into it. I don't have much spare time to deal with this right now, but I'll get to it eventually.

@int19h int19h reopened this Mar 29, 2020
@Matihood1
Copy link

Hello. I know I'm coming here like 1,5 years after the last comment but still, the issue hasn't really been fixed. I'd happily do it myself but I have no idea where to even look in the source code or what to change.

@Matihood1
Copy link

Matihood1 commented Nov 3, 2021

Are you sure the problem lays within reading the .txt files? Because from what I've seen, it's only the names of the characters and parties (or even just the main character's name) that get messed up and those should be stored within the save file itself, judging by the fact that they remain the same after changing the game's language. The .txt files should, indeed, only contain UTF-8 characters (at least for the Native module, which is what both I and the creator of this issue are using).

Regarding the .txt files, would it really not be possible to just change the encoding by changing the way reader instances are created? For example, for PartyDefinitions:

using (var reader = File.OpenText(Path.Combine(BasePath, "parties.txt"))) {
Parties = new PartyDefinitions(reader);
_entityDefinitions.AddEntities(() => Parties);
}

Currently, you're using File.OpenText(String) which, according to the msdocs, just calls the StreamReader(String) constructor. Wouldn't it be possible to just use the StreamReader constructor that takes the encoding:

using (var reader = new StreamReader(Path.Combine(BasePath, "parties.txt"), Encoding.Unicode))

I still don't think this is the source of the problem since it's the names stored in the savegame itself that are displayed incorrectly.

Edit: Or maybe the problem lays not in the fact that the encoding used when reading the save file is wrong but the encoding in the GUI part of the program is wrong. After all, after making an edit to a save file, all special characters are saved correctly, even though they are not displayed as such in the editor.

@int19h
Copy link
Owner

int19h commented Nov 3, 2021

Sorry, I wasn't paying attention and got it mixed up with the other open issue! Yes, you're right, this one is strictly about the encoding. It's not a GUI issue - once the strings are read, they're stored as UTF-16 in memory (same as any other .NET string), and the GUI is also Unicode throughout.

Strings from the save itself use the OS locale / codepage ("Language for non-Unicode applications", which Warband is):

string IValueSerializer<string>.Read(BinaryReader reader) {
var length = reader.ReadInt32();
var bytes = reader.ReadBytes(length);
if (bytes.Length != length) {
throw new EndOfStreamException();
}
return Encoding.Default.GetString(bytes);
}
void IValueSerializer<string>.Write(BinaryWriter writer, string value) {
writer.Write(value.Length);
writer.Write(Encoding.Default.GetBytes(value));
}

So the reason why it round-trips successfully is because the mapping between bytes and char is 1:1 in this case (unlike UTF-8). It decodes incorrectly - but re-encoding that incorrect result gives you the original bytes back.

Anyway, the challenge here is to figure out the correct codepage, since the save itself doesn't contain this info (as far as I know). The non-Unicode locale is usually a decent proxy for this, but it could be made a setting in the app itself. But you're dealing with saves from different game versions, that might have different encodings, that's not ideal, either. It might be best to add an encoding drop-down to the open file dialog when loading a save, like e.g. Notepad does for text files - but I haven't looked into how complicated that is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants