Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML Marker Parser and New Internal Format #56

Draft
wants to merge 767 commits into
base: master
Choose a base branch
from
Draft

Conversation

AsherGlick
Copy link
Owner

A new XML parser using the RapidXML library which converts to a new internal binary marker format that will be read by Burrito. Before this PR is merged the new format will be implemented and usable for all of the attributes that are currently supported. The new internal format will support marker categories, and will be divided by map and marker pack. Having this done will allow for Burrito to eventually support enabling and disabling specific marker categories on a given map instead of entire marker files at once. The completion of this work will also indicate the depreciation of the existing python and rust XML parsers.

The XML parser will attempt to parse all valid attributes across BlishHUD and TacO even if those attributes are not used within Burrito at this time. The result is that the parser will double as a linter to alert users of any marker errors. Depending on time a simple documentation generator may be able to take the various parser classes and create documentation similar to BlishHUD's docs automatically.

Right now all that is completed is a design for writing the XML Parser / Linter

@coderedart
Copy link
Contributor

what binary format btw? custom or a standard one like protobuf?

@AsherGlick
Copy link
Owner Author

AsherGlick commented Mar 30, 2022

I currently plan to use protobuf, specifically proto3, as it is a strictly typed and future proof standard that, in theory, could easily be used by other applications if desired. I do want to add the functionality to export these binary files back into XML for distribution to existing XML parsing applications. Several considerations have yet to be made about the schema. Some of note are: if images should be stored inside the tree or outside, if trails should be stored inside the tree or outside, how this might handle future yet-ephemeral features such as 2D areas or 3D meshes.

@AsherGlick
Copy link
Owner Author

Remaining tasks can be found in #64

@coderedart
Copy link
Contributor

I personally grouped up all the bools into a single byte.

in_game_visibility = 1;
map_visibility = 2;
mini_map_visibility = 4;
auto_trigger = 8;
can_fade = 16;
scale_on_map_with_zoom = 32; // wtf xD
has_countdown = 64;
??? = 128; // for future reserver

then again, i had bitflags crate which takes care of 99% of work with 2 lines. I also made Races, Professions, Festivals, Mounts etc.. like lists into bitflags.

not saying you gotta do this too, as the code already seems complex enough with macros and generators. but might still be relevant to this because using bitflags makes a lot of sense for binary formats.

@AsherGlick
Copy link
Owner Author

@coderedart Using bitflags makes a lot of sense but I am going to leave these as bools both in memory and in the serialized format.

Protobuf already has other aspects that make it less-than optimal for space saving. If I wanted a perfect binary format on disk I would serialize a struct with bitflags. However the convenience of automatically having the variables defined and usable by name inside an instantiated class when parsing a protobuf, and the utility of forwards and backwards compatibility, which would be lost if using bitflags, outweighed the space savings for me. For the filters/flags, like profession, protobuf wont save false values inside the serialized data structure because false is the default value for bool and default values can be elided, which means that 0 bytes are written to disk for icons/trails that do not use the field.

For the in-memory data structure, saving an extra few dozen bytes per icon is still only a few kB of memory all in all, and once I get over to GDScript it is going to be stored in memory in an even less optimal format. However if the need arises to restrict memory usage in this c++ module then I will absolutely take a look at switching to bitflags here.

And just theory crafting as to how they would work for fun. Each class here, such as Icon, Trail, Category, or the eventually generated classes such as profession_filter would keep a running tally of how many boolean flags they are using when generating the code. Allocate a uint of appropriate size with a static name such as flags or something. Then instead of defining variables of type bool, they would create #defines of each bool and its associated location in the list of possible bools. This would probably be fine because the API would be stable even if the in-memory location swapped around between library versions. If more usability and stability was desired then public get/set functions for the psuedo booleans could be created that handled the bitflag operations automatically. Which would also allow more booleans to be configurable then the widest integer would allow by allowing different psuedo boolean get/set functions to access different internal bitflags.

@AsherGlick
Copy link
Owner Author

AsherGlick commented Aug 4, 2022

Also just because I feel like ranting about macros. The crazy complex macro system I had created originally, before deciding to swap to code generators that work off the documentation, was a bad idea. It seemed really cool, but in the end due to the things I had to do to get everything to work properly without adding extra toil to modifying a class caused the code to end up being quite slow for a dumb reason. Instantiation of each parseable subclass took orders of magnitude longer then actually parsing the xml into it did. Once this task is done all that macro hackery should be totally gone and instead nice and speedy auto generated parser functions should remain.

Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
Repository owner deleted a comment from github-actions bot Nov 18, 2023
klingbolt and others added 30 commits October 1, 2024 23:18
Changed workflow to also work on the new guildpoint extention
…file by suffix function to return relative paths
Changed all instances of .bin to .guildpoint
Added tracking for the top level categories in each pack.
Changed file extenstion for duplicates tests
… the proto if we do not know how it will be implemented in Burrito
Commented out the attributes that are not implemented in Burrito
Formatted XML output functionality
…ersion

Removing Individual File Conversion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants