Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler, character-state-machine based "parser" #4

Merged
merged 15 commits into from
Sep 18, 2024
Merged

Conversation

dhdaines
Copy link
Owner

@dhdaines dhdaines commented Sep 18, 2024

Relies on io to do buffering instead of some fragile things. EOF handling is still difficult though.

This is much faster on actual files (because letting cPython do buffering is a winning strategy), but for fairly obvious reasons, it's also much slower on BytesIO, and it turns out that the majority of "parsing" going on is over BytesIO objects. So there will be a separate PR to add a regex-based "parser" for in-memory data.

@dhdaines dhdaines merged commit 1394c50 into main Sep 18, 2024
1 check passed
@dhdaines dhdaines deleted the simple_parser branch September 18, 2024 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant