Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Rust based Vocabulary type #30

Merged
merged 3 commits into from
Sep 24, 2024
Merged

Introduce Rust based Vocabulary type #30

merged 3 commits into from
Sep 24, 2024

Conversation

umut-sahin
Copy link
Contributor

(closes #26)

This type is not exposed in Python, which will be done in a subsequent PR, once this one is merged. Underlying type changed from a Vec to a HashMap to have efficient insertion (i.e., to avoid searching the token in the vec). Since HashMap<Key, Value> has the same iteration semantics as Vec<(Key, Value)> no changes were needed where the vocabulary is used.

This was required because vocabulary was no longer ordered. So returning a vector was causing ordering issues.
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there's a small performance regression (e.g. possibly due to the HashMap change), but it's negligible (i.e. under a microsecond and less than 5MB).

@brandonwillard brandonwillard merged commit f3fb1fb into main Sep 24, 2024
8 of 9 checks passed
@brandonwillard brandonwillard deleted the vocabulary branch September 24, 2024 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Rust based Vocabulary type
2 participants