-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor performance improvements #110
Conversation
id_from_canonical ideally takes owned strings instead of references to avoid a copy.
4dfa366
to
a687379
Compare
src/eval.rs
Outdated
let mut result = 0; | ||
for (i, env) in envs.iter().enumerate() { | ||
if let Some(v) = env.get_var(v.as_ref()) { | ||
result = v.calc_evaluated_length(&envs[i + 1..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better as return v.calc_eval...
with a 0
return at the end of the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done.
/// need to create a new id, but would also be possible to create a version | ||
/// of this function that accepts string references that is more optimized | ||
/// for the case where the entry already exists. But so far, all of our | ||
/// usages of this function have an owned string easily accessible anyways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the case where this might come up is when loading .n2_db, where the string paths are stored canonicalized so we know they don't need any mutation as we parse them. But I think that's only worth really thinking about if/when it comes time to speed up that codepath. (This comment is just refreshing my memory on this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, though currently read_str
in db.rs
reads owned strings. Using references would also be a tradeoff that requires keeping the db in memory, but we could do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it works is the db is read fully at startup and all the paths are mapped to ids as they're read, so it doesn't really need an owned string in there. But I think it's also probably not the slow part, yet...
@@ -76,10 +65,19 @@ impl Loader { | |||
loader | |||
} | |||
|
|||
/// Convert a path string to a FileId. For performance reasons | |||
/// this requires an owned 'path' param. | |||
fn path(&mut self, mut path: String) -> FileId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's too bad, I went to all the effort to make this area reuse a single String buffer, but I understand why it must be done. RIP
/// for the case where the entry already exists. But so far, all of our | ||
/// usages of this function have an owned string easily accessible anyways. | ||
pub fn id_from_canonical(&mut self, file: String) -> FileId { | ||
// TODO: so many string copies :< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW the comment here was a reminder to myself around my struggle to not have so many string copies here. The hashmap is mapping string -> File, and file has a .name field which is a string, so in theory you don't need a second copy of that string as the hashmap key. But I couldn't get all the lifetimes to work when I tried that. Fixing that would mean we no longer need two copies of every path string and would also save a lot of allocations on the load path...
So that we can do one memory allocation for them.
Use HashMap.entry() instead of a lookup + insert.
FxHashMap has a faster hashing algorithm, at the expense of not being resistent to DOS attacks.
a687379
to
a24eb9c
Compare
These changes give about a 20% performance improvement to bench_load_synthetic, and save a couple seconds off of loading the android ninja files. Most of this is from the switch to FxHashMap, the other changes are only around a 5% improvement.
If you have concerns about any of these commits let me know and I'll take those ones out of the PR.