-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include a fingerprint of the specific arguments in mangled names. #4771
Include a fingerprint of the specific arguments in mangled names. #4771
Conversation
Instead of including the raw index of the specific, which is unstable across files and across unrelated changes, use a fingerprint of the constant values of the specific arguments. This is a placeholder until we decide on how we want to mangle specific functions.
@@ -17,7 +17,7 @@ namespace Carbon::SemIR { | |||
namespace { | |||
struct Worklist { | |||
// The file containing the instruction we're currently processing. | |||
const File* sem_ir; | |||
const File* sem_ir = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great :) I was wondering inside if this would be a good idea on the last PR
@@ -337,11 +344,25 @@ struct Worklist { | |||
|
|||
auto InstFingerprinter::GetOrCompute(const File* file, InstId inst_id) | |||
-> uint64_t { | |||
Worklist worklist = {.sem_ir = nullptr, | |||
.todo = {{file, inst_id}}, | |||
Worklist worklist = {.todo = {{file, inst_id}}, | |||
.fingerprints = &fingerprints_}; | |||
worklist.Run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorta aside, but the new GetOrCompute avoids doing a second lookup, and it seems like Run() could also keep around the last fingerprint and return it, which would be the same as the lookup here and save us that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done. (In this PR for convenience, let me know if you'd prefer that I split it out.)
worklist.Prepare(file); | ||
worklist.Add(inst_block_id); | ||
if (!worklist.todo.empty()) { | ||
worklist.Run(); | ||
worklist.Prepare(file); | ||
worklist.Add(inst_block_id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love how this is breaking the rather nice Worklist/Run abstraction. /me goes to stare at Worklist and see if this can be done internally
I see, it's because InstBlockId
is not an InstId
so it can't go in todo
. This is maybe a terrible idea but just in case - did you consider constructing some artificial typed inst here that has an InstBlockId inside it, and give that as the todo
?
Or a possibly heavier or different shaped hammer: todo
could become a list of variant<InstId, InstBlockId>
and Run could deal with that internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I find this a little unsatisfying too. We can't construct a real InstId
for a new artificial instruction without modifying the File
, which seems out-of-contract (and we're using this from lowering which really ought to not mutate the File
).
I think switching to storing a variant in the todo list could be a bit heavyweight. On the other hand, if we see the same InstBlockId
s repeatedly, caching their fingerprints might save us some work. I think we should probably make this call based on performance data, but I don't think we have representative examples we can use to measure this yet, so I've added a TODO to try this out later.
One other option I considered was storing an Inst
rather than an InstId
in the todo list. That'd allow us to use an artificial typed inst here without modifying the IR. Again it's a little heavy -- an Inst
is four times the size of an InstId
-- and I'm not sure whether it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks. Yeah if we're concerned with performance here, having some kind of benchmark that we can look at would be great! I would be a little surprised if a variant of two ids is noticeably worse than one id, but we can't tell without measuring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the performance impact, it's mostly the extra levels in the Merkle tree that I'm concerned with, rather than the variant itself. I'd expect that it's a win to add separate levels for InstBlock
s if we often end up reusing the work, and not otherwise. [And I don't know what would constitute "often" either :)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was not imagining that changing actually, I think. Maybe I will send a lil followup to this and you can see if you like what I am picturing or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#4776 is what I was thinking, see what you think. No hard feelings if it's not what you'd like.
inst block fingerprints.
Instead of including the raw index of the specific, which is unstable across files and across unrelated changes, use a fingerprint of the constant values of the specific arguments. This is a placeholder until we decide on how we want to mangle specific functions.