Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include a fingerprint of the specific arguments in mangled names. #4771

Merged

Conversation

zygoloid
Copy link
Contributor

@zygoloid zygoloid commented Jan 8, 2025

Instead of including the raw index of the specific, which is unstable across files and across unrelated changes, use a fingerprint of the constant values of the specific arguments. This is a placeholder until we decide on how we want to mangle specific functions.

Instead of including the raw index of the specific, which is unstable
across files and across unrelated changes, use a fingerprint of the
constant values of the specific arguments. This is a placeholder until
we decide on how we want to mangle specific functions.
@github-actions github-actions bot requested a review from jonmeow January 8, 2025 01:09
@@ -17,7 +17,7 @@ namespace Carbon::SemIR {
namespace {
struct Worklist {
// The file containing the instruction we're currently processing.
const File* sem_ir;
const File* sem_ir = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great :) I was wondering inside if this would be a good idea on the last PR

@@ -337,11 +344,25 @@ struct Worklist {

auto InstFingerprinter::GetOrCompute(const File* file, InstId inst_id)
-> uint64_t {
Worklist worklist = {.sem_ir = nullptr,
.todo = {{file, inst_id}},
Worklist worklist = {.todo = {{file, inst_id}},
.fingerprints = &fingerprints_};
worklist.Run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorta aside, but the new GetOrCompute avoids doing a second lookup, and it seems like Run() could also keep around the last fingerprint and return it, which would be the same as the lookup here and save us that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done. (In this PR for convenience, let me know if you'd prefer that I split it out.)

Comment on lines +356 to +362
worklist.Prepare(file);
worklist.Add(inst_block_id);
if (!worklist.todo.empty()) {
worklist.Run();
worklist.Prepare(file);
worklist.Add(inst_block_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love how this is breaking the rather nice Worklist/Run abstraction. /me goes to stare at Worklist and see if this can be done internally

I see, it's because InstBlockId is not an InstId so it can't go in todo. This is maybe a terrible idea but just in case - did you consider constructing some artificial typed inst here that has an InstBlockId inside it, and give that as the todo?

Or a possibly heavier or different shaped hammer: todo could become a list of variant<InstId, InstBlockId> and Run could deal with that internally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I find this a little unsatisfying too. We can't construct a real InstId for a new artificial instruction without modifying the File, which seems out-of-contract (and we're using this from lowering which really ought to not mutate the File).

I think switching to storing a variant in the todo list could be a bit heavyweight. On the other hand, if we see the same InstBlockIds repeatedly, caching their fingerprints might save us some work. I think we should probably make this call based on performance data, but I don't think we have representative examples we can use to measure this yet, so I've added a TODO to try this out later.

One other option I considered was storing an Inst rather than an InstId in the todo list. That'd allow us to use an artificial typed inst here without modifying the IR. Again it's a little heavy -- an Inst is four times the size of an InstId -- and I'm not sure whether it's worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks. Yeah if we're concerned with performance here, having some kind of benchmark that we can look at would be great! I would be a little surprised if a variant of two ids is noticeably worse than one id, but we can't tell without measuring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the performance impact, it's mostly the extra levels in the Merkle tree that I'm concerned with, rather than the variant itself. I'd expect that it's a win to add separate levels for InstBlocks if we often end up reusing the work, and not otherwise. [And I don't know what would constitute "often" either :)]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was not imagining that changing actually, I think. Maybe I will send a lil followup to this and you can see if you like what I am picturing or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4776 is what I was thinking, see what you think. No hard feelings if it's not what you'd like.

@zygoloid zygoloid requested a review from danakj January 8, 2025 19:28
@zygoloid zygoloid enabled auto-merge January 8, 2025 19:54
@zygoloid zygoloid added this pull request to the merge queue Jan 8, 2025
Merged via the queue into carbon-language:trunk with commit 9a5f2d7 Jan 8, 2025
8 checks passed
@zygoloid zygoloid deleted the toolchain-fingerprint-specifics branch January 8, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants