Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use the AlmostExact map when traversing across multiple TV ops #3317

Closed
wants to merge 1 commit into from

Conversation

naoyam
Copy link
Collaborator

@naoyam naoyam commented Oct 30, 2024

This was an attempt to fix #3299. I believe the error happens due to a particular combination of reshape and expanded broadcast domains. Expanded broadcast domains become non-broadcast domains before reshape here. It seems that's causing some unexpected effects in the indexing traversal. I thought the use of the Permissive graph is suspicious and replacing it with AlmostExact does fix the error of #3299, but unfortunately it results in a different error, e.g.:

00:13:48 terminate called after throwing an instance of 'nvfuser::nvfError'
00:13:48   what():   INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/device_lower/analysis/index_compute.cpp":727, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Could not find required iter domain in reference replay: iblockIdx.y172{108}

I think these are all due to the use of rather lax usage of permissive mappings. I don't understand why this particular part needs to use the Permissive graph, but apparently it results in the other error with AlmostExact.

I thought maybe fixing the legacy indexer could be a simple change, but apparently that's not the case. I'll think about a workaround by using the new IdModel-based indexer, which should not have these problems as it's much more strict with iter-domain mappings (although still not perfect).

@naoyam
Copy link
Collaborator Author

naoyam commented Oct 30, 2024

!build --diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error found in Mistral-Nemo and Qwen2's Rope implementations
1 participant