Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to split a symbolic allocation domain. #3480

Open
wujingyue opened this issue Nov 26, 2024 · 0 comments
Open

Failed to split a symbolic allocation domain. #3480

wujingyue opened this issue Nov 26, 2024 · 0 comments
Assignees

Comments

@wujingyue
Copy link
Collaborator

wujingyue commented Nov 26, 2024

This is a spin-off from #3458 (comment).

Repro:

// The test fails as is. The symbolic IterDomains in loop/allocation are not
// concretized. I tried to change DynamicTransformConcretizer::mutate to grab
// all expressions between root and allocation but still couldn't get it to
// work.
TEST_F(AllocationDomainTest, DISABLED_InputAllocationIsSplit_Symbolic) {
auto fusion = std::make_unique<Fusion>();
FusionGuard fg(fusion.get());
TensorView* in = makeContigTensor(1);
TensorView* out = set(in);
fusion->addInput(in);
fusion->addOutput(out);
in->split(0, 2);
in->setAllocationDomain(in->getLoopDomain(), true);
FusionExecutorCache executor_cache(std::move(fusion));
auto options = at::TensorOptions().dtype(at::kFloat).device(at::kCUDA);
at::Tensor in_tensor = at::randn({6}, options);
auto out_tensors = executor_cache.runFusionWithInputs({in_tensor});
testValidate(
executor_cache.fusion(), out_tensors, {in_tensor}, __LINE__, __FILE__);
}

$ bin/nvfuser_tests --gtest_filter=*InputAllocationIsSplit_Symbolic --gtest_also_run_disabled_tests
C++ exception with description " INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/ir/utils.cpp":965, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. dom0 has unreachable IDs. dom0: iS27{( (( (( getMetaData(T0) )).logical_size ))[0] )}. dom1: iS2{( ceilDiv(i0, 2) )}, iS3{2}

Apparently, replaceSymbolicSizes failed to replace the symbolic sizes in the allocation/loop domain, leading to a disconnected ID graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants