-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor cleanup in executor.cpp #2750
Conversation
- Add NVF_THROW - Add more information in bindInputs error - Add IValue debug string function - Cleanup some stale code
#define NVF_ERROR(cond, ...) \ | ||
if ((!(cond))) { \ | ||
nvfuser::nvfErrorFail( \ | ||
#define NVF_THROW(...) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted a throw to be able to get some better error messages in bindInputs.
csrc/executor_utils.cpp
Outdated
ss << "When trying to run the provided host program," | ||
<< " there was an error with the provided input " << i | ||
<< ". Provided input was:\n "; | ||
ss << PolymorphicValue_functions::toString(*args[i]); | ||
ss << "\n which does not match the expected input:\n "; | ||
ss << inputs[i]->toString() << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is e.what()
worth keeping in the new exception message? It has the original source code location at least, which I think helps debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instance I didn't think it was particularly helpful, as it's the caller that we would typically want to point at, not the inside of expression evaluator which is the first to find it. I think we'd actually want the error to be thrown higher where bindInputs
is called as I expect most instances that would fail are because of providing bad inputs.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust your judgement -- I haven't ran into enough errors to have a strong opinion.
AFAICT, ExpressionEvaluator::bind
is a deep function that can fail at
Fuser/csrc/evaluator_common.cpp
Line 369 in 346e51c
"Could not evaluate metadata expression for ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that's why I lifted the error in the common place we bind a bunch of inputs to expression evaluator provided from someplace not generated by nvFuser (developers in tests and Thunder in integration).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for removing the dead code!
std::vector<at::Tensor> outputs, | ||
ExpressionEvaluator& expr_eval); | ||
|
||
// TODO: args shouldn't come in a reference here because we will append the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjsjann123 see the todo here. We don't ever use args as they're updated with the outputs. We always pass it back as an array of tensors. So it is different behavior with evaluateFusionOutputs
, but that seems to be the behavior we should want.
Cleanup printing. Co-authored-by: Jingyue Wu <[email protected]>
csrc/executor_utils.cpp
Outdated
ss << "When trying to run the provided host program," | ||
<< " there was an error with the provided input " << i | ||
<< ". Provided input was:\n "; | ||
ss << PolymorphicValue_functions::toString(*args[i]); | ||
ss << "\n which does not match the expected input:\n "; | ||
ss << inputs[i]->toString() << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust your judgement -- I haven't ran into enough errors to have a strong opinion.
AFAICT, ExpressionEvaluator::bind
is a deep function that can fail at
Fuser/csrc/evaluator_common.cpp
Line 369 in 346e51c
"Could not evaluate metadata expression for ", |
!build |
…nt sizes through the exact graph.
!build |
I went back over the error handling, I could use another review please, @jjsjann123 @wujingyue. Focused on changes in expr_evaluator.cpp @wujingyue and @jjsjann123 because I touched the ops testing in python. |
csrc/expr_evaluator.cpp
Outdated
void handlePropagateError( | ||
Fusion* fusion, | ||
ExpressionEvaluator* expr_eval, | ||
std::shared_ptr<VectorOfUniqueEntries<const IterDomain*>> id_set) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::shared_ptr<VectorOfUniqueEntries<const IterDomain*>> id_set) { | |
const VectorOfUniqueEntries<const IterDomain*>& id_set) { |
Passing shared_ptr is slightly more expensive than passing in the underlying reference. The latter seems to suffice because id_set
outlives this function.
// outputs to be able to send it to the kernel. For now none of the users are | ||
// reconsuming the args, so it is okay. It isn't done now because changing it | ||
// from a reference makes a call as runFusion({}) ambiguous, and that is used | ||
// in some places in the codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be really interesting to see how we can resolve this cleanly, but I don't know any way to do that without changing the call site to not use braced-init-list. This gives me headache: https://en.cppreference.com/w/cpp/language/overload_resolution#Implicit_conversion_sequence_in_list-initialization
I don't know anything that can help overload resolution? wondering if @zasdfgbnm knows any dark magic?
Typos Co-authored-by: Jingyue Wu <[email protected]>
…to executor_cleanup
!build |
!build |
Was just going through executor a bit to get oriented and tried to do a bit of cleanup.
Improves error handling when invalid static sizes are provided to nvFuser. Specifically, when finding incompatible sizes within
void ExpressionEvaluator::propagateBoundValuesThroughExactMaps(Fusion* fusion) {
.Before this code would throw an error like:
Now it looks like: