Remove truediv operation #2837

rdspring1 · 2024-08-23T04:51:27Z

This PR removes the truediv operation, which mapped to torch.true_divide and torch.div(a, b, rounding_mode='none').

Why?

It is difficult to distinguish truediv from div in the Fusion IR.
Thunder prim.div maps to the div operator. The other types of div are implement through this prim.

csrc/ir/nodes.cpp

jacobhinkle · 2024-08-23T18:39:05Z

What exactly is the semantics of our div meant to be? For ints it is truncdiv (C++ int division). If either arg is float, then it is truediv. My point is that we do not actually implement python div (floordiv), so what is the value in separately supporting truediv?

rdspring1 · 2024-08-23T19:59:22Z

Nvfuser's truediv is like torch.div with rounding_mode=None

NVFUSER_DEFINE_BINARY_FLOAT_OP(truediv, Div)
BinaryOpType::op_type, v1, v2, TypePromotion::float_op_config);

Are you proposing to just remove truediv?

jacobhinkle · 2024-08-23T20:35:07Z

Are you proposing to just remove truediv?

I'm not sure what I'm proposing. You're right that in the kernel our Div will cast int inputs to float, but that's not true in our evaluator. That's probably just a bug in the evaluator, but I think that behavior is also used in scalar arithmetic, e.g. in #2818.

I guess for ultimate clarity we should have all these separate nodes/ops: truediv, truncdiv, floordiv, and ceildiv and nothing just called div.

rdspring1 · 2024-08-24T22:12:59Z

I believe truediv isn't used by Thunder, so maybe we can remove it for clarity. For now, I removed truediv from expression evaluator.

jjsjann123

This is used for translating CPP Fusion to python FusionDefinition

What's the usage here? Can you add an example to show that?

I'm also not sure if we actually do the right behavior here. i.e. we promote Tensors to floating point before div, but do we do the same for scalars? I share the concern @jacobhinkle has in his comment. But I don't think we have to resolve everything in this PR. Hence the ask of an example of what the problem we want to patch.

BTW, our python binding is binding __truediv__ to div. Might want to change that as well, since I think that's also relevant here?

The division handling in nvfuser is a concerning question. We should open and track that with another issue. Does that sound good to you @jacobhinkle ?

jjsjann123 · 2024-08-26T21:58:24Z

csrc/type.cpp

@@ -606,6 +608,7 @@ static const char* binary_op_type_inline_op2string(BinaryOpType t) {
    case BinaryOpType::Add:
      return "+";
    case BinaryOpType::Div:
+    case BinaryOpType::Truediv:


but / isn't truediv though.

I feel this is getting scarier here. A binary op with BinaryOpType::Truediv doesn't do truediv?!

jacobhinkle · 2024-08-27T00:00:13Z

Does that sound good to you @jacobhinkle ?

💯

rdspring1 · 2024-08-27T00:07:20Z

What's the usage here? Can you add an example to show that?

It is this series of PRs.

rdspring1 · 2024-08-27T00:08:15Z

@jjsjann123 Why not get rid of truediv in the python-frontend? It isn't used in Thunder. Division semantics are a big headache.

rdspring1 · 2024-08-27T00:10:46Z

Do we need to support truediv, truncdiv, floordiv? Thunder already handles this.

jjsjann123 · 2024-08-27T17:19:00Z

What exactly is the semantics of our div meant to be? For ints it is truncdiv (C++ int division). If either arg is float, then it is truediv. My point is that we do not actually implement python div (floordiv), so what is the value in separately supporting truediv?

Looks like thunder prim is only has c++ style div and py_floordiv (which we are not claiming anyway).
I don't think it makes sense to have others in API, since we do not have correct support at this moment.

rdspring1 · 2024-09-04T16:38:41Z

@jjsjann123 I changed this PR to remove the truediv operation.

jjsjann123

I think I see why you want to have a BinaryOpType::Truediv. in #2841, this is just so we have a clean 1-to-1 translation for the truediv part?

I'm not sure how big of a deal it is to have that 1-to-1 translation. We don't have to keep a truediv binary op, having it implicitly handled by floating point promotion followed by a c++ style div should be good enough.

not having truediv is somewhat an inconvenience IMHO. thunder doesn't have to use it, but we still have our python API where I can define fusion definition manually. So I tend to think maybe we want to keep it, but I don't have a strong opinion there neither.

jjsjann123 · 2024-09-05T09:23:08Z

csrc/ops/arith.h

-NVF_API Val* truediv(Val* v1, Val* v2);
-NVF_API TensorView* truediv(TensorView* v1, Val* v2);
-NVF_API TensorView* truediv(Val* v1, TensorView* v2);
-NVF_API TensorView* truediv(TensorView* v1, TensorView* v2);


Naive question, we don't think we need/should remove the c++ truediv API. Since these does float promotion, so it is a truediv in nature? Even though it's not a BinaryOpType::TrueDiv.

jjsjann123 · 2024-09-05T09:30:07Z

csrc/type.cpp

@@ -606,6 +608,7 @@ static const char* binary_op_type_inline_op2string(BinaryOpType t) {
    case BinaryOpType::Add:
      return "+";
    case BinaryOpType::Div:
+    case BinaryOpType::Truediv:


I feel this is getting scarier here. A binary op with BinaryOpType::Truediv doesn't do truediv?!

jjsjann123 · 2024-09-05T09:42:54Z

csrc/python_frontend/python_bindings.cpp

@@ -1443,7 +1443,6 @@ void initNvFuserPythonBindings(PyObject* module) {
  NVFUSER_PYTHON_BINDING_BINARY_OP("add", add)
  NVFUSER_PYTHON_BINDING_BINARY_OP("atan2", atan2)
  NVFUSER_PYTHON_BINDING_BINARY_OP("div", div)
-  NVFUSER_PYTHON_BINDING_BINARY_OP("truediv", truediv)


note, there's also an entry for __truediv__ that is translated to div. We have a comment saying that our div is true div, which I think is a lie.

import torch from nvfuser import FusionDefinition, DataType def nvfuser_fusion_id1(fd : FusionDefinition) -> None : T0 = fd.define_tensor(shape=[-1, -1], contiguity=[True, True], dtype=DataType.Int32, is_cpu=False, stride_order=[1, 0]) T1 = fd.define_tensor(shape=[-1, -1, -1], contiguity=[True, True, True], dtype=DataType.Int32, is_cpu=False, stride_order=[2, 1, 0]) T2 = T0 / T1 #T2 = fd.ops.truediv(T0, T1) fd.add_output(T2) with FusionDefinition() as fd: nvfuser_fusion_id1(fd) t0 = torch.ones((20,), dtype=torch.int32, device='cuda:0').as_strided((10, 2), (2, 1)) t1 = torch.ones((40,), dtype=torch.int32, device='cuda:0').as_strided((2, 10, 2), (20, 2, 1)) t1 += 1 inputs = [t0.clone(), t1.clone()] o = fd.execute(inputs)[0] print(o)

jjsjann123

BTW, this isn't necessarily caused by your PR. I'm glad you are exposing these issues. ❤️

Two things that I do have a stronger opinion on:

In this PR, the BinaryOpType::Truediv isn't doing true div, so that doesn't feel right.
Not in your code, but our python binding has

1555   // In PyTorch, __div__ (//) and __truediv__ (/) are different.
1556   // When applied to integer-dtype arguments, they do as expected, returning
1557   // integer and float outputs, respectively. When applied to two floating-type
1558   // arguments, they return the floor of division for // and plain division for
1559   // /. When applied to mixed types, the types are promoted, so the
1560   // floating-point behavior is returned.
1561   // Our div operator matches the __truediv__ behavior, so we do not implement
1562   // __div__.
1563   NVFUSER_PYTHON_BINDING_BINARY_OP_SPECIAL("__truediv__", "div", div)

This is just totally wrong and we should fix it.

This PR updates the comments for `__truediv__` operator defined in python bindings. The current comment does not reflect what the code actually does. Reference: #2837 (review)

zasdfgbnm reviewed Aug 23, 2024

View reviewed changes

csrc/ir/nodes.cpp Outdated Show resolved Hide resolved

rdspring1 mentioned this pull request Aug 25, 2024

Create CPP Fusion to Python FusionDefinition framework #2841

Merged

rdspring1 requested a review from jjsjann123 August 26, 2024 15:43

jjsjann123 reviewed Aug 26, 2024

View reviewed changes

rdspring1 force-pushed the truediv_type branch from 51e9833 to 3d618b4 Compare September 3, 2024 16:21

rdspring1 changed the title ~~Add truediv BinaryOpType~~ Remove truediv operation Sep 3, 2024

rdspring1 requested a review from jjsjann123 September 4, 2024 16:38

jjsjann123 reviewed Sep 5, 2024

View reviewed changes

update __truediv__ comment

119c323

rdspring1 force-pushed the truediv_type branch from c1aca34 to 119c323 Compare December 13, 2024 01:42

rdspring1 closed this Dec 13, 2024

rdspring1 mentioned this pull request Dec 13, 2024

Update __truediv__ comment in python bindings. #3586

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove truediv operation #2837

Remove truediv operation #2837

rdspring1 commented Aug 23, 2024 •

edited

Loading

jacobhinkle commented Aug 23, 2024

rdspring1 commented Aug 23, 2024 •

edited

Loading

jacobhinkle commented Aug 23, 2024

rdspring1 commented Aug 24, 2024

jjsjann123 left a comment

jjsjann123 Aug 26, 2024

jjsjann123 Sep 5, 2024

jacobhinkle commented Aug 27, 2024

rdspring1 commented Aug 27, 2024

rdspring1 commented Aug 27, 2024 •

edited

Loading

rdspring1 commented Aug 27, 2024

jjsjann123 commented Aug 27, 2024

rdspring1 commented Sep 4, 2024

jjsjann123 left a comment

jjsjann123 Sep 5, 2024

jjsjann123 Sep 5, 2024

jjsjann123 Sep 5, 2024

jjsjann123 left a comment

Remove truediv operation #2837

Remove truediv operation #2837

Conversation

rdspring1 commented Aug 23, 2024 • edited Loading

jacobhinkle commented Aug 23, 2024

rdspring1 commented Aug 23, 2024 • edited Loading

jacobhinkle commented Aug 23, 2024

rdspring1 commented Aug 24, 2024

jjsjann123 left a comment

Choose a reason for hiding this comment

jjsjann123 Aug 26, 2024

Choose a reason for hiding this comment

jjsjann123 Sep 5, 2024

Choose a reason for hiding this comment

jacobhinkle commented Aug 27, 2024

rdspring1 commented Aug 27, 2024

rdspring1 commented Aug 27, 2024 • edited Loading

rdspring1 commented Aug 27, 2024

jjsjann123 commented Aug 27, 2024

rdspring1 commented Sep 4, 2024

jjsjann123 left a comment

Choose a reason for hiding this comment

jjsjann123 Sep 5, 2024

Choose a reason for hiding this comment

jjsjann123 Sep 5, 2024

Choose a reason for hiding this comment

jjsjann123 Sep 5, 2024

Choose a reason for hiding this comment

jjsjann123 left a comment

Choose a reason for hiding this comment

rdspring1 commented Aug 23, 2024 •

edited

Loading

rdspring1 commented Aug 23, 2024 •

edited

Loading

rdspring1 commented Aug 27, 2024 •

edited

Loading