You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
One usecase for this feature is GPU simulations. One example is in MPM-MLS algorithm, you have a grid of values, that you update from each particle you have in simulation. And since each grid node can be updated from number of particles, you need to update grid via atomics. Another example is aggregating data particles or grid nodes in some other way, for example counting avarage velocity or forces those may apply to objects.
Another usecase is calculating bounding boxes of programmably generated meshes, or for something you can't offline process for some reason. You can just do atomic min/max with value of each position.
Describe the solution you'd like
Intrinsics that allow usage of float32 atomic add/subtract/min/max, similar to ones int32 has already
Describe alternatives you've considered
There are 2 alternatives:
Converting floats to fixed point
This works in many circumstances, but have few nuances:
You first need to balance tradeoff between representable range and precision. Native floats are better at keeping relative precision.
Fixed point floats can overflow without any way to detect that. Native floats can handle overflow.
Using CAS loop to emulate float atomics
This works in some circumstances, but has numerous nuances:
If hardware have native float atomics, performance should be better with those
If too many threads try to do atomic operation on a single value, contention between threads may get so high that shader times out and causes DEVICE_HUNG
Additional context
Since atomic add/subtract is not associative, results will not be deterministic. This is fine in most circumstances. Additionally, vendors should be free to perform certain optimizations that affect order of operations, for example do wave sum before doing atomic once per wave. This would also change order of operations, but if it is not consistent either way, it should be fine.
Atomic min/max won't have any catches like this though.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
One usecase for this feature is GPU simulations. One example is in MPM-MLS algorithm, you have a grid of values, that you update from each particle you have in simulation. And since each grid node can be updated from number of particles, you need to update grid via atomics. Another example is aggregating data particles or grid nodes in some other way, for example counting avarage velocity or forces those may apply to objects.
Another usecase is calculating bounding boxes of programmably generated meshes, or for something you can't offline process for some reason. You can just do atomic min/max with value of each position.
Describe the solution you'd like
Intrinsics that allow usage of float32 atomic add/subtract/min/max, similar to ones int32 has already
Describe alternatives you've considered
There are 2 alternatives:
Additional context
Since atomic add/subtract is not associative, results will not be deterministic. This is fine in most circumstances. Additionally, vendors should be free to perform certain optimizations that affect order of operations, for example do wave sum before doing atomic once per wave. This would also change order of operations, but if it is not consistent either way, it should be fine.
Atomic min/max won't have any catches like this though.
The text was updated successfully, but these errors were encountered: