Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set parallel to false if parent_job is not nothing #1348

Merged
merged 1 commit into from
Mar 16, 2024

Conversation

giordano
Copy link
Contributor

While doing some conference-driven development (🙂) I found that Enzyme was generating atomicrmw instructions in the LLVM IR of my gradient, but my backend doesn't support them, and so my program was crashing at runtime...but only in Pluto. @vchuravy pointed out the problem may be this setting of parallel, related to the fact Pluto starts by default julia with multiple threads. I worked around the problem by fixing the number of threads to 1 in my Pluto notebook, but Valentin suggested also this change as a slightly more robust fix.

Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing we should think about is that now Julia has thread adoption and so the number of threads may change during the execution.

So I think we have to be conservative and for the CPU always assume parallel.

@vchuravy
Copy link
Member

@giordano How does the IPU have a way of handling updates from multiple threads to the same location?

I.e. is there ever a situation where you have a parallel read? Those need to become parallel accumulate ops

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.54%. Comparing base (2f6d564) to head (7e6a477).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1348       +/-   ##
===========================================
+ Coverage   75.19%   93.54%   +18.34%     
===========================================
  Files          35        7       -28     
  Lines       10669      248    -10421     
===========================================
- Hits         8023      232     -7791     
+ Misses       2646       16     -2630     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@giordano
Copy link
Contributor Author

Tile-level multi-threading is still a to-do for me: JuliaIPU/IPUToolkit.jl#30 🥲

@giordano
Copy link
Contributor Author

I guess I'd need to write a multi-threaded codelet in C++ and see what's the resulting LLVM IR (this is what I've been doing so far to figure out the expected IR from Julia), but atomicrmw instructions like

  %12 = atomicrmw fadd float* %"'ipg.i", float %11 monotonic, align 1, !dbg !1259

in the Enzyme derivative were causing a runtime error

Unresolved relocation: symbol '__atomic_compare_exchange' on tile 741

I don't know if I need to do something to "activate" a multi-threaded environment on the IPU side 🤔

@vchuravy
Copy link
Member

Yeah it might mean that they use custom intrinsics or that you need to link in a library

@vchuravy
Copy link
Member

@wsmoses all the local libEnzyme versions seem broken due to a build failure.

@vchuravy vchuravy merged commit e5fbf09 into EnzymeAD:main Mar 16, 2024
8 of 43 checks passed
@giordano giordano deleted the patch-2 branch March 16, 2024 23:53
@giordano giordano restored the patch-2 branch March 16, 2024 23:53
@giordano giordano deleted the patch-2 branch March 16, 2024 23:53
@wsmoses
Copy link
Member

wsmoses commented Mar 17, 2024

@vchuravy i think this breaks cuda/amd/etc. basically because if there is a parallel read it won't get atomic added together

@vchuravy
Copy link
Member

This should not impact CUDA? I.e. 5 lines later we set it to true for CUDA

@wsmoses
Copy link
Member

wsmoses commented Mar 17, 2024

Ah missed that, yeah this is fine then.

Maybe later we should upgrade the interface for backends to signify if they add parallelism

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants