-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sycl Half #1710
base: half_batch
Are you sure you want to change the base?
Sycl Half #1710
Conversation
3bbe66b
to
09c777a
Compare
ba0dab2
to
f1cc46e
Compare
dpcpp/base/math.hpp
Outdated
// TODO: check whether mac compiler always use complex version even when real | ||
// half | ||
#define COMPLEX_HALF_OPERATOR(_op, _opeq) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this TODO been addressed?
dpcpp/components/atomic.dp.hpp
Outdated
// unsupported | ||
auto old = *addr; | ||
*addr += val; | ||
return old; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's unsupported, we should not compile it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a workaround, we could use an atomic_cas loop
0f9a564
to
e497cea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this workaround, would creating a non-std complex implementation with sycl::half
work, e.g. gko::complex<sycl::half>
? Then you could do the usual device type mapping stuff to actually use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this idea, though we should keep the implementation internal to Ginkgo to allow future changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, of course, it should be a private type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the complex is internal type. They will not expose it to public.
We only add the dependence with -fsycl
, which is only in the dpcpp backend.
I wonder it affect the performance especially the intel complex mapped std::complexsycl::half to __spv::complex_half, which might be their internal type?
In gko::half and sycl::half, I have seen 1.5 or 2x slowdown by using gko::half.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So can you also use _spv::complex_half
here? It sounds a bit like you could just map std::complex<gko::half>
to that type and don't need to introduce a new type at all.
To adapt it, we need to do the same trick such that we can provide the implementation before loading complex header.
// after providing std::complex<sycl::half>, we can load their <complex> to | ||
// complete the header chain. | ||
|
||
#if GINKGO_DPCPP_MAJOR_VERSION > 7 || \ | ||
(GINKGO_DPCPP_MAJOR_VERSION == 7 && GINKGO_DPCPP_MINOR_VERSION >= 1) | ||
|
||
#if defined(__has_include_next) | ||
// GCC/clang support go through this path. | ||
#include_next <complex> | ||
#else | ||
// MSVC doesn't support "#include_next", so we take the same workaround in | ||
// stl_wrappers/complex. | ||
#include <../stl_wrappers/complex> | ||
#endif | ||
|
||
#else | ||
|
||
|
||
#include <complex> | ||
|
||
|
||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to just remove this and always include <complex>
.
Some of the comments can also be adjusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be a normal header again after everything works well
ca0767c
to
454381a
Compare
This PR will be necessary to support half in sycl when we still use gko::half in the host.
It creates the mapping like cuda_type or hip_type in sycl part, and apply it to those variables with value type
It will be merged/checked when we have half and intel ci