-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Premise Selection on GPUs #53
Comments
I believe they should work for CUDA if changed from: Lines 335 to 336 in 38a5a48
to: auto p_topk_indices = topk_indices.to_vector<int>();
auto p_topk_values = topk_values.to_vector<float>();
// Possibly the following line is also required; I'd have tested if I had a system that the CUDA support ran on; sadly it appears something accidentally assumed a newer GPU than mine, as mine is technically supported according to the documentation. Well, encoding and generating fail with the prebuild binary; I can't even seem to compile with CUDA support.
ctranslate2::synchronize_stream(device); . For better throughput retrieval should be batched; as-is, the bulk of it's work does a single multiply-add operation per scalar loaded from memory; for example, a Ryzen 9 5950X would, if I'm not mistaken, be DRAM read bandwidth bound up to a batch size of about 80; this would though require loop tiling to both register file and L1 cache, as with a 1472 embedding dimension and f32 values 80 queries would already barely fit in L2 cache. And yes, I do mean that you retrieving 100 queries with a small enough k in the TopK, if running sufficiently optimal code, would be expected to take approximately twice as long as retrieving a single query. It's that bad. |
We have recent reports about using Lean Copilot on certain Ubuntu machines with CUDA-enabled GPUs. We currently apply a suboptimal fix with CMake flags. Linking PR #109. |
We only select premises on CPUs but want to make it work also for GPUs.
LeanCopilot/cpp/ct2.cpp
Line 246 in 8c7b338
@Peiyang-Song . A recent discussion on Zulip may be relevant: https://leanprover.zulipchat.com/#narrow/stream/219941-Machine-Learning-for-Theorem-Proving/topic/LeanCopilot/near/418208166
The text was updated successfully, but these errors were encountered: