-
Notifications
You must be signed in to change notification settings - Fork 8
Cuda support
Jiří Novotný edited this page Sep 15, 2015
·
10 revisions
EACirc provides a version of gate circuit that is capable of running on CUDA devices. EACirc must be compiled with special settings to enable CUDA support. The CUDA-enabled binary can run considerably faster. To run the accelerated computation you need s special GPU supporting CUDA (basically any modern nVIDIA GPU). The minimal supported GPU architecture is Fermi. Note that CUDA is capable of running only on x64 machines.
- Prerequisites: Properly installed CUDA version 7 or newer. To download the CUDA SDK visit https://developer.nvidia.com/cuda-downloads
- Enable CUDA support: When CMake detects CUDA (usually done automatically when CUDA is properly installed), a CMake option
BUILD_CUDA
becomes available. If this option is chosen then the compiled binary will be capable of running on CUDA devices.
- nVIDIA driver supporting CUDA 7 (note: I think a version 350 or higher should do)
- EACirc binary compiled with CUDA support
- Enable CUDA (option
EACIRC/CUDA/ENABLED
is set to1
). - Use Gate 2 circuit backend (option
EACIRC/MAIN/CIRCUIT_REPRESENTATION
is set to3
). - Disable circuit memory (option
EACIRC/GATE_CIRCUIT/USE_MEMORY
is set to0
). - Disable JVMSim functionality (option
EACIRC/GA_CIRCUIT/ALLOWED_FUNCTIONS/FNC_JVM
is set to0
). - Parameter
EACIRC/TEST_VECTORS/SET_SIZE
should be at least32000
(any other number will do, but a multiple of cuda block size is the most effective).
Note: Changing these options could dramatically increase or decrease the effectiveness of CUDA circuit. (If you don't know what you are doing decrease is most likely.)
- option
EACIRC/CUDA/BLOCK_SIZE
- size of the block in kernel (default is512
) - option
EACIRC/TEST_VECTORS/SET_SIZE
- number of running threads in kernel - If any of the following options is changed then it is recommended to experimentally find the most effective settings for CUDA circuit (block size and test set size).
EACIRC/MAIN/CIRCUIT_SIZE_INPUT
EACIRC/MAIN/CIRCUIT_SIZE_OUTPUT
EACIRC/GATE_CIRCUIT/NUM_LAYERS
EACIRC/GATE_CIRCUIT/SIZE_LAYER
EACIRC/GATE_CIRCUIT/NUM_CONNECTORS
-
EACIRC/MAIN/SAVE_STATE_FREQ
any number higher than 100 will do, preferably 1000
- EACIRC/CUDA/BLOCK_SIZE = 512
- EACIRC/TEST_VECTORS/SET_SIZE = 3200
- EACIRC/MAIN/SAVE_STATE_FREQ = 1000
- EACIRC/MAIN/CIRCUIT_SIZE_INPUT = 16
- EACIRC/MAIN/CIRCUIT_SIZE_OUTPUT = 1
- EACIRC/GATE_CIRCUIT/NUM_LAYERS = 5
- EACIRC/GATE_CIRCUIT/SIZE_LAYER = 8
- EACIRC/GATE_CIRCUIT/NUM_CONNECTORS = 4