In a nutshell, OCCA (like oca-rina) is an open-source library which aims to
- Make it easy to program different types of devices (e.g. CPU, GPU, FPGA)
- Provide a unified API for interacting with backend device APIs (e.g. OpenMP, CUDA, HIP, OpenCL, Metal)
- JIT compile backend kernels and provide a kernel language (a minor extension to C) to abstract programming for each backend
The "Hello World" example of adding two vectors looks like:
@kernel void addVectors(const int entries,
const float *a,
const float *b,
float *ab) {
for (int i = 0; i < entries; ++i; @tile(16, @outer, @inner)) {
ab[i] = a[i] + b[i];
}
}
Or we can inline it using C++ lambdas
// Capture variables
occa::scope scope({
{"a", a},
{"b", b},
{"ab", ab}
});
occa::forLoop()
.tile({entries, 16})
.run(OCCA_FUNCTION(scope, [=](const int i) -> void {
ab[i] = a[i] + b[i];
}));
Or we can use a more functional way by using occa::array
// Capture variables
occa::scope scope({
{"b", b}
});
occa::array<float> ab = (
a.map(OCCA_FUNCTION(
scope,
[=](const float &value, const int index) -> float {
return value + b[index];
}
))
);
We maintain our documentation on the libocca.org site
git clone --depth 1 https://github.com/libocca/occa.git
cd occa
make -j 4
Setup environment variables inside the occa
directory
export PATH+=":${PWD}/bin"
export LD_LIBRARY_PATH+=":${PWD}/lib"
export PATH+=":${PWD}/bin"
export DYLD_LIBRARY_PATH+=":${PWD}/lib"
The occa library is based on 3 different objects, all covered in the 01_add_vectors example:
occa::device
occa::memory
occa::kernel
cd examples/cpp/01_add_vectors
make
./main
Find how to inline for
loops using occa::forLoop
in example 02_for_loops:
cd examples/cpp/02_for_loops
make
./main
Learn how to use occa::array
in a functional way in example 03_arrays:
cd examples/cpp/03_arrays
make
./main
There is an executable occa
provided inside bin
> occa
Usage: occa [OPTIONS] COMMAND [COMMAND...]
Helpful utilities related to OCCA workflows
Commands:
autocomplete Prints shell functions to autocomplete occa
commands and arguments
clear Clears cached files and cache locks
compile Compile kernels
env Print environment variables used in OCCA
info Prints information about available backend modes
modes Prints available backend modes
translate Translate kernels
version Prints OCCA version
Arguments:
COMMAND Command to run
Options:
-h, --help Print usage
if which occa > /dev/null 2>&1; then
eval "$(occa autocomplete bash)"
fi
OCCA is definitely not the only solution that aims to simplify programming on different hardware/accelerators. Here is a list of other libraries that have taken different approaches:
-
The alpaka library is a header-only C++14 abstraction library for accelerator development. Its aim is to provide performance portability across accelerators through the abstraction (not hiding!) of the underlying levels of parallelism.
-
RAJA is a library of C++ software abstractions, primarily developed at Lawrence Livermore National Laboratory (LLNL), that enables architecture and programming model portability for HPC applications
-
Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management.