-
Notifications
You must be signed in to change notification settings - Fork 2
Yadriggy C OpenCL
Yadriggy-C can generate OpenCL code on macOS.
OpenCL assumes that a kernel function runs in a separate memory space. Hence, we first have to copy data to the OpenCL memory space, which is usually GPU memory, and then copy them back to the main memory.
To express an array in the OpenCL memory space, Yadriggy-C provides the OclArray
class.
An instance of OclArray
is an array of float
(single precision floating point numbers).
This instance can be created only in the initialize
method of Yadriggy::C::Program
.
The created instance has to be referred to through an instance variable.
An OpenCL kenrnel is a block given to the ocl_times
method on an integer.
The kernel is executed in parallel by the OpenCL runtime.
The number of the threads is given by the number that ocl_times
is called on.
require 'yadriggy/c/opencl'
class Inc < Yadriggy::C::Program
def initialize
@data = OclArray.new(16)
end
def inc(arr, n) ! Void
typedecl arr: Float32Array, n: Int
@data.copyfrom(arr, n)
16.ocl_times {|i| @data[i] *= 2.0 }
@data.copyto(arr, n)
end
end
The OpenCL kernel is {|i| @data[i] *= 2.0 }
.
i
is the thread identifier.
Since ocl_times
is called on 16
, 16 threads will run in parallel to
execute the kernel and i
will be 0 to 15.
Note that the kernel can access only @data
, the instance variable of
Inc
. It cannot access the parameter arr
or n
since they
are not in the OpencL memory space.
All the instance variables such as @data
have to be initialized
in the initialize
method.
They have to be OclArray
objects.
For example, OclArray.new(16)
creates an array with 16 elements
of float
type.
Before and after running a kernel function, data has to be copied
into and back from OclArray
. This is done by copyfrom
and copyto
methods
on an OclArray
object.
-
copyfrom(arr, n)
It copies data from
arr
into theOclArray
object in the OpenCL memory space.arr
is anFloat32Array
object.n
is the number of elements (typeInteger
). -
copyto(arr, n)
It copies data from the
OclArray
object toarr
.arr
is anFloat32Array
object.n
is the number of elements (type 'Integer').
To run the program, use ocl_compile
.
m = Inc.ocl_compile
arr = Yadriggy::C::Float32Array.new(8)
arr.set_values {|i| i }
m.ocl_init(1)
m.inc(arr, arr.size)
m.ocl_finish
puts arr.to_a
The parameters to ocl_compile
is the same as compile
in Yadriggy::C::Program
.
The ocl_compile
method returns a module that provide the functions defined in the DSL code, such as inc
.
Before calling the kernel function, the ocl_init
method has to
be called on the module for initializing the OpenCL device.
The argument to ocl_init
specifies the OpenCL device.
Usually, 0 is the CPU and 1 is the first GPU device. 2 is the second device, and so on.
After calling the kernel function, the ocl_finish
method has to
be called for releasing the resources for OpenCL.
ocl_init
and ocl_finish
are automatically added to the module.
See the image filter example under yadriggy/examples.