Skip to content

Yadriggy C OpenCL

chibash edited this page Jun 20, 2018 · 1 revision

Yadriggy-C can generate OpenCL code on macOS.

OpenCL programming

OpenCL assumes that a kernel function runs in a separate memory space. Hence, we first have to copy data to the OpenCL memory space, which is usually GPU memory, and then copy them back to the main memory.

To express an array in the OpenCL memory space, Yadriggy-C provides the OclArray class. An instance of OclArray is an array of float (single precision floating point numbers). This instance can be created only in the initialize method of Yadriggy::C::Program. The created instance has to be referred to through an instance variable.

An OpenCL kenrnel is a block given to the ocl_times method on an integer. The kernel is executed in parallel by the OpenCL runtime. The number of the threads is given by the number that ocl_times is called on.

Example

require 'yadriggy/c/opencl'

class Inc < Yadriggy::C::Program
  def initialize
    @data = OclArray.new(16)
  end
  def inc(arr, n) ! Void
    typedecl arr: Float32Array, n: Int
    @data.copyfrom(arr, n)
    16.ocl_times {|i| @data[i] *= 2.0 }
    @data.copyto(arr, n)
  end
end

The OpenCL kernel is {|i| @data[i] *= 2.0 }. i is the thread identifier. Since ocl_times is called on 16, 16 threads will run in parallel to execute the kernel and i will be 0 to 15. Note that the kernel can access only @data, the instance variable of Inc. It cannot access the parameter arr or n since they are not in the OpencL memory space.

All the instance variables such as @data have to be initialized in the initialize method. They have to be OclArray objects. For example, OclArray.new(16) creates an array with 16 elements of float type.

Before and after running a kernel function, data has to be copied into and back from OclArray. This is done by copyfrom and copyto methods on an OclArray object.

  • copyfrom(arr, n)

    It copies data from arr into the OclArray object in the OpenCL memory space.

    arr is an Float32Array object.

    n is the number of elements (type Integer).

  • copyto(arr, n)

    It copies data from the OclArray object to arr.

    arr is an Float32Array object.

    n is the number of elements (type 'Integer').

To run the program, use ocl_compile.

m = Inc.ocl_compile
arr = Yadriggy::C::Float32Array.new(8)
arr.set_values {|i| i }
m.ocl_init(1)
m.inc(arr, arr.size)
m.ocl_finish
puts arr.to_a

The parameters to ocl_compile is the same as compile in Yadriggy::C::Program. The ocl_compile method returns a module that provide the functions defined in the DSL code, such as inc. Before calling the kernel function, the ocl_init method has to be called on the module for initializing the OpenCL device. The argument to ocl_init specifies the OpenCL device. Usually, 0 is the CPU and 1 is the first GPU device. 2 is the second device, and so on. After calling the kernel function, the ocl_finish method has to be called for releasing the resources for OpenCL. ocl_init and ocl_finish are automatically added to the module.

Further reading

See the image filter example under yadriggy/examples.

Clone this wiki locally