-
Notifications
You must be signed in to change notification settings - Fork 95
ctypes tutorial
The ocaml-ctypes
library ("ctypes
" for short) provides OCaml functions for describing C types and binding to C functions, making it possible to interface with C without writing or generating C code.
The easiest way to install ctypes
is to use opam. Once you have opam
installed, running the following command installs the library:
opam install ctypes
We'll see how to use ctypes
to describe the types of some standard C and POSIX functions, then call the functions from OCaml. Let's start with the time
function, which returns the current calendar time, and has the following signature:
time_t time(time_t *)
The first step is to open the Ctypes
, PosixTypes
and Foreign
modules. The Ctypes
module provides functions for describing C types in OCaml. The PosixTypes
module includes some extra types, such as time_t
. The Foreign
module exposes the foreign
function that makes it possible to bind C functions.
open Ctypes
open PosixTypes
open Foreign
The following code creates a binding for time
:
let time = foreign "time" (ptr time_t @-> returning time_t)
The foreign
function is the main link between OCaml and C. It takes two arguments: the name of the C function to bind, and a value describing the type of the bound function. Here the function type specifies one argument of type ptr time_t
and a return type of time_t
. The name bound by let
in our example has the following type:
val time : time_t ptr -> time_t
We can call time
immediately. The argument is of no interest, so we'll just pass a suitably-coerced null pointer:
time (from_voidp time_t null)
We're going to call time
a few times, so let's create a wrapper function that passes the null pointer through:
(* val time' : unit -> time_t *)
let time' () = time (from_voidp time_t null)
Since time_t
is an abstract type, we need a second function to do anything useful with the return values from time
. We'll use the standard C function difftime
, which has the following signature:
double difftime(time_t, time_t)
The following code creates a binding for difftime
:
let difftime = foreign "difftime" (time_t @-> time_t @-> returning double)
This time the bound name difftime
has the following OCaml type:
val difftime : time_t -> time_t -> float
Now we can create a timer function that calls time
twice to measure the execution time of a function.
(* val measure_execution_time : (unit -> unit) -> float *)
let measure_execution_time timed_function =
let start_time = time' () in
let () = timed_function () in
let end_time = time' () in
difftime start_time end_time
The measure_execution_time
function has a problem: on many systems it uses a resolution of seconds, which may not be sufficiently precise. In a later section we'll look at how to refine the function to use a more precise timer.
Recall the description of the types of time
and difftime
:
ptr time_t @-> returning time_t
time_t @-> time_t @-> returning double
The returning
function may appear superfluous: why couldn't we simply give the types as follows?
ptr time_t @-> time_t
time_t @-> time_t @-> double
The reason involves higher types and two differences between the way that functions are treated in OCaml and C. First, functions are first-class values in OCaml, but not in C. For example, in C, it is possible to return a function pointer from a function, but not to return an actual function. Second, OCaml functions are typically defined in a curried style in OCaml: the signature of a "two-argument function" is written as follows
val curried : int -> int -> int
but this really means
val curried : int -> (int -> int)
and the arguments can be supplied one at a time.
curried 3 4 (* supply both arguments *)
let f = curried 3 in f 4 (* supply one argument at a time *)
In contrast, C functions receive their arguments all at once; the equivalent C function type is the following:
int uncurried_C(int, int);
and the arguments must be supplied together:
uncurried_C(3, 4);
A C function written in curried style looks very different:
/* A function that accepts an int, and returns a function pointer that
accepts a second int and returns an int. */
typedef int (function_t)(int);
function_t *curried_C(int);
curried_C(3)(4); /* supply both arguments */
function_t *f = curried_C(3); f(4); /* supply one argument at a time */
The OCaml type of uncurried_C
when bound by ctypes
is int -> int -> int
: a two-argument function. The OCaml type of curried_C
when bound by ctypes
is int -> (int -> int)
: a one-argument function that returns a one-argument function. In OCaml, of course, these types are absolutely equivalent. Since the OCaml types are the same, but the C semantics are quite different, we need some kind of marker to distinguish the cases; this is the purpose of returning
.
Pointers are at the heart of C, so they are necessarily part of ctypes
, which provides support for pointer arithmetic, pointer conversions, reading and writing through pointers, and passing and returning pointers to and from functions. We've already seen a simple use of pointers in the argument of time
. Let's look at a (very slightly) less trivial example where we pass a non-null pointer to a function. Continuing with the theme from earlier, we'll bind to the ctime
function which converts a time_t
value to a human-readable string. The C signature of ctime
is as follows:
char *ctime(const time_t *timep);
The corresponding C types binding can be written
(* val ctime : time_t ptr -> string *)
let ctime = foreign "ctime" (ptr time_t @-> returning string)
Recall that we have a function that retrieves the current time as a time_t
value:
val time' : unit -> time_t
In order to pass the result of time'
to the ctime
function we need to place it in addressable memory and retrieve its address. We can accomplish that by allocating space for it
let t_ptr = allocate time_t (time' ())
The allocate
function takes two arguments: the type of the memory to be allocated, and the initial value; it returns a suitably-typed pointer. We can now call ctime
, passing the pointer as argument:
ctime t_ptr
(* => "Wed Jun 5 11:09:40 2013\n" *)
The string
type value in the specification of ctime
is an example of a view. Views create new C type descriptions that have special behaviour when used to read or write C values. The string
view wraps the C type char *
(written as ptr char
), and converts between the C and OCaml string representations each time the value is written or read.
The function used to create views is Ctypes.view
; it has the following signature
val view : read:('a -> 'b) -> write:('b -> 'a) -> 'a typ -> 'b typ
The string view is created using a pair of functions that convert between the C and OCaml representations
val string_of_char_ptr : char ptr -> string
val char_ptr_of_string : string -> char ptr
(* val string : string typ *)
let string = view ~read:string_of_char_ptr ~write:char_ptr_of_string (char ptr)
Views can often make slightly awkward C types easier to use. The ctypes
distribution includes type values ptr_opt
and funptr_opt
that map possibly-null pointers into option values.
The C constructs struct
and union
make it possible to build new types from existing types. In ctypes
there are counterparts that work similarly.
Let's improve the timer function that we wrote earlier. The POSIX function gettimeofday
makes it possible to retrieve the time with microsecond resolution. The signature of gettimeofday
is as follows:
int gettimeofday(struct timeval *, struct timezone *tv);
The struct timeval
type has the following definition:
struct timeval {
long tv_sec;
long tv_usec;
};
Using ctypes
we can describe this type as follows:
type timeval
let timeval : timeval structure typ = structure "timeval"
let tv_sec = timeval *:* long
let tv_usec = timeval *:* long
let () = seal timeval
The first line defines a new OCaml type typeval
that we'll use to instantiate the parameterised structure
type. Creating a new OCaml type to reflect the underlying C type in this way means that the structure we define will be incompatible with other structures in the program, which helps to avoid errors.
The second line calls structure
, which creates the new structure type. At this point the structure type is incomplete, so we can add fields, but cannot yet create structure values. Once we seal the structure the situation is reversed: we will be able to create values, but adding fields to a sealed structure is an error.
The names tv_sec
and tv_usec
are bound to structure fields. Structure fields are typed accessors, associated with a particular structure, that correspond to labels in C.
Since gettimeofday
also accepts a struct timezeone
pointer, we need to define a second structure type:
type timezone
let timezone : timezone structure typ = structure "timezone"
We don't need to create struct timezone
values, so we can leave this struct as incomplete.
Now we're ready to bind to gettimeofday
:
(* val gettimeofday : timeval structure ptr -> timezone structure ptr -> int *)
let gettimeofday = foreign "gettimeofday"
(ptr timeval @-> ptr timezone @-> returning_checking_errno int)
There's one new feature here: the returning_checking_errno
function behaves like returning
, except that it checks whether the bound C function modifies the C error flag errno
. Changes to errno
are mapped into exceptions.
As before we can create a wrapper to make gettimeofday
easier to use. The functions make
, addr
and getf
respectively create a structure value, retrieve the address of a structure value, and retrieve the value of a field.
(* val gettimeofday : unit -> float *)
let gettimeofday' () =
let tv = make timeval in
gettimeofday (addr tv) (from_voidp timezone null);
let secs = getf tv tv_sec
and usecs = getf tv tv_usec in
Signed.Long.(Pervasives.
(float (to_int secs) +. float (to_int usecs) /. 1_000_000.))
Now we can rewrite measure_execution_time
to measure more precisely:
(* val measure_execution_time : (unit -> unit) -> float *)
let measure_execution_time timed_function =
let start_time = gettimeofday' () in
let () = timed_function () in
let end_time = gettimeofday' () in
start_time -. end_time
Using ctypes
, it's straightforward to pass OCaml functions to C. The standard C function qsort
has the following signature:
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void *, const void *));
C programmers often use typedef
to make type definitions involving function pointers easier to read. Using a typedef the type of qsort
looks like this:
typedef int(compare_t)(const void *, const void *);
void qsort(void *base, size_t nmemb, size_t size, compare_t *);
We can define the type similarly in ctypes
. Since type descriptions are regular values, we can just use let
in place of typedef
. The type of qsort
is defined as follows:
let compare_t = ptr void @-> ptr void @-> returning int
let qsort = foreign "qsort"
(ptr void @-> size_t @-> size_t @-> funptr compare_t @-> returning void)
The resulting value is a higher-order function, as shown by its type:
val qsort : void ptr -> size_t -> size_t ->
(void ptr -> void ptr -> int) -> unit
As before, let's define a wrapper function to make qsort
easier to use. The second and third arguments to qsort
specify the length (number of elements) of the array and the element size. Arrays created using ctypes
have a richer runtime structure than C arrays, so we don't need to pass size information around. Furthermore, we can use OCaml polymorphism in place of the unsafe void ptr
type.
let qsort' arr cmp =
let open Unsigned.Size_t in
let ty = Array.element_type arr in
let len = of_int (Array.length arr) in
let elsize = of_int (sizeof ty) in
let start = to_voidp (Array.start arr) in
let compare l r = cmp (!@ (from_voidp ty l)) (!@ (from_voidp ty r)) in
qsort start len elsize compare
Our wrapper function has the following type:
val qsort' : 'a array -> ('a -> 'a -> int) -> unit
Using qsort'
to sort arrays is straightforward. First, we'll use Array.of_list
to create a C array:
let arr = Array.of_list int [5;3;1;2;4]
We can sort the array using Pervasives.compare
, and inspect the result using Array.to_list
:
qsort' arr Pervasives.compare
Array.to_list arr
(* => [1; 2; 3; 4; 5] *)
Let's reverse the ordering:
qsort' arr (fun l r -> - compare l r)
Array.to_list arr
(* [5; 4; 3; 2; 1] *)
The ctypes
distribution contains a number of larger scale examples, including bindings to the POSIX fts
API and a ctypes
variant of the ncurses
C extension from the OCaml manual.