Creating and modifying Objective CAML values from C
A C function called from Objective CAML can modify its arguments in place,
or return a newly-created value. This value must match the
Objective CAML type for the function result. For base types, several C
macros are provided to convert a C datum to an Objective CAML value.
For structured types, the new value must be allocated in the Objective CAML
heap, with the correct size, and its fields initialized with values of
the correct types. Considerable care is required here: it is easy to
construct bad values from C, and these bad values may crash
the Objective CAML program.
Any allocation in the Objective CAML heap can trigger a garbage collection,
which will deallocate unused memory blocks and may move live blocks.
Therefore, any Objective CAML value manipulated from C must be registered
with the Objective CAML garbage collector, if they are to survive the
allocation of a new block. These values must be treated as extra
memory roots by the garbage collector. To this end, several macros
are provided for registering extra roots with the garbage collector.
Finally, C code can allocate Objective CAML heap blocks that contain C data
instead of Objective CAML values. This C data will then benefit from
Objective CAML's automatic memory management. If the C data requires
explicit deallocation, a finalization function can be attached to the
heap block.
Modifying Objective CAML values
The following macros allow the creation of immediate Objective CAML values from the
corresponding C data, and the modification of structured values in place.
|
|
Val_long(l) |
return the value representing the long integer l |
Val_int(i) |
return the value representing the integer l |
Val_bool(x) |
return false if x=0, true otherwise |
Val_true |
the representation of true |
Val_false |
the representation of false |
Val_unit |
the representation of () |
|
|
Store_field(b,n,v) |
store the value v in the n-th
field of block b |
Store_double_field(b,n,d) |
store the float d in the
n-th field of the float array b |
Figure 12.10: Creation of immediate values and modification of structured blocks.
Moreover, the macros Byte and Byte_u can be used on the
left-hand side of an assignment to modify the characters of a string.
The Field macro can also be used for assignment on blocks with
tag Abstract_tag or Final_tag; use Store_field
for blocks with tag between 0 and No_scan_tag-1.
The following function reverses a character string in place:
#include <caml/mlvalues.h>
value swap_char(value v, int i, int j)
{ char c=Byte(v,i); Byte(v,i)=Byte(v,j); Byte(v,j)=c; }
value swap_string (value v)
{
int i,j,t = string_length(v) ;
for (i=0,j=t-1; i<t/2; i++,j--) swap_char(v,i,j) ;
return v ;
}
# external
mirror
:
string
->
string
=
"swap_string"
;;
external mirror : string -> string = "swap_string"
# mirror
"abcdefg"
;;
- : string = "gfedcba"
Allocating new blocks
The functions listed in figure 12.11 allocate new blocks
in the Objective CAML heap.
|
|
alloc(n, t) |
return a new block of size n words and tag t |
alloc_tuple(n) |
same, with tag 0 |
alloc_string(n) |
return an uninitialized string of length n characters |
copy_string(s) |
return a string initialized with the C string
s |
copy_double(d) |
return a block containing the double float
d |
alloc_array(f, a) |
return a block representing an array, initialized
by applying |
|
the conversion function f to each element of the C array of |
|
pointers a, null-terminated. |
copy_string_array(p) |
return a block representing an array of
strings, obtained |
|
from the C string array p (of type char **),
null-terminated. |
Figure 12.11: Functions for allocating blocks.
The function alloc_array takes an array of pointers a,
terminated by a null pointer, and a conversion function f
taking a pointer and returning a value. The result of alloc_array is an Objective CAML array containing the results of applying
f in turn to each pointer in a. In the following example,
the function make_str_array uses alloc_array to convert
a C array of strings.
#include <caml/mlvalues.h>
value make_str (char *s) { return copy_string(s); }
value make_str_array (char **p) { return alloc_array(make_str,p) ; }
It is sometimes necessary to allocate blocks of size 0, for instance
to represent an empty Objective CAML array. Such a block is called an
atom.
# inspect
[|
|]
;;
....memory block: size=0 - structured block (tag=0):
- : '_a array = [||]
Because atoms are allocated statically and do not reside in
the dynamic part of the Objective CAML heap,
the allocation functions in figure 12.11 must not be
used to allocate atoms. Instead, atoms are created in C by the macro
Atom(t), where t is the desired tag for the block of size
0.
Storing C data in the Objective CAML heap
It is sometimes convenient to use the Objective CAML heap to store arbitrary C data
that does not respect the constraints imposed by the garbage collector.
In this case, blocks with tag Abstract_tag must be used.
A natural example is the manipulation of native C integers
(of size 32 or 64 bits) in Objective CAML.
Since these integers are not tagged as the
Objective CAML garbage collector expects, they must be kept in one-word heap
blocks with tag Abstract_tag.
#include <caml/mlvalues.h>
#include <stdio.h>
value Cint_of_OCAMLint (value v)
{
value res = alloc(1,Abstract_tag) ;
Field(res,0) = Long_val(v) ;
return res ;
}
value OCAMLint_of_Cint (value v) { return Val_long(Field(v,0)) ; }
value Cplus (value v1,value v2)
{
value res = alloc(1,Abstract_tag) ;
Field(res,0) = Field(v1,0) + Field(v2,0) ;
return res ;
}
value printCint (value v)
{
printf ("%d",(long) Field(v,0)) ; fflush(stdout) ;
return Val_unit ;
}
# type
cint
external
cint_of_int
:
int
->
cint
=
"Cint_of_OCAMLint"
external
int_of_cint
:
cint
->
int
=
"OCAMLint_of_Cint"
external
plus_cint
:
cint
->
cint
->
cint
=
"Cplus"
external
print_cint
:
cint
->
unit
=
"printCint"
;;
We can now work on native C integers, without losing the use of the tag bit,
while remaining compatible with Objective CAML's garbage collector.
However, such integers are heap-allocated, instead of being immediate
values, which renders arithmetic operations less efficient.
# let
a
=
1
0
0
0
0
0
0
0
0
0
;;
val a : int = 1000000000
# a+
a
;;
- : int = -147483648
# let
c
=
let
b
=
cint_of_int
a
in
plus_cint
b
b
;;
val c : cint = <abstr>
# print_cint
c
;
print_newline
()
;;
2000000000
- : unit = ()
# int_of_cint
c
;;
- : int = -147483648
Finalization functions
Abstract blocks can also contain pointers to memory blocks allocated
outside the Objective CAML heap. We know that Objective CAML blocks that are no
longer used by the program are deallocated by the garbage collector.
But what happens to a block allocated in the C heap and referenced by
an abstract block that was reclaimed by the GC? To avoid memory
leaks, we can associate a finalization function to the
abstract block; this function is called by the GC before reclaiming
the abstract block.
An abstract block with an attached finalization function is allocated
via the function alloc_final (n, f, used, max) .
-
n is the size of the block, in words. The first word of
the block is used to store the finalization function; hence the size
occupied by the user data must be increased by one word.
- f is the finalization function itself, with type
void f (value). It receives the abstract block as argument,
just before this block is reclaimed by the GC.
- used represents the memory space (outside the Objective CAML heap)
occupied by the C data. used must be <= max.
- max is the maximum memory space outside the Objective CAML heap
that we tolerate not being reclaimed immediately.
For efficiency reasons, the Objective CAML garbage collector does not
reclaim heap blocks as soon as they become unused, but some time
later. The ratio used/max controls the proportion of finalized
abstract blocks that the garbage collector may leave allocated while
they are no longer used. A ratio of 0 (that is, used = 0)
lets the garbage collector work at its usual pace; higher ratios
(no greater than 1) cause it to work harder and spend more CPU time
finding unused finalized blocks and reclaiming them.
The following program manipulates arrays of C integers allocated in
the C heap via malloc. To allow the Objective CAML garbage collector
to reclaim these arrays automatically, the create function wraps
them in a finalized abstract block, containing both a pointer to the
array and the finalization function finalize_it.
#include <malloc.h>
#include <stdio.h>
#include <caml/mlvalues.h>
typedef struct {
int size ;
long * tab ; } IntTab ;
IntTab *alloc_it (int s)
{
IntTab *res = malloc(sizeof(IntTab)) ;
res->size = s ;
res->tab = (long *) malloc(sizeof(long)*s) ;
return res ;
}
void free_it (IntTab *p) { free(p->tab) ; free(p) ; }
void put_it (int n,long q,IntTab *p) { p->tab[n] = q ; }
long get_it (int n,IntTab *p) { return p->tab[n]; }
void finalize_it (value v)
{
IntTab *p = (IntTab *) Field(v,1) ;
int i;
printf("reclamation of an IntTab by finalization [") ;
for (i=0;i<p->size;i++) printf("%d ",p->tab[i]) ;
printf("]\n"); fflush(stdout) ;
free_it ((IntTab *) Field(v,1)) ;
}
value create (value s)
{
value block ;
block = alloc_final (2, finalize_it,Int_val(s)*sizeof(IntTab),100000) ;
Field(block,1) = (value) alloc_it(Int_val(s)) ;
return block ;
}
value put (value n,value q,value t)
{
put_it (Int_val(n), Long_val(q), (IntTab *) Field(t,1)) ;
return Val_unit ;
}
value get (value n,value t)
{
long res = get_it (Int_val(n), (IntTab *) Field(t,1)) ;
return Val_long(res) ;
}
The C functions visible from Objective CAML are: create,
put and get.
# type
c_int_array
external
cia_create
:
int
->
c_int_array
=
"create"
external
cia_get
:
int
->
c_int_array
->
int
=
"get"
external
cia_put
:
int->
int
->
c_int_array
->
unit
=
"put"
;;
We can now manipulate our new data structure from Objective CAML:
# let
tbl
=
cia_create
1
0
and
tbl2
=
cia_create
1
0
in
for
i=
0
to
9
do
cia_put
i
(i*
2
)
tbl
done
;
for
i=
0
to
9
do
print_int
(cia_get
i
tbl)
;
print_string
" "
done
;
print_newline
()
;
for
i=
0
to
9
do
cia_put
(9
-
i)
(cia_get
i
tbl)
tbl2
done
;
for
i=
0
to
9
do
print_int
(cia_get
i
tbl2)
;
print_string
" "
done
;;
0 2 4 6 8 10 12 14 16 18
18 16 14 12 10 8 6 4 2 0 - : unit = ()
We now force a garbage collection to check that the finalization
function is called:
# Gc.full_major
()
;;
reclaimation of an IntTab by finalization [18 16 14 12 10 8 6 4 2 0 ]
reclaimation of an IntTab by finalization [0 2 4 6 8 10 12 14 16 18 ]
- : unit = ()
In addition to freeing C heap blocks, finalization functions can also
be used to close files, terminate processes, etc.
Garbage collection and C parameters and local variables
A C function can trigger a garbage collection, either during an
allocation (if the heap is full), or voluntarily by calling
void Garbage_collection_function ()
.
Consider the following example. Can you spot the error?
#include <caml/mlvalues.h>
#include <caml/memory.h>
value identity (value x)
{
Garbage_collection_function() ;
return x;
}
# external
id
:
'a
->
'a
=
"identity"
;;
external id : 'a -> 'a = "identity"
# id
[
1
;2
;3
;4
;5
]
;;
- : int list = [538917758; 538917752; 538917746; 538917740; 538917734]
The list passed as parameter to id, hence to the C function
identity, can be moved or reclaimed by the garbage collector.
In the example, we forced a garbage collection, but any allocation in
the Objective CAML heap could have triggered a garbage collection as well.
The anonymous list passed to id was reclaimed by the
garbage collector, because it is not reachable from the set of known
roots. To avoid this,
any C function that allocates anything in the Objective CAML heap
must tell the garbage collector about the C function's
parameters and local variables of type value.
This is achieved by using the macros described next.
For parameters, these macros are used within
the body of the C function as if they were additional
declarations:
CAMLparam1(v) |
: |
for one parameter v of type value |
CAMLparam2(v1,v2) |
: |
for two parameters |
... |
|
... |
CAMLparam5(v1,...,v5) |
: |
for five parameters |
CAMLparam0 ; |
: |
required when there are no value parameters. |
If the C function has more than five value parameters,
the first five are declared with the CAMLparam5 macro,
and the remaining parameters with the macros CAMLxparam1, ..., CAMLxparam5, used as many times as
necessary to list all value parameters.
CAMLparam5(v1,...,v5); |
CAMLxparam5(v6,...,v10); |
CAMLxparam2(v11,v12); |
: |
for 12 parameters of type value |
For local variables, these macros are used instead of
normal C declarations of the variables.
Local variables of type value must also be registered with the
garbage collector, using the macros CAMLlocal1,
..., CAMLlocal5. An array of values is declared with
CAMLlocalN(tbl,n) where n is the number of elements
of the array tbl. Finally, to return from the C function,
we must use the macro CAMLreturn instead of C's return
construct.
Here is the corrected version of the previous example:
#include <caml/mlvalues.h>
#include <caml/memory.h>
value identity2 (value x)
{
CAMLparam1(x) ;
Garbage_collection_function() ;
CAMLreturn x;
}
# external
id
:
'a
->
'a
=
"identity2"
;;
external id : 'a -> 'a = "identity2"
# let
a
=
id
[
1
;2
;3
;4
;5
]
;;
val a : int list = [1; 2; 3; 4; 5]
We now obtain the expected result.
Calling an Objective CAML closure from C
To apply a closure (i.e. an Objective CAML function value) to one or
several arguments from C, we can use the functions declared in the
header file callback.h.
callback(f,v) |
: |
apply the closure f to the argument
v, |
callback2(f,v1,v2) |
: |
same, to two arguments, |
callback3(f,v1,v2,v3) |
: |
same, to three arguments, |
callbackN(f,n,tbl) |
: |
same, to n arguments stored in
the array tbl. |
All these functions return a value, which is the result of the
application.
Registering Objective CAML functions with C
The callback functions require the Objective CAML function to be
applied as a closure, that is, as a value that was passed as an argument
to the C function. We can also register a closure from Objective CAML,
giving it a name, then later refer to the closure by its name in a C
function.
The function register from module Callback
associates a name (of type string) with a closure or with any other
Objective CAML value (of any type, that is, 'a). This closure or
value can be recovered from C using the C function
caml_named_value, which takes a character string as argument
and returns a pointer to the closure or value associated with that
name, if it exists, or the null pointer otherwise.
An example is in order:
# let
plus
x
y
=
x
+
y
;;
val plus : int -> int -> int = <fun>
# Callback.register
"plus3_ocaml"
(plus
3
);;
- : unit = ()
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/callback.h>
value plus3_C (value v)
{
CAMLparam1(v);
CAMLlocal1(f);
f = *(caml_named_value("plus3_ocaml")) ;
CAMLreturn callback(f,v) ;
}
# external
plusC
:
int
->
int
=
"plus3_C"
;;
external plusC : int -> int = "plus3_C"
# plusC
1
;;
- : int = 4
# Callback.register
"plus3_ocaml"
(plus
5
);;
- : unit = ()
# plusC
1
;;
- : int = 6
Do not confuse the declaration of a C function with external
and the registration of an Objective CAML closure with the function
register. In the former case, the declaration is static, the
correspondence between the two names is established at link time.
In the latter case, the binding is dynamic: the correspondence between
the name and the closure is performed at run time. In particular,
the name--closure binding can be modified dynamically by registering a
different closure with the same name, thus modifying the behavior of
C functions using that name.