The Module Language

The Module Language

The Objective CAML language features a sub-language for modules, which comes in addition to the core language that we have seen so far. In this module language, the interface of a module is called a signature and its implementation is called a structure. When there is no ambiguity, we will often use the word ``module'' to refer to a structure.

The syntax for declaring signatures and structures is as follows:

Syntax

module type NAME =

sig

interface declarations

end

Syntax

module Name =

struct

implementation definitions

end

Warning

The name of a module must start with an uppercase letter. There are no such case restrictions on names of signatures, but by convention we will use names in uppercase for signatures.

Signatures and structures do not need to be bound to names: we can also use anonymous signature and structure expressions, writing simply

Syntax

sig declarations end

Syntax

struct definitions end

We write signature and structure to refer to either names of signatures and structures, or anonymous signature and structure expressions.

Every structure possesses a default signature, computed by the type inference system, which reveals all the definitions contained in the structure, with their most general types. When defining a structure, we can also indicate the desired signature by adding a signature constraint (similar to the type constraints from the core language), using one of the following two syntactic forms:

Syntax

module Name : signature = structure

Syntax

module Name = (structure : signature)

When an explicit signature is provided, the system checks that all the components declared in the signature are defined in the structure structure, and that the types are consistent. In other terms, the system checks that the explicit signature provided is ``included in'', or implied by, the default signature. If so, Name is viewed in the remainder of the code with the signature ``signature'', and only the components declared in the signature are accessible to the clients of the module. (This is the same behavior we saw previously with interface files.)

Access to the components of a module is via the dot notation:

Syntax

Name₁.name₂

We say that the name name₂ is qualified by the name Name₁ of its defining module.

The module name and the dot can be omitted using a directive to open the module:

Syntax

open Name

In the scope of this directive, we can use short names name₂ to refer to the components of the module Name. In case of name conflicts, opening a module hides previously defined entities with the same names, as in the case of identifier redefinitions.

Two Stack Modules

We continue the example of stacks by recasting it in the module language. The signature for a stack module is obtained by wrapping the declarations from the stack.mli file in a signature declaration:


# module type STACK =
   sig
     type 'a t
     exception Empty
     val create: unit -> 'a t
     val push: 'a -> 'a t -> unit
     val pop: 'a t -> 'a
     val clear : 'a t -> unit
     val length: 'a t -> int
     val iter: ('a -> unit) -> 'a t -> unit
   end ;;
module type STACK =
  sig
    type 'a t
    exception Empty
    val create : unit -> 'a t
    val push : 'a -> 'a t -> unit
    val pop : 'a t -> 'a
    val clear : 'a t -> unit
    val length : 'a t -> int
    val iter : ('a -> unit) -> 'a t -> unit
  end

A first implementation of stacks is obtained by reusing the Stack module from the standard library:


# module StandardStack = Stack ;;
module StandardStack :
  sig
    type 'a t = 'a Stack.t
    exception Empty
    val create : unit -> 'a t
    val push : 'a -> 'a t -> unit
    val pop : 'a t -> 'a
    val clear : 'a t -> unit
    val length : 'a t -> int
    val iter : ('a -> unit) -> 'a t -> unit
  end

We then define an alternate implementation based on arrays:


# module MyStack =
   struct
     type 'a t = { mutable sp : int; mutable c : 'a array }
     exception Empty
     let create () = { sp=0 ; c = [||] }
     let clear s = s.sp <- 0; s.c <- [||]
     let increase s x =  s.c <- Array.append s.c (Array.create 5 x)
     let push x s = 
       if s.sp >= Array.length s.c then increase s x; 
       s.c.(s.sp) <- x; 
       s.sp <- succ s.sp
     let pop s =
       if s.sp =0 then raise Empty
       else (s.sp <- pred s.sp ; s.c.(s.sp))
     let length s = s.sp
     let iter f s = for i = pred s.sp downto 0 do f s.c.(i) done
   end ;;
module MyStack :
  sig
    type 'a t = { mutable sp: int; mutable c: 'a array }
    exception Empty
    val create : unit -> 'a t
    val clear : 'a t -> unit
    val increase : 'a t -> 'a -> unit
    val push : 'a -> 'a t -> unit
    val pop : 'a t -> 'a
    val length : 'a t -> int
    val iter : ('a -> 'b) -> 'a t -> unit
  end

These two modules implement the type t of stacks by different data types.


# StandardStack.create () ;;
- : '_a StandardStack.t = <abstr>
# MyStack.create () ;;
- : '_a MyStack.t = {MyStack.sp=0; MyStack.c=[||]}

To abstract over the type representation in Mystack, we add a signature constraint by the STACK signature.


# module MyStack = (MyStack : STACK) ;;
module MyStack : STACK
# MyStack.create() ;;
- : '_a MyStack.t = <abstr>

The two modules StandardStack and MyStack implement the same interface, that is, provide the same set of operations over stacks, but their t types are different. It is therefore impossible to apply operations from one module to values from the other module:


# let s = StandardStack.create() ;;
val s : '_a StandardStack.t = <abstr>
# MyStack.push 0 s ;;
Characters 15-16:
This expression has type 'a StandardStack.t = 'a Stack.t
but is here used with type int MyStack.t

Even if both modules implemented the t type by the same concrete type, constraining MyStack by the signature STACK suffices to abstract over the t type, rendering it incompatible with any other type in the system and preventing sharing of values and operations between the various stack modules.


# module S1 = ( MyStack : STACK ) ;;
module S1 : STACK
# module S2 = ( MyStack : STACK ) ;;
module S2 : STACK
# let s = S1.create () ;;
val s : '_a S1.t = <abstr>
# S2.push 0 s ;;
Characters 10-11:
This expression has type 'a S1.t but is here used with type int S2.t

The Objective CAML system compares abstract types by names. Here, the two types S1.t and S2.t are both abstract, and have different names, hence they are considered as incompatible. It is precisely this restriction that makes type abstraction effective, by preventing any access to the definition of the type being abstracted.

Modules and Information Hiding

This section shows additional examples of signature constraints hiding or abstracting definitions of structure components.

Hiding Type Implementations

Abstracting over a type ensures that the only way to construct values of this type is via the functions exported from its definition module. This can be used to restrict the values that can belong to this type. In the following example, we implement an abstract type of integers which, by construction, can never take the value 0.

 
# module Int_Star = 
   ( struct
       type t = int 
       exception Isnul
       let of_int = function 0 -> raise Isnul | n -> n
       let mult = ( * ) 
     end
   : 
     sig
       type t
       exception Isnul
       val of_int : int -> t 
       val mult : t -> t -> t 
     end
   ) ;;
module Int_Star :
  sig type t exception Isnul val of_int : int -> t val mult : t -> t -> t end

Hiding Values

We now define a symbol generator, similar to that of page ??, using a signature constraint to hide the state of the generator.

We first define the signature GENSYM exporting only two functions for generating symbols.


# module type GENSYM = 
   sig
     val reset : unit -> unit
     val next : string -> string 
   end ;;

We then implement this signature as follows:


# module Gensym : GENSYM = 
   struct 
     let c = ref 0
     let reset () = c:=0
     let next s = incr c ; s ^ (string_of_int !c)
  end;;
module Gensym : GENSYM

The reference c holding the state of the generator Gensym is not accessible outside the two exported functions.


# Gensym.reset();;
- : unit = ()
# Gensym.next "T";;
- : string = "T1"
# Gensym.next "X";;
- : string = "X2"
# Gensym.reset();;
- : unit = ()
# Gensym.next "U";;
- : string = "U1"
# Gensym.c;;
Characters 0-8:
Unbound value Gensym.c

The definition of c is essentially local to the structure Gensym, since it is hidden by the associated signature. The signature constraint achieves more simply the same goal as the local definition of a reference in the definition of the two functions reset_s and new_s on page ??.

Multiple Views of a Module

The module language and its signature constraints support taking several views of a given structure. For instance, we can have a ``super-user interface'' for the module Gensym, allowing the symbol counter to be reset, and a ``normal user interface'' that permits only the generation of new symbols, but no other intervention on the counter. To implement the latter interface, it suffices to declare the signature:


# module type USER_GENSYM =
   sig
     val next : string -> string 
   end;;
module type USER_GENSYM = sig val next : string -> string end

We then implement it by a mere signature constraint.


# module UserGensym = (Gensym : USER_GENSYM) ;;
module UserGensym : USER_GENSYM
# UserGensym.next "U" ;;
- : string = "U2"
# UserGensym.reset() ;;
Characters 0-16:
Unbound value UserGensym.reset

The UserGensym module fully reuses the code of the Gensym module. In addition, both modules share the same counter:


# Gensym.next "U" ;;
- : string = "U3"
# Gensym.reset() ;;
- : unit = ()
# UserGensym.next "V" ;;
- : string = "V1"

Type Sharing between Modules

As we saw on page ??, abstract types with different names are incompatible. This can be problematic when we wish to share an abstract type between several modules. There are two ways to achieve this sharing: one is via a special sharing construct in the module language; the other one uses the lexical scoping of modules.

Sharing via Constraints

The following example illustrates the sharing issue. We define a module M providing an abstract type M.t. We then restrict M on two different signatures exporting different subsets of operations.


# module M = 
 ( 
   struct
     type t = int ref
     let create() = ref 0
     let add x = incr x
     let get x = if !x>0 then (decr x; 1) else failwith "Empty"
   end
   :
   sig
     type t
     val create : unit -> t
     val add : t -> unit
     val get : t -> int 
   end 
 ) ;;

# module type S1 = 
   sig
     type t
     val create : unit -> t
     val add : t -> unit
   end ;;

# module type S2 =
   sig
     type t
     val get : t -> int
   end ;;
# module M1 = (M:S1) ;;
module M1 : S1
# module M2 = (M:S2) ;;
module M2 : S2

As written above, the types M1.t and M2.t are incompatible. However, we would like to say that both types are abstract but identical. To do this, Objective CAML offers special syntax to declare a type equality over an abstract type in a signature.

Syntax

NAME with type t₁ = t₂ and ...

This type constraint forces the type t₁ declared in the signature NAME to be equal to the type t₂.

Type constraints over all types exported by a sub-module can be declared in one operation with the syntax

Syntax

NAME with module Name₁ = Name₂

Using these type sharing constraints, we can declare that the two modules M1 and M2 define identical abstract types.


# module M1 = (M:S1 with type t = M.t) ;;
module M1 : sig type t = M.t val create : unit -> t val add : t -> unit end
# module M2 = (M:S2 with type t = M.t) ;;
module M2 : sig type t = M.t val get : t -> int end
# let x = M1.create() in M1.add x ;  M2.get x ;;
- : int = 1

Sharing and Nested Modules

Another possibility for ensuring type sharing is to use nested modules. We define two sub-modules (M1 et M2) sharing an abstract type defined in the enclosing module M.


# module M = 
   ( struct 
       type t = int ref 
       module M_hide = 
         struct 
           let create() = ref 0
           let add x = incr x
           let get x = if !x>0 then (decr x; 1) else failwith "Empty"
         end 
       module M1 = M_hide 
       module M2 = M_hide 
     end 
    :
     sig
       type t
       module M1 : sig  val create : unit -> t  val add : t -> unit end
       module M2 : sig  val get : t -> int  end 
     end ) ;;
module M :
  sig
    type t
    module M1 : sig val create : unit -> t val add : t -> unit end
    module M2 : sig val get : t -> int end
  end

As desired, values created by M1 can be operated upon by M2, while hiding the representation of these values.


# let x = M.M1.create() ;;
val x : M.t = <abstr>
# M.M1.add x ; M.M2.get x ;;
- : int = 1

This solution is heavier than the previous solution based on type sharing constraints: the functions from M1 and M2 can only be accessed via the enclosing module M.

Extending Simple Modules

Modules are closed entities, defined once and for all. In particular, once an abstract type is defined using the module language, it is impossible to add further operations on the abstract type that depend on the type representation without modifying the module definition itself. (Operations derived from existing operations can of course be added later, outside the module.) As an extreme example, if the module exports no creation function, clients of the module will never be able to create values of the abstract type!

Therefore, adding new operations that depend on the type representation requires editing the sources of the module and adding the desired operations in its signature and structure. Of course, we then get a different module, and clients need to be recompiled. However, if the modifications performed on the module signature did not affect the components of the original signature, the remainder of the program remains correct and does not need to be modified, just recompiled.