Comparison of Modules and Objects

Comparison of Modules and Objects

The main difference between modular programming and object programming in Objective CAML comes from the type system. In effect, programming with modules remains within the ML type system (i.e. parametric polymorphism code is executed for different types of parameter), while programming with objects entails an ad hoc polymorphism (in which the sending of a message to an object triggers the application of different pieces of code). This is particularly clear with subtyping. This extension of the ML type system can not be simulated in pure ML. It will always be impossible to construct heterogeneous lists without breaking the type system.

Modular programming and object programming are two safe (thanks to typing) approaches to the logical organisation of a program, permitting the reusability and the modifiability of software components. Programming with objects in Objective CAML allows parametric polymorphism (parameterized classes) and inclusion/subtype polymorphism (sending of messages) thanks to late binding and subtyping, with restrictions due to equality, facilitating incremental programming. Modular programming allows one to restrict parametric polymorphism and use immediate binding, which can be useful for conserving efficiency of execution.

The modular programming model permits the easy extension of functions on non-extensible recursive data types. If one wishes to add a case in a variant type, it will be necessary to modify a large part of the sources.
The object model of programming defines a set of recursive data types using classes. One interprets a class as a case of the data type.

Efficiency of Execution

Late binding corresponds to an indirection in the method table (see page ??). Just as the access to an instance variable from outside the class goes through a message dispatch, this accumulation of indirections can prove to be costly.

To show this loss of efficiency, we construct the following class hierarchy:


# class virtual test () = 
   object
    method virtual sum : unit -> int
    method virtual sum2 : unit -> int
   end;;
# class a x  = 
   object(self)
    inherit test ()
    val a = x
    method a = a
    method sum () = a
    method sum2 () = self#a
   end;;
# class b x y  = 
   object(self)
   inherit a x as super
    val b = y
    method b = b
    method  sum () = b + a 
    method  sum2 () = self#b + super#sum2()
   end;;

Now, we compare the execution time, on one hand of the dispatch of messages sum and sum2 to an instance of class b, and on the other hand of a call to the following function f.


# let f a b = a + b ;;
# let iter g a n = for i = 1 to n do ignore(g a) done ; g a ;;
# let go i j = match i with 
                  1 -> iter (fun x -> x#sum()) (new b 1 2) j
                | 2 -> iter (fun x -> x#sum2()) (new b 1 2) j
                | 3 -> iter (fun x -> f 1 x) 2 j ;;

# go (int_of_string (Sys.argv.(1))) (int_of_string (Sys.argv.(2))) ;;

For 10 million iterations, we get the following results:

	bytecode	native
case 1	07,5 s	0,6 s
case 2	15,0 s	2,3 s
case 3	06,0 s	0,3 s

This example has been constructed in order to show that late binding has a cost relative to the standard static binding. This cost depends on the quantity of calculation relative to the number of message dispatches in a function. The use of the native compiler reduces the calculation component without changing the indirection component of the test. We can see in case 2 that the multiple indirections at the dispatch of message sum2 have an ``incompressible'' cost.

Example: Graphical Interface

The AWI graphical library (see page ??) was designed using the functional/imperative core of the language. It is very easy to adapt it into module form. Each component becomes an independent module, thus permitting a harmonization of function names. To add a component, it is necessary to know the concrete type of its components. It is up to the new module to modify the fields necessary to describe its appearance and its behaviors.

The library can also be rewritten as an object. For this we construct the hierarchy of classes shown in figure 16.1.

Figure 16.1: Class hierarchy for AWI.

It is easier to add new components, thanks to inheritance, than when using modules; however, the absence of overloading still requires options to be encoded as method parameters. The use of the subtyping relation makes it easy to construct a list of the constituents of a container. Deferred linking selects the methods appropriate to the component. The interest of the object model also comes from the possibility of extending or modifying the graphics context, and the other types that are used, again thanks to inheritance. This is why the principal graphics libraries are organised according to the object model.

Translation of Modules into Classes

A simple module which only declares one type and does not have any type-independent polymorphic functions can be translated into a class. According to the nature of the type used (record type or variant type) one translates the module into a class in a different way.

Type Declarations

Record type.

A record type can be written directly in the form of a class in which every field of the record type becomes an instance variable.

Variant type.

A variant type translates into many classes, using the conceptual model of a ``composite''. An abstract class describes the operations (functions) on this type. Every branch of the variant type thus becomes a subclass of the abstract class, and implements the abstract methods for its branch. We no longer have pattern matching but instead choose the method specific to the branch.

Parameterized types.

Parameterized types are implemented by parameterized classes.

Abstract types.

We can consider a class as an abstract type. At no time is the internal state of the class visible outside its hierarchy. Nevertheless, nothing prevents us from defining a subclass in order to access the variables of the instances of a class.

Mutually recursive types.

The declarations of mutually recursive types are translated into declarations of mutually recursive classes.

Function Declarations

Those functions with parameters dependent on the module type, t, are translatable into methods. Functions in which t does not appear may be declared private inasmuch as their membership of the module is not directly linked to the type t. This has the added advantage that there is no problem if type variables appear in the type of the parameters. We are left with the problem of functions in which one parameter is of type t and another is of type 'a. These functions are very rare in the modules of the standard library. We can identify ``peculiar'' modules like Marshal or Printf which have non-standard typing, and modules (that operate) on linear structures like List. For this last, the function fold_left, of type ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a is difficult to translate, especially in a method of the class ['b] list because the type variable 'a is free and may not appear in the type of the method. Rather than adding a type parameter to the list class, it is preferable to break these functions out into new classes, parameterized by two type variables and having a list field.

Binary methods.

Binary methods do not pose any problem, outside subtyping.

Other declarations.

Declarations of non-functional values. We can accept the declaration of non-functional values outside classes. This is also true for exceptions.

Example: Lists with Iterator.

We are trying to translate a module with the following signature LIST into an object.


# module type LIST =  sig
   type 'a list = C0 | C1 of 'a * 'a list
   val add : 'a list -> 'a -> 'a list
   val length : 'a list -> int
   val hd : 'a list -> 'a
   val tl : 'a list -> 'a list
   val append : 'a list -> 'a list -> 'a list
   val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a
 end ;;

First of all, we declare the abstract class 'a list corresponding to the definition of the type.


# class virtual ['a] list () = 
   object (self : 'b) 
     method virtual add : 'a -> 'a list
     method virtual empty : unit -> bool
     method virtual hd : 'a
     method virtual tl : 'a list
     method virtual length : unit -> int
     method virtual append : 'a list  -> 'a list
   end ;;

Then we define the two subclasses c1_list and c0_list for each constituent of the variant type. Each of these classes should define the methods of the ancestor abstract class


# class ['a] c1_list (t, q)  = 
   object (self )
     inherit ['a] list () as super
     val t = t
     val q = q
     method add x = new c1_list (x, (self : 'a #list :> 'a list))
     method empty () = false
     method length () = 1 + q#length()
     method hd = t
     method tl = q
     method append l  = new c1_list (t,q#append l)
   end ;;
# class ['a] c0_list () = 
   object (self)
     inherit ['a] list () as super
     method add x = new c1_list (x, (self : 'a #list :> 'a list))
     method empty () = true
     method length () = 0 
     method hd = failwith "c0_list : hd"
     method tl = failwith "c0_list : tl"
     method append l  = l 
   end ;;
# let l = new c1_list (4, new c1_list (7, new c0_list ())) ;;
val l : int list = <obj>

The function LIST.fold_left has not been incorporated into the list class to avoid introducing a new type parameter. We prefer to define the class fold_left to implement this method. For this, we use a functional instance variable (f).


# class virtual ['a,'b] fold_left () = 
   object(self)
     method virtual f : 'a -> 'b -> 'a
     method iter r (l : 'b list) = 
       if l#empty() then r else self#iter (self#f r (l#hd)) (l#tl)
   end ;;
# class ['a,'b] gen_fl f = 
   object 
     inherit ['a,'b] fold_left ()
     method f = f
   end ;;

Thus we construct an instance of the class gen_fl for addition:


# let afl = new gen_fl (+) ;;
val afl : (int, int) gen_fl = <obj>
# afl#iter 0 l ;;
- : int = 11

Simulation of Inheritance with Modules

Thanks to the relation of inheritance between classes, we can retrieve in a subclass the collection of variable declarations and methods of the ancestor class. We can simulate this relation by using modules. The subclass which inherits is transformed into a parameterized module, of which the parameter is the ancestor class. Multiple inheritance increases the number of parameters of the module. We revisit the classic example of points and colored points, described in chapter 15, to translate it into modules.

The class point becomes the module Point with the following signature POINT.


# module type POINT =
   sig
     type point
     val new_point : (int * int) -> point
     val get_x : point -> int
     val get_y : point -> int
     val moveto : point -> (int * int) -> unit
     val rmoveto : point -> (int * int) -> unit
     val display : point -> unit
     val distance : point -> float
   end ;;

The class colored_point is transformed into a parameterized module ColoredPoint which has the signature POINT as its parameter.


# module ColoredPoint = functor (P : POINT) ->
 struct
   type colored_point = {p:P.point;c:string}
   let new_colored_point p c = {p=P.new_point p;c=c}
   let get_c self = self.c
 (* begin *)
   let get_x self = let super = self.p in P.get_x super
   let get_y self = let super = self.p in P.get_y super
   let moveto self = let super = self.p in P.moveto super
   let rmoveto self = let super = self.p in P.rmoveto super
   let display self = let super = self.p in P.display super
   let distance self = let super = self.p in P.distance super
 (* end *)
   let display self = 
     let super = self.p in P.display super; print_string ("has color "^ self.c)
 end ;;

The burden of ``inherited'' declarations can be lightened by an automatic translation procedure, or an extension of the language. Recursive method declarations can be written with a single let rec ... and. Multiple inheritance leads to functors with many parameters. The cost of redefinition is not greater than that of late binding.

Late binding is not implemented in this simulation. To achieve it, it is necessary to define a record in which each field corresponds to the type of its functions/methods.

Limitations of each Model

The functional/modular module offers a reassuring but rigid framework for the modifiability of code. Objective CAML's object model suffers from ``double vision'' of classes: structuring and type, implying the absence of overloading and the impossibility of imposing type constraints from an ancestor type on a descendant type.

Modules

The principal limitations of the functional/modular model arise from the difficulty of extending types. Although abstract types allow us to get away from the concrete representation of a type, their use in parameterized modules requires that type equalities between modules be indicated by hand, complicating the writing of signatures.

Recursive dependencies.

The dependence graph of the modules in an application is a directed acyclic graph (DAG). This implies on the one hand that there are no types that are mutually recursive between two modules, and on the other prevents the declaration of mutually recursive values.

Difficulties in writing signatures.

One of the attractions of type inference is that it is not necessary to specify the types of function parameters. The specification of signatures sacrifices this convenience. It becomes necessary to specify the types of the declarations of the signature ``by hand.'' One can use the -i option of the compiler ocamlc to display the type of all the global declarations in a .ml file and use this information to construct the signature of a module. In this case, we lose the ``software engineering'' discipline which consists of specifying the module before implementing it. In addition, if the signature and module undergo large changes, we will have to go back to editing the signature. Parameterized modules need signatures for their parameters and those should also be written by hand. Finally if we associate a functional signature with a parameterized module, it is impossible to recove the signature resulting from the application of the functor. This obliges us to mostly write non-functional signatures, leaving it until later to assemble them to construct a functional signature.

Import and export of modules.

The importation of the declarations of a simple module is achieved either by dot notation (Module.name) or directory by the name of a declaration (name) if the model has been opened (open Module). The declaration of the interface of the imported module is not directly exportable at the level of the module in process of being defined. It has access to these declarations, but they are not considered as declarations of the module. In order to do this it is necessary to declare, in the same way as the simulation of inheritance, imported values. The same is true for parameterized modules. The declarations of the module parameters are not considered as declarations of the current module.

Objects

The principle limitations of the Objective CAML object model arise from typing.

no methods containing parameters of free type;
difficulty of escaping from the type of a class in one of its methods;
absence of type constraint from the ancestor type on its descendant;
no overloading;

The most disconcerting point when you start with the object extension of Objective CAML is the impossibility of constructing methods containing a parameterized type in which the type parameter is free. The declaration of a class can be seen as the definition of a new type, and hence arises the general rule forbidding the presence of variables with free type in the declaration of a type. For this reason, parameterized classes are indispensable in the Objective CAML object model because they permit the linking of their type variables.

Absence of overloading.

The Objective CAML object model does not allow method overloading. As the type of an object corresponds to types of its methods, the fact of possessing many methods with the same name but different types would result in numerous ambiguities, due to parametric polymorphism, which the system could only resolve dynamically. This would be contradictory to the vision of totally static typing. We take a class example which has two message methods, the first having an integer parameter, and the second a float parameter. Let e be an instance of this class and f be the following function:


# let f x a  = x#message a ;;

The calls f e 1 et f e 1.1 cannot be statically resolved because there is no information about the class example in the code of the function f.

An immediate consequence of this absence is the uniqueness of instance constructors. The declaration of a class indicates the parameters to supply to the creation function. This constructor is unique.

Initialization.

The initialization of instance variables declared in a class can be problematic when it should be calculated based on the values passed to the constructor.

Equality between instances.

The only equality which applies to objects is physical equality. Structural equality always returns false when it is applied to two physically different objects. This can be surprising inasmuch as two instances of the same class share the same method table. One can imagine a physical test on the method table and a structural test on the values (val) of objects. These are the implementation choices of the linear pattern-matching style.

Class hierarchy.

There is no class hierarchy in the language distribution. In fact the collection of libraries are supplied in the form of simple or parameterized modules. This demonstrates that the object extension of the language is still stabilizing, and makes little case for its extensive use.