Allocation and Deallocation of Memory

Allocation and Deallocation of Memory

Most languages permit dynamic memory allocation, among them C, Pascal, Lisp, ML, SmallTalk, C++, Java, ADA.

Explicit Allocation

We distinguish two types of allocation:

a simple allocation reserving a block of memory of a certain size without concern of its contents;
an allocation combining the reservation of space with its initialization.

The first case is illustrated by the function new in Pascal or malloc in C. These return a pointer to a memory block (i.e. its address), through which the value stored in memory can be read or modified. The second case corresponds to the construction of values in Objective CAML, Lisp, or in object-oriented languages. Class instances in object-oriented languages are constructed by combining new with the invocation of a constructor for the class, which usually expects a number of parameters. In functional languages, constructor functions are called in places where a structural value (tuple, list, record, vector, or closure) is defined.

Let's examine an example of value construction in Objective CAML. The representation of values in memory is illustrated in Figure 9.1.

Figure 9.1: Memory representation of values.


# let u = let l = ['c'; 'a'; 'm'] in List.tl l ;;
val u : char list = ['a'; 'm']
# let v = let r = ( ['z'] , u )
         in match r with p -> (fst p) @ (snd p) ;;
val v : char list = ['z'; 'a'; 'm']

A list element is represented by a tuple of two words, the first containing a character and the second containing a pointer to the next element of the list. The actual runtime representation differs slightly and is described in the chapter on interfacing with C (see page ??).

The first definition constructs a value named l by allocating a cell (constructor ::) for each element of the list ['c';'a';'m']. The global declaration u corresponds to the tail of l. This establishes a sharing relationship between l and u, i.e. between the argument and the result of the function call to List.tl.

Only the declaration u is known after the evaluation of this first statement.

The second statement constructs a list with only one element, then a pair called r containing this list and the list u. This pair is pattern matched and renamed p by the matching. Next, the first element of p is concatenated with its second element, which creates a value ['z';'a';'m'] tied to the global identifier v. Notice that the result of snd (the list ['a';'m']) is shared with its argument p whereas the result of fst (the character 'z') is copied.

In each case memory allocation is explicit, meaning that it is requested by the programmer (by a language command or instruction).

Note

Allocated memory stores information on the size of the object allocated in order to be able to free it later.

Explicit Reclamation

Languages with explicit memory reclamation possess a freeing operator (free in C or dispose in Pascal) that take the address (a pointer) of the region to deallocate. Using the information stored at allocation time, the program frees this region and may re-use it later.

Dynamic allocation is generally used to manipulate data structures that evolve, such as lists, trees etc.. Freeing the space occupied by such data is not done in one fell swoop, but instead requires a function to traverse the data. We call such functions destructors.

Although correctly defining destructors is not too difficult, their use is quite delicate. In fact, in order to free the space occupied by a structure, it is necessary to traverse the structure's pointers and apply the language's freeing operator. Leaving the responsibility of freeing memory to the programmer has the advantage that the latter is sure of the actions taken. However, incorrect use of these operators can cause an error during the execution of the program. The principal dangers of explicit memory reclamation are:

dangling pointers: a memory region has been freed while there are still pointers pointing at it. If the region is reused, access to the region by these pointers risks being incoherent.
Inaccessible memory regions (a memory ``leak''): a memory region is still allocated, but no longer referenced by any pointer. There is no longer any possibility of freeing the region. There is a clear loss of memory.

The entire difficulty with explicit memory reclamation is that of knowing the lifetime of the set of values of a program.

Implicit Reclamation

Languages with implicit memory reclamation do not possess memory-freeing operators. It is not possible for the programmer to free an allocated value. Instead, an automatic reclamation mechanism is engaged when a value is no longer referenced, or at the time of an allocation failure, that is to say, when the heap is full.

An automatic memory reclamation algorithm is in some ways a global destructor. This characteristic makes its design and implementation more difficult than that of a destructor dedicated to a particular data structure. But, once this difficulty is overcome, the memory reclamation function obtained greatly enhances the safety of memory management. In particular, the risk of dangling pointers disappears.

Furthermore, an automatic memory reclamation mechanism may bring good properties to the heap:

compaction: all the recovered memory belongs to a single block, thereby avoiding fragmentation of the heap, and allowing allocation of objects of the size of the free space on the heap;
localization: the different parts of the same value are close to one another from the point of view of memory address, permitting them to remain in the same memory pages during use, and thereby avoiding their erasure from cache memory.

Design choices for a garbage collector must take certain criteria and constraints into account:

reclamation factor: what percentage of unused memory is available?
memory fragmentation: can one allocate a block the size of the free memory?
the slowness of allocation and collection;
what freedom do we have regarding the representation of values?

In practice, the safety criterion remains primordial, and garbage collectors find a compromise among the other constraints.