Vicare Scheme: objects memory usage

13.3.5 Writing correct C language code

The garbage collector considers an object “in use” if at least one reference to it is reachable from the roots of the garbage collection; the roots of the garbage collection are:

Heap’s dirty generational pages not collected in the current garbage collection run.
The Scheme stack.
The next continuation Scheme object.
The symbol table collecting interned symbols.
The root fields of the PCB structure.

Notice that the heap’s nursery is not a garbage collector root; so if we leave some machine words uninitialised on the nursery, outside of Scheme objects: nothing bad happens, because the garbage collector never sees them. Upon allocation, there is no need to initialise the memory segment used as nursery.

If an ikptr_t reference exists only in a CPU register or on the C language stack, or on the C language heap out of segments allocated for Scheme: the garbage collector will not see it. This allows to avoid scanning the full process’ stack for references to values, but imposes care when writing C language code.

Whenever we call ik_safe_alloc() or a function relying on it for memory allocation: a garbage collection may run and Scheme objects may be moved from their location in memory to another memory generational page; this makes invalid all the pointers in the CPU registers, on the C stack and the C heap. Notice that this includes the arguments to C functions called from Scheme through the macro foreign-call.

If an old Scheme object contains a reference to a new Scheme object: we have to inform the garbage collector about this. Whenever we allocate a new Scheme object and store in one of its fields a reference to a previously allocated Scheme object: we have to register this event in the dirty vector.

We must write C code with the following constraints:

Before a call to ik_safe_alloc(): we must make sure that all the Scheme objects we are using in C code are reachable by the garbage collector. This is done by registering an object as garbage collector root through the root fields of the PCB.
After a call to ik_safe_alloc(): we must reobtain all the pointers to the internals of the objects we are using.

To help identification of C functions and macros allocating memory: the ones calling ik_safe_alloc() are prefixed with ika_ and IKA_; the ones calling ik_unsafe_alloc() are prefixed with iku_ and IKU_.

Example of correct code: s_one is protected while allocating s_two:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one, s_two;

s_one = ika_bytevector_alloc(pcb, 10);
pcb->root0 = &s_one;
{
  s_two = ika_bytevector_alloc(pcb, 10); /* GOOD */
}
pcb->root0 = NULL;

example of wrong code: after the second call to the allocation function the value in s_one may be invalid:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one, s_two;

s_one = ika_bytevector_alloc(pcb, 10);
s_two = ika_bytevector_alloc(pcb, 10);
/* do something with "s_one" and "s_two" */ /* WRONG */

Example of correct code: s_one is protected while allocating s_two and after the second allocation the pointer to the data area of s_one is retrieved again:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one;
ikptr_t   s_two;
char *    one;
char *    two;

s_one = ika_bytevector_alloc(pcb, 10);
one   = IK_BYTEVECTOR_DATA_CHARP(s_one);
/* do something with "one" */
pcb->root0 = &s_one;
{
  s_two = ika_bytevector_alloc(pcb, 10);
}
pcb->root0 = NULL;
one   = IK_BYTEVECTOR_DATA_CHARP(s_one); /* GOOD */
two   = IK_BYTEVECTOR_DATA_CHARP(s_two);
/* do something with "one" and "two" */

example of wrong code: after the second call to the allocation function the pointer one to the data area of s_one may be invalid:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one;
ikptr_t   s_two;
char *    one;
char *    two;

s_one = ika_bytevector_alloc(pcb, 10);
one   = IK_BYTEVECTOR_DATA_CHARP(s_one);
/* do something with "one" */
pcb->root0 = &s_one;
{
  s_two = ika_bytevector_alloc(pcb, 10);
}
pcb->root0 = NULL;
two   = IK_BYTEVECTOR_DATA_CHARP(s_two);
/* do something with "one" and "two" */ /* WRONG */

Notice that, according to the C standard Section 6.5.16 “Assignment operators”: the order of evaluation of the operands is unspecified³. In the following code:

IK_CAR(s_pair) = ika_bytevector_alloc(pcb, 8); /* WRONG */

the left–side expression may be evaluated before the right–side one, resulting in the value referenced by s_pair to be invalid when the memory assigment actually takes place; so we have to code:

ikpcb_t * pcb    = ...;
ikptr_t   s_pair = ...;
ikptr_t   s_tmp;

pcb->root0 = &s_pair;
{
  s_tmp          = ika_bytevector_alloc(pcb, 8); /* GOOD */
  IK_CAR(s_pair) = s_tmp;
  IK_SIGNAL_DIRT(pcb, IK_CAR_PTR(s_pair));
}
pcb->root0 = NULL;

or:

ikpcb_t * pcb    = ...;
ikptr_t   s_pair = ...;
ikptr_t   s_tmp;

pcb->root0 = &s_pair;
{
  IK_ASS(IK_CAR(s_pair), ika_bytevector_alloc(pcb, 8)); /* GOOD */
  IK_SIGNAL_DIRT(pcb, IK_CAR_PTR(s_pair));
}
pcb->root0 = NULL;

yes, it is a hard life.

Let’s consider the following snippet, which is wrong:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one, s_two;

s_one = IKA_PAIR_ALLOC(pcb); /* WRONG */
pcb->root0 = &s_one;
{
  s_two = IKA_PAIR_ALLOC(pcb);
}
pcb->root0 = NULL;

when the second pair is allocated, the first pair has car and cdr still uninitialised (the macro IKA_PAIR_ALLOC() does not initialise the pair object): the content of these words is undefined; this may cause undefined behaviour while the second allocation takes place and the garbage collection tries to scan the first pair. The correct code is:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one, s_two;

s_one = IKA_PAIR_ALLOC(pcb);
IK_CAR(s_one) = IK_FALSE; /* GOOD */
IK_CDR(s_one) = IK_FALSE; /* GOOD */
pcb->root0 = &s_one;
{
  s_two = IKA_PAIR_ALLOC(pcb);
}
pcb->root0 = NULL;

or:

ikpcb_t * pcb = ik_the_pcb();
ikptr_t   s_one, s_two;

s_one = ika_pair_alloc(pcb); /* GOOD */
pcb->root0 = &s_one;
{
  s_two = IKA_PAIR_ALLOC(pcb);
}
pcb->root0 = NULL;

because ika_pair_alloc() initialises the car and the cdr.

Footnotes

(3)

For an introduction to such problems see (URL last verified Jan 12, 2012):

http://en.wikipedia.org/wiki/Sequence_point