Next: objects memory, Previous: objects types, Up: objects [Index]
Values of type ikptr_t
at the C language level are the ones we move
around as arguments and return values at the Scheme level; they
represent machine words. ikptr_t
values have two major
interpretations:
Objects that fit in a single machine word: special constants (like
#t
and #f
), fixnums, characters and input/output port
transcoders.
Objects allocated on the heap and subject to garbage collection; they are represented by tagged pointers: symbols, pairs, vectors, bytevectors, structures, ports, bignums, ratnums, flonums, compnums, cflonums, strings, closures, continuations, code objects, pointers.
immediate ikptr_t
values have two minor interpretations:
These are #t
, #f
, nil, void, unbound, BWP.
These are fixnums, characters and transcoders.
reference ikptr_t
values have two minor interpretations:
Memory pointer values whose 3 least significant bits are set to the vector tag. They reference multiword objects allocated on the heap: vectors, bignums, structures, flonums, ratnums, compnums, cflonums, continuations, code, ports, symbols, pointers.
Pointer values whose 3 least significant bits are set to a type–specific tag. They reference multiword objects allocated on the heap: pairs, bytevectors, closures, strings.
An immediate built in object or a reference to a built in object; it is defined as follows:
void *
of size 32-bit: it is defined
as alias for uint32_t
.
void *
of size 64-bit: it is defined
as alias for uint64_t
.
unsigned long
int
. This should never happen.
Return an integer representing the 3 least significant bits of an
ikptr_t
value.
Getter and setter for machine words. Interpret value_ref as a
pointer to an array of ikptr_t
values and locate the value at the
zero–based byte_offset. A use of this macro can appear both as
operand and as left–side of an assignment.
ikptr_t P, Q; Q = IK_REF(P, 2*wordsize); /* retrieve the 3rd word */ IK_REF(P, 0) = 123L; /* store a value in the 1st word */
Both value_ref and byte_offset are first cast to long
values, then added and the sum is cast to ikptr_t *
.
There are two categories of values for byte_offset: offsets and displacements; both are usually precomputed at compile time and are predefined for the built in Scheme values.
They are plain numbers of bytes to be added to an untagged pointer to obtain the memory address of a machine word.
They are number of bytes from which a Scheme value’s tag is subtracted: adding an offset to a tagged pointer removes the tag and computes the memory address of a machine word, in a single step.
Given an untagged pointer to a vector, the fixnum representing the length of the vector can be obtained with:
ikptr_t p_vector = ...; ikptr_t s_length = IK_REF(p_vector, disp_vector_length);
predefined displacements have names prefixed with disp_
; given a
tagged pointer to a vector, the fixnum representing the length
of the vector can be obtained with:
ikptr_t s_vector = ...; ikptr_t s_length = IK_REF(s_vector, off_vector_length);
predefined offsets have names prefixed with off_
. An offset can
be computed from a displacement simply by subtracting the tag:
off_vector_length = disp_vector_length - vector_tag
this because we can build a tagged pointer from an untagged and aligned one with:
s_vector = p_vector | vector_tag = p_vector + vector_tag
and vice versa we can compute an untagged pointer from a tagged one with:
p_vector = s_vector - vector_tag
and so:
s_vector + off_vector_length = p_vector + disp_vector_length
Like IK_REF()
, but rather than returning the machine word at
offset byte_offset from value_ref, return a pointer to it.
This is especially useful to build the second argument in a call to
ik_signal_dirt_in_page_of_pointer()
.
All the immediate values but fixnums have the 3 least significant bits set to 1; to distinguish between immediate values and references we can do:
ikptr_t X; if (IK_IS_FIXNUM(X) || (immediate_tag == IK_TAGOF(X))) it_is_immediate(); else it_is_not();
where:
immediate_tag = 7 = #b111
Special machine words of type ikptr_t
representing, respectively:
#f
; #t
; nil, the empty list; EOF, the end of file;
#!void
, the return value of functions returning no value.
Special machine word value stored in the value
and proc
fields of Scheme symbol memory blocks to signal that these fields are
unset.
Special machine word value stored in locations that used to hold weak references to values which have been already garbage collected. ‘BWP’ stands for “broken weak pointer”.
When a Scheme object’s memory block is moved by the garbage collector:
the first word of the old memory block is overwritten with a special
value, the “forward pointer”, which is the symbol
IK_FORWARD_PTR
.
Notice that when the garbage collector scans, word by word, memory that
should contain the data area of a Scheme object: it interprets every
machine word with all the bits set to 1
as IK_FORWARD_PTR
.
Newly allocated memory is initialised by Vicare to a sequence of
IK_FORWARD_PTR
words, which, most likely, will trigger an
assertion violation if the garbage collector scans a machine word we
have not explicitly initialised to something valid. Whenever we reserve
a portion of memory page, with aligned size, for a Scheme object we must
initialise all its words to something valid.
When we convert a requested size to an aligned size with
IK_ALIGN()
: either zero or one machine word is allocated beyond
the requested size. When such additional machine word is allocated: we
have to initialise it to something valid. Usually the safe value to
which we should initialise memory is the fixnum zero: a machine word
with all the bits set to 0
.
The variable values that fit in a single machine word are fixnums, characters and port transcoders. The last byte of these machine words is tagged as follows:
object | tag bits | tag hex | mask bits ---------------+------------+---------+------------ fixnums 32-bit | #b??????00 | -- | #b00000011 fixnums 64-bit | #b?????000 | -- | #b00000111 characters | #b00001111 | #x0F | #b11111111 transcoders | #b01111111 | #x7F | #b11111111
to identify a fixnum we can do:
ikptr_t X; if (fx_tag == (X & fx_mask)) it_is_a_fixnum(); else it_is_not();
or just use the macro IK_IS_FIXNUM()
; similarly for for the other
immediate variable values.
Notice that a NULL
pointer stored in a ikptr_t
with zero bits as
tag represents the fixnum zero; also, the zero tag bits for fixnums are
in such a number that: a tagged ikptr_t
fixnum can be interpreted as
the number of bytes needed to hold a number of machine words equal to
the number represented by the fixnum itself, that is the following holds
true:
long number_of_words = ...; number_of_words * wordsize == number_of_words << fx_shift;
where fx_shift
is the number of bits in the fixnum’s tag.
The values that do not fit into a single machine word are composed of a reference machine word and an array of machine words on the heap; they are: symbols, pairs, vectors, bytevectors, structures, ports, bignums, ratnums, flonums, compnums, cflonums, strings, closures, continuations, codes, pointers.
The machine words used as reference have the 3 least significant bits used as tag and the remaining most significant bits used to store a pointer in memory; on 32-bit platforms the layout of such machine words is:
PPPPPPPP PPPPPPPP PPPPPPPP PPPPPTTT P = bit of pointer |--------|--------|--------|--------| T = bit of tag byte 3 byte 2 byte 1 byte 0
the following tags are used:
object | tag bits | tag hex | mask bits ------------+----------+---------+------------ pairs | #b001 | #x1 | #b00000111 bytevectors | #b010 | #x2 | #b00000111 closure | #b011 | #x3 | #b00000111 vectors | #b101 | #x5 | #b00000111 strings | #b110 | #x6 | #b00000111
notice how none of the tags for reference words is
#b111
, which is reserved for immediate values; also notice how
#b100
must not be used as tag, because on 32-bit
platforms it would match the fixnums with the least significant bit set
to one.
The vector tag is used to tag machine word references to multiple object types: vectors, bignums, structures, flonums, ratnums, compnums, cflonums, continuations, code, ports, symbols, pointers, system continuations. The first word in the memory block of these types has the least significant bits set to a secondary tag.
All the possible values for 3-bit tags in reference values are already allocated; new object types can be added only by defining a new secondary tag with references tagged as vector.
While the API defines predicates to recognise values, to identify a type–specific reference we can do:
ikptr_t X; if (pair_tag == (X & pair_mask)) it_is_a_pair(); else it_is_not();
similarly for the other types. The vector tag acts as primary tag; a secondary tag is stored in the least significant bits of the referenced vector of words on the heap; to recognise such values we can do:
ikptr_t X; if ((vector_tag == (X & vector_mask)) && (secondary_tag == (secondary_mask & IK_REF(X, -vector_tag)))) it_is(); else it_is_not();
where secondary_tag
and secondary_mask
are type–specific.
The secondary tags and the associated masks are:
object | tag bits | tag hex | tag mask | 76543210 | | 76543210 --------------------+-------------+---------+------------- vector | #b??????00 | fixnum | -- bignum | #b????s011 | #x03 | #b00000111 structure | #b?????101 | #x05 | #b00000111 flonum | #b00010111 | #x17 | -- ratnum | #b00100111 | #x27 | -- compnum | #b00110111 | #x37 | -- cflonum | #b01000111 | #x47 | -- continuation | #b00011111 | #x1F | -- code | #b00101111 | #x2F | -- port | #b??111111 | #x3F | #b00111111 symbol | #b01011111 | #x5F | -- pointer | #b100000111 | #x107 | -- system continuation | #b100011111 | #x11F | --
notice how the port secondary tag has all the 6 least significant
bits set to 1: no other tag must have all such bits set to
1. Secondary tags for new types can be allocated by selecting
the least significant byte to #x0F
and reserving a specific bit
pattern in the most significant bytes.
The only tags having an associated mask are the ones of objects storing additional informations in the first word of the heap vector:
The first word of a vector is a fixnum representing the number of elements.
The first word uses the 3 least significant bits as tag, the 4th bit representing the sign (0 for positive, 1 for negative) and the remaining bits representing the number of words in the bignum data area.
The first word is tagged as vector, because the first word of a structure is itself a reference to a structure: the type descriptor.
The most significant bits of the first word are used for port attributes.
Next: objects memory, Previous: objects types, Up: objects [Index]