Next: iklib chars unicode ascii, Previous: iklib chars unicode utf16, Up: iklib chars unicode [Index]
UTF-32, also called UCS 4, is a multioctet character encoding for Unicode which can represent every character in the Unicode set: it can represent every code point in the ranges ‘[0, #xD800)’ and ‘(#xDFFF, #x10FFFF]’. It uses exactly 32 bits per Unicode code point.
This makes UTF-32 a fixed-length encoding, in contrast to all other Unicode Transformation Formats which are variable–length encodings. The UTF-32 form of a character is a direct representation of its code point.
The following syntactic bindings are exported by the library
(vicare unsafe unicode)
. The following macros assume the
word arguments are fixnums representing 32-bit words: they must be
in the range ‘[0, #xFFFFFFFF]’; while the code-point
arguments are fixnums representing Unicode code points (they are in the
range ‘[0, #x10FFFF]’, but outside the range ‘[#xD800,
#xDFFF]’).
Evaluate to #t
if code-point is a Unicode code point
representable in UTF-32 encoding.
Encode a Unicode code point as UTF-32 encoding.
Evaluate to #t
if word is valid as 32-bit word UTF-32
encoding of a Unicode character; otherwise evaluate to #f
.
Encode a valid UTF-32 encoding word into the corresponding Unicode code point.