Next: baselib strings, Previous: baselib symbols, Up: baselib [Index]
The characters are objects that represent Unicode scalar values.
Unicode defines a standard mapping between sequences of Unicode scalar values (integers in the range 0 to
#x10FFFF
, excluding the range#xD800
to#xDFFF
) in the latest version of the standard and human–readable “characters”.More precisely, Unicode distinguishes between glyphs, which are printed for humans to read, and characters, which are abstract entities that map to glyphs (sometimes in a way that’s sensitive to surrounding characters). Furthermore, different sequences of scalar values sometimes correspond to the same character. The relationships among scalar, characters, and glyphs are subtle and complex.
Despite this complexity, most things that a literate human would call a “character” can be represented by a single Unicode scalar value (although several sequences of Unicode scalar values may represent that same character). For example, Roman letters, Cyrillic letters, Hebrew consonants, and most Chinese characters fall into this category.
Unicode scalar values exclude the range
#xD800
to#xDFFF
, which are part of the range of Unicode code points. However, the Unicode code points in this range, the so–called surrogates, are an artifact of the UTF–16 encoding, and can only appear in specific Unicode encodings, and even then only in pairs that encode scalar values. Consequently, all characters represent code points, but the surrogate code points do not have representations as characters.
Return #t
if obj is a character, #f
otherwise.
sv must be a Unicode scalar value, i.e., a non–negative exact
integer object in [0, #xD7FF] union [#xE000, #x10FFFF]
.
Given a character, char->integer
returns its Unicode scalar value
as an exact integer object. For a Unicode scalar value sv,
integer->char
returns its associated character.
(integer->char 32) ⇒ #\space (char->integer (integer->char 5000)) ⇒ 5000 (integer->char #\xD800) ⇒ exception &assertion
These procedures impose a total ordering on the set of characters according to their Unicode scalar values.
(char<? #\z #\Z) ⇒ #f
Next: baselib strings, Previous: baselib symbols, Up: baselib [Index]