Vicare Scheme: stdlib unicode characters

5.1.1 Characters

Procedure: char-upcase char

Procedure: char-downcase char

Procedure: char-titlecase char

Procedure: char-foldcase char

These procedures take a character argument and return a character result.

If the argument is an upper–case or title–case character, and if there is a single character that is its lower–case form, then char-downcase returns that character.

If the argument is a lower–case or title–case character, and there is a single character that is its upper–case form, then char-upcase returns that character.

If the argument is a lower–case or upper–case character, and there is a single character that is its title–case form, then char-titlecase returns that character.

If the argument is not a title–case character and there is no single character that is its title–case form, then char-titlecase returns the upper–case form of the argument.

Finally, if the character has a case–folded character, then char-foldcase returns that character. Otherwise the character returned is the same as the argument.

For Turkic characters #\x130 and #\x131, char-foldcase behaves as the identity function; otherwise char-foldcase is the same as char-downcase composed with char-upcase.

(char-upcase #\i)               ⇒ #\I
(char-downcase #\i)             ⇒ #\i
(char-titlecase #\i)            ⇒ #\I
(char-foldcase #\i)             ⇒ #\i

NOTE char-titlecase does not always return a title–case character.

NOTE These procedures are consistent with Unicode’s locale–independent mappings from scalar values to scalar values for upcase, downcase, titlecase, and case–folding operations. These mappings can be extracted from UnicodeData.txt and CaseFolding.txt from the Unicode Consortium, ignoring Turkic mappings in the latter.

Note that these character–based procedures are an incomplete approximation to case conversion, even ignoring the user’s locale. In general, case mappings require the context of a string, both in arguments and in result. The string-upcase, string-downcase, string-titlecase, and string-foldcase procedures (stdlib unicode strings perform more general case conversion.

Procedure: char-ci=? char1 char2 char3 …

Procedure: char-ci<? char1 char2 char3 …

Procedure: char-ci>? char1 char2 char3 …

Procedure: char-ci<=? char1 char2 char3 …

Procedure: char-ci>=? char1 char2 char3 …

These procedures are similar to char=?, etc., but operate on the case–folded versions of the characters.

(char-ci<? #\z #\Z)             ⇒ #f
(char-ci=? #\z #\Z)             ⇒ #f

Procedure: char-alphabetic? char

Procedure: char-numeric? char

Procedure: char-whitespace? char

Procedure: char-upper-case? char

Procedure: char-lower-case? char

Procedure: char-title-case? char

These procedures return #t if their arguments are alphabetic, numeric, whitespace, upper–case, lower–case, or title–case characters, respectively; otherwise they return #f.

A character is alphabetic if it has the Unicode “Alphabetic” property. A character is numeric if it has the Unicode “Numeric” property. A character is whitespace if has the Unicode “White_Space” property. A character is upper case if it has the Unicode “Uppercase” property, lower case if it has the “Lowercase” property, and title case if it is in the Lt general category.

(char-alphabetic? #\a)          ⇒ #t
(char-numeric? #\1)             ⇒ #t
(char-whitespace? #\space)      ⇒ #t
(char-whitespace? #\x00A0)      ⇒ #t
(char-lower-case? #\x00AA)      ⇒ #t
(char-title-case? #\I)          ⇒ #f
(char-title-case? #\x01C5)      ⇒ #t

Procedure: char-general-category char

Return a symbol representing the Unicode general category of char, one of Lu, Ll, Lt, Lm, Lo, Mn, Mc, Me, Nd, Nl, No, Ps, Pe, Pi, Pf, Pd, Pc, Po, Sc, Sm, Sk, So, Zs, Zp, Zl, Cc, Cf, Cs, Co, or Cn.

(char-general-category #\a)             ⇒ Ll
(char-general-category #\space)         ⇒ Zs
(char-general-category #\x10FFFF)       ⇒ Cn