Next: char-sets sets how, Up: char-sets sets [Index]
Unicode assigns a single number to each code element defined by the
Standard. Each of these numbers is called a code point and, when
referred to in text, is listed in hexadecimal form following the prefix
U+
. For example, the code point U+0041
is the hexadecimal
number 0041
(equal to the decimal number 65
); it
represents the character A
in the Unicode Standard.
Each character is also assigned a unique name that specifies it and no
other. For example, U+0041
is assigned the character name
LATIN CAPITAL LETTER A
. U+0A1B
is assigned the character
name GURMUKHI LETTER CHA
. These Unicode names are identical to
the ISO/IEC 10646 names for the same characters.
For a general overview of the Unicode Standard see6:
For the complete reference of Unicode code points we should inspect the Unicode Characters Database7:
which is partly and introductorily documented by8:
the same directory on the unicode.org site offers other documents on the interpretation of the database.
For an explanation of ASCII coding, see9: