Next: , Up: scheme lex   [Index]


3.4.1 Introduction

The syntax of Scheme code is organized in three levels:

  1. The lexical syntax that describes how a program text is split into a sequence of lexemes.
  2. The datum syntax, formulated in terms of the lexical syntax, that structures the lexeme sequence as a sequence of syntactic data, where a syntactic datum is a recursively structured entity.
  3. The program syntax formulated in terms of the datum syntax, imposing further structure and assigning meaning to syntactic data.

Syntactic data (also called external representations) double as a notation for objects, and Scheme’s (rnrs io ports (6)) library provides the get-datum and put-datum procedures for reading and writing syntactic data, converting between their textual representation and the corresponding objects. Port input/output. Each syntactic datum represents a corresponding datum value. A syntactic datum can be used in a program to obtain the corresponding datum value using quote. Quotation

Scheme source code consists of syntactic data and (non–significant) comments. Syntactic data in Scheme source code are called forms. (A form nested inside another form is called a subform.) Consequently, Scheme’s syntax has the property that any sequence of characters that is a form is also a syntactic datum representing some object. This can lead to confusion, since it may not be obvious out of context whether a given sequence of characters is intended to be a representation of objects or the text of a program. It is also a source of power, since it facilitates writing programs such as interpreters or compilers that treat programs as objects (or vice versa).

A datum value may have several different external representations. For example, both #e28.000 and #x1c are syntactic data representing the exact integer object 28, and the syntactic data (8 13), ( 08 13 ), (8 . (13 . ())) all represent a list containing the exact integer objects 8 and 13. Syntactic data that represent equal objects (in the sense of equal?; baselib predicates) are always equivalent as forms of a program.

Because of the close correspondence between syntactic data and datum values, this report sometimes uses the term datum for either a syntactic datum or a datum value when the exact meaning is apparent from the context.

An implementation must not extend the lexical or datum syntax in any way, with one exception: it need not treat the syntax #!<identifier>, for any <identifier> (scheme lex syntax identifiers) that is not r6rs, as a syntax violation, and it may use specific #!–prefixed identifiers as flags indicating that subsequent input contains extensions to the standard lexical or datum syntax. The syntax #!r6rs may be used to signify that the input afterward is written with the lexical syntax and datum syntax described by this report. #!r6rs is otherwise treated as a comment; scheme lex syntax whitespace and comments.


Next: , Up: scheme lex   [Index]