Next: , Previous: , Up: srfi char-sets spec   [Index]


2.9.3.3 Iterating over character sets

Function: char-set-cursor cs
Function: char-set-ref cs cursor
Function: char-set-cursor-next cs cursor
Function: end-of-char-set? cursor

Cursors are a low–level facility for iterating over the characters in a set; a cursor is a value that indexes a character in a char set.

char-set-cursor returns a new cursor object associated to the character set cs. There can be multiple cursors associated to the same character set.

char-set-ref returns a character object representing the set element currently indexed by a cursor.

char-set-cursor-next increments a cursor index and returns a new cursor indexing the next character in the set; in this way, code can step through every character in a char set.

Stepping a cursor “past the end” of a char set produces a cursor that answers true to end-of-char-set?. It is an error to pass such a cursor to char-set-ref or to char-set-cursor-next.

A cursor value may not be used in conjunction with a different character set; if it is passed to char-set-ref or char-set-cursor-next with a character set other than the one used to create it, the results and effects are undefined.

Cursor values are not necessarily distinct from other types: they may be integers, linked lists, records, procedures or other values.

Note that these primitives are necessary to export an iteration facility for char sets to loop macros.

Example:

(define cs (char-set #\G #\a #\T #\e #\c #\h))

;; Collect elts of CS into a list.
(let lp ((cur (char-set-cursor cs)) (ans '()))
  (if (end-of-char-set? cur) ans
      (lp (char-set-cursor-next cs cur)
          (cons (char-set-ref cs cur) ans))))
  ⇒ (#\G #\T #\a #\c #\e #\h)

;; Equivalently, using a list unfold (from SRFI 1):
(unfold-right end-of-char-set?
             (curry char-set-ref cs)
      (curry char-set-cursor-next cs)
      (char-set-cursor cs))
  ⇒ (#\G #\T #\a #\c #\e #\h)
Function: char-set-fold kons knil cs -> object

This is the fundamental iterator for character sets. Apply the function kons across the character set cs using initial state value knil.

If cs is the empty set: the return value is knil.

Otherwise some element c of cs is chosen; let cs1 be the remaining, unchosen characters, the procedure returns:

(char-set-fold kons (kons c knil) cs1)

Examples:

;; CHAR-SET-MEMBERS
(lambda (cs) (char-set-fold cons '() cs))

;; CHAR-SET-SIZE
(lambda (cs) (char-set-fold (lambda (c i) (+ i 1)) 0 cs))

;; How many vowels in the char set?
(lambda (cs)
  (char-set-fold (lambda (c i) (if (vowel? c) (+ i 1) i))
                 0 cs))
Function: char-set-unfold f p g seed
Function: char-set-unfold f p g seed base-cs
Function: char-set-unfold! f p g seed base-cs

This is a fundamental constructor for character sets.

g is used to generate a series of “seed” values from the initial seed:

seed
(g seed)
(g2 seed)
(g3 seed)
...

p tells us when to stop: when it returns true when applied to one of these seed values.

f maps each seed value to a character. These characters are added to the base character set base-cs to form the result; base-cs defaults to the empty set.

char-set-unfold! adds the characters to base-cs in a linear–update; it is allowed, but not required, to side–effect and use base-cs’s storage to construct the result.

More precisely, the following definitions hold, ignoring the optional-argument issues:

(define (char-set-unfold p f g seed base-cs)
  (char-set-unfold! p f g seed (char-set-copy base-cs)))

(define (char-set-unfold! p f g seed base-cs)
  (let lp ((seed seed)
           (cs   base-cs))
    (if (p seed)
        cs
      (lp (g seed)
          (char-set-adjoin! cs (f seed))))))

note that the actual implementation may be more efficient.

Examples:

(port->char-set p)
≡ (char-set-unfold eof-object? values
                          (lambda (x) (read-char p))
                          (read-char p))

(list->char-set lis)
≡ (char-set-unfold null? car cdr lis)
Function: char-set-for-each proc cs

Apply the procedure proc to each character in the character set cs; return unspecified values. The order in which proc is applied to the characters in the set is unspecified, and may even change from one procedure application to another.

Function: char-set-map proc cs

proc is a character–to–character procedure; apply it to all the characters in cs, and collect the results into a new character set, return such new character set.

Example:

(char-set-map char-downcase cs)

Next: , Previous: , Up: srfi char-sets spec   [Index]