This page is part of the web mail archives of SRFI 13 from before July 7th, 2015. The new archives for SRFI 13 contain all messages, not just those from before July 7th, 2015.
I have finally completed the html-markup of SRFIs 13 & 14. The relevant URLs are at ftp://ftp.ai.mit.edu/people/shivers/srfi/13/ ftp://ftp.ai.mit.edu/people/shivers/srfi/13/string-lib.txt ftp://ftp.ai.mit.edu/people/shivers/srfi/13/string-lib.html ftp://ftp.ai.mit.edu/people/shivers/srfi/13/string-lib.scm ftp://ftp.ai.mit.edu/people/shivers/srfi/13/string-package.scm ftp://ftp.ai.mit.edu/people/shivers/srfi/14/ ftp://ftp.ai.mit.edu/people/shivers/srfi/14/cset-lib.txt ftp://ftp.ai.mit.edu/people/shivers/srfi/14/cset-lib.html ftp://ftp.ai.mit.edu/people/shivers/srfi/14/cset-lib.scm ftp://ftp.ai.mit.edu/people/shivers/srfi/14/cset-package.scm ftp://ftp.ai.mit.edu/people/shivers/srfi/14/cset-tests.scm May I add that the HTML typesetting was an incredibly tedious and painful job. They will be moved over to the SRFI sites at http://srfi.schemers.org/srfi-13/ http://srfi.schemers.org/srfi-14/ imminently; Mike Sperber will announce when that has happened. Like SRFI-1, the HTML for these two SRFIs has internal links from a procedure/variable index that appears at the top of the document to every definition inside the document, so they should work well as tools for quickly looking up a procedure. (Unfortunately, there's a bug in Netscape that renders the internal links found in the first line of text in a section or paragraph inoperative. Bizarre. But typical of Netscape. This shuts down some of the links. Perhaps it will be fixed in Netscape 6.) I've checked the layout of the HTML on both netscape & Internet Explorer. (It looks better on IE, which does a better job of implementing style sheets. Netscape is, again, pretty broken.) I've also run them through the W3's html validator to check them against the HTML 4 spec. I've folded in updates to the reference implementation due to Brad Lucier's review. Even at this late date, I have made two alterations to the content of the SRFI: an addition and a small change. The addition: I've added one more facility to the character-set library (SRFI 14). I realised that there is no primitive facility allowing other iteration systems such as loop macros to loop over the elements of a character set. So I've added a simple cursor interface to provide this. Here is the documentation: ------ char-set-cursor cset -> cursor char-set-index cset cursor -> char char-set-cursor-next cset cursor -> cursor end-of-char-set? cursor -> boolean Cursors are a low-level facility for iterating over the characters in a set. A cursor is a value that indexes a character in a char set. CHAR-SET-CURSOR produces a new cursor for a given char set. The set element indexed to by the cursor is fetched with CHAR-SET-INDEX. A cursor index is incremented with CHAR-SET-CURSOR-NEXT; in this way, code can step through every character in a char set. Stepping a cursor "past the end" of a char set produces a cursor that answers true to END-OF-CHAR-SET?. It is an error to pass such a cursor to CHAR-SET-INDEX or to CHAR-SET-CURSOR-NEXT. A cursor value may not be used in conjunction with a different character set; if it is passed to CHAR-SET-INDEX or CHAR-SET-CURSOR-NEXT with a character set other than the one used to create it, the results and effects are undefined. Cursor values are *not* necessarily distinct from other types. They may be integers, linked lists, records, procedures or other values. This license is granted to allow cursors to be very "lightweight" values suitable for tight iteration, even in fairly simple implementations. Note that these primitives are necessary to export an iteration facility for char sets to loop macros. Example: (define cs (char-set #\G #\a #\T #\e #\c #\h)) ;; Collect elts of CS into a list. (let lp ((cur (char-set-cursor cs)) (ans '())) (if (end-of-char-set? cur) ans (lp (char-set-cursor-next cs cur) (cons (char-set-index cs cur) ans)))) => (#\G #\T #\a #\c #\e #\h) ;; Equivalently, using a list unfold (from SRFI 1): (unfold-right end-of-char-set? (curry char-set-index cs) (curry char-set-cursor-next cs) (char-set-cursor cs)) => (#\G #\T #\a #\c #\e #\h) Rationale: Note that the cursor API's four functions "fit" the functional protocol used by the unfolders provided by the list, string and char-set SRFIs (see the example above). By way of contrast, here is a simpler, two-function API that was rejected for failing this criterion. Besides CHAR-SET-CURSOR, it provided a single function that mapped a cursor and a character set to two values, the indexed character and the next cursor. If the cursor had exhausted the character set, then this function returned false instead of the character value, and another end-of-char-set cursor. In this way, the other three functions of the current API were combined together. ------ And here is the reference implementation; it's only about 14 lines of code. ------ ;;; Cursors ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; Simple implementation. A cursors is an integer index into the ;;; mark vector, and -1 for the end-of-char-set cursor. (define (char-set-cursor cset) (%char-set-cursor-next cset 256 char-set-cursor)) (define (end-of-char-set? cursor) (< cursor 0)) (define (char-set-index cset cursor) (%latin1->char cursor)) (define (char-set-cursor-next cset cursor) (check-arg (lambda (i) (and (integer? i) (exact? i) (<= 0 i 255))) cursor char-set-cursor-next) (%char-set-cursor-next cset cursor char-set-cursor-next)) (define (%char-set-cursor-next cset cursor proc) ; Internal (let ((s (%char-set:s/check cset proc))) (let lp ((cur cursor)) (let ((cur (- cur 1))) (if (or (< cur 0) (si=1? s cur)) cur (lp cur)))))) ------ I'm sorry for the late addition, but it is basic and important. Loop macros are going to happen. Some facility must be exported from all container and sequence types to allow iteration over them. The small change: I originally spec'd the char-set and string hash functions as taking an optional BOUND value, which is either a non-negative integer or false (meaning the default). This was a small error -- I have just realised that BOUND's valid numeric range isn't *non-negative* but *positive*. That is, zero is not a valid bound. So I have coopted zero to serve the purpose formerly filled by the false value -- it means the default, implementation-specific "large" modulus. Not only does this simplify the type of the hash functions, since this parameter is now simply an exact integer, but using 0 as the "effectively infinite" modulus makes a certain minor sort of mathematical sense -- it is still true that modulus * quotient + remainder = dividend (that is, remainder is defined when modulus is zero, even though the quotient is not). So I have changed the text of the spec and the implementation's arg checking for hash functions simply to require BOUND to be a non-negative value. No more #f. Now it is the case that (string-hash s 0) = (string-hash s) It's important to clean up even a small detail like this *now*, because hash functions are going to occur in many future libraries. We want a convention that will hold across all of these future SRFIs. The SRFIs are typeset and ready to go. I'm setting a 48-hour bound on discussion of both the hash bound value and the character-set iteration cursor facility. Then this thing is done. -Olin