This page is part of the web mail archives of SRFI 13 from before July 7th, 2015. The new archives for SRFI 13 contain all messages, not just those from before July 7th, 2015.
Hello! If I may I'd like to propose two more functions, string->integer and string-split. string-split is similar to string-tokenize, but it supports a 'delimiting' rather than the inclusion grammar. While the token-set tells characters that make up tokens, a string-split's argument specifies a set of characters to delimit tokens with. Some problems are more elegantly and efficiently expressed in terms of inclusion, some other are in terms of delimiting. I found for example that in Perl and Python split() is a rather often-used function. Furthermore, string-split ought to accept an optional LIMIT argument to limit the number of splits performed. The specification in Appendix A below is what I implemented, and wrote an extensive set of tests for. You will probably generalize the CHARSET argument (SRFI-14 didn't exists when App A was written). Besides, I defer to you as to how to mold string-split into SRFI-13 should this proposal gets accepted. R5RS procedure string->number is far more generic than the proposed string->integer -- and this may be a problem IMHO. For example, string->number will try to read strings like "1/2" "1S2" "1.34" and even "1/0" (the latter causing a zero-divide error). Note that to Gambit's string->number, "1S2" is a valid representation of an _inexact_ integer (100 to be precise). Oftentimes we want to be more restrictive about what we consider a number; we want merely to read an integral label. -- procedure+: string->integer STR START END Makes sure a substring of the STR from START (inclusive) till END (exclusive) is a representation of a non-negative integer in decimal notation. If so, this integer is returned. Otherwise -- when the substring contains non-decimal characters, or when the range from START till END is not within STR, the result is #f. > [SRFI-13] > string-concatenate string-list -> string > Append the elements of STRING-LIST together into a single _list_. > Guaranteed to return a freshly allocated _list_. Did you mean to say a 'string' (instead of a _list_)? SRFI-13 mentions that string-unfold is also called "anamorphism". Do you want to point out that a foldr combinator (e.g., string-fold-right) is also called a "catamorphism"? Thank you for the trouble you took putting together SRFI-13! Oleg Appendix A. string-split (a very draft proposal) -- procedure+: string-split STRING -- procedure+: string-split STRING '() -- procedure+: string-split STRING '() MAXSPLIT Returns a list of whitespace delimited words in STRING. If STRING is empty or contains only whitespace, then the empty list is returned. Leading and trailing whitespaces are trimmed. If MAXSPLIT is specified and positive, the resulting list will contain at most MAXSPLIT elements, the last of which is the string remaining after (MAXSPLIT - 1) splits. If MAXSPLIT is specified and non-positive, the empty list is returned. "In time critical applications it behooves you not to split into more fields than you really need." -- procedure+: string-split STRING CHARSET -- procedure+: string-split STRING CHARSET MAXSPLIT Returns a list of words delimited by the characters in CHARSET in STRING. CHARSET is a list of characters that are treated as delimiters. Leading or trailing delimiters are NOT trimmed. That is, the resulting list will have as many initial empty string elements as there are leading delimiters in STRING. If MAXSPLIT is specified and positive, the resulting list will contain at most MAXSPLIT elements, the last of which is the string remaining after (MAXSPLIT - 1) splits. If MAXSPLIT is specified and non-positive, the empty list is returned. "In time critical applications it behooves you not to split into more fields than you really need." This is based on the split function in Python/Perl (string-split " abc d e f ") ==> ("abc" "d" "e" "f") (string-split " abc d e f " '() 1) ==> ("abc d e f ") (string-split " abc d e f " '() 0) ==> () (string-split ":abc:d:e::f:" '(#\:)) ==> ("" "abc" "d" "e" "" "f" "") (string-split ":" '(#\:)) ==> ("" "") (string-split "root:x:0:0:Lord" '(#\:) 2) ==> ("root" "x:0:0:Lord") (string-split "/usr/local/bin:/usr/bin:/usr/ucb/bin" '(#\:)) ==> ("/usr/local/bin" "/usr/bin" "/usr/ucb/bin") (string-split "/usr/local/bin" '(#\/)) ==> ("" "usr" "local" "bin") Implementation: http://pobox.com/~oleg/ftp/Scheme/util.scm A regression test suite: http://pobox.com/~oleg/ftp/Scheme/vinput-parse.scm