[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on SRFI-13 reference implementation

This page is part of the web mail archives of SRFI 13 from before July 7th, 2015. The new archives for SRFI 13 contain all messages, not just those from before July 7th, 2015.

>    string-index, string-index-right, string-skip, string-skip-right, string-count:
>    Add internal routines that do the work but no error checking.  Call the
>    string arg s for consistency.
> There are no module-internal calls to STRING-COUNT.
> Adding internal routines for string-index & string-index-right would only
> speed up string-titlecase & string-tokenize... but would slow down other
> code's use of string-index by the cost of the procedure call from string-index
> to the internal %string-index

Well, it's a tail call from string-index to %string-index with the same arguments,
so in Gambit it's a jump.   Doing all the the optional argument parsing and
checking that the arguments are reasonable takes a lot longer than this.
(Although in Gambit, optional arguments are quite fast.)  For example, my
checking code is

(##define-macro (macro-check-string-start-end s start end form expr)
  (define (flat x) (if (pair? x) (cons (car x) (flat (cdr x))) (list x)))
  `(if (##string? ,s)
       (let ((,end (if (absent-object? ,end) (string-length ,s) ,end)))
         (if (##fixnum? ,start)
             (if (##fixnum.>= ,start 0)
                 (if (##fixnum? ,end)
                     (if (and (##fixnum.<= ,start ,end)
                              (##fixnum.<= ,end (string-length ,s)))
                         ,(if (list? form)
                              `(##trap-check-range ',(car form) ,@(cdr form))
                              `(##trap-check-range* ',(car form) ,@(flat (cdr form)))))
                     (if (##bignum? ,end)
                         ,(if (list? form)
                              `(##trap-check-range ',(car form) ,@(cdr form))
                              `(##trap-check-range* ',(car form) ,@(flat (cdr form))))
                         ,(if (list? form)
                              `(##trap-check-exact-int ',(car form) ,@(cdr form))
                              `(##trap-check-exact-int* ',(car form) ,@(flat (cdr form))))))
                 ,(if (list? form)
                      `(##trap-check-range ',(car form) ,@(cdr form))
                      `(##trap-check-range* ',(car form) ,@(flat (cdr form)))))
             (if (##bignum? ,start)
                 ,(if (list? form)
                      `(##trap-check-range ',(car form) ,@(cdr form))
                      `(##trap-check-range* ',(car form) ,@(flat (cdr form))))
                 ,(if (list? form)
                      `(##trap-check-exact-int ',(car form) ,@(cdr form))
                      `(##trap-check-exact-int* ',(car form) ,@(flat (cdr form)))))))
       ,(if (list? form)
            `(##trap-check-string ',(car form) ,@(cdr form))
            `(##trap-check-string* ',(car form) ,@(flat (cdr form))))))


(define (string-index s criterion #!optional (start 0) (end (macro-absent-object)))
  (macro-check-string-start-end s start end (string-index criterion s start end)
   (##string-index s criterion start end)))

Each individual check is fast (a few hardware instructions at most), but there are
7 of them, 8 if you have to get end from (string-length s).  So I'd rather do the
extra jump for string-index than the extra checks for the routines that use it.

>    string-concatenate/shared, string-concatenate, string-join:
>    Need to check strings.
> There's no cheap way to check an alpha list to ensure that all elements
> are alphas. So I don't check these.

Well, my version reads

(##define-macro (##every? proc lst)
  `(let ((proc ,proc))
     (let loop ((lst ,lst))
       (or (null? lst)
           (and (proc (car lst))
                (loop (cdr lst)))))))

(define (string-concatenate/shared strings)
  (if (list? strings)
      (if (##every? string? strings)
          (let lp ((strings strings) (nchars 0) (first #f))
            (cond ((pair? strings)                      ; Scan the args, add up total
                   (let* ((string  (car strings))       ; length, remember 1st
                          (tail (cdr strings))          ; non-empty string.
                          (slen (string-length string)))
                     (if (zero? slen)
                         (lp tail nchars first)
                         (lp tail (+ nchars slen) (or first strings)))))

                  ((zero? nchars) "")

                  ;; Just one non-empty string! Return it.
                  ((= nchars (string-length (car first))) (car first))

                  (else (let ((ans (make-string nchars)))
                          (let lp ((strings first) (i 0))
                            (if (pair? strings)
                                (let* ((s (car strings))
                                       (slen (string-length s)))
                                  (##string-copy! ans i s 0 slen)
                                  (lp (cdr strings) (+ i slen)))))
          (error "Not every element of strings is a string: (string-concatenate/shared " strings ")"))
      (error "strings is not a list: (string-concatenate/shared " strings ")")))

Considering the amount of stuff you actually do with the contents of these strings,
and how the routine is going to blow up if it isn't a list or if all the
entries are not strings, I'd check it.  I suppose since you're walking the list
anyway in lp, you could check it inside the cond and get rid of the extra two
traversals of the list (for list? and ##every?).