[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: KMP

This page is part of the web mail archives of SRFI 13 from before July 7th, 2015. The new archives for SRFI 13 contain all messages, not just those from before July 7th, 2015.

To: srfi-13@xxxxxxxxxxxxxxxxx
Subject: Re: KMP
From: shivers@xxxxxxxxxx
Date: Sun, 7 May 2000 17:19:30 -0400
In-reply-to: <DLdF58XCD3IH092yn@xxxxxxxx> (d96-mst-ingen-reklam@xxxxxxxx)
References: <AcCE58XCD3BR092yn@xxxxxxxx> <200005051936.PAA12536@xxxxxxxxxxxxxxxxxx> <DLdF58XCD3IH092yn@xxxxxxxx>
Reply-to: shivers@xxxxxxxxxx

   >You can't use Boyer-Moore with large character types -- it requires you
   >to build a table with one entry for every possible character. Hence
   >not really useable for anything past Latin-1.
   >
   >In fact, I have an implementation of B-M. I just don't *export* it into
   >the library's API as such, because it isn't portable across character types,
   >which is one of the design criteria of the lib.

   But my opinion is that the SRFI should not mention ANY algorithm, it
   should leave it up to implementors. A SRFI is a specification, not an
   implementation, isn't it?

Good question. The specific features of an algorithm can certainly show up in
a specification. Tailoring a spec to a particular algorithm allows the user to
rely on specific performance guarantees of the algorithm. Let's take the case
of KMP. If you know it's KMP, for example, then you know that it won't
allocate temporary storage proportional to the size of the character type.
This is a good thing to know if you'd like your code to run in a Unicode
environment, eh?

SRFI-13 specs string search in two levels. One level is the more abstract
you seem to prefer -- STRING-CONTAINS? and STRING-CONTAINS-CI?. These
functions are free to use any algorithm deemed appropriate by the
library implementor.

The second level is the more algorithmic-specific KMP routines. These give you
extra features that are KMP-specific, and so have a more detailed API. Note
that the thing that lets me export the KMP API is that it doesn't conflict
with the design criteria of the library, e.g., it's portable across different
character types (unlike Boyer-Moore or Sunday's algorithm).

You use the interface that's appropriate to the task you have.
    -Olin

P.S. Note that an *algorithm* is not an *implementation*.

References:
- KMP
  - From: Mikael Ståldal
- Re: KMP
  - From: shivers
- Re: KMP
  - From: Mikael Ståldal

Prev by Date: Re: KMP
Next by Date: Re: Shared substrings
Previous by thread: Re: KMP
Next by thread: Re: KMP
Index(es):
- Date
- Thread