This page is part of the web mail archives of SRFI 13 from before July 7th, 2015. The new archives for SRFI 13 contain all messages, not just those from before July 7th, 2015.
>You can't use Boyer-Moore with large character types -- it requires you >to build a table with one entry for every possible character. Hence >not really useable for anything past Latin-1. > >In fact, I have an implementation of B-M. I just don't *export* it into >the library's API as such, because it isn't portable across character types, >which is one of the design criteria of the lib. But my opinion is that the SRFI should not mention ANY algorithm, it should leave it up to implementors. A SRFI is a specification, not an implementation, isn't it? Good question. The specific features of an algorithm can certainly show up in a specification. Tailoring a spec to a particular algorithm allows the user to rely on specific performance guarantees of the algorithm. Let's take the case of KMP. If you know it's KMP, for example, then you know that it won't allocate temporary storage proportional to the size of the character type. This is a good thing to know if you'd like your code to run in a Unicode environment, eh? SRFI-13 specs string search in two levels. One level is the more abstract you seem to prefer -- STRING-CONTAINS? and STRING-CONTAINS-CI?. These functions are free to use any algorithm deemed appropriate by the library implementor. The second level is the more algorithmic-specific KMP routines. These give you extra features that are KMP-specific, and so have a more detailed API. Note that the thing that lets me export the KMP API is that it doesn't conflict with the design criteria of the library, e.g., it's portable across different character types (unlike Boyer-Moore or Sunday's algorithm). You use the interface that's appropriate to the task you have. -Olin P.S. Note that an *algorithm* is not an *implementation*.