[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: freshman-level Boyer-Moore fast string search



William D Clinger wrote:
 > Out of curiosity, what string representations does SRFI-75 penalize
 > which you consider to be poor?

Suppose each string s is represented by a vector of 2^21 elements,
where element i consists of a list of numbers, in IEEE double
precision, that represent the indexes within s at which the
character c appears, where c is the Unicode scalar value f(i),
where f is represented by a global association list that maps
scalar values to indexes (i.e. f-inverse).  SRFI-75 allows that
representation, yet penalizes it.  I also consider it to be a
poor representation.

You consider this a poor representation because it does not allow constant-time random access or because it does not allow linear-time traversal? Or simply because of the space cost?

Consider an implementation that internally represents strings as if they were doubly-linked lists and cached the position of the last index. This allows linear-time traversal but not constant-time random access. Would you consider this a poor implementation?

The thing is, SRFI-75 does not mention how one accesses the characters in a string, so I presume that R5RS character indices will remain the only portable way to do this.

Regards,

Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México