Re: Surrogates and character representation

William D Clinger wrote:
Per Bothner wrote:
 > Random accesses to a position in a string that has not
 > been previously accessed is not in itself useful.

Untrue.  The Boyer-Moore algorithm for fast string
searching uses random accesses to positions that have
not been previously accessed [1].

Yes, but I think you can implement this for UTF-8 or UTF-16 strings using offsets to the underlying bytes or shorts. I don't think that you need character offsets.


Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México