Tom Emerson wrote:
Representing strings internally in UTF-8 is a loss though, since you lose random access to the string.
Random access to a previously accessed position works just fine - just use the byte offset.
Random accesses to a position in a string that has not been previously accessed is not in itself useful.
For some applications this isn't a big deal, but in general using UTF-8
> as an internal representation is a bad idea.It's the other way round. Using UTF-8 as in internal representation is just fine for *applications*. The problem is that certain *API*s have a concept of indexing into a string, and unfortunately R5RS is one of them. In itself indexing of strings is a useless feature, as it can be replaced by a sequential-access cursor/iterator API - but historically the Scheme cursor/iterator API uses integers for the "cursor". And existing code moves the "cursor" forwards by adding 1.
-- --Per Bothner per@xxxxxxxxxxx http://per.bothner.com/