[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: freshman-level Boyer-Moore fast string search

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

To: William D Clinger <cesura@xxxxxxxxxxx>
Subject: Re: freshman-level Boyer-Moore fast string search
From: Alan Watson <a.watson@xxxxxxxxxxxxxxxx>
Date: Fri, 29 Jul 2005 10:20:30 -0500
Cc: srfi-75@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-75@xxxxxxxxxxxxxxxxx
In-reply-to: <y9liryuvwne.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Organization: Centro de Radioastronomía y Astrofísica UNAM
References: <42E991CE.5050806@xxxxxxxxxxx> <5fb7e0870507282012453c30dd@xxxxxxxxxxxxxx> <42E9B37A.80008@xxxxxxxxxxx> <5fb7e0870507282305772e2cc2@xxxxxxxxxxxxxx> <y9liryuvwne.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Mozilla Thunderbird 1.0 (X11/20050317)

William D Clinger wrote:

 > Out of curiosity, what string representations does SRFI-75 penalize
 > which you consider to be poor?

Suppose each string s is represented by a vector of 2^21 elements,
where element i consists of a list of numbers, in IEEE double
precision, that represent the indexes within s at which the
character c appears, where c is the Unicode scalar value f(i),
where f is represented by a global association list that maps
scalar values to indexes (i.e. f-inverse).  SRFI-75 allows that
representation, yet penalizes it.  I also consider it to be a
poor representation.

You consider this a poor representation because it does not allowconstant-time random access or because it does not allow linear-timetraversal? Or simply because of the space cost?

Consider an implementation that internally represents strings as if theywere doubly-linked lists and cached the position of the last index. Thisallows linear-time traversal but not constant-time random access. Wouldyou consider this a poor implementation?

The thing is, SRFI-75 does not mention how one accesses the charactersin a string, so I presume that R5RS character indices will remain theonly portable way to do this.


Regards,

Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México

References:
- Re: freshman-level Boyer-Moore fast string search
  - From: Alex Shinn
- Re: freshman-level Boyer-Moore fast string search
  - From: Alex Shinn
- Re: freshman-level Boyer-Moore fast string search
  - From: William D Clinger

Prev by Date: Re: Surrogates and character representation
Next by Date: Re: Surrogates and character representation
Previous by thread: Re: freshman-level Boyer-Moore fast string search
Next by thread: freshman-level Boyer-Moore fast string search
Index(es):
- Date
- Thread