[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Surrogates and character representation

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.



On 7/28/05, William D Clinger <cesura@xxxxxxxxxxx> wrote:
> 
> I acknowledged that the algorithm will still work.  My point
> is that its asymptotic complexity may be degraded.

I'm sorry, perhaps I'm just misunderstanding, but if the exact
same algorithm, in fact the exact same code, can be used,
how is the asymptotic complexity affected?

> > It is in fact UTF-32 that has additional overhead for Boyer-Moore
> > searches as mentioned in my previous mail.
> 
> Your claim was incorrect.  Your claim appears to depend upon
> ignorance of techniques for constructing, representing, and
> manipulating sparse automata efficiently, and upon your mistaken
> belief that c1 > c2 implies O(m+c1) > O(m+c2), where c1 and c2
> are constants.  Finally, you ignored the difference between O(n)
> and O(n/m), which was the point of the paragraph you quoted above.

Please note that I said "additional overhead."  Nowhere did I say
or suggest that UTF-32 incurred an asymptotic penalty.  However,
the c1 and c2 are not just constants - they are typically larger than
m, and therefore should not be ignored.

-- 
Alex