[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.



Jonathan S. Shapiro scripsit:

> > Unless char is defined to be a code point. Which is IMHO the most
> > reasonable choice: code points are the natural atomic units of Unicode
> > text, and most Unicode algorithms are expressed in terms of code points.
> 
> In many respects I agree that this would be sensible from the
> programmer's perspective.
> 
> Unfortunately it is quite wrong, which is something that the UNICODE
> people go to great lengths to make clear (and, IMO, a serious failing of
> UNICODE).

It's not *wrong*.  It's not a matter of the Right Thing and the Wrong Thing.
For some purposes, code units (8 or 16 or 32 bits) are the Right Thing;
for some purposes, codepoints are; for some purposes, higher-level units
are.  It's about appropriate choices.

And if Unicode is complicated (and it is), it's because it's embedded in
a complicated world.

-- 
John Cowan   cowan@ccil.org  http://www.ccil.org/~cowan
Most languages are dramatically underdescribed, and at least one is
dramatically overdescribed.  Still other languages are simultaneously
overdescribed and underdescribed.  Welsh pertains to the third category.
        --Alan King