[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?



Jonathan S. Shapiro scripsit:

> > Unless char is defined to be a code point. Which is IMHO the most
> > reasonable choice: code points are the natural atomic units of Unicode
> > text, and most Unicode algorithms are expressed in terms of code points.
> 
> In many respects I agree that this would be sensible from the
> programmer's perspective.
> 
> Unfortunately it is quite wrong, which is something that the UNICODE
> people go to great lengths to make clear (and, IMO, a serious failing of
> UNICODE).

It's not *wrong*.  It's not a matter of the Right Thing and the Wrong Thing.
For some purposes, code units (8 or 16 or 32 bits) are the Right Thing;
for some purposes, codepoints are; for some purposes, higher-level units
are.  It's about appropriate choices.

And if Unicode is complicated (and it is), it's because it's embedded in
a complicated world.

-- 
John Cowan   cowan@ccil.org  http://www.ccil.org/~cowan
Most languages are dramatically underdescribed, and at least one is
dramatically overdescribed.  Still other languages are simultaneously
overdescribed and underdescribed.  Welsh pertains to the third category.
        --Alan King