[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.



On Tue, 2006-05-23 at 14:15 -0400, John Cowan wrote:
> Jonathan S. Shapiro scripsit:
> > Unfortunately it is quite wrong, which is something that the UNICODE
> > people go to great lengths to make clear (and, IMO, a serious failing of
> > UNICODE).
> 
> It's not *wrong*.  It's not a matter of the Right Thing and the Wrong Thing.
> For some purposes, code units (8 or 16 or 32 bits) are the Right Thing;
> for some purposes, codepoints are; for some purposes, higher-level units
> are.  It's about appropriate choices.

I did not mean "wrong" in the sense of "immoral, unethical, or
fattening". I meant "wrong" in the sense of "incorrect or inaccurate".
For better or worse, the real world has decided that characters are not
code points. Given that this is true, I am simply suggesting that it is
a mistake to mislabel them by making poor choices about the names of
standard procedures.

READ-CHAR must conceptually be built on top of READ-CODEPOINT, which in
turn must conceptually be built on top of READ-BYTE. From our experience
in BitC, it appears to be the case that READ-CODEPOINT is sufficient for
implementation of the compiler/interpreter, and READ-CHAR can therefore
be implemented as a library procedure.

> And if Unicode is complicated (and it is), it's because it's embedded in
> a complicated world.

Indeed.