[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.

On Tue, 2006-05-23 at 11:57 -0700, Per Bothner wrote:
> Jonathan S. Shapiro wrote:
> > READ-CHAR must conceptually be built on top of READ-CODEPOINT, which in
> > turn must conceptually be built on top of READ-BYTE. From our experience
> > in BitC, it appears to be the case that READ-CODEPOINT is sufficient for
> > implementation of the compiler/interpreter, and READ-CHAR can therefore
> > be implemented as a library procedure.
> What is the use-case for read-char, as you define it?
> What is the use-case for a "character" data type that is
> *not* a codepoint data type?

We are getting to the jagged edge of what I know about UNICODE, but here
is the situation as I understand it.

The underlying issue within UNICODE is the existence of the so-called
"combining characters". There exist characters that have no single
defining codepoint. These exist primarily in Asian languages, for
example in the form of multiple code points that together form a single

The use case, then, seems self evident: programs that must be aware of
these at the code-point level.

The codepoint==char presumption is simply untrue in some non-western