[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?



On Tue, 2006-05-23 at 11:57 -0700, Per Bothner wrote:
> Jonathan S. Shapiro wrote:
> > READ-CHAR must conceptually be built on top of READ-CODEPOINT, which in
> > turn must conceptually be built on top of READ-BYTE. From our experience
> > in BitC, it appears to be the case that READ-CODEPOINT is sufficient for
> > implementation of the compiler/interpreter, and READ-CHAR can therefore
> > be implemented as a library procedure.
> 
> What is the use-case for read-char, as you define it?
> What is the use-case for a "character" data type that is
> *not* a codepoint data type?

We are getting to the jagged edge of what I know about UNICODE, but here
is the situation as I understand it.

The underlying issue within UNICODE is the existence of the so-called
"combining characters". There exist characters that have no single
defining codepoint. These exist primarily in Asian languages, for
example in the form of multiple code points that together form a single
"glyph".

The use case, then, seems self evident: programs that must be aware of
these at the code-point level.

The codepoint==char presumption is simply untrue in some non-western
languages.


shap