[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.



On Tue, 2006-05-23 at 13:08 -0700, Per Bothner wrote:
> Code that works on compound characters *as a unit* can and should use a
> string type.  Code that needs to look *inside* a compound character,
> needs to works with codepoints.

I see the argument. I don't know enough to agree or disagree, but it
does seem plausible, and it also seems to have worked for many other
languages.

> In Java, "character" is actually a Unicode code-point.  This is how it
> should be in Scheme, though we might want to replace the 16-bit size
> by a 20-bit size to avoid the complexities of surrogate characters.

Small nit: I seem to recall that the magic number is actually 21 bits,
but in either case, I agree completely. Java (and several other
languages, either for the same reason or because of compatibility) got
bit by committing too early.

shap