[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?



On Tue, 2006-05-23 at 13:08 -0700, Per Bothner wrote:
> Code that works on compound characters *as a unit* can and should use a
> string type.  Code that needs to look *inside* a compound character,
> needs to works with codepoints.

I see the argument. I don't know enough to agree or disagree, but it
does seem plausible, and it also seems to have worked for many other
languages.

> In Java, "character" is actually a Unicode code-point.  This is how it
> should be in Scheme, though we might want to replace the 16-bit size
> by a 20-bit size to avoid the complexities of surrogate characters.

Small nit: I seem to recall that the magic number is actually 21 bits,
but in either case, I agree completely. Java (and several other
languages, either for the same reason or because of compatibility) got
bit by committing too early.

shap