[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: character strings versus byte strings
bear <bear@xxxxxxxxx> writes:
> Each character is a unicode codepoint plus a non-defective sequence of
> unicode combining codepoints. The unicode documentation refers to these
> entities as "graphemes."
I should revise what I said; there may well be a case for Scheme
characters being graphemes instead of codepoints. I lead toward
codepoints, but I recognize that graphemes are a good contender.
My post was intended to argue against UTF-8; but moving further up the
abstraction ladder than codepoints may well be right.