[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Surrogates and character representation

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.



Alan Watson scripsit:

> Hmm. That would seem to prevent an implementation representing strings 
> internally using UTF-8. This is convenient in some contexts as Scheme 
> strings can be trivially converted to UTF-8 C strings.

Not at all.  There is a well-defined UTF-8 encoding for every Unicode
code point (which is not the case for UTF-16).  See Table 3-6 in
the Unicode Standard 4.0.

-- 
Here lies the Christian,                        John Cowan
        judge, and poet Peter,                  http://www.reutershealth.com
Who broke the laws of God                       http://www.ccil.org/~cowan
        and man and metre.                      jcowan@xxxxxxxxxxxxxxxxx