[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Surrogates and character representation

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.



FWIW, I now think (after some talk on a private Unicode list) that it's
correct to allow surrogates as Scheme characters; that is, the range of
char->integer should be 0 to #x10FFFF.

Hmm. That would seem to prevent an implementation representing strings internally using UTF-8. This is convenient in some contexts as Scheme strings can be trivially converted to UTF-8 C strings.

Regards,

Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México