[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Surrogates and character representation
Okay, thanks for clearing up my misunderstanding.
> but in general using UTF-8 as an internal representation is
> a bad idea.
Using UTF-8 internally for a Scheme on a Plan 9 system is not obviously
a bad idea. Sure, you don't have direct indexing, but you avoid
conversion when you talk to the C library and OS.
Using UTF-16 internally doesn't give you direct indexing because of
characters outside the BMP, but it might make sense on Windows boxes for
precisely the same reason.
Using UCS-32 internally in these cases would involve translation to talk
to the library and OS and would further make my emacs use about four
times as much memory as it does now (which brings us back the the
representation for infinity).
In general, any single representation is a bad idea in some circumstances.
Regards,
Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México