[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.



On Thu, Feb 12, 2004 at 09:33:12PM -0800, bear wrote:
> This is actually a very good point.
> 
> If we agree on a convention for writing and displaying these extended
> characters into identifiers, character constants, and strings using
> the portable character set, then we are able to write portable code
> using them.  If a particular implementation displays them as literal
> glyphs instead, or allows users with a particular keyboard to type
> them as literal glyphs, or is able to accept program text where they
> are "directly present" in the code rather than displayed using
> alternate means, I'd say it's a win.

I've heard that argument before -- it came up during the C++
standardization process. Beware, because the committee ended up making
choices just like what's being proposed here, and I don't think anybody
was truly happy with it. They just couldn't agree on anything better.

> I'd suggest therefore a means of extending the "named" characters --
> R5RS gives us #\space and #\newline but we ought to be able to add
> characters to the named set using some kind of binding construct, and
> thereafter refer to them by name in strings, character constants, and
> identifiers.

C++ works like that. You can write identifiers directly in Unicode, or
you can mangle them like that. The idea was that editors could hide the
mangling from you, or convert to and from it, but I'm not aware of any
editors that actually support that. It's a kludge, and people would much
rather just use characters than use this kind of markup.

(I could have a biased impression of the user experience with this,
because I personally dislike it, just so you know.)
-- 
Bradd W. Szonye
http://www.szonye.com/bradd