This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.
At Thu, 21 Jul 2005 15:45:34 -0700, Thomas Lord wrote: > If CHARs are codepoints, more basic Unicode algorithms translate > into Scheme cleanly. I don't see what you mean. Can you provide an example? > What is gained by forcing surrogates to be unrepresentable as CHAR? Every string is representable in UTF-8, UTF-16, etc. > What kind of code will I wind up with if I want to iterate over > a large range of CHAR values? Two loops: one from 0 to #xD7FF, and one from #xE000 to #x10FFFF. > It's not as if by excluding surrogates we arrive at a CHAR definition > that is significantly more "linguistic" than if we don't. True, but we arrive at a definition that is more standards-friendly, and that's part of the overall compromise. FWIW: MzScheme originally supported a larger set of characters, mainly because extra bits are available my implementation. The resulting bad experience convinced me to define characters in terms of scalar values, instead. Matthew