This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.
bear <bear@xxxxxxxxx> writes: > Particularly, some characters, particularly accented characters, > have uppercase and lowercase versions which are different numbers of > codepoints. Thus, in the "codepoint equals character" model, one > case is a character and the other case -- isn't. I don't quite understand what you're saying: the locale-independent case mappings in UnicodeData.txt always map a single scalar value to a single scalar value. Sure it doesn't always do what your locale thinks (as you point out), but this case mapping doesn't require "multi-codepoint characters." > Sixth, is there any way for a scheme implementation to support > characters and strings in addutional encodings different from > unicode and not necessarily subsets of it, and remain compliant? I don't think so, at least not in the way you envision. I don't think that's necessary or even a good idea, either. This SRFI effectively hijacks the char and string datatypes and says that the abstractions for accessing them deal in Unicode. Any representation that allows you to do that---i.e. implement STRING-REF, CHAR->INTEGER, and INTEGER->CHAR and so on in a way compatible with the SRFI is fine, but I believe you're thinking about representations where that's not the case. -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla