This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 are here. Eventually, the entire history will be moved there, including any new messages.
> From: tb@xxxxxxxxxx (Thomas Bushnell, BSG) > Matthew Flatt <mflatt@xxxxxxxxxxx> writes: > > * For Scheme characters, pick a specific encoding, probably one of > > UTF-16, UTF-32, UCS-2, or UCS-4 (but I don't know which is the right > > choice). > Wrong. A Scheme character should be a codepoint. The representation > of code points as sequences of bytes should be under the hood. Misleading. It isn't obvious that Scheme characters should be _Unicode_ codepoints. For (much) more inclusive definitions of "codepoint", that characters should be codepoints is tautologically true. There's a serious problem regarding Scheme and Unicode in that, for any sane definition of "character" in Unicode, the character type in R5RS is not sanely isomorphic. I think that the best way to handle that in an FFI is to try to remain agnostic about the range of the scheme CHAR? type when mapped into C. I _guess_ that the error-signalling-on-range-error property of SCHEME_EXTRACT_CHARACTER satisfies this but it could certainly be rounded out and made more useful. -t