This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.
On 5/26/06, Matthew Flatt <mflatt@xxxxxxxxxxx> wrote:
Straightforward additions ------------------------- * `char-general-category', which accepts a character and returns one of 'lu, 'li, ... * `string-normalize-nfd', `string-normalize-nfkd, `string-normalize-nfc', and `string-normalize-nfkc', which each accept a string and produce its normalization according to normal form D, KD, C, or KC, respectively.
I wouldn't consider these straightforward because they remove the option of a Scheme implementation to keep all strings internally in the same normalization form. For the extra work of making all string primitives construct and retain a single normalization form, you relieve the user from the burden of ever having to worry about any normalization issues (although you then introduce round-trip issues with external string sources). What about a not-necessarily-Unicode-specific STRING-NORMALIZE that simply converts to the implementation's preferred normal form, possibly returning the original string? Regardless, since all of these procedures require large tables can we assume they are part of a library and not part of the core language? -- Alex