[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

character strings versus byte strings

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

This looks like an excellent start!

Some suggestions toward addressing the character-encoding issue:

 * Change the API to distinguish between byte strings and character
   strings. (I think C code is as likely to need one as the other).

 * Where "char *" is used for strings (e.g., "expected_explanation" for
   a type error), define it to be an ASCII or Latin-1 encoding (I
   prefer the latter).

 * For Scheme characters, pick a specific encoding, probably one of
   UTF-16, UTF-32, UCS-2, or UCS-4 (but I don't know which is the right

An additional request:

 * Distinguish between mutable and immutable strings, particularly in
   checking argument types. (C code that intents to mutate an argument,
   for example, should require a mutable one and reject an immutable