[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

character strings versus byte strings

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

To: srfi-50@xxxxxxxxxxxxxxxxx
Subject: character strings versus byte strings
From: Matthew Flatt <mflatt@xxxxxxxxxxx>
Date: Mon, 22 Dec 2003 07:16:25 -0700
Delivered-to: srfi-50@xxxxxxxxxxxxxxxxx

This looks like an excellent start!

Some suggestions toward addressing the character-encoding issue:

 * Change the API to distinguish between byte strings and character
   strings. (I think C code is as likely to need one as the other).

 * Where "char *" is used for strings (e.g., "expected_explanation" for
   a type error), define it to be an ASCII or Latin-1 encoding (I
   prefer the latter).

 * For Scheme characters, pick a specific encoding, probably one of
   UTF-16, UTF-32, UCS-2, or UCS-4 (but I don't know which is the right
   choice).

An additional request:

 * Distinguish between mutable and immutable strings, particularly in
   checking argument types. (C code that intents to mutate an argument,
   for example, should require a mutable one and reject an immutable
   one.)

Matthew

Follow-Ups:
- Re: character strings versus byte strings
  - From: Per Bothner
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: Michael Sperber

Prev by Date: thread-safe interfaces
Next by Date: Re: character strings versus byte strings
Previous by thread: Re: thread-safe interfaces [correction]
Next by thread: Re: character strings versus byte strings
Index(es):
- Date
- Thread