[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

text processes vs. string procedures



I agree with almost all of Sergei's msg.

- Basic string procs should *not* require textual well-formedness in a Unicode
  world. A string full of accents and umlauts and cedillas with no preceding
  base or start character is still a legal string.

- Full Unicode support will certainly require other procedures not in the 
  SRFI-13 spec. Sergei's examples of canonical & compatibility decomposition
  and composition are good ones. These should go in a Unicode-specific
  library, which is not the goal of SRFI-13.

- We also certainly need to do a new char library. Or perhaps a pair of them:
  one generic one, and one for Unicode-specific things. 

- However, I think case-mapping and string-comparison are basic things, and
  they can be given a generic, portable definition independent of the
  underlying character encoding. Case-mapping does *not* require strings to be
  well-formed text. ASCII, Latin-1 and Unicode all provide a clear,
  language-independent definitions of this operation.

  I don't want the string library to be minimal. I want it to be useful.
  People -- many of whom currently program with Latin-1 or ASCII Schemes --
  case-map and compare strings frequently. These operations can be provided
  with an API which is portable across ASCII, Latin-1 and Unicode. So there's
  no barrier here.