[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.



Back to dumb questions.

I assume that it is useful to distinguish the two goals of
	extending programming language identifiers
and	processing Unicode data.

For identifiers, either we have EQ? preserving literals, or "literalization of 
bits" (I.e. string preservation).

So w.r.t. identifiers, why is normalization needed at all? To my mind, 
normalization is a library procedure (set of procedures) for dealing with 
Unicode data/codepoints.

Defining valid identifier syntax such that case folding of (unnormalized) 
identifier literals should be sufficient.

What am I missing?

============
Another note.  Characters are currently dealt with in a fairly abstract 
manner.  It would seem that in dealing with Unicode data as binary data 
(codepoints), R6RS/SRFI/... must define a binary IO API.

============
$0.02,
-KenD