[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Normalization vs. grapheme clusters
The Unicode old farts^W^Wrespected elders have confirmed that if you segment
a Unicode string by grapheme clusters as officially defined by Unicode,
then normalization forms C and D will not change it; that is, any change
will be within grapheme clusters and not across their boundaries.
This does not hold for normalization forms KC and KD, which remove characters
with compatibility decompositions.
John Cowan www.ccil.org/~cowan www.reutershealth.com jcowan@xxxxxxxxxxxxxxxxx
[T]here is a Darwinian explanation for the refusal to accept Darwin.
Given the very pessimistic conclusions about moral purpose to which his
theory drives us, and given the importance of a sense of moral purpose
in helping us cope with life, a refusal to believe Darwin's theory may
have important survival value. --Ian Johnston