This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.
Per Bothner <firstname.lastname@example.org> writes: > You need keyboard *events*, for input, but they're obviously quite > different from characters. (They have modifier bits, plus maybe > separate up/down events.) Right. A text editor needs an input mechanism and a mapping from that mechanism to characters. It does not care about code points, except in the case that the input method happens to be similar to code points. > You need to store and modify text data in *buffers*, but there is no > need for characters as a separate data type. You do need functions to > "move to the next character/word/line/paragraph", but again these work > in characters in a buffer, not individual characters. Except that text is an assemblage of characters, not of code points. The editor needs functions like "display this character", "move to next character", "tell the user what character this is", and even "convert this character to some standard interchange format". > You need be able to display the text, which involves searching for > glyphs in fonts, possibly performing kerning, line-breaking, etc. > Again, a character data type will probably be either too high-level > or too low-level. You certainly don't want to have it built-in to > your programming language, but into your display software. What I want is a *character* type for a text editor. What is *certainly* useless is a "code point" type. What is perfectly perverse is taking a code point type and *labelling* it character. If you don't want a character data type, then you don't have to have one as far as I'm concerned. But those of us who do want one, can we please call it "character"? And you, with your preferred "code point" type (which *is* useful for some applications, namely, those which are unicode-specific) can have that, and please call it "code-point". A text editor need not deal with encodings *at all*. Think of it: the keyboard driver provides characters to the text editor. Real, full-fledged, characters. And the text editor asks the display widget to display a character in a particular font and context (since some characters have different glyphs). But never does it really care about encodings. It is only an editor like emacs that must spend all this energy on encodings, because it is really a byte editor being torqued into use as a text editor. Think of it this way: an editor should not even *care* what the underlying encodings are for characters; it should be entirely irrelevant. Thomas