[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.



Jonathan S. Shapiro wrote:
On Sun, 2006-05-21 at 17:54 +0200, Marcin 'Qrczak' Kowalczyk wrote:
"Jonathan S. Shapiro" <shap@eros-os.org> writes:

  1. The correct primitive is READ-CODEPOINT, not READ-CHAR.
     READ-CHAR is a library routine.
Unless char is defined to be a code point. Which is IMHO the most
reasonable choice: code points are the natural atomic units of Unicode
text, and most Unicode algorithms are expressed in terms of code points.

In many respects I agree that this would be sensible from the
programmer's perspective.

Unfortunately it is quite wrong, which is something that the UNICODE
people go to great lengths to make clear (and, IMO, a serious failing of
UNICODE).

I don't think that is relevant.  The point is that the Scheme concept of
character is most sensibly mapped to the Unicode concept of codepoint.
We should stay far away from attempting any kind of data-type that tries
models character combinations - for that people should use strings.

Hence, there is no need to use the name "read-codepoint" - "read-char"
is just fine, since it reads a Scheme character, which happens to be
the same as a Unicode codepoint.
--
	--Per Bothner
per@bothner.com   http://per.bothner.com/