[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Identifiers

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.

To: srfi-52@xxxxxxxxxxxxxxxxx
Subject: Re: Identifiers
From: "Bradd W. Szonye" <bradd+srfi@xxxxxxxxxx>
Date: Thu, 12 Feb 2004 22:39:24 -0800
Delivered-to: srfi-52@xxxxxxxxxxxxxxxxx
In-reply-to: <Pine.LNX.4.58.0402112211240.11073@xxxxxxxxxxxxxx>
Mail-followup-to: srfi-52@xxxxxxxxxxxxxxxxx
References: <200402102106.NAA13325@xxxxxxxxxxxxxxxxxxxxxxx> <87ekt1avo8.wl@xxxxxxxxxxxxxxxxxxxxx> <20040212024256.GA7434@xxxxxxxxxxxxxxx> <Pine.LNX.4.58.0402112211240.11073@xxxxxxxxxxxxxx>
User-agent: Mutt/1.4.1i

bear wrote:
> There are some appropriate restrictions [on codepoints in
> identifiers], I think; identifiers should not begin with:
> 
>  * a combining character
>  * a non-character codepoint
>  * a whitespace character
>  * a control character
>  * characters which can begin syntactically valid numbers
>       (digits, sign, point)
>  * a delimiter (parens, at least)

Agreed. (The 5th point, symbol/number ambiguity isn't too hard to deal
with, and it's a popular extension to allow ids like "1+").

> Identifiers should not contain:
>   * whitespace
>   * delimiters
>   * non-character codepoints
>   * control characters
>   * invalid sequences

Agreed.

> The minimum requirement for case insensitivity as defined by
> R5RS gives another rule:
> 
>   * no character in an identifier ought to be automatically
>     converted to the implementation's preferred case (and no
>     identifier differing only by that character versus another
>     ought to be considered the same identifier)  unless it is
>     part of a one-to-one reciprocal pair of upper and lower case
>     characters as identified by char-upcase, char-downcase, and
>     char-ci=?.   This finally is the property that is required
>     for the char-alphabetic? characters in the portable character
>     set: R5RS does not say so specifically but it is not possible
>     to comply with R5RS without meeting this requirement.

Hm. Makes sense.

> Note that R5RS permits 'rules raping' in terms of this requirement;
> An implementation of R5RS is fairly easy if no characters other than
> a ... z and A ... Z are case-folded in case insensitive identifiers
> and char-alphabetic? returns #t for only those characters.

Heh. I don't think that would be desirable.
-- 
Bradd W. Szonye
http://www.szonye.com/bradd

References:
- terminology
  - From: Tom Lord
- Re: terminology
  - From: Alex Shinn
- Re: terminology
  - From: Bradd W. Szonye
- Identifiers
  - From: bear

Prev by Date: Re: Encodings.
Next by Date: Re: Encodings.
Previous by thread: Identifiers
Next by thread: Encodings.
Index(es):
- Date
- Thread