[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: #\a octothorpe syntax vs SRFI 10

This page is part of the web mail archives of SRFI 58 from before July 7th, 2015. The new archives for SRFI 58 contain all messages, not just those from before July 7th, 2015.



 | Date: Sun, 26 Dec 2004 23:14:00 -0800 (PST)
 | From: campbell@xxxxxxxxxxxxxxxxxxxxxxxxxxx
 | 
 | On Sun, 26 Dec 2004, Aubrey Jaffer wrote:
 | 
 | > Arrays are a fundamental data organizing paradigm from the origins of
 | > computing; FORTRAN has arrays; APL has arrays.  I hope arrays will
 | > become part of Scheme in R6RS.  For a construct which generalizes two
 | > of Scheme's three aggregate data types, a succinct read-syntax does
 | > not seem overly burdensome.
 | 
 | Need it be so succinct as to add eleven new octothorpe reader macros,
 | each dispatching further for the large number of different types of
 | arrays?  It would be much simpler, I think, and it would not lose much
 | brevity, to use SRFI 10; indeed, SRFI 10 was designed in response to
 | this issue as it arose in SRFI 4.
 | 
 | >  | In particular, I suggest that it be:
 | >  | 
 | >  |   #,(ARRAY [<rank>] <type> <elements> ...)
 | > 
 | > Rank cannot be deduced from <element> nesting for heterogeneous
 | > arrays.  I suggest that <rank> be required.
 | 
 | Sorry, I was not sufficiently clear there.  I meant to specify that the
 | rank defaults to 1, like #Axxx(...) in the current proposal.

In the updated srfi-58.html I sent to the editor I have eliminated the
#Axxx syntax.  The rank digit(s) will be required.

 | >  | So, for example, the two-by-two array of unsigned 16-bit integers from
 | >  | the document might be written as #,(ARRAY 2 u16 (0 1) (2 3)).
 | >  | General object arrays' types would be OBJECT (so #(FOO 1 #T ())
 | >  | could also be written #,(ARRAY OBJECT FOO 1 #T ())) and character
 | >  | arrays' types would be CHAR (so "foo" could alternatively be
 | >  | written #,(ARRAY CHAR #\f #\o #\o)).
 | > 
 | > This appears to introduce type symbols like U16 and CHAR which are not
 | > part of srfi-47.  The prototype functions in srfi-47 return arrays.
 | > 
 | >  | [...]
 | > 
 | > I am not opposed to also having SRFI-10 syntax for arrays.  This would
 | > seem to require reserving a set of symbols for type specification,
 | > which is an unschemely way of doing things.  Scheme goes to some
 | > lengths to avoid using symbols as cookies; witness NULL? and
 | > EOF-OBJECT?
 | 
 | Perhaps I'm confused, but I don't see much difference between my usage
 | of symbols -- which exist only at read-time, never at run-time, unlike
 | nil and the EOF object -- and your usage of the suffixes of the new #A
 | syntax.  Could you elaborate on how my proposal is any worse in that
 | respect than yours?

To keep symbols-as-cookies out of Scheme to this point probably means
that some RRRS-author(s) is severly allergic to it.

I want arrays in R6RS.  I don't want to jeopardize array's chances by
making a proposal which looks like symbols-as-cookies, even if it is
not exactly true in a technical sense.

SRFI-10 mandates parentheses (eg. #,(infinity) instead of #,infinity).
This makes its SRFI-10 objects look like expressions to be evaluated.
SRFI-58 objects will be used as prototype array objects in calls to
MAKE-ARRAY:

(make-array '#1Ar64(1.0) 2 3)                   ; Current SRFI-58 syntax

(make-array '#,(Array 1 ar64 [1.0]) 2 3)        ; SRFI-10 style

(make-array '#,(Ar64 [1.0]) 2 3)                ; compact-SRFI-10 style.
                                                ; [] nesting gives rank.

(make-array    (Ar64 1.0) 2 3)                  ; Current SRFI-47 functions

    ==> #2Ar64((1.0 1.0 1.0) (1.0 1.0 1.0))

The SRFI-10 style above looks like symbols-as-cookies.  The
compact-SRFI-10 style does not.  Do you like the compact-SRFI-10
style; or would it take too much of SRFI-10s namespace?

Having the read prefix use the same coding as the prototype functions
halves the (human) memory load.  If we move to nomenclature like
REAL-64, then I want prototype functions to be available with those
names:

(make-array '#,(Array 1 real-64 [0.0]) 2 3)     ; longer SRFI-10 Style

(make-array '#,(real-64 [0.0]) 2 3)             ; longer compact-SRFI-10

(make-array    (real-64 0.0) 2 3)               ; analogous SRFI-47 function

 | >  | (I'd also prefer that the names be longer & much more descriptive, like
 | >  | UNSIGNED16 or BOOLEAN, but I suppose that's a little too late, now that
 | >  | SRFI 47 has already been finalized & the incomprehensible abbreviations
 | >  | of array types have been set into stone...)
 | > 
 | > SRFI-47 defines procedures to return prototype arrays.  Additional
 | > procedures can be added to alias the abbreviated ones.
 | 
 | This works for SRFI 47, but not necessarily this SRFI: one cannot
 | define one's own aliases for existing array types in the reader
 | syntax.

Yes.  That is why we are dicussing this now; before SRFI-58 is
finalized.

 | > But explicitly complete descriptions for numeric types are rather
 | > long:
 | > 
 | > [...long list...]
 | > 
 | > These long names present more of a burden for the memories of
 | > non-English-speakers than the short names, which are the same for
 | > everyone.
 | 
 | I'm not suggesting names so long that they induce tedium in typists,
 | but rather names somewhat longer than are excessively obscure, such as
 | INTEGER-U16, COMPLEX-64, BIT, et cetera.

This is requiring users to internalize assumptions that integers are
exact; and reals and complexes are not.  Scheme has a strong
propensity for calling things exactly what they are, witness
CALL-WITH-CURRENT-CONTINUATION, EOF-OBJECT?, LIST?, and PAIR?.

 | Furthermore, the single-character mnemonics are derived from
 | English, and there is certainly the possibility that their names
 | would begin with different initial letters in other languages;
 | however, everything in Scheme is from English anyway, so I see
 | nothing wrong with using English words for array element type
 | names.

English doesn't much help remember Scheme exponent markers:

  The letters `s', `f', `d', and `l' specify the use of SHORT, SINGLE,
  DOUBLE, and LONG precision, respectively.

I don't usually think of a DOUBLE as shorter than a LONG.  And where
did `f' for SINGLE come from?  Maybe it is a C-ism.  In any case, it
is one of five characters (with 'e') rather than one of five longer
sequences to remember.

 | > There is Scheme precedent for abbreviated names in identifiers
 | > like CADR an CDADAR and in the radix and exactness prefixes #B,
 | > #O, #D, #X, #E, #I.
 | 
 | ... A better analogue would be ARRAY-REF, but I haven't seen any
 | objections to that as opposed to AREF, and I much prefer ARRAY-REF
 | rather than AREF.

I am not opposed to longer names, but they must work together and they
must integrate well with Scheme.

 | Let me also point out here that much of Scheme's naming conventions
 | and lexemes originated from T.  In T, there was no built-in
 | facility for multi-dimensional arrays, but there were still object
 | representation names used by Orbit's representation analyzer and
 | for the C & Pascal FFIs.  These were named semi-verbosely, as I
 | suggest above; e.g., the representation descriptor of unsigned,
 | sixteen-bit integers was named REP/INTEGER-16-U.  Many of the names
 | in T were intended to be long enough to be understandable and not
 | obscure, but not so long as to be excessive; this has tended to
 | hold in Scheme as well.  I think it would be good to preserve that
 | in the array element type names as well.

I found my T2.7 manual, but it doesn't have FFIs in it.

If I come up with longer names and they aren't better than the current
system (used by SCM for many years), then I would be making a
straw-man.  Please replace the first column of this table with a set
of better names, so we can discuss this change in more concrete terms.

    prototype
    procedure exactness  element-type
    ========= =========  ============
    vector               any (conventional vector)
    ac64      inexact    64-bit+64-bit complex
    ac32      inexact    32-bit+32-bit complex
    ar64      inexact    64-bit real
    ar32      inexact    32-bit real
    as64      exact      64-bit signed integer
    as32      exact      32-bit signed integer
    as16      exact      16-bit signed integer
    as8       exact      8-bit signed integer
    au64      exact      64-bit unsigned integer
    au32      exact      32-bit unsigned integer
    au16      exact      16-bit unsigned integer
    au8       exact      8-bit unsigned integer
    string               char (string)
    at1                  boolean (bit-vector)