[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on SRFI 69



On Thu, Aug 11, 2005 at 10:23:47AM -0400, David Van Horn wrote:
> The SRFI document states the following in the Abstract:
>    This SRFI specifies an API for basic hash tables. Hash tables are data
>    structures that provide a mapping from some set of keys to some set of
>    values associated to those keys.

I can't help feeling a little bit perplexed.  (1) The sentence following
your citation was part of the definition of what hash tables are.  (2) I
thought item (1) was obvious.  (3) If I had to write every definition in
one sentence only, think about what it would make the definition of
"appropriate hash function" like.

> From your description of what hash tables are, the name "hash table" seems 
> too specific of a term.  This issue was raised earlier by Marc Feeley, and 

Rather on the contrary, it's too broad.  As you point out, not all hash
table types allow destructive update, for instance.  However, my claim
of "broad applicability" is retained, because one can use SRFI-69 hash
tables as if they were immutable.

> as far as I can tell was never addressed in the document or discussion 

It was addressed in
http://srfi.schemers.org/srfi-69/mail-archive/msg00021.html
(part ...naming)

> list. Perhaps, I'm just missing it.  However, the more important issue here 
> is that what follows in the document is far from a basic API for structures 
> that provide a mapping from some set of keys to some set of values 
> associated to those keys.

Yes, yes.  The sentence you quoted is just an elaboration of this claim:
"hash tables are mappings".  But that's not all hash tables are.  SRFI
69 does not attempt to define a basic API for mappings, but a basic API
for hash tables.  The constraints mentioned in the SRFI -- complexity
constraints, requirement for enumeration and updating functions -- tie
any implementation to be pretty close to what I think users expect when
they think they are using "hash tables".  The third sentence of the
abstract summarises it quite cleanly IMO.

Examples of implementation strategies that are ruled out because of
API/complexity requirements include alists, search trees, copy-on-write
hash tables, chained functions, and the like.  So SRFI 69 definitely is
not a definition of an API for mappings.

> Luckily this SRFI does state its aims, however there are conflicts with 
> what is stated.  Further, design choices have been made that violate these 
> aims, and rationales rarely appeal to the aims in a consistent manner.

I can't understand what you're referring to, unless you really take
"hash table" in the SRFI to refer to "mapping", ignoring the third
sentence of the abstract, which I find hard to believe.

>    The primary aim of this SRFI is to provide a simple and generic
>    hash table API that will answer most of users' needs for basic
>    usage of hash tables.
> This conflates two disparate and competing aims into one; that of providing 

You know, a wording like this means that the API will try to be all of:
simple, generic, and broadly applicable.  It does not claim that these
goals are the same; rather, it means that the SRFI aims at fulfilling
all of them simultaneously (with compromises, if need be).

> patterns may be).  The abstract makes no mention of most common hash table 
> usage, so this second aim seems out of the scope of this SRFI.  But several 

If you want to give examples of typical hash table usage, I can add them
into the abstract, no problem.  But did somebody perhaps complain that
what lists are is not articulated clearly enough in SRFI 1 because the
abstract does not have examples of most common list usage?  I think most
common hash table usage is approximately as difficult to define as most
common list usage.

> choices are made contrary to the aim of simplicity and generality, such as 
> the ad-hoc collection of type-specialized hash table procedures and their 
> hash function counterparts.  Appeals for generality in the API have been 
> discounted by the author saying such things as, "I'd rather define these 
> routines to account for the most common situation(s) and be done with it."

And now I can't help feeling a little bit offended.  You ignore almost
half of the abstract, then proceed to claim that one clause in my
purpose statement is void because it does not have a supporting
definition in the abstract, _then_ use this as a rationale why the API
does not meet the purpose statement.  _And_, you ignore the fact that
simplicity and generality are still conflicting goals, and say that
appeals for generality have been "discounted" because I sometimes choose
simplicity over generality.  Moreover, you seem to be ignoring that I'm
trying to make a generic hash table API, not a generic mapping API.

Might you have a suggestion how the SRFI could be improved?  Add a
clause in the abstract that more precisely specifies what has tables
are?  But it's there already...

> On the other hand, there is widespread use of immutable hash tables, tables 
> with weakly held keys, concurrency, and GC-sensitive tables, but this SRFI 
> addresses none of those common usages or the issues that arise in their 
> presence, and is therefore deficient on this second stated aim.

Now, this is the first piece of concrete criticism that I see in this
posting.  So, why were these not addressed?

immutable hash tables:
	because immutable hash tables are IMO adequately addressed with
	mutable hash tables.

weak-key (or weak-value) hash tables:
	I think the issues of weak references and hash tables are
	quite orthogonal.  Besides, there is no SRFI for weak references
	to standardise on.

concurrency:
	Because SRFI 18 says: "It is the responsibility of the
	application to avoid write/read and write/write races through
	appropriate use of synchronisation primitives."  IOW,
	concurrency in Scheme is not (and IMO, should not) be connected
	to general-purpose data structures.

GC-sensitive hash tables:
	I don't know what they are.  Care to clarify?

Shortly put, I think immutable hash tables are covered, while the other
issues are orthogonal to hash tables.  IOW, the latter are not "common
uses of hash tables", but common uses of separate language constructs
with hash tables.

> As it exists now, the 
> SRFI neither provides a basic API for a key-value mapping datastructure, 
> nor provides an API covering most, or even common, users' hash table needs.

Summa summarum, I think your first statement is based on a bad
assumption (caused by ignoring the third sentence of the abstract),
while your second statement is false.

> Also, I think a key aim that this SRFI should have, but does not, is the 
> selection of names that represent consistent conventions with existing 
> Scheme practices.

What kind of Scheme practices are you referring to?  Standard scheme
with its vector and string types?  You think they are not accounted?
SRFI conventions?  Which SRFI's I'm particularly at odds with?  SRFI 44?
Is there _any_ mapping API that follows SRFI 44?  Existing
implementations?  Can you show me the way to more precisely match naming
in existing implementations?  Existing user code?  Which portion of the
world's Scheme code have you seen?

Your criticism is full of grave concerns which are however not clearly
enough stated that I could do anything about them.

> The SRFI document states the following in the Rationale:
>    Hash tables are widely recognized as a fundamental data structure
>    for many kinds of computational tasks. Almost every non-minimal
>    Scheme implementation provides some kind of hash table
>    functionality.
> This is certainly true.  The majority of Scheme's I'm familiar with include 
> a hash table datastructure and their common operations, however the names 
> vary highly.  This highlights what should be a primary concern in the 
> design of this SRFI, but which has been neglected; to identify a 
> consistent, and portable set of names and parameter conventions.

You see half of the point.  (BTW, what you say here is also written in
the SRFI, right after the paragraph you quote.)  The primary concern of
this SRFI is to unify different hash table concepts into a portable
whole.  But while it's mostly a naming issue, it's also an issue of
partially differing views on what a hash table really is.  As a result
of differing views on what a hash table is, different implementations
expose the implementation in the API in varying degrees.  For instance,
guile does not have the equivalent of (make-hash-table); the
documentation defines a hash table to be a vector /tout court/.

My view of a hash table is external, governed by the API together with
its complexity constraints.  If you can show how this is not clear from
the SRFI, please tell me.  Of if you think this is a flawed picture,
tell me what hash tables should be in your opinion.  But it's no option
to _only_ define names for various operations in the SRFI, leaving e.g.
creation of hash tables to be an implementation-dependent detail.  One
could not write portable programs with such a hash table SRFI.

> The author has chosen the name and parameter conventions that run counter 
> to several existing Scheme conventions, such as previous SRFIs, RnRS, and 
> numerous Scheme implementations.  Some choices have no precedent 

Now, please point these out and make suggestions for improvements.

> My preference for this SRFI is the following.  Drop the aim of covering 
> common hash table usage.  Writing such a SRFI is a very ambitious and 
> difficult thing to do, and it requires an extensive amount of surveying 
> common use.  I would expect such a SRFI to discuss the design choices taken 
> by most Scheme implementations, the several related SRFIs, as well as 
> similar libraries from related languages such as ML and Lisp.  SRFI 1 is a 
> good example of such a "common use" SRFI.  Shivers surveyed R4RS/R5RS 
> Scheme, MIT Scheme, Gambit, RScheme, MzScheme, slib, Common Lisp, Bigloo, 
> guile, T, APL and the SML standard basis in designing that library.  A good 
> common use hash table SRFI would need to do likewise.

But what else does Shivers say in SRFI 1 but that he surveyed these
systems?  I dug the API bits out of Shiro's cross reference; I've used
Gauche, Guile, PLT and Scheme 48 hash tables, as well as those of
Python, Tcl, Java, awk and Ocaml; am I to be blamed because I didn't
boast with those?

> data structures that provide a mapping from some set of keys to some set of 
> values associated to those keys.  All parts of this SRFI that do not 
> contribute to that aim should be dropped.  All rationales that do not 

Do you really think that would make this SRFI _easier_?  _You_ try to
write a SRFI for generic mappings, and I'll show you how it is not
general enough in 20 ways.  Not only would it be harder, but also quite
useless.  I want standardised hash tables in Scheme implementations, I'm
not here to improve my charisma in scientific writing or
street-wiseness.

> Scheme.  Most notably, this SRFI should be consistent in its choice of 
> names and parameter conventions with SRFI 44 [1].  If the author chooses 
> against these conventions, this SRFI then conflicts and competes with SRFI 
> 44 (and others) and as such the "rationale should explain why the present 
> proposal is a substantial improvement" over these existing conventions, as 
> required by the process document.

Would you like this explanation to be added to the SRFI?
http://srfi.schemers.org/srfi-69/list-archive/msg00024.html
There is also the problem that I was not active in the Scheme community
when SRFI 44 was finalised.  Yes, I think parts of SRFI 44 are flawed
(if not too flawed); I can write a "competing SRFI" for SRFI 44 if I
find the time (I have more important SRFI's in mind).  If you want to
continue discussion on this subject, please answer to the points in the
message I give a pointer to above.

As for this specific issue:

> hash-table-equivalence-function, which a reader of SRFI 44 would expect to 
> return an equivalence over the items in the hash table collection, i.e. key 
> value pairs, whereas hash-table-key-equivalence-function would return what 
> SRFI 69 returns for hash-table-equivalence-function.

Hash tables don't have use for item equivalence functions, and SRFI 44
specifies the item equivalence function to default to the key
equivalence function for mappings, so hash-table-equivalence-function
returns the correct value and hash-table-key-equivalence-function can be
added by a compatibility layer.

On a sidenote, I reread SRFI 44 and its rationales for naming of mapping
functions -- but actually, I could not find any.  Nor could I find any
in the discussion archives.  Would you like to point them out to me?
While SRFI 44 probably has been a lot of work and some serious thought
has gone into it, the part for mappings is IMNSHO not very honed out and
does not follow any existing practice.

Panu

-- 
personal contact: atehwa@xxxxxx, +35841 5323835, +3589 85619369
work contact: panu.kalliokoski@xxxxxxxxxxx, +35850 3678003
kotisivu (henkkoht):	http://www.iki.fi/atehwa/
homepage (technical):	http://sange.fi/~atehwa/