[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on SRFI 69



The SRFI document states the following in the Abstract:

   This SRFI specifies an API for basic hash tables. Hash tables are data
   structures that provide a mapping from some set of keys to some set of
   values associated to those keys.

From your description of what hash tables are, the name "hash table" seems too specific of a term. This issue was raised earlier by Marc Feeley, and as far as I can tell was never addressed in the document or discussion list. Perhaps, I'm just missing it. However, the more important issue here is that what follows in the document is far from a basic API for structures that provide a mapping from some set of keys to some set of values associated to those keys.

One of the most common ways SRFIs go wrong is that their purpose is not clearly articulated. Without a clear thesis, it is impossible to evaluate design choices and rationales, or even provide helpful suggestions. Luckily this SRFI does state its aims, however there are conflicts with what is stated. Further, design choices have been made that violate these aims, and rationales rarely appeal to the aims in a consistent manner.

The SRFI document states the following in the Rationale:

   The primary aim of this SRFI is to provide a simple and generic hash table
   API that will answer most of users' needs for basic usage of hash tables.

This conflates two disparate and competing aims into one; that of providing a simple and generic API for a data structure which "provides a mapping from some set of keys to some set of values associated to those keys," and that of covering the general usage patterns of hash tables (whatever those patterns may be). The abstract makes no mention of most common hash table usage, so this second aim seems out of the scope of this SRFI. But several choices are made contrary to the aim of simplicity and generality, such as the ad-hoc collection of type-specialized hash table procedures and their hash function counterparts. Appeals for generality in the API have been discounted by the author saying such things as, "I'd rather define these routines to account for the most common situation(s) and be done with it."

On the other hand, there is widespread use of immutable hash tables, tables with weakly held keys, concurrency, and GC-sensitive tables, but this SRFI addresses none of those common usages or the issues that arise in their presence, and is therefore deficient on this second stated aim.

By conflating these aims, the design choices lack a clear purpose and often seem to reflect the authors personal preferences rather than a reasoned rationale. Indeed, it is difficult if not impossible to evaluate a design choice when there is not a clear and consistent aim for the SRFI. I think this document would greatly benefit a more explicit statement of its purpose, resolving the conflict of the current aims. As it exists now, the SRFI neither provides a basic API for a key-value mapping datastructure, nor provides an API covering most, or even common, users' hash table needs.

Also, I think a key aim that this SRFI should have, but does not, is the selection of names that represent consistent conventions with existing Scheme practices.

The SRFI document states the following in the Rationale:

   Hash tables are widely recognized as a fundamental data structure for many
   kinds of computational tasks. Almost every non-minimal Scheme
   implementation provides some kind of hash table functionality.

This is certainly true. The majority of Scheme's I'm familiar with include a hash table datastructure and their common operations, however the names vary highly. This highlights what should be a primary concern in the design of this SRFI, but which has been neglected; to identify a consistent, and portable set of names and parameter conventions.

The author has chosen the name and parameter conventions that run counter to several existing Scheme conventions, such as previous SRFIs, RnRS, and numerous Scheme implementations. Some choices have no precedent whatsoever. To choose such conventions is perfectly allowable, but the advantage of these new conventions must be thoroughly articulated and compelling. I don't think that is the case here. Further, if the aim of this goal is to cover common hash table usage, unprecedented names and conventions run counter to this aim; something which has never been used before is not common.


My preference for this SRFI is the following. Drop the aim of covering common hash table usage. Writing such a SRFI is a very ambitious and difficult thing to do, and it requires an extensive amount of surveying common use. I would expect such a SRFI to discuss the design choices taken by most Scheme implementations, the several related SRFIs, as well as similar libraries from related languages such as ML and Lisp. SRFI 1 is a good example of such a "common use" SRFI. Shivers surveyed R4RS/R5RS Scheme, MIT Scheme, Gambit, RScheme, MzScheme, slib, Common Lisp, Bigloo, guile, T, APL and the SML standard basis in designing that library. A good common use hash table SRFI would need to do likewise.

Instead, this SRFI should focus on providing a simple and generic API for data structures that provide a mapping from some set of keys to some set of values associated to those keys. All parts of this SRFI that do not contribute to that aim should be dropped. All rationales that do not appeal this aim, should be abandoned. I would like to see this API be consistent with the existing datastructure API conventions that exist in Scheme. Most notably, this SRFI should be consistent in its choice of names and parameter conventions with SRFI 44 [1]. If the author chooses against these conventions, this SRFI then conflicts and competes with SRFI 44 (and others) and as such the "rationale should explain why the present proposal is a substantial improvement" over these existing conventions, as required by the process document.

This SRFI will be an important one. People will turn to it regardless of how well or poorly constructed it is. Without a clear and consistent aim, which I believe is the case now, such a SRFI can do a great deal of harm.

David

[1] This issue of SRFI 44 names was raised as the first comment during the discussion period by Scott Miller, to which Bear voiced criticism over SRFI 44 on implementation and usability grounds. This is irrelevant. SRFI 44 included a great deal of work on identifying the proper names for precisely this kind of datastructure. The choices include rationales, some of which have been appealed to in deciding names in this SRFI. If this SRFI is not going to use the names of SRFI 44, it *must* include compelling rationales for these names over the names (and rationales) identified in SRFI 44. Many of the choices made thus far in the SRFI directly conflict with SRFI 44, sometimes in very confusing ways, such as hash-table-equivalence-function, which a reader of SRFI 44 would expect to return an equivalence over the items in the hash table collection, i.e. key value pairs, whereas hash-table-key-equivalence-function would return what SRFI 69 returns for hash-table-equivalence-function.