[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [oleg@xxxxxxxxx: Interface view of dictionaries]




On Mon, 27 Oct 2003 scgmille@xxxxxxxxxxxxxxxxxx wrote:

>On Mon, Oct 27, 2003 at 08:25:49PM -0800, Bradd W. Szonye wrote:

>> Thing is, there are a few functions like this which have superior
>> performance characteristics for *many* collections, if you limit the
>> interface somewhat. As you say, there are trade-offs between performance
>> and generality. However, there's a significant class of collections --
>> any tree-based dictionary, for example -- where you can improve from
>> O(N) performance to O(lg N + K) performance for sequential access to
>> keys. That's both efficient and generic.
>
>I agree, and I expect a tree implementation of dictionaries to specify
>exactly those operators.  But they don't necessarily extend to
>hashtables, disk file databases, or any number of other dictionary
>mappable datastructures... a set which is larger than the set of Tree
>dictionaries.

Okay, here's where our thought diverges.

Some collections and dictionaries, as you point out, are going to
have structures that imply that the implementation of these things
is no more efficient than it can be made with the SRFI-44
primitives.

My experience though with people providing "uniform APIs" is that
it creates a strong temptation to regard the underlying data
structures as interchangeable modules, without regard to the
efficiency of operations in those structures.  This becomes a
design requirement, and then people restrict their use of primitives
to just those primitives available in *all* of the potential modules.

So, even where fetching a range of mappings from a dictionary would
be more efficient using an operation that's defined on trees and olists,
people are forbidden to use that named operation and instead implement
it using the restricted set of common primitives because their code
has to be "generic".  This code then becomes fantastically stupid
and fails to realize the available benefits of the data structure
when the underlying dictionary actually *is* a tree or an olist.

If the "common" functions include any common pattern of use that can
be more efficiently implemented in *some* dictionary than it can be
implemented using the SRFI-44 proposed primitives, then generic code
realizes benefits.

These allow the programmer to say what s/he means, and the code will
realize the benefits of efficiency where it's available.  It won't
have to be written stupid unable to take advantage of available
efficiencies, it won't have to be rewritten stupid when someone
changes to a different dictionary structure that doesn't support
the exact same set of operations efficiently, and it won't then
stay stupid when they switch back.

If the SRFI-44 primitives are what's standardized, then people will
be forced to implement these common patterns of use in the worst
possible way in order to create "generic" code even where, on most
dictionaries, better ways are available.

				Bear










>
>>
>> > If I read the intent of SRFI-44 correctly, its goal is not to provide
>> > all things for all people. Rather, the goal is to define the framework
>> > and to describe a _minimal_ set of core functions plus a _limited_ set
>> > of very common extensions.
>>
>> And Bear is trying to establish that some of these functions are not
>> only common but important extensions. For most dictionary-type
>> collections, primitives aren't sufficient. There's a convenience
>> argument *and* a performance argument for extending the set.
>
>But only if those operators generalize to all dictionaries.  Otherwise
>they impair the elegance and utility of the API that is general to all
>dictionaries.  We're not outright disagreeing, these operators just need
>to be defined in the SRFI that specifies tree or even just "log n
>efficient indexed dictionaries".
>
>>
>> > The intent is not to make it unnecessary to add more API in future
>> > SRFIs. Rather, the intent is to make it unnecessary to remove SRFI-44
>> > features in future SRFIs.
>>
>> Why would you need to remove a function that is more convenient and more
>> efficient?
>Because they may not apply to a hypothetical collection.
>
>> Also, speaking of things which might need to be removed, I'm concerned
>> about the object-oriented nature of the containers. Polymorphism is
>> good, but I suspect that the SRFI's design may be tied too closely to
>> the specific object system used in the reference implementation. For
>> example, the hierarchy looks like it's based on a traditional "objects
>> as references" class-based system, when an "objects as values"
>> prototype-based system may fit better into Scheme. (Especially given the
>> way such a system incorporates primitive data structures according to
>> content rather than type).
>
>Hardly.  The SRFI was defined before there was even an object system
>selected for the reference implementation.  We were very careful not to
>evey specify polymorphism.  The SRFI only requires that the more general
>functions work for the 'subtypes' of the SRFI.  It never says that the
>collections actually be subtypes in some object system, its just a
>useful metaphor.
>
>> inconsistent with the core language they're supposed to extend.
>>
>> > About experience using SRFI-44. I can see how to change my
>> > implementation of treaps to fit SRFI-44.
>>
>> That's good. Present the implementations as support for the design. The
>> author hasn't done so, however -- indeed, he keeps making excuses for
>> why he shouldn't have to. That reduces my confidence in the proposal.
>> Furthermore, the SRFI Process Document recommends against exactly that
>> kind of thing.
>>
>You seem to portray this as if I'm hiding some dark secret about
>limitations to the API.  I'm quite confident in the API in fact, and
>just don't believe that I need 'prove' its good to you by writing a
>bunch of code that won't wind up in the SRFI.  I have limited time for
>this SRFI among many many projects, so this is strictly pragmatic.
>
>>
>> > About fold-right. I'd rather wish to see fold-right gone ....
>> > Reversible collections is another thing that I'm uneasy about. A
>> > concept of views has received quite a lot of attention recently.
>>
>> I was thinking the same thing -- container views and enumeration
>> adapters would be a much better way to provide some of this stuff. I
>> think I mentioned it earlier in the discussion, but I dropped it because
>> of the attitude that it's best not to change the SRFI at this point.
>
>Fortunately, collection-fold-right could (if I'm not mistaken) be
>implemented in terms of collection-fold-left and container views.  So
>we're not losing too much by leaving it in.
>
>> > Bradd W. Szonye wrote:
>> >> And, most importantly, can you show how to implement a
>> >> multiple-collection fold without cursors?
>> >>
>> >>     (nfold f seed c1 c2 c3 ...)
>>
>> > First of all, who said that we should ban cursors completely?
>>
>> Have you read the author's replies to our review comments? He dismisses
>> them immediately, with a handwave in the general direction of your
>> earlier article.
>
>Absolutely not.  I dismissed cursors because I knew they were derivable
>from enumeration.  I said this at least twice and then pointed you to
>the document Oleg initially cited.
>
>> enumerator," a coroutine designed to traverse the collection efficiently
>> (and without any special coding! just a straightforward traversal that
>> supplies every value with YIELD). The main FOLD function creates the
>> coroutines, grabs a value from each, and calls the folding-function.
>> I've already got this implemented. It was almost trivial, once I
>> understood how coroutines work.
>
>> Yes, a stable API would be good. My objection is that the SRFI is not a
>> stable API. Parts of it are unimplemented, and major sections have
>> changed recently. And what's there doesn't take advantage of prior art
>> like SRFI-1 and SRFI-13 very well; it reinvents the wheel too much IMO.
>> Some elements it did take from SRFI (like linear update functions) are a
>> bit controversial -- see the beginner's discussion on c.l.s.
>
>We've already established disagreement on the stability and
>implementation arguments.  Linear update procedures aren't
>controversial.  Whats confusing to new users (but not so much to
>experienced ones) is the naming collision between linear update and
>purely mutable update functions.  Sound familiar?  This is why we had a
>! and !! distinction in the first place.
>
>> So while I agree that there is value in a stable API, I don't think
>> there's any value in rushing an immature API to finalization just so
>> that we can call it "stable." It'd be much better to propose a concrete
>> collection or three, get some experience with their use, and it they're
>> successful, factor out the interface and publish that as a SRFI.
>
>This is arguably bad design.  Writing many collections and then culling
>the common operators is a great way to wind up with a poorly thought out
>system.  Especially when the stated goal here is to create a general,
>*interoperable* framework for collections.
>
>
>> Actually, if they're successful, there's less need to publish the
>> interface separately. Future developers will imitate it like they
>> already imitate SRFI-1. That's a much more desirable outcome: We get
>> useful collection types *and* a de facto standard interface that way.
>Name one language with a useful generic collections library that got it
>this way?  Java had to competely rewrite its collections library at 1.2
>in order to get a general framework, because the former approach didn't
>work.
>
>	Scott
>