[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Interface view of dictionaries

This page is part of the web mail archives of SRFI 44 from before July 7th, 2015. The new archives for SRFI 44 contain all messages, not just those from before July 7th, 2015.



scgmille@xxxxxxxxxxxxxxxxxx wrote:
> Lets examine your proposed API for dictionaries, and contrast it with
> SRFI-44.

Bear wrote:
>> (*-insert! dict datum key),

> *-put dictionary key value

Not necessarily. PUT adds a key if it does not exist or changes it if it
does exist. That's not well-defined unless the dictionary has unique
keys. INSERT may be equivalent to ADD, which can add a duplicate key or
raises an exception if the dictionary doesn't permit them.

By the way, did I mention that PUT isn't well-defined for arbitrary
mappings? It only works for "function" (N to 1) mappings.

> > (*-fetch-deleting! dict key)

> [A]
> (define (*-fetch-deleting! dict key)
>   (let ([rv (*-get dict key)])
>     (*-remove! dict key)
>     rv))
> 
> Additionally, fetch-deleting! does not map to many dictionaries, for
> example disk file database, *without* doing the above.

The point is that it's an important for practical reasons. Also, your
implementation may not be the best way to do it for some dictionaries.
Without the interface, there's no way for a generic algorithm to deal
with that.

Now, there is a trade-off between perfect polymorphism and keeping the
"concept space" of the interface reasonable. But without a real
implementation of SRFI-44, there's no way to show that its trade-offs
are appropriate. Indeed, I've seen you repeatedly reject these sorts of
practical concerns.

>> (*-get-n dict num)
>> (*-get-n-deleting! dict num)

> Trivially defined with enumeration.

But is it practical? Another disadvantage of the "no practical
implementation" approach.

>> (get-iterator dict)   ;; returns an iterator function that
>>                       ;; takes a key arg and returns some "next" key,
>>                       ;; or takes '() and returns some (arbitrarily
>>                       ;; chosen for unordered dictionaries) "first" key.
>> 
>> (*-nextkey dict) ;; same as function returned from get-iterator
>> (*-nextkey dict key) ;; same as returned from the get-iterator func above
>> (*-lastkey dict)     ;; same as returned from the get-reversed-iterator func
>> (get-reversed-iterator dict)

> See Olegs post on the hazards of iterators.

Oleg makes a good point, but you should quit taking it as gospel. Can
you clearly explain the advantages and disadvantages yourself? Do you
understand the underlying theory? And, most importantly, can you show
how to implement a multiple-collection fold without cursors?

    (nfold f seed c1 c2 c3 ...)

Q: Can you implement it at all with the SRFI-44 interface?

A: Yes, you can, but it's tricky, and the implementation includes
something just like a cursor. Essentially, collection-fold-left and
nfold must be coroutines, and the Scheme implementation of coroutines
effectively creates a one-use cursor.

Enumerators may also create problems for algorithms which require
backtracking (bidirectionality). It may be possible, with something like
the "limited use cursors" you get from coroutines, but I haven't figured
it out yet.

> <snip>Efficiency procs
> Best left to a separate SRFI.

Yes, perhaps. But you should be more careful about your design to make
sure that it isn't incompatible with them. You seem to have a mental
filter that shuts out anything performance-related, even if it has
design implications.

>> (*-fetchfirst dict number) ;; returns number entries
> [B] Trivially implemented with collection-fold-left

Trivially and VERY SLOWLY. However, you could implement it more
efficiently with a "*-get-any" interface that returns all matches. Even
more efficiently with a "*-get-any" enumerator that uses a filtering
predicate that can stop the enumeration. You can do this for a tree ADT
in O(log N + K) time, where N is the dictionary size and K is the number
of elements returned.

But SRFI-44 doesn't include those interfaces, so you're stuck with the
O(N) enumeration algorithm. Sure, you could leave this for later, but
what's the point in having an inefficient dictionary interface at all?
There's no need for it.

> So, you see, your API is almost *exactly* like the current SRFI apart
> from minor naming variations.

No, it's much richer than the current SRFI, with procedures designed
around the way people actually use the collection. In some cases, it
solves problems that are poorly-defined or inefficient using the SRFI-44
interface.

That's more than just "minor naming variations." Sorry if this sounds
rude, but you're far too quick to dismiss ideas as being "minor" just
because they're about naming, efficiency, or practical concerns.

Which is ridiculous, because efficiency is the whole point of having
collections, and usability (including naming) is the whole point of
having a standard interface. I don't know what your design goals were,
but I'd be curious to hear them.

> Where it differs, there has been strong refutation for that approach.

Please don't overstate Oleg's claims unless you can explain them. He
makes some good points, enough to justify the enumerators, but I'm not
entirely convinced -- enumerators themselves seem to require at least a
limited kind of iterator, if you go beyond the basic one-collection
fold.

> I must ask, did you *read* the SRFI carefully?

I have, several times, and it's missing some important details, like
design goals, conflicts with prior art, a full implementation, several
of the performance- and usability-oriented interfaces that Bear's
dictionary provides. Most importantly, it doesn't explain this bizarre
attitude that performance and usability are "minor" issues in a proposal
for a standard interface for collections. I strongly suggest that you
step back and reconsider your design goals, both to (1) make sure that
you've met them, and to (2) make sure that they actually make sense
given the nature of the SRFI.
-- 
Bradd W. Szonye
http://www.szonye.com/bradd