[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: neighbor deletion; param ordering; < & <=; fresh lists; cell recycling



***I'm still wondering where the reference implementation is.***


On Sat, 20 Jul 2002 shivers@xxxxxxxxxxxxx wrote:

> Good question -- it ain't sorting. But it's something you typically do with
> sorted data. I considered punting these. But writing them well takes a little
> work (especially the iterative destructive list version), and part of the
> library game, for me, is that I make an effort and write a thing well, once,
> and then everyone can use it. Count it under "manipulation of sorted data,"
> just like vector binary search.

I think you should make it a separate SRFI.  It does seem a bit too small
for that, but it really doesn't make sense to me to have it in the sorting
library.

>     I think the ordering op should go first.
>
> Me, too. But the compatibility win outweighs my preference here.
>
>     Also, <= seems much more intuitive than <.
>
> Me, too. But it busts compatibility with just about *everything* out there,
> lisp and Scheme. I don't think it's worth it.

Qaere:  Do you provide functions with the same _names_ as existing
implementations, or just the same interfaces?  If the names are not the
same, compatibility seems to me to be not such a big deal, as the
functions will have to be redefined anyway.  It's not much harder to say
(define (cl-sort-list list <)
	(sort-list (lambda (x y) (not (< y x))) list))

than
(define (cl-sort-list sort-list)

Not quite as easy, but not a big deal if you're going to have
compatability definitions anyway.  Unless I'm missing some effect this has
on the rest of the SRFI...

Another little thought:  You could provide _both_, with the different
versions marked by asterisks or something.  That could be a bit confusing
though, and even lead to a bit of bloat.

>     I note that the non-mutating vector sorts are guaranteed to copy the
>     vector.  It would seem potentially very useful to give some indication of
>     whether the new vector is in fact different from the old vector, to allow
>     programmers to optimize for generational collectors.
>
> Heh? The new vector is *guaranteed* to be different from the old vector.
> You alter one, the other is unchanged. Guaranteed.

What I was thinking was that in functional-style code, if you sort an
array, and if the array was in fact already sorted, you get a little win
by using the old array rather than the new one (and letting the new one
get trashed almost immediately in the first generation) rather than having
2 identical arrays sitting around wasting memory, and missing the
opportunity for a first-generation collection.  But I'm no expert at GC,
so this may not actually be significant.  Of course you can always check
if the array is sorted first, but that would take more time.

> If you really, really need to know that the returned list is fresh, that it
> contains no cons cells in common with the input list, you are in a
> side-effecting world. Otherwise you would not care about cons-cell identity,
> right? So copy the list and then use SORT-LIST!. It would probably run faster
> and certainly allocate less "trash" memory than a purely functional sort.
> So we might as well let this one slide.

I was thinking of this in the "side-effecting world".  I was guessing
(without any serious thought on the topic) that it could be faster (at
least on some implementations) to do a total-copy sort than a
copy-then-sort.  Don't have a clue if this is really the case.

> so there are not going to be any new cells in the answer. Let me add a little
> bit of language to the spec to spell this out more (which is mostly taken
> from the later appearance of LIST-MERGE!):

Alright.  You may be able to get away with something really simple like:
"The maximum amount of live memory between the call to list-merge! and its
completion may be no more than a constant amount more than the amount of
live memory immediately before the call to list-merge!".

>     These procedures use SET-CDR!s to rearrange the cells of the lists into
>     the final result. As such, they do not allocate any extra cons cells --
>     they work "in place." Hence, any cons cell appearing in the result must
>     have originally appeared in an input.

I think that's OK.

David