[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Storage Efficiency of Vectors

This page is part of the web mail archives of SRFI 43 from before July 7th, 2015. The new archives for SRFI 43 contain all messages, not just those from before July 7th, 2015.




On Wed, 16 Apr 2003, Per Bothner wrote:


>The tradeoff here depends a lot on your memory allocation and gc
>overheads.  If you have a memory allocator that is tuned for handling
>allocating and reclaiming cons cells really fast, then I suspect the
>best strategy is to use a temporary list.  On the other hand if
>object allocation or collection are relatively slow or don't
>special-case cons cells, then re-allocating will probably be faster.

I think there's a far more important tradeoff that is being ignored
here.  My problem with association lists is not storage efficiency,
it's access time.

If you are considering a vector solution, you have at least the
opportunity to make hash tables, which actually solves the main
problem of alists, instead of just reimplementing an inefficient
algorithm using vectors instead of lists.

I have not understood yet why you're not doing this.  Linear access
time is a very serious limitation on how big a system it is practical
to make with the association mechanism; storage space is less so.  I
don't want to waste fifty thousand CPU cycles accessing an association
in a natural language application, for example, just because the word
whose parse-procedure I'm looking up happens to have been the ten
thousandth word discovered.  If I wanted to waste that much CPU, I'd
be using alists to store my procedures.

I can stand doubling the memory size of the overhead for the sake of
cutting CPU time for an access from linear to constant.  That is a
worthwhile tradeoff.  I can't believe I'm hearing you argue about a
savings of 30 percent in storage space when the big bleeding problem
of alist access time isn't addressed.

Since association vectors, association lists, and association
hashtables will use overhead space on the same order (from 20-200% of
each other's size), why are we even talking about doing something that
doesn't offer the substantial advantage in access time that's
available?  What was the motivation, if we're not after that
advantage, in using vectors in the first place?

				Bear