[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Wording of the rationale

This page is part of the web mail archives of SRFI 95 from before July 7th, 2015. The new archives for SRFI 95 contain all messages, not just those from before July 7th, 2015.



Jens Axel Søgaard skrev:

Quoting the rational of srfi-95:
       * Four varieties of algorithms were provided (quick, heap,
         insert, merge) even though quick, heap, and insert sorts have
         no significant advantage over merge-sort.

Commenting:
Second: What does "no significant advantage" mean? I were of the
impression, that the "hidden constants" of O(n log (n)) of
vector-quick-sort were smaller than that of vector-merge-sort.

...

Also: Is merge-sort the fastest algorithm, if the elements are "almost
sorted"?

I tried a few experiments, and discovered that heap-sort beats the other
algorithms when the elements are in almost reverse order. (The most
surprising about this, is that I didn't figured it out before running
the test).

That is: If you know something about the data to be sorted, you are
in a position, where you *want* to choose between the various
algorithms.

/Jens Axel Søgaard


  A sorted, long vector:  (vector 1 2 3 ... 1000000) with <
  ----------------------------------------------------------
  vector-merge-sort! and vector-insert-sort! roughly the same time
  vector-heap-sort! *much* slower

  vector-merge-sort!
  cpu time: 62 real time: 63 gc time: 0
  vector-insert-sort!
  cpu time: 62 real time: 63 gc time: 0
  vector-heap-sort!
  cpu time: 2343 real time: 2359 gc time: 0



  A reverse-sorted, long vector:  (vector 10000 ... 3 2 1) with <
  -------------------------------------------------------------
  vector-heap-sort! *very* fast
  vector-merge-sort! and vector-insert-sort! roughly the same time

  vector-merge-sort!
  cpu time: 2203 real time: 2219 gc time: 0
  vector-insert-sort!
  cpu time: 2047 real time: 2046 gc time: 0
  vector-heap-sort!
  cpu time: 16 real time: 16 gc time: 0



My test program and results follow:


(require (lib "32.ss" "srfi"))

; interval : integer integer -> (list integer)
;   (interval n m) => (list n n+1 ... m)
(define (interval n m)
  (do ((i m (- i 1))
       (xs '() (cons i xs)))
    ((< i n) xs)))

; copy the input vector, and time the sorting only
(define (test orig-v sorter)
  (let (; copy v so we can sort the same vector several times
        (v (list->vector
            (vector->list orig-v))))
    ; make sure the garbage of one test doesn't affect the next
    (collect-garbage)
    (collect-garbage)
    ; time the sorting only
    (time (sorter < v))))

'SORTED-VECTOR-TEST
(define n 1000000)
(define v (list->vector (interval 1 n)))
'vector-merge-sort!
(test v vector-merge-sort!)
'vector-insert-sort!
(test v vector-insert-sort!)
'vector-heap-sort!
(test v vector-heap-sort!)


'REVERSE-SORTED-VECTOR-TEST
(define n 10000)
(define v (list->vector (reverse! (interval 1 n))))
'vector-merge-sort!
(test v vector-merge-sort!)
'vector-insert-sort!
(test v vector-insert-sort!)
'vector-heap-sort!
(test v vector-heap-sort!)


'RANDOM-VECTOR-TEST
(define n 10000)
(define v (list->vector (map (lambda (n) (random 1000000))
                             (interval 1 n))))
'vector-merge-sort!
(test v vector-merge-sort!)
'vector-insert-sort!
(test v vector-insert-sort!)
'vector-heap-sort!
(test v vector-heap-sort!)


The results were (a different run than the above):

Welcome to DrScheme, version 359.100-svn6nov2006.
Language: Pretty Big (includes MrEd and Advanced Student).

SORTED-VECTOR-TEST
vector-merge-sort!
cpu time: 63 real time: 62 gc time: 0
vector-insert-sort!
cpu time: 62 real time: 62 gc time: 0
vector-heap-sort!
cpu time: 2359 real time: 2359 gc time: 0

REVERSE-SORTED-VECTOR-TEST
vector-merge-sort!
cpu time: 2110 real time: 2110 gc time: 0
vector-insert-sort!
cpu time: 2000 real time: 2000 gc time: 0
vector-heap-sort!
cpu time: 15 real time: 15 gc time: 0

RANDOM-VECTOR-TEST
vector-merge-sort!
cpu time: 16 real time: 16 gc time: 0
vector-insert-sort!
cpu time: 1031 real time: 1062 gc time: 0
vector-heap-sort!
cpu time: 31 real time: 31 gc time: 0