This page is part of the web mail archives of SRFI 58 from before July 7th, 2015. The new archives for SRFI 58 contain all messages, not just those from before July 7th, 2015.
Aubrey Jaffer wrote: >>> Modern CPUs are almost always (I/O-bound or) limited by their memory >>> bandwidth through the cache. If you double or quadruple the data >>> movement necessary, you will execute at half or quarter the speed. Bradd wrote: >> That depends on your data access patterns and cache sizes. If your >> working set still fits in L1 cache after aligning the data, you get >> better performance, and some of the big servers on the market now >> have huge amounts of L1 cache. > The largest I found was 32.kiB inst + 64.kiB data on Suns > <http://www.sun.com/servers/family-comp.html>. > A stripped down SCM interpreter is 40.kB, but even if the interpreter > fit, it would be getting swapped out to bring in SUBRs. 64.kiB would > be a much better fit. I misremembered slightly; I was actually thinking of the 256KB L2 cache on the Itanium 2, which Intel describes as similar in performance to most L1 caches. Here are the stats for its 3 cache levels: Cache Size Load latency L1 32KB 1 cycle direct, 2 cycles indirect L2 256KB ~6 cycles L3 1.5-9MB 12+ cycles While I wouldn't say that ~6 cycles is comparable to most L1 caches, it's still very fast. Put that cache in a system where byte access requires a word load, mask, and shift (about 3-5 cycles), and you might just break even. You'd trade about 2-3 cycles per load for a greater number of 6-cycle L2 loads, just enough where it starts looking viable. Almost forgot -- eliminating the mask & shift operations also reduces pressure on the instruction cache, for more savings. >> In some architectures, byte-aligned access may even be more >> expensive than L2 cache. > That sounds like a poorly designed CPU. Or a big, fast L2 cache like the Itanium's. > Disk-based b-trees, used extensively for database indexes, are an > important example of byte-intensive algorithms not tied to text > strings. Other examples are cryptography and data-compression. Thanks; I didn't know that. -- Bradd W. Szonye http://www.szonye.com/bradd