[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: #\a octothorpe syntax vs SRFI 10

Aubrey Jaffer wrote:
>>> Modern CPUs are almost always (I/O-bound or) limited by their memory
>>> bandwidth through the cache.  If you double or quadruple the data
>>> movement necessary, you will execute at half or quarter the speed.

Bradd wrote:
>> That depends on your data access patterns and cache sizes.  If your
>> working set still fits in L1 cache after aligning the data, you get
>> better performance, and some of the big servers on the market now
>> have huge amounts of L1 cache.

> The largest I found was 32.kiB inst + 64.kiB data on Suns
> <http://www.sun.com/servers/family-comp.html>.
> A stripped down SCM interpreter is 40.kB, but even if the interpreter
> fit, it would be getting swapped out to bring in SUBRs.  64.kiB would
> be a much better fit.

I misremembered slightly; I was actually thinking of the 256KB L2 cache
on the Itanium 2, which Intel describes as similar in performance to
most L1 caches. Here are the stats for its 3 cache levels:

    Cache   Size      Load latency
    L1      32KB      1 cycle direct, 2 cycles indirect
    L2      256KB     ~6 cycles
    L3      1.5-9MB   12+ cycles

While I wouldn't say that ~6 cycles is comparable to most L1 caches,
it's still very fast. Put that cache in a system where byte access
requires a word load, mask, and shift (about 3-5 cycles), and you might
just break even. You'd trade about 2-3 cycles per load for a greater
number of 6-cycle L2 loads, just enough where it starts looking viable.
Almost forgot -- eliminating the mask & shift operations also reduces
pressure on the instruction cache, for more savings.

>> In some architectures, byte-aligned access may even be more
>> expensive than L2 cache.

> That sounds like a poorly designed CPU.

Or a big, fast L2 cache like the Itanium's.

> Disk-based b-trees, used extensively for database indexes, are an
> important example of byte-intensive algorithms not tied to text
> strings.  Other examples are cryptography and data-compression.

Thanks; I didn't know that.
Bradd W. Szonye