mutual exclusion.

Next: data blocking. Up: 4. Memory bandwidth and Previous: false sharing.

mutual exclusion.

Mutual exclusion appends when 2 or more memory lines are needed but cannot be in cache all-together because they fit in the same cache line and successive access cause exclusion of the memory lines previously loaded. This is a real problem for direct mapped caches where a memory line can be in only one cache line. N way set associative caches solve this problem by allowing a memory line to be in N different cache lines, the N locations are called a set; this is the case of the P6 processors where L1 cache is 2 way set associative and it avoids mutual exclusion for all Level 1 BLAS with less than 3 vector operands (like ddot and daxpy). Mutual exclusion can also appends for matrix operations like dgemm: when using a block method, leading dimension¹²can be such that some memory lines share the same set into cache avoiding more than N of them at the same time: on figure

the memory lines l1 l2 and l3 are 1024 doubles spaced (remember that PII L1 cache has 256 sets each containing 2 line of 32 bytes) so they fit into the same L1 cache set which can hold only 2 different memory lines. Thus l1, l2 and l3 and subsequent lines exclude mutually. The solution to have the block in cache is to make a copy into contiguous memory and this is the solution adopted in ATLAS.

**Figure:** Mutual exclusion in block access.
$\begin{figure} \begin{center} \epsfig{file=mutualexclude.eps,width=7cm} \end{center} \end{figure}$

Next: data blocking. Up: 4. Memory bandwidth and Previous: false sharing.

Thomas Guignon
2000-08-24