Cache coherence vs consistency

12/2/2023

The reason is, that some lock implementations produce too many bus messages and thereby slow down the execution of the processor. The cache-coherency protocol is sometimes crucial for the scaling of a particular lock implementation. Discuss how to remedy this problem and explain the impact on memory and cache transfers. CPUīecause B is contained in cacheline X, false sharing occurs with A. Initially all cachelines are invalid (I). The last two columns describe if the operation caused a cacheline transfer to/from memory or to/from another cache.

The next two columns list the MESI state of the cachelines X and Y in each of the processors. The first column lists the processor and the second column specifies the memory operation being carried out. The following table lists the memory operations of the individual processors as they appear on the shared bus. Processor P1 executes the following code: The cacheline size is 32 bytes, so that A and B reside in cacheline X, whereas C resides in cacheline Y. In memory there exists a data structure with the following layout: A (8 bytes), B (24 bytes), C (8 bytes). Two processors ( P1 and P2) and uniform memory are connected to a shared bus, which implements the MESI cache coherency protocol. One invalidation-based protocol discussed in the lecture is the MESI protocol. Multiprocessor systems with caches use a coherency protocol, which ensures that writes by one processor eventually become visible to all other processors and that no two processors write to the same memory location simultaneously. Use as few fence instructions as necessary. Insert MFENCE (memory fence) instructions in Dekker's and Peterson's algorithms to ensure their correct behavior on a multi-processor system that implements a store buffer with store forwarding. Machines with relaxed memory consistency typically provide programmers with fence instructions to tighten the ordering of memory instructions. Explain why Peterson's algorithm does not break on machines with a store buffer where reads are not permitted to bypass writes to the same memory location and why it does break if reads are permitted to bypass writes to the same memory location on systems with store forwarding (e.g., SPARC TSO).

Relaxed Consistency: Peterson-AlgorithmĪ well-known algorithm for mutual exclusion is Peterson's algorithm (shown in pseudo-code below).
For each row describe if the result is sequentially consistent and if so, specify a visibility order that produces the result. Complete the following table with the possible results for (u,v,w). Some outcomes may not be possible on a sequentially consistent system.

The outcome of the execution, denoted by the tuple (u,v,w), may vary depending on the order in which the individual operations of each processor become globally visible. Here a1 denotes the first operation of processor P1, a2 denotes the first operation of P2 and b2 denotes the second operation of P2, etc. Three processors ( P1, P2 and P3) in a shared-memory system execute the following code (initially A = B = 0). The order in which the individual memory operations of each processor become visible to the other processors on the shared interconnect (e.g., the bus) is called visibility order.

In a system with sequential consistency each processor always executes memory operations in the order specified by its program (program order).

Please be prepared for all questions as the exercise will focus on discussion, not on understanding the question and gathering the knowledge. In the tutorial, all solutions will be presented by students.

0 Comments

Cache coherence vs consistency

Leave a Reply.

Author

Archives

Categories