Understanding Cache Coherence Protocols: MOESI vs. MESI
A deep dive into how modern multi-core processors keep their caches consistent, exploring the tradeoffs between MOESI and MESI protocols and when adaptive approaches shine.
The Problem: Shared Memory in Multi-Core Systems
When multiple processor cores share the same memory, each core typically has its own private cache for performance. But what happens when Core 0 writes to a memory address that Core 1 has cached? Without a coherence protocol, Core 1 would read stale data — a correctness nightmare.
Cache coherence protocols solve this by defining a set of states each cache line can be in, and rules for state transitions when reads, writes, and bus messages occur.
MESI: The Classic Four-State Protocol
MESI defines four states:
- Modified (M): The line is dirty and exclusive — only this cache has it, and it differs from main memory.
- Exclusive (E): The line is clean and exclusive — only this cache has it, matching main memory.
- Shared (S): The line is clean and potentially held by other caches too.
- Invalid (I): The line is not valid in this cache.
The key insight is the Exclusive state: if a core has exclusive access to a clean line, it can transition to Modified on a write *without any bus traffic*. This silent upgrade is a huge performance win for data that's only accessed by one core.
MOESI: Adding the Owned State
MOESI extends MESI with a fifth state:
- Owned (O): The line is potentially shared with other caches, but *this* cache is responsible for supplying the data on requests (not main memory).
The Owned state enables dirty sharing — when a core with a Modified line receives a read request from another core, instead of writing the line back to memory and transitioning to Shared, it can transition to Owned while the requesting core gets a Shared copy. This avoids an expensive memory write-back.
When Does Adaptive Make Sense?
In my implementation of an adaptive MOESI variant, I found that the optimal strategy depends heavily on the workload:
- Producer-consumer patterns: MOESI wins because dirty sharing avoids write-backs on the critical path.
- Migratory data: Both protocols perform similarly, as the data keeps moving exclusively between cores.
- Widely-shared read-mostly data: The overhead of tracking the Owned state can hurt — simpler MESI-like behavior is better.
The adaptive protocol monitors sharing patterns and switches between invalidation-based and update-based coherence actions at runtime. On a 16-core SMP simulation, this reduced bus traffic by up to 18% on mixed workloads compared to static MOESI.
Key Takeaways
- Coherence protocols are fundamentally about trading off bus traffic, memory bandwidth, and cache utilization.
- The Owned state in MOESI is most valuable when dirty sharing is common.
- Adaptive approaches can outperform static protocols, but add complexity to the controller state machine.
- When evaluating protocols, sweep across cache sizes, associativity, and block sizes — the optimal choice changes with the memory hierarchy configuration.