Maven Silicon SoC Design Understanding Memory Hierarchy in SoCs: Caches, TLBs, and Coherence

SoC Design

Understanding Memory Hierarchy in SoCs: Caches, TLBs, and Coherence

Name: Maven Silicon
Brand: Maven Silicon
Rating: 4.7 (1481 reviews)

byRaghavendra H
January 23, 2026
4 minutes read
4 Views

Introduction

Modern System-on-Chips (SoCs) are designed to deliver high performance, low latency, and energy-efficient computation. One of the most important contributors to SoC performance is the memory hierarchy’s multi-level organisation that ensures the processor can access data quickly while keeping overall system cost and power low.

This blog explains key components of the memory hierarchy in SoCs: caches, Translation Lookaside Buffers (TLBs), and cache coherence mechanisms, along with simple example snippets.

Why Memory Hierarchy Matters in SoCs?

CPUs operate much faster than memory. A typical core may run at GHz frequencies, while accessing DRAM takes hundreds of cycles. If every memory access went to DRAM, performance would collapse.

To solve this, SoCs rely on a hierarchical memory structure:

Registers (fastest, few cycles)
L1/L2/L3 Caches
On-chip SRAM / scratchpads
Off-chip DRAM

Each level acts as a filter, holding recently accessed data close to the processor. The deeper the hierarchy, the better the performance:-provided data is managed efficiently.

Cache Architecture in SoCs

Caches store small chunks of data called cache lines (typically 32–128 bytes) fetched from the main memory. Most SoCs use multi-level caches:

L1 Cache

Split into I-cache and D-cache
Very small (16–64 KB)
Fastest and placed close to the core

L2 / L3 Cache

Larger and shared among cores
Acts as a buffer between L1 and DRAM

Cache Mapping Techniques

Direct-mapped: simple but prone to conflicts
Set-associative: balance of speed and conflict reduction
Fully-associative: best flexibility but expensive

Example: Simple Cache Access Flow (Pseudocode)

This illustrates how the hierarchy reduces average memory access latency.

Understanding TLBs (Translation Lookaside Buffers)

Virtual memory systems translate virtual addresses to physical addresses. A naive page table lookup takes several memory accesses, so SoCs introduce the Translation Lookaside Buffer (TLB):-a small cache dedicated to storing recent address translations.

Key properties of TLBs

Similar to cache, but store page table entries (PTEs)
Typically 32–128 entries in L1 TLBs
Multi-level: L1 TLB + shared L2 TLB
Miss penalties are very high (walk page tables → multiple memory accesses)

TLB Hit vs Miss

Hit: translation returned in 1 cycle
Miss: hardware or software page-table walk

TLB Miss Example (Pseudocode)

Efficient TLB design is critical in SoCs running Linux or real-time OSes where context switching happens frequently

Cache Coherence in Multi-Core SoCs

Modern SoCs house multiple CPU cores sharing memory. When each core has its own cache, we face a problem:
If multiple caches store copies of the same memory location, then it leads to inconsistency unless the system maintains cache coherence.

Common Coherence Protocols

MESI – Modified, Exclusive, Shared, Invalid
MOESI – adds Owned state
MESIF – Intel variant with Forward state
Directory-based coherence – scalable for many cores

Why Coherence Matters?

Without coherence:

One core may read stale data
Writes may not propagate
Parallel programs behave unpredictably

MESI Protocol Example

Core0 write X 🡪 moves its cache line to M state

Core1 read X 🡪 Core0 supplies updated data, Core1 gets S state

Core0 read X 🡪 stays in M or moves to S depending on the protocol

Putting It All Together: End-to-End Memory Access

Below is a simplified view of what happens when a core reads memory in a modern SoC:

TLB checks if the virtual address is cached
If miss → page-table walk
Physical address sent to L1 cache
If miss → check L2 cache
If other cores may hold the line → coherence protocol ensures correct version
If all miss → fetch from DRAM
Return data up the hierarchy and update caches

Each layer either accelerates access or ensures correctness.

Practical Example: RISC-V SoC Memory Access Test

A small C snippet that demonstrates how memory locality impacts cache/TLB behavior:

Sequential access runs significantly faster due to cache line and TLB locality

Conclusion

The memory hierarchy in SoCs plays a pivotal role in enabling modern processors to achieve high performance while keeping power consumption reasonable. Caches reduce the time to fetch data, TLBs accelerate address translation, and coherence protocols maintain consistency across multiple cores.

Together, they form the backbone of efficient system design. Engineers building SoCs, firmware, or device drivers must deeply understand these concepts to optimize performance and ensure correctness

Raghavendra H
Raghavendra Havaldar focuses on delivering high-quality training in VLSI design and RTL development at Maven Silicon. He has over 18 years of combined industry and academic experience and strong expertise in Verilog, RISC-V architecture, FPGA, GPIO, and AHB-APB protocols. He has played a key role in developing RTL for RISC-V cores and building self-checking testbenches, while also training hundreds of engineering graduates and professionals in frontend VLSI technologies

Share This Post:

Loading Popular Posts...

Loading categories...

75,221

SUBSCRIBERS

Subscribe to our Blog

Get the latest VLSI news, updates, technical and interview resources

Download the

Maven Learning App

LEARN ANYTIME, ANYWHERE

Get trained online as a VLSI Professional

FLAT

40% ^OFF

On all Blended Courses

75,221

SUBSCRIBERS

Subscribe to our Blog

Get the latest VLSI news, updates, technical and interview resources

Have Doubts?
Read Our FAQs

Don't see your questions answered here?

Understanding Memory Hierarchy in SoCs: Caches, TLBs, and Coherence

Introduction

Why Memory Hierarchy Matters in SoCs?

Cache Architecture in SoCs

L1 Cache

L2 / L3 Cache

Cache Mapping Techniques

Example: Simple Cache Access Flow (Pseudocode)

Understanding TLBs (Translation Lookaside Buffers)

Key properties of TLBs

TLB Hit vs Miss

TLB Miss Example (Pseudocode)

Cache Coherence in Multi-Core SoCs

Common Coherence Protocols

Why Coherence Matters?

MESI Protocol Example

Putting It All Together: End-to-End Memory Access

Practical Example: RISC-V SoC Memory Access Test

Conclusion

Raghavendra H

Share This Post:

SUBSCRIBERS

Subscribe to our Blog

Maven Learning App

Get trained online as a VLSI Professional

40% OFF

SUBSCRIBERS

Subscribe to our Blog

Have Doubts?Read Our FAQs

40% ^OFF

Have Doubts?
Read Our FAQs