AMD has filed a patent application for a novel cache architecture that stacks L2 memory layers vertically, aiming to deliver lower latency than conventional planar designs. The approach builds on its existing 3D V-Cache technology but targets the more critical L2 layer, which sits closer to the CPU cores and handles high-frequency data access.

The proposed design uses silicon vias to connect stacked cache dies directly above or below a base compute die, eliminating long wire paths that typically increase latency in planar layouts. Initial benchmarks suggest a 1 MB stacked L2 configuration could achieve 12-cycle latency, compared to the 14 cycles observed in planar equivalents, while also reducing power consumption by shortening signal distances and decreasing capacitance.

This isn't AMD's first foray into vertical cache stacking. Its 3D V-Cache technology, which stacks L3 memory layers, has already been implemented in products ranging from Ryzen mobile processors to EPYC datacenter chips. However, the new research indicates a shift toward addressing the L2 layer—a move that could have broader implications for performance and thermal efficiency.

AMD headquarter 20240318
  • Cache Architecture: Stacked L2 cache (1 MB–4 MB configurations)
  • Latency Improvement: 12 cycles (stacked) vs. 14 cycles (planar)
  • Connection Method: Silicon vias routed through the center of stacked dies
  • Power Benefits: Reduced capacitance, shorter signal paths, faster active-to-idle transitions

The implications are twofold: first, for performance-critical workloads where L2 cache access is a bottleneck, and second, for power efficiency in mobile and embedded systems where thermal output must be tightly controlled. If implemented, this could allow AMD to offer larger cache capacities without the traditional latency penalties, potentially influencing future CPU and GPU designs across its product line.

While no timeline has been provided, the patent suggests this is an active area of development. Whether it appears in consumer chips or high-performance computing remains unclear, but the focus on balanced latency positions it as a potential game-changer for next-generation architectures.