The new CPU generations have increased the number of cores with each iteration. Each core requires fast access to the main memory ans the bandwidith required by a CPU is also increasing. The solution is stacking layers of DRAM chips atop each other and combining the control logic into a HMC – Hybrid Memory Cube.
Traditional memory channels have wide interfaces that run multiple devices in lockstep, are extremely sensitive to signal loading, and require coordinated accesses to prevent bus contention. HMC uses high-speed serial/deserializer (SERDES) channels to interface to the outside world. These channels run directly between devices and can operate independently of each other, yet can talk to any memory location within the cube. Each HMC lane runs at 10 Gb/s, 12.5 Gb/s, or 15 Gb/s with multiple channel configurations to support almost any bandwidth requirement up to 160 GB/s per cube. Not only does the cube provide bandwidth scalability, it also supports cube-to-cube chaining to increase memory density without increasing the pin count of the host.
Before HMC, it was impossible to sustain 160 GB/s (that’s 1.28Tb!) of memory bandwidth—all day long. It took the efforts of many accomplished engineers, working for several years, to resolve the well-documented “memory wall” problem with HMC. The HMC stack consists of four or eight memory layers and one logic layer. The memory is based on our high-volume process node but designed just for HMC. Each memory layer has millions of memory cells in defined groups (vaults) with complex support logic (vault controller) that controls all aspects of the memory cells and provides an interface to the internal crossbar switch
. The logic die is manufactured at a world-class logic foundry so we can maximize the density of the memory layers and take advantage of the predictability of the logic process to ensure robust vault controllers. HMC has 16 vaults that operate independently of each other and are designed to sustain 10 GB/s (80 Gb/s) of true memory bandwidth from each vault. The logic layer also supports the external interfaces, cross-bar switch, memory schedulers, built-in self-test (BIST), sideband channels, and numerous reliability, availability, and serviceability (RAS) features.
Micron samples 2GB and 4GB HMC modules with a maximum bandwidth of 120 GB/s and 160 GB/s.
The HMC data sheet has no memory timing requirements; no RAS, CAS, WE, and CS signals; no refresh requirements; no tFAW, tRP, tWR, and tRC restrictions; and no slew rate adjustments for voltage or bus loading variations. HMC is agnostic to memory; it can support any memory type without changes to the protocol. The standardized communication protocol defined by the Hybrid Memory Cube Consortium (HMCC) supports multiple READ, WRITE, and ATOMIC commands within a reliable packet with a header and tail that provide data routing between the host and all cubes within the channel. The protocol is simple and adaptive to future HMC devices.
HMC is developped by a consortium with eight lead developers: Altera, ARM, IBM, SK Hynix, Micron, Open-Silicon, Samsung and Xilinix. An HMC 1.0 spec has been drawn up and released, and there are more than 100 HMC adopters listed by the consortium.
The adopters can use HMC as “near memory” mounted close to the processors using it, or as “far memory” featuring scale-out HMC modules and better power-efficiency