Opportunities and Challenges for the
Nanometric Design of Post-CMOS Memories
Fabrizio Lombardi ITC Endowed Chair Professor
Dept of ECENortheastern University, Boston
CMOS: currently at 28/22nm, soon to move further down in scaling (ITRS)
New commercial markets: GPU, tablet, massive external storage (mostly portable)
Emerging paradigms: multi-value operation, non-volatile RAM, processing-in-memory
Challenges: New designs abound, but not yet a clear winner
Memory: today is already the past
CMOS is not going away any time soon More and More-Than Moore Beyond CMOS
Evolution of extended CMOS (ITRS)
More Than Moore
year
Elements
Beyond CMOS
Logic Technologies ITRS 2011
Extending MOSFETs to the End of the Roadmap
___________CNTFETsGraphene nanoribbonsIII-V Channel MOSFETsGe Channel MOSFETsNanowire FETsTunnel FET Non-conventional Geometry Devices
Unconventional FETSCharge-based Extended CMOS Devices _______________Spin FET& Spin MOSFETNegative Cg MOSFETNEMS switchExcitonic FET, Mott FETTunnel FETI-MOSSET
Non-FET, Non Charge-based ‘Beyond CMOS’ Devices
_______________
Spin Transfer Torque LogicMoving domain wall devicesPseudo-spintronic DevicesNanomagnetic (M:QCA)Negative Cg MOSFETAll Spin Logic Molecular SwitchAtomic SwitchBiSFET
Resistive Memories
Memory Technologies ITRS2011
Redox Memory−Nanoionic memory−Electrochemical memory− Fuse/Antifuse memoryMolecular Memory
Electronic Effects Memory− Charge trapping− Metal-Insulator Transition− FE barrier effects
Spin Transfer Torque MRAMNanoelectromechanical Nanowire PCMMacromolecular (Polymer) Capacitive Memory
FeFET Memory
Memory technology (ITRS2011)
ITRS+IBM: Memories
NVM cost/gigabyte ~ $1 (Intel)
CMOS vs. Post CMOS Memories PVT variations Stability (SNM) concern Power dissipation Charge diffusion and
collection in the layout Basic binary operation
(supply voltage requirements)
Inability to meet large storage needs
Likely soft errors
Avoid large capital investment, selectively use new/compatible technologies
Preferably, hybrid circuits
Multi-level (multi-bit) operation
Processing in memory (PIM)
Problematic endurance
Move to higher radix bases than binary: ternary, quad or eventually octalBases:
1. Ternary: used for CAM processing mostly in routers, but also in GPUs (cache)
2. Quaternary/Octal: increase capacity for massive storage (to replace flash memories)
Not efficiently done in CMOS (additional voltage rails and high area/power penalty)
Use radically new technologies
Multi-Level (Multi-bit) Operation
ITRS: memory has always met stated objectives in the past
Late 2014 as crucial initial milestone wrt to performance (power dissipation and density) and design fundamentals.
Discuss new (emerging) directions:
Unorthodox technologies (briefly) Material-based technologies Focus on non volatile memories
Emerging Technology Trends
Innovative operational paradigms for memory using new physics storage phenomena:
1. QCA (memory in motion); challenge is room temperature operation and CMOS compatibility for manufacturing
2. SET (controlled transfer of electrons for memory operation purposes)
Long term opportunities abound, but grand challenges tooCurrently applicable mostly to an academic investigation
Unorthodox Technologies
Exploit new materials and fabrication methods (CMOS compatible) to meet challenges
Additional criteria:1. Hybrid operation is usually sought2. Robustness to PVT variations/endurance.3. New design realms:
Multi level (resistance) for increased capacityAmbipolar operation for controlAPPLICATION: non volatile storage
Material-based technologies
Emerging Research Memory Technology Stand-Alone Embedded
Ferroelectric-gate FET X
Nanoelectromechanical RAM X X
Spin Transfer Torque MRAM X
Nanoionic or Redox Memory X X
Nanowire Phase Change Memory (PCM) X X
Electronic Effects (Charge trapping, Mott) X
Macromolecular memory X X
Molecular memory X X
2011 Memory Application (ITRS)
Also know as Resistive RAMs: add (programmable) resistive element(s) to active device(s) (usually 1T1R for simplest non-volatile cell design)
Issues:1. Resistance range (Rmax-Rmin)2. Power dissipation and leakage3. Programmability and universal memory feature4. Error/defect models (soft and drift) 5. Endurance (related to read/write operation)6. Testing
Non-Volatile Memories
FEATURE NOR NAND PCM MRAMFRAM
Capacity 256MB 16GB 32MB 2MB 1MB
Random Read Yes No Yes Yes Yes
Random Write No No Yes Yes Yes
Endurance 10^5 10^5-10^3 10^6 10^15 10^14
Management High High Mod No NoError Correction No 1-72 bits * No NoRetention(ys) 10 1-10 15 20 5-
20Read Access(ns) 60 60 10 35 60 Prog Access(us)200 200 20 35 60Erase Access(ms) 1-100 1-100 50 35 60Power Mid Mid Mid Low
LowCell size(F^2) 10 4 4 6-20 4-
15Universal Memory No No Yes Yes
Yes
Flash vs NV-RRAMs (late2012)
Roadmap (IBM 2012)
Competition
Flash memory seen as a mature technology, unable to capitalize on scaling and not meeting high density storage for mobile application
Low lifetime due to high-voltage based process
Apple and Anobit (2012) Additional players:Samsung, Micron, IBM
Moving on…..
Why resistive-based memories?Enables true crossbar structures at system-level
• Does not require many transistors or other access devices
R emove silicon requirements:• Improve density• R educe power consumption• Integrate with processors• R educe total area• Crossbar Inc (August 2013):
3D stacking, 1TByte on chip prototype (using FeRRAM)
P Cell Size = 4 F2
P itch = 2F for cross bars
Feature size = Litho node F
MEMRISTOR
dφ = M dq
1971Chua
The Memristor: Prediction
Ohm 1827
1831Faraday
Von Kleist 1745
Leon ChuaU.C. Berkeley
RESISTORdv = R di
CAPACITOR
dq = C dv
INDUCTOR
dφ = L di
i
v
q
φ
dφ/ dt = v dq / d t = i
φ v q i
Fourth Fundamental, Two-Terminal Circuit Element
Memristor Resistance depends on direction of voltage or
current across it (dϕ = M*dq) Titanium dioxide film sandwiched between
two platinum electrodes; doped operation (HP Labs), 5-10nm in length
Resistance Range• Between Ron and Roff• Roff : Highest resistance• Ron : Lowest resistance
Excellent linearity in switching Resistive range is good I-V characteristics are also very good Nanometric dimension (10nm in 2011, 5nm
in 2013): very high density potential at extremely low power consumption
Manufacturing compatibility with CMOS Problem: endurance and leakage (on read)
Memristor vs. xResistive
Ambipolar control of single memristor No standby power, no direct path from VDD
to GND, only dynamic power dissipation Less number of transistors than RAM (6T)
Memristor-Based Memory Cell
Performance of Binary Cell Memristor changes its value when reading
Roff state Refresh operation is required Write time significantly higher than read
VDD(V)32 nm 45nm 65 nm
0.9 V 1 V 0.9 V 1 V 0.9 V 1 V
Write time (ns) 160 150 195 180 235 200
Read time (ns) 0.8 0.75 0.975 0.9 1.175 1
Endurance: stuck-at-1 (HP data)
R o nR o ff
10 0 10 1 10 2 10 3 10 4 10 5 10 610 2
10 3
10 4
Ti 1nm /Pt 100nm/TiOx 29nm/Ti4O7 100nm
Resi
stan
ce(o
hm)
sw itching cycles
R o nR o ff
Phase Change Memory Use phases of GTS (chalcogenide alloy) High current-based process for two
phases: amorphous (high R) and crystalline (low R).
No erase-write cycle as for NAND flash (at most 100,000 cycles for enterprise product)
Ron, programming (write) region: intersection of Ron curve with voltage axis is Vh (holding voltage)
Roff, read region: this can be changed by I or V pulse; Roff=Ron exp(toff/t) where t=effective recombination time (constant), toff=non programming time
Vx as intersection point of Ron curve and Rset curve, Vx=Vh x Rset/(Rset-Ron)
Typical values: Rset=7k, Rreset=200k, Ron=1k, Vh=0.45v, Rset<Roff<Rreset, t=5nsec
Resistive Features
Mobile devices (Samsung) PCM likely to a be a depository (for less
frequently accessed data) next to DRAM for processor design (IBM)
Networking/Communication systems: CAM/TCAM designs
Massive storage for data acquisition systems
Applications
ISSCC11: Samsung (1-Gbit, 58-nm manufacturing process, low-power double-data-rate nonvolatile memory interface)
ISSCC12 : Samsung (8-Gbit, 20-nm device). IEDM11: Macronix/IBM (39-nm device with
30-microamp reset current and 10^9 cycling endurance, 128-Mbit)
July 2012: Micron/Numonyx (45 nm PCM for mobile devices in 1 Gb and 512 Mb multichip packages); commercially available
Commercial news
PCM Features Low voltage and moderate current as
operational characteristics Multiple bit operation (at least 2): higher
resistance range (M ohms) than other RRAMs Read Time: 12ns; Write time: 85ns (@45nm) Soft error highly unlikely to occur for GST Good endurance (IBM: 1million cycles) and
density
Use 1T1P core for both CAM/TCAM Functionality is at support circuitry Voltage-based sensing for
comparison outcome in search Use of circuit with ambipolar properties for
comparison and control
New Cell Design
IBM (1/2 PCMs per core), current based operation
New cell (1 PCM per core), voltage based operation
Quantitative Comparison
Circuit CAM TCAM[20] Proposed [20] Proposed
Write Time (ns) 199.34 199.34 209.53 199.34
Search Time (ns) 1.326 1.092 1.346 2.447
Number of Transistors/Core
1 12
1
Number of PCM s/Core 1 1
21
PDP of Search
Operation (fJ)
46.6886 36.429648.41
43.4518
Stored Search IML (A)
0(200kΩ)
0 (VSL = 0) -1.38*10-9
1 (VSL = 0.4) -1.97*10-6
1(7kΩ)
0 (VSL = 0) -1.38*10-9
1 (VSL = 0.4) -4.15*10-5
Practical problem: drift of resistance and threshold voltage (when not read or programmed)
Related to crystalline fraction (Cx) in GST Rpcm=(1-Cx)*Ra+Rc*Cx (Ra >> Rc) Ra=Rreset Rc=Rset
But …….
Resistance Drift Level drift is more pronounced for high
resistance states and non linear wrt time Problematic for MVL storage (i.e. more than one
bit per cell) Order of resistivity for states remains the same
(short term), so avoid overlap in long term.
Use advanced modulation coding technique for solving short-term drift (analogous to NAND flash, electrons leak through thin walls of cells and create data read errors).
Apply a voltage pulse based on deviation from desired level and measure resistance. If desired level of resistance is not achieved, apply another voltage pulse and measure again – until achieve the exact level
Only suitable for binary cell storage It may reduce endurance (multiple writes)
IBM Drift Solution (short-term)
Assume cell independence in drift errors (?). Data to be encoded not in the programmed
state but in the relative order of the states in a small group of cells.
Error in encoding scheme only seen when resistivity levels of states cross each other
Software-based error correction methodologies are then applied (slow)
Reduction in capacity: from 2 bits/cell to 1.57 bits/cell
Error Codes (mid-ware)
Octal base for MVL (noise, crosstalk) and/or single vs multiple storage elements
MVL implications on error detection/correction
Dynamic models of RRAM operation in HSPICE (as related to drift evaluation and mitigation)
At system-level, improve endurance by reducing maximum number of writes to a cell
System-level application modeling (for example “normally-off instantly-on” operation: combining SRAM with PCM)
On-going PCM Investigation
Emergence of new paradigms: resistive RAMs, non-volatile operation, multi-bit storage
Nearly all future memories will utilize new phenomena away from 6T configuration
TECHNOLOGY TIME SCALE: Hybrid implementations will be dominant in
the next 5-10 years 4Q-2014/1Q-2015 as crucial time frame for
PCM
Conclusion
Top Related