MRAM DTCO and Compact Models - University of Minnesota
Transcript of MRAM DTCO and Compact Models - University of Minnesota
MRAM DTCO and Compact Models
Jeehwan Song*, Jian-Ping Wang, and Chris H. Kim
Dept. of ECE, University of Minnesota*Now with TSMC, San Jose, CA
This work was supported in part by C-SPIN, one of the six SRC STARnet Centers, through MARCO and DARPA and in part by the NSF/SRC E2CDA Program.
Spin Transfer Torque MRAM Basics
• Magnetic Tunnel Junction (MTJ) is the storage element of STT-MRAM• Thermal stability factor (Δ) determines the retention time while anisotropy field (HK) determines
the energetic preference of the magnetization vector• Key features: Low operating voltage, good CMOS compatibility, high speed, high density (<20F2),
zero static power, and high endurance. • Applications: non-volatile memory (eflash replacement in 22nm), cache (SRAM replacement)
2
EbP APThermal
fluctuation
current
WL
BL
MTJfree
fixedoxide
DTCO for STT-MRAM
• Initial set of compact models and design rules for a preliminary standard cell library and entire optimization loop is repeated to obtain a satisfactory set of the device and design rule parameters
• MTJ compact model is a critical component of the overall MRAM DTCO flow.3
Device/material parameters, design rules
MTJ, CMOS models
MRAM circuit design
(layout, read/write)
Monte Carlo, Evaluate PPA
Technology(CMOS/MTJ)
choice
Development
Yieldimprovement
Technology Release
...
EDA
Fab
IP
EDA
Fab
IP
EDA
Fab
IP
process node influence ↓
Outline
Introduction: DTCO and STT-MRAMMTJ SPICE ModelsSTT-MRAM Array Level EvaluationState-of-the-art STT-MRAM CircuitsConclusions
4
MTJ SPICE Model
• Landau-Lifshitz-Gilbert (LLG) equation solved using SPICE
• The MTJ SPICE models are available at mtj.umn.edu.
5
J. Kim, et. al., CICC, 2015.
J. Sun, Nature 2003 (IBM)
M
sF MeWLtPR
2
=
ySTTI ,yDMPI ,
MTJI
)( yMV
γα 21+
=C
)( K e fxHV
)( sttAV
yPRCI , )( 0yMV
C_My My 0 '(1+α2)/γ'G_dMy_prc 0 My cur='-(v(Mz)∙v(HKefx)-v(HKefz)∙v(Mx))'G_dMy_dmp 0 My cur='-α∙(v(Mz)∙(v(My0)∙v(HKefz)-v(HKefy)∙v(Mz))-(v(Mx)∙v(HKefy)-v(HKefx)∙v(My0))∙v(Mx))'G_dMy_stt 0 My cur='v(Astt)∙(v(Mz)∙(v(My0)∙Mpz-Mpy∙v(Mz))-(v(Mx)∙Mpy-Mpx∙v(My0))∙v(Mx))'E_My0 My0 0 vol='v(My)' max='cos(v(θc))' min='-cos(v(θc))'
)( K e fyHV )( K e fzHV
sFsttpsttKeffKeff
MetPJAMMMAHMMHM
dtMd
2),()(1 2
=××⋅+××⋅−×−=⋅+ αγα
Numerical form:
Circuit implementation (y-coordinate):
HSPICE script (y-coordinate):
Precession Damping Spin torque
HK
thresholding
MTJ SPICE Model Verification
• Model parameters can be tuned to match experimental data. • Variability effects of both the MTJ (W, L, tF, RA) and CMOS (W, L, Vth, Tox) can be incorporated
6
(a)
(b)
VSA
RAP RP
IRef
IAP IP
Even I/O Odd I/O
BL0
Ref. cells
. . .. .
.
. . .
VSA
. . . . . .
. . .
WDR
IRef
SL0 BL15 SL15 BL0SL0BL15SL15
COL0
WDR
CLAMP
READ
COL15
WL0
WL255
. . .
..
Readpath
Write path
J. Kim, et. al., CICC, 2015.
Simulated Write and Read Delay Distributions
• Includes realistic variation for both MTJ (i.e. W, L, tF, RA) and CMOS (i.e. transistor W, L, Vth, tox)
• Top: Write delay distribution is narrower for a higher write voltage due to faster switching.
• Bottom: TMR ratio (=(RAP−RP)/RP) impact on read sensing margin.
7
0
10
20
30
00.5
11.5
2
0
10
20
30
3.5 4.5 5.5 6.5 7.5
STT-MRAM 6σ write delay VDD (1.2V): 7.49ns VDD+0.1V: 6.49ns VDD+0.2V: 5.80ns VDD+0.3V: 5.29ns
4.5 7.5Write delay (ns)
5.5 6.5
30
10
0
20
Perc
entil
e (%
)
3.5
STT-MRAM 6σ sensing delay TMR 100%: 1.32ns TMR 200%: 0.82ns TMR 300%: 0.67ns
0.5 < 2.0Sensing delay (ns)
1.0 1.5
30
10
20
Perc
entil
e (%
)
0
Read failures
0
65nm CMOS, i-PMTJ (Δ=70), 85ºC
J. Kim, et. al., CICC 2015
Incorporating Stochastic Switching Behavior
• Thermal fluctuation causes switching time to vary each write operation
• Can be incorporated into the model using the initial angle parameter (e.g. Monte Carlo) and time-varying thermal field
8
I. Ahmed, et al., JxCDC 2017
J. Sun, Nature 2003 (IBM)
M
HK
0.0 0.5 1.0 1.5 2.0 2.5 3.010-10
10-07
10-04
10-01
1002
Initial angle, θini (radian)
ρ ini
tial (θ
)
𝜌𝜌𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝜃𝜃 =𝑒𝑒𝑒𝑒𝑒𝑒 −𝛥𝛥𝑠𝑠𝑠𝑠𝑠𝑠2 𝜃𝜃
∫0𝜋𝜋2 𝑠𝑠𝑠𝑠𝑠𝑠 𝜃𝜃 𝑒𝑒𝑒𝑒𝑒𝑒 −𝛥𝛥𝑠𝑠𝑠𝑠𝑠𝑠2 𝜃𝜃 𝑑𝑑𝜃𝜃
Δ = Thermal stability factor,θ = Initial angle between magnetization and Z-axis
Outline
Introduction: DTCO and STT-MRAMMTJ SPICE ModelsSTT-MRAM Array Level EvaluationState-of-the-art STT-MRAM CircuitsConclusions
9
“Dennard” Style Scaling Study for STT-MRAM Caches
• First order scaling analysis based on MTJ macromodels• Δ for a target retention time is set by adjusting free layer thickness and MTJ anisotropy. • STT-MRAM’s scalability based on a constant JC0•RA/VDD scaling scenario
10
Technology scaling
Δ∝*Hkctw2
Diameter: w → αwThickness: t → t/α2
Thermal stability (Δ)
Δ and JC0 adjustment with technology
scalingΔ → Δ
JC0 → JC0/α2
JC0•RA scaling RA → α2•βRAJC0•RA/VDD: constant
W
αW
t/α2
t
K. Chun, et al., JSSC 2013 *based on MTJ macromodels, before MTJ SPICE models became available
α: dimension scaling factorβ: voltage scaling factorJc0: Critical switching current densityRA: Resistance area product
Device/material parameters, design rules
MTJ, CMOS models
MRAM circuit design
(layout, read/write)
Monte Carlo, Evaluate PPA
“Dennard” Style Scaling Study for STT-MRAM Caches
• Left: Sub-array architecture of STT-MRAM cache (~3 times smaller than 6T-SRAM cache)• Right: simulation setup includes process variation in the memory cells and SA circuit as well
as variation of wire resistances, capacitances, reference biases, and supply levels 11
Column & BL-S/A
64kb STT-MRAM256cells/WL256cells/BL
WL
Dec
oder
Cha
rge
pum
p &
Dec
ap
WL
Dec
oder
C
TRL
Localized write driver at every 64 bits [13]
Dum
my cell averaging and
disturb-free reference [16]
349μ
m
189μm* Mismatches are based on inverse square root relationship of devices' areas.* Based on historic data, we assume σVt/F is constant with tech. scaling* µ(CBL) is assumed to be scaled proportional to scaling factor.
-10% to account for supply noisePower supply noise
Device mismatchesBit-cell
σ/µ=5%: each µ are calculated based on sub-array sizeParasitic capacitance (CBL)
Resistance area product σ/µ=5%
Sense Amplifier (S/A)I-applying and V-sensing method (AP direction read) +
Voltage S/A: IREF σ/µ=2.5%, S/A pair mismatches
Reference cell Reference cell averaging scheme with MTJ replica cells
Write threshold current σ/µ=5%
K. Chun, et al., JSSC 2013
Device/material parameters, design rules
MTJ, CMOS models
MRAM circuit design
(layout, read/write)
Monte Carlo, Evaluate PPA
6σ BL Sensing and Write Delay Trends
• Sensing delay decreases with scaling, and with a higher TMR ratio • Write delay becomes worse due to the lower drive current (in planar CMOS).• The trends follow basic circuit intuition (i.e. read and write always have a conflict)
12
6-σ
BL w
rite
dela
y (n
orm
aliz
ed)
0.1
1
10
100
65 45 32 22 15 11 8Technology node (nm)
6T SRAMSTT-MRAM @VDDSTT-MRAM @VDD+0.1vSTT-MRAM @VDD+0.2vSTT-MRAM @VDD+0.3v
RA≥1.0Ω•μm2
STT-MRAM: WTX=12F
Iso latency criterion
(b)0.1
1
10
100
65 45 32 22 15 11 8
6T SRAM TMR 100%TMR 150% TMR 200%TMR 250% TMR 300%
Technology node (nm)
6-σ
BL s
ensi
ng d
elay
(nor
mal
ized
) STT-MRAM: JRD/JC0=1.5 @6-sigma
Iso latency criterion
(a)
K. Chun, et al., JSSC, 2013.
Device/material parameters, design rules
MTJ, CMOS models
MRAM circuit design
(layout, read/write)
Monte Carlo, Evaluate PPA
Outline
Introduction: DTCO and STT-MRAMMTJ SPICE ModelsSTT-MRAM Array Level EvaluationState-of-the-art STT-MRAM CircuitsConclusions
13
Write Verify Scheme
• MRAM cell is repetitively written until correct value is verified significant write error rate reduction at the expense of high write energy and long write time
• More suitable for NVM applications than cache memory• Write driver has programmable write strength
14
(a)
(b)
Read(verify)
Expected data?
Done
WriteNo
Yes
time
Cur
rent
leve
l
. . .read
Write(1st trial)
read
Write(2nd trial)
read
Write(3rd trial)
Weak write
settings
Strong write
settings
SEL2
SEL1
SEL0
BL
SL
EN
EN=1, CTRL=1: SLBL current flowEN=1, CTRL=0: BLSL current flow
CTRL
L. Wei, et al., ISSCC 2019 (Intel 22nm) H. Noguchi, et al., ISSCC 2016 (Toshiba 65nm)
Array with Shared SL Voltage
• Common source line for compared area and reduced parasitic resistance‒ Improves read and write margins with a modest layout area overhead. ‒ Inhibit voltage is applied on the unselected BL‒ Negative voltage applied to unselected WLs to suppress the BL leakage current
15
CSL0
. . .
. . .
BL0 BL1
. . .
WL0
WL511
WL1
CSL3
. . .
. . .
BL6 BL7
. . .
Selected cell Unselected cell
Y.-D. Chih, et al., ISSCC 2020 (TSMC 22nm), C. Kim, et al., ISSCC 2015 (Samsung)
Readout Circuit• Clamp transistor trimming circuit
‒ Clamps BL voltage to prevent read disturbance‒ Trimming circuit to remove offset
• 1T4MTJ reference cell provides a stable (RP+RAP)/2 reference value without causing read disturbance
• Half-VDD detection circuit improves the sensing margin by extending the signal development time
16
RP
Clamptrimming
WL
PRE
VSA
PRE QQREF
... ...
Q QREF
RAP
RP
RAP
WLREF1T4MTJref. cell
Half VDD detection
BL BLref
Y.-D. Chih, et al., ISSCC 2020 (TSMC 22nm)
Conclusions• This invited paper covers various aspects of MRAM DTCO including
device, circuit, and architecture considerations. ‒ SPICE MTJ device model‒ Array level STT-MRAM PPA evaluation‒ Scalability and variability studies‒ Novel MRAM read/write circuit, array, and bit-cell layout design
techniques can improve the yield of the large arrays and must be accounted for in DTCO flow
17
This work was supported in part by C-SPIN, one of the six SRC STARnet Centers, through MARCO and DARPA and in part by the NSF/SRC E2CDA Program.