Improving Endurance in 3D-NAND Flash€¦ · Optimal B1+D1 D2 D3, reset to B1 after reprogram B1+D1...

12
Improving Endurance in 3D-NAND Flash Roman Pletka, Ioannis Koltsidas, Nikolas Ioannou, Sasa Tomic, Nikolaos Papandreou, Thomas Parnell, Haralampos Pozidis IBM Research Zurich Research Laboratory Aaron Fry, Tim Fisher IBM Systems Flash Memory Summit 2017 Santa Clara, CA 1

Transcript of Improving Endurance in 3D-NAND Flash€¦ · Optimal B1+D1 D2 D3, reset to B1 after reprogram B1+D1...

Improving Endurance in 3D-NAND Flash

Roman Pletka, Ioannis Koltsidas, Nikolas Ioannou, Sasa Tomic,

Nikolaos Papandreou, Thomas Parnell, Haralampos PozidisIBM Research – Zurich Research Laboratory

Aaron Fry, Tim FisherIBM Systems

Flash Memory Summit 2017

Santa Clara, CA 1

Agenda

▪ Background on Flash Management

▪ How can different Flash management components benefit from

each other ?

▪ 3D NAND characteristics

▪ What are the similarities and differences between 2D and 3D

NAND Flash ?

▪ Consequences on Flash Management

▪ Data placement and garbage collection strategies

▪ Threshold voltage shifting

▪ Wear leveling techniques

Flash Memory Summit 2017

Santa Clara, CA 2Disclaimer: Results in this presentation are not specific to a particular

product or a Flash memory vendor

Enterprise all-Flash arrays

Consistent sub-ms latency

5 – 7 years of endurance under

heavy writes

Enterprise reliability

Enterprise Flash Commitment

Flash Memory Summit 2017

Santa Clara, CA 3

Purpose-built hardware (FPGAs)

Hardware-only data path

Purpose-built ECC schemes

Advanced Signal Processing

State-of-the-art Garbage Collection and Data Placement

Fine-grained Health Management

How?

Flash Management

Flash Controller

Flash Management ComponentsDependencies and Synergies

Flash Memory Summit 2017

Santa Clara, CA 4

DataPlacement

Read Voltage

Shifting

Health Binning(Wear Leveling)

GarbageCollection

Separationof host andrelocation

writes

HeatSegregation

Classify blocks according to

their health

Reduce write amplification

Isolatewrites with different

update frequencies

Isolateuncorrelated writes

Introduce variations in block health

ECC

Accurate RBER

measure-

ments

BlockCalibration

Determine optimal

shift values

Data Placement Strategies

Significant write amplification reductions

can be achieved for skewed workloads

through adequate data placement

strategies:

• Separation of host and relocation writes:• Typical workloads show no correlation between

the update frequencies of host and relocation writes

• Heat Segregation:• Heat: Tracking update frequency at LBA

granularity

0%

10%

20%

30%

40%

50%

60%

70%

Host andRelocation Write

Separation

Hot-Cold HeatSegregation

BothRed

uct

ion

in W

rite

Am

plif

icat

ion

Random Zipfian80/20 Zipfian95/20

Endurance Characteristics

Flash Memory Summit 2017

Santa Clara, CA 6

▪ In 2D NAND Flash, significant differences in the RBER over time have been observed among blocks [1]

• Some blocks have almost twice the endurance of others

• Low RBER at early life does not indicate a good block, and an early high RBER not a weak one!

0

0.5

1

1.5

2

1x nm MLC 1y nm MLC 3D-NAND TLC

No

rma

lize

d E

nd

ura

nce

Average block endurance and standard deviation(normalized to 1x nm MLC Flash)

RBER of different

consumer-level

flash blocks in the

same device as a

function of P/E

cycles

[1] Health Binning: Maximizing the Perf ormance and Endurance of Consumer-lev el NAND Flash

R. Pletka, S. Tomic, SYSTOR 2016

▪ The same effects exist in 3D NAND:

• Comparison of the average block endurance using the same ECC capable of correcting in the order of 10-2 errors

From Traditional Wear Leveling to Health Binning

•Introduce data placement with stream segregation

•Use better blocks for hotter data

Dynamic WL

Balance P/E cycles across blocks upon overwrites and relocations.

Typically uses the least worn available block to place new data.

•Reduce Static WL to address retention and read disturb limitations of Flash

•Perform relocations instead of block swapping

Static WL

Identifies the least worn blocks holding static data in the

background. Still valid data is relocated to another block causing an increase in write amplification.

•Background grading of blocks based on RBER

•RBER estimation based on ECC feedback

P/E Cycle-based WL

Balances wear of blocks based on their program-erase cycle count

only.Endurance gains of up to 60%

with 3D TLC NAND !

Read Voltage Shifting

Read Voltage Shifting has been proposed in the past [2]:

• Dynamic Read Level Shifting requires special access modes to Flash.

• Extensive characterization is required to determine behavior of read levels

under different conditions.

• Read level shift values depend on:

• Number of P/E cycles of the block

• Number of reads a page has seen since programmed

• Retention time

• Individual block/page characteristics

• Block Calibration: Optimal read levels must be continuously updated in the

background which takes a non-negligible amount of time.

Benefits:

• Dynamic read level shifting significantly contributes to maximize flash

endurance: Gains of 3x in endurance achievable!

• Calibration in the background does not impact host reads and writes.

• Use special techniques to reduce meta-data overhead.

Flash Memory Summit 2017

Santa Clara, CA 8

[2] Using Adaptive Read Voltage Thresholds to Enhance the Reliability of MLC NAND Flash Memory Systems,

N. Papandreou, T. Parnell, H. Pozidis, T. Mittelholzer, E. Eleftheriou, C. Camp, T. Griff in, G. Tressler, A. Walls,

GLSVLSI 2014

Vth11 01 00 10

Erased

State

Programmed States

Thr1 Thr2 Thr3

Vth11 01 00 10

Thr1 Thr2 Thr3

Vth11 01 00 10

Thr1 Thr2 Thr3

t3

t2

t1

RB

ER

Predominantly

P/E Cy cling

Data Retention

Phase

t2 t3

Nominal

Optimal

1.0x 2.0xt1

Typical 2D NAND behavior:

Block Calibration Challenges

• Results from a characterization experiment showing the evolution

of the RBER at specific points in time using optimal shift values.

• RBER increases during cycling and further deteriorates during the

retention phase

• If read voltage levels are not adapted periodically as well as after

erasing and reprogramming the block, the RBER continues to increase!

Flash Memory Summit 2017

Santa Clara, CA 9

Vth11 01 00 10

Erased

State

Programmed States

Nominal

Thr1

Nominal

Thr2

Nominal

Thr3

Vth11 01 00 10

Optimal

Thr1

Optimal

Thr2

Optimal

Thr3

Vth11 01 00 10

Optimal

Thr1

Optimal

Thr2

Optimal

Thr3

Base

Shift

Delta

Shift

How can this be addressed ?

• Maintain separate shift values to track contributions to

the threshold voltage distribution from:

• permanent changes (e.g., P/E cycling) Base shift

• temporary changes w.r.t. the block erase count (e.g., retention, read disturb effects, …) Delta Shift

• Reset Delta shift values after a block erase

Nominal

Optimal

B1+D1D2 D3,

reset to B1 af ter reprogram

B1+D1D2 D3,

no reset af ter reprogram

Data Retention

Phase

RB

ER Predominantly

P/E Cy cling Phase

Data Retention

Phase

Re

loca

te +

Era

se

+ P

rog

ram

ECC limit

Time

Typical 3D NAND behavior:

Transient effects on the RBER

• Relative contributions to the RBER

from retention and read disturbs

• Observations:• 2D MLC Flash @ EOL behaves similarly

as 3D TLC @ beginning of life

• 3D TLC has much higher relative

contributions to the RBER from transient effects

• Consequences on Flash 3D Flash

Management:• Equalize block health with Health Binning

to keep RBER differences between blocks

due to permanent effects as small as

possible

• Additional data relocations are needed which may affect overall endurance

Flash Memory Summit 2017

Santa Clara, CA 10

Read Disturb

characteristics:

Retention

Characteristics:

2D MLC NAND: 3D TLC NAND:

Conclusion

• 2D vs. 3D NAND Flash characteristics

• Similar block variability observed in 3D NAND compared to 2D NAND 1x

and 1x nm generations

• Transient effects on the RBER dominate even early in life and require

careful Flash management in 3D NAND

• Efficient Flash management techniques to address these challenges have been outlined:

• Data placement with separation of host and relocation writes combined with

heat segregation

• Health Binning using ECC feedback to grade blocks for data placement

• Read voltage shifting with background block calibration

Flash Memory Summit 2017

Santa Clara, CA 11

Flash Memory Summit 2017

Santa Clara, CA 12

Questions ?

www.research.ibm.com/labs/zurich/cci/

Thanks !