Power-Efficient Design of an Embedded Flash Memory ...542866/FULLTEXT01.pdf · Chapter1...

83
Power-Efficient Design of an Embedded Flash Memory Management System Master thesis JONAS BRUNLÖF December 2009 Supervisors: Magnus Persson, KTH Barbro Claesson, ENEA Detlef Scholle, ENEA Examiner: Martin Törngren, KTH

Transcript of Power-Efficient Design of an Embedded Flash Memory ...542866/FULLTEXT01.pdf · Chapter1...

Power-Efficient Design of an Embedded FlashMemory Management System

Master thesis

JONAS BRUNLÖF

December 2009

Supervisors:Magnus Persson, KTHBarbro Claesson, ENEADetlef Scholle, ENEA

Examiner:Martin Törngren, KTH

iii

.

Master thesis MMK2009:99 MDA356

Power-Efficient Design of an Embedded Flash MemoryManagement System

Jonas Brunlöf

Approved: Examiner: Supervisor:2009-12-17 Martin Törngren Magnus Persson

Employer: Contact person:ENEA AB Detlef Scholle.

Abstract.The report is the result of a master thesis at ENEA AB during the fall of 2009. Itaims to create a specification of flash memory management system which focuseson power efficiency and low RAM usage for embedded systems, and to design andimplement a prototype of such a system to facilitate further development toward thecreated specification. The system used by ENEA today is a Flash Translation Layer(FTL). It has a complex structure which prohibits modifications and customization,therefore a new flash memory management system needs to be developed.

The suggested solution uses a translation layer called Metadata FTL (MFTL), wherefile system metadata and userdata are separated from each other in order to improveperformance. The partition holding userdata uses a block-level mapped translationlayer system called Fully Associative Sector Translation FTL. The other partition,holding metadata, will instead use a page-level mapped translation layer systemwhich also separates often modified data from data modified seldom. The separationof data with different update frequencies is executed by a page allocation schemecalled Modification Aware (MODA).

The result of this report is a specification of the system described in the sectionabove and an implemented prototype which has all the basic features of an FTL.The implemented design can successfully be used instead of the old FTL with a fewrestrictions. It can handle normal file system commands and can manage rebootswithout loss of information. However, the main goal of the implemented designis still to act as a prototype to facilitate further development toward the designexplained in the specification.

iv

.

Examensarbete MMK2009:99 MDA356

Energieffektiv design av ett inbyggtflashminneshanteringssystem

Jonas Brunlöf

Godkänt: Examinator: Handledare:2009-12-17 Martin Törngren Magnus Persson

Uppdragsgivare: Kontaktperson:ENEA AB Detlef Scholle.

Sammanfattning.Denna rapport är resultatet av ett examensarbete på ENEA AB under hösten 2009.Målet med arbetet är att skapa en specifikation över ett hanteringssystem för flash-minnen som fokuserar på energieffektivitet och lågt RAM utnyttjande för inbyggdasystem, samt att designa och implementera prototyp som kan verka som grundför att vidareutveckla systemet mot den framtagna specifikationen. Det system somidag används av ENEA är ett translationslager (FTL). Det har en komplex strukturvilket förhindrar modifieringar och anpassningar, därför ska ett nytt hanteringssys-tem för flashminnen tas fram.

Den framtagna lösningen använder ett translationslager kallat Metadata FTL (MFTL)där metadata och användardata separeras från varandra för uppnå bättre prestan-da. Partitionen som håller användardata använder ett blocknivå-mappat transla-tionslager kallat Fully Assosiative Sector Translation FTL, vilket är designat förminimera energikonsumtionen genom att begränsa kostsamma skriv- och raderop-erationer till flashminnet och samtidigt konsumera lite RAM. Den andra partitionensom innehåller metadata använder istället ett sidnivå-mappat translationslager somsamtidigt separerar data som modifieras ofta och data som sällan modifieras för attspara ännu fler operationer. Separeringen av data med olika uppdateringsfrekvensutförs av ett allokeringsschema som heter MODA.

Resultatet av denna rapport är en specifikation över det system som är beskrivetovan samt en implementering av prototyp som har alla de grundläggande funktionerett FTL har. Den implementerade designen kan framgångsrikt användas istället fördet gamla FTL:et med några restriktioner. Det klarar normala filsystemkommandonoch kan hantera omstarter utan att tappa information. Fortfarande är det dock såatt den implementerade designen först och främst skall ses som en prototyp somkan användas för vidareutveckling av systemet.

Contents

Contents v

List of Figures viii

List of Tables x

Abbreviations xi

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2.1 Flash memory management system . . . . . . . . . . . . . . . 11.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Challenge: Merge energy efficiency and low RAM usage 52.1 Power management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Flash memory characteristics . . . . . . . . . . . . . . . . . . 62.1.2 RAM usage in embedded systems . . . . . . . . . . . . . . . . 7

2.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Flash memory management 93.1 An introduction to flash memory . . . . . . . . . . . . . . . . . . . . 9

3.1.1 Functionality of a flash memory . . . . . . . . . . . . . . . . . 93.1.2 Flash memory types . . . . . . . . . . . . . . . . . . . . . . . 103.1.3 Wear leveling . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.1.4 Garbage collection . . . . . . . . . . . . . . . . . . . . . . . . 123.1.5 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Overview of flash memory management systems . . . . . . . . . . . . 133.3 Current setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.1 JEFF - Journaling Extensible File system Format . . . . . . 143.3.2 Current FTL . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4 Flash file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4.1 JFFS - Journaling Flash File System . . . . . . . . . . . . . . 14

v

vi CONTENTS

3.4.2 YAFFS - Yet Another Flash File System . . . . . . . . . . . . 163.4.3 CFFS - Core Flash File System . . . . . . . . . . . . . . . . . 173.4.4 MODA - MODification Aware . . . . . . . . . . . . . . . . . . 183.4.5 Summary and discussion . . . . . . . . . . . . . . . . . . . . . 18

3.5 Flash Translation Layers . . . . . . . . . . . . . . . . . . . . . . . . . 203.5.1 BAST FTL - Block-Associative Sector Translation FTL . . . 213.5.2 AFTL - Adaptive FTL . . . . . . . . . . . . . . . . . . . . . . 223.5.3 FAST FTL - Fully-Associative Sector Translation FTL . . . . 233.5.4 FTL/FC - FTL/Fast Cleaning . . . . . . . . . . . . . . . . . 233.5.5 MFTL - Metadata FTL . . . . . . . . . . . . . . . . . . . . . 243.5.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . 24

3.6 Other flash memory related functionalities . . . . . . . . . . . . . . . 263.6.1 Bad block management . . . . . . . . . . . . . . . . . . . . . 263.6.2 ECC - Error correction code . . . . . . . . . . . . . . . . . . 263.6.3 Cleaning policies . . . . . . . . . . . . . . . . . . . . . . . . . 263.6.4 Buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.7.1 Optimal flash management system . . . . . . . . . . . . . . . 273.7.2 Related questions and thoughts . . . . . . . . . . . . . . . . . 283.7.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 OSE 5.4 and the Soft Kernel Board Support Package (SFK-BSP) 314.1 OSE 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.1.1 Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.1.2 Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.1.3 Flash Access Manager . . . . . . . . . . . . . . . . . . . . . . 33

4.2 Soft Kernel Board support package (SFK-BSP) . . . . . . . . . . . . 344.2.1 Modules in the soft kernel . . . . . . . . . . . . . . . . . . . . 34

5 Design and implementation 355.1 Translation layer design . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.1 Initiation of translation layer . . . . . . . . . . . . . . . . . . 365.1.2 Storage on flash volume . . . . . . . . . . . . . . . . . . . . . 365.1.3 Translation layer list . . . . . . . . . . . . . . . . . . . . . . . 365.1.4 Translation layer metadata . . . . . . . . . . . . . . . . . . . 375.1.5 Scan function . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.1.6 Garbage collection . . . . . . . . . . . . . . . . . . . . . . . . 39

5.2 Implementation in OSE . . . . . . . . . . . . . . . . . . . . . . . . . 395.2.1 Supported signals . . . . . . . . . . . . . . . . . . . . . . . . . 395.2.2 Supported I/O commands . . . . . . . . . . . . . . . . . . . . 405.2.3 Mount recommendations . . . . . . . . . . . . . . . . . . . . . 41

6 Test suite 436.1 Introduction to test . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

CONTENTS vii

6.2 Test case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436.2.1 Results from test case 1 . . . . . . . . . . . . . . . . . . . . . 44

6.3 Test case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.3.1 Results from test case 2 . . . . . . . . . . . . . . . . . . . . . 46

6.4 Test case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.4.1 Results from test case 3 . . . . . . . . . . . . . . . . . . . . . 48

6.5 Test case 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.5.1 Results from test case 4 . . . . . . . . . . . . . . . . . . . . . 50

6.6 Test case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.6.1 Results from test case 5 . . . . . . . . . . . . . . . . . . . . . 52

6.7 Summary of test results . . . . . . . . . . . . . . . . . . . . . . . . . 53

7 Discussion 557.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7.1.1 Flash memory management design . . . . . . . . . . . . . . . 557.1.2 QoS and power awareness . . . . . . . . . . . . . . . . . . . . 567.1.3 File system and FTL interface . . . . . . . . . . . . . . . . . 567.1.4 Flash memory management implementation . . . . . . . . . . 567.1.5 Evaluation of implementation . . . . . . . . . . . . . . . . . . 57

7.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577.2.1 Evaluation of requirements . . . . . . . . . . . . . . . . . . . 577.2.2 Summary of requirement evaluation . . . . . . . . . . . . . . 58

8 Future work 618.1 Build according to specification . . . . . . . . . . . . . . . . . . . . . 618.2 Dynamic support for different flash sizes . . . . . . . . . . . . . . . . 618.3 Flash translation layer metadata . . . . . . . . . . . . . . . . . . . . 628.4 Test system on hardware . . . . . . . . . . . . . . . . . . . . . . . . . 62

Bibliography 63

A Complete test cases 65A.1 Test case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65A.2 Test case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66A.3 Test case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67A.4 Test case 4a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68A.5 Test case 4b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.6 Test case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

B Requirements 71

List of Figures

3.1 NAND flash architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Flash memory management system architecture . . . . . . . . . . . . . . 133.3 JFFS garbage collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4 The MODA scheme classification . . . . . . . . . . . . . . . . . . . . . . 183.5 Page- and block-address translation . . . . . . . . . . . . . . . . . . . . 213.6 Merge and switch operations . . . . . . . . . . . . . . . . . . . . . . . . 223.7 DAC regions translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1 Overview of process states . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2 Structure of signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.1 Object in linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.2 The structure of a metadata chunk . . . . . . . . . . . . . . . . . . . . . 375.3 Structure of delete-metadata chunk . . . . . . . . . . . . . . . . . . . . . 385.4 Overview of implementation in OSE . . . . . . . . . . . . . . . . . . . . 40

6.1 Example of a result printout . . . . . . . . . . . . . . . . . . . . . . . . 436.2 Result from test case 1:a . . . . . . . . . . . . . . . . . . . . . . . . . . . 446.3 Result from test case 1:b . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.4 Result from test case 1:c . . . . . . . . . . . . . . . . . . . . . . . . . . . 456.5 Result from test case 2:a . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.6 Result from test case 2:b . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.7 Result from test case 3:a . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.8 Result from test case 3:b . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.9 Result from test case 3:c . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.10 Result from test case 3:d . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.11 Result from test case 3:e . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.12 Result from test case 3:f . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.13 Result from test case 3:g . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.14 Result from test case 4:a . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.15 Result from test case 4:b . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.16 Result from test case 4:c . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.17 Result from test case 4:d . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.18 Result from test case 5:a . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

viii

List of Figures ix

6.19 Result from test case 5:b . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

A.1 Complete result from test case 1 . . . . . . . . . . . . . . . . . . . . . . 65A.2 Complete result from test case 2 . . . . . . . . . . . . . . . . . . . . . . 66A.3 Complete result from test case 3 . . . . . . . . . . . . . . . . . . . . . . 67A.4 Complete result from test case 4a . . . . . . . . . . . . . . . . . . . . . . 68A.5 Complete result from test case 4b . . . . . . . . . . . . . . . . . . . . . . 69A.6 Complete result from test case 5 . . . . . . . . . . . . . . . . . . . . . . 70

List of Tables

2.1 NAND flash characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1 Overview of flash file system properties . . . . . . . . . . . . . . . . . . 193.2 Optimal flash file system . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Evaluation summary of discussed translation layers . . . . . . . . . . . . 26

7.1 Evaluation of the requirements . . . . . . . . . . . . . . . . . . . . . . . 59

B.1 Descriptions of the requirements . . . . . . . . . . . . . . . . . . . . . . 71

x

Abbreviations

JEFF Journaling Extensible File system FormatJFFS Journaling Flash File System

YAFFS Yet Another Flash File systemCFFS Core Flash File System

MODA MODification-AwareFTL Flash Translation Layer

AFTL Adaptive Flash Translation LayerBAST Block-Associative Sector TranslationFAST Fully-Associative Sector Translation

FTL/FC FTL/Fast CleaningMDT Memory Technology DeviceDAC Dynamic dAta ClusteringXIP eXecute In Place

EEPROM Electrically Erasable Programmable Read-Only MemoryLRU Least Recently UsedFAM Flash Access Manager

SFK-BSP SoFt Kernel Board Support Package

xi

Chapter 1

Introduction

1.1 BackgroundThe master thesis was done at ENEA AB in collaboration with the Royal Institute ofTechnology, KTH. It is a part of GEODES, a project with the aim to provide powerawareness and innovative management capabilities for operating systems, protocolsand applications, and also to apply the notion of quality of service (QoS)[10]. Powerawareness can be considered as a concept within QoS and to implement it in QoS,feedback from affected devices is required. This master thesis will focus on creatinga specification of an optimized flash memory management system where QoS canbe implemented on a flash media, and on designing and implementing a prototypeof a such a system.

1.2 Problem statementIn an embedded system it is important to minimize power consumption at every levelin the design. However, this needs to be achieved without reducing the performanceof the system in an unsatisfactory manner or disrupting any crash safety features.Minimizing power consumption but still maintaining performance is an importantpart of QoS and this can be helped by making modules power-aware.

1.2.1 Flash memory management system

A flash memory management system involves many functions besides the obviousread and write. Due to the construction and properties of flash media, functionssuch as erase, garbage collection, error correction and wear leveling are also needed.ENEA’s own operating system OSE1 is using the JEFF2 file system and a flashtranslation layer (FTL) to handle flash media. This complete flash managementsystem needs to be optimized and previous studies at ENEA shows that the FTL

1Operating System Embedded2Journaling Extensible File system Format

1

2 CHAPTER 1. INTRODUCTION

is the bottleneck. The FTL is a complete flash media manager involving all thefunctions stated above. However, it is a general purpose system and unnecessaryfeatures are included in the software.

Flash memory management system design

Instead of analyzing and modifying the old FTL a new flash media manager isrequested. Hence, other, preferably open source, solutions for flash memory man-agement require evaluation. What flash management systems are available and isthere a solution fitting the requirements? As a result, a specification of how an op-timized and power-aware flash memory management system should be constructed,will be acquired. How can power awareness and QoS be implemented on the flashmemory management system?

In case of a need for an interface between a novel file system and the flashmemory management system, optimization could be required. How can a potentialinterface between a file systems and an optimized flash memory management systembe improved?

Flash memory management system implementation

A prototype of a flash memory management system is to be implemented in OSE butdue to time restrains it is not necessary that the implementation and the optimizeddesign are identical. What will a suitable implementation design look like to be ableto be constructed within the limited time frame?

When the design is implemented, verification and validation of the design needsto be performed to confirm that the system can manage the basic requirements of aflash memory management system. Can the implemented design manage the basicrequirements of a flash memory management system?

1.3 Method

The master thesis starts with an academic study involving research papers, manuals,technical reports and books in the area of flash memory management systems.It also involves an evaluation of the different flash media management systemsfound during the study. It ends in design specification describing the requirementsobtained during the academic study.

The master thesis also includes an implementation, which is supposed to illus-trate the ideas suggested in the design phase. A demonstration of the implemen-tation and a report covering all the stages in this master thesis are also required.

1.4. DELIMITATIONS 3

1.4 DelimitationsThe time limit for this master thesis is 20 weeks. The academic study and theevaluation of the results need to be completed within the first ten weeks. Thedesign and implementation phases need to be completed within the remaining tenweeks together with the rapport and the presentation.

The foundation for the specification is only the data gathered during the aca-demic study.

The implementation needs to end in a prototype with the basic features of aflash memory management system but not manage all the features fulfilled by thespecification.

During development the OSE 5.4 operating system will be used together with asoftware kernel from ENEA.

Chapter 2

Challenge: Merge energy efficiency andlow RAM usage

In an embedded system resources are limited and it is important to be restrictivewhen using them. But the limited resources are even more important to considerduring the design of a system. For example, a goal to minimize power consumptionusually comes on the expense of something else, e.g. RAM-usage. Therefore, asystem needs to be able to find a good compromise between declared criteria inorder to be considered optimized.

2.1 Power management

Power management involves monitoring the energy consumption and also changingit to meet performance and energy demands. Power management is usually split intwo separate areas; static and dynamic power management.

Static methods of power management are predictions of how a system will func-tion followed by design modifications to find the best compromise between perfor-mance and energy consumption. However, static power management is not alwaysa sufficient solution. In a system with a very dynamic workload the static powermanagement method sometimes needs to be adjusted to the worst case scenario,and is therefore only optimal when the system is fully utilized.

Dynamic methods analyze the system during run-time and adjust the perfor-mance to the present demand of resources. A branch in QoS focusing solely onthe energy aspect and on dynamic power management was developed by Pillai etal. [17] and is called Energy-aware QoS (EQos). The Pillai et. al. EQoS methodincludes making the best use of limited stored energy and also varying the servicelevel in each process to ensure that system runtime goals are met.

First of all it is preferred that a system saves as much energy as possible with-out affecting performance; this can be applied to both static and dynamic powermanagement. It is also applicable in both hardware and software implementations.Dynamic voltage scaling (DVS) of the CPU or GPU is, for example, a well known

5

6CHAPTER 2. CHALLENGE: MERGE ENERGY EFFICIENCY AND LOW RAM

USAGE

hardware approach for dynamic power management which can save energy withoutaffecting performance [17]. In software effective algorithms can be optimized forminimizing energy consumption.

At a certain level it is no longer possible to save energy without affecting theperformance of the system. The question then is what is of most importance; per-formance or energy consumption. Sometimes it is necessary to degrade performancebecause the energy source is depleting. However, the opposite is just as plausible; Ifa very important task needs to be executed maximum performance can be requestedby the system.

2.1.1 Flash memory characteristics

Because of more energy efficient processors, CPUs are no longer alone at the topof energy consumers in embedded systems. Storage devices are becoming biggerand faster and thus requiring more energy. Flash memories in particular haveundergone a huge expansion in the last decade. Power-aware flash memories withdynamic power management properties are therefore requested.

Flash memory management systems have, however, special characteristics. Read,write and erase operations have different latency and requires different amount ofenergy [14].

Table 2.1 shows the latency and energy consumption of a typical NAND flashmemory. It shows that write operations cost far more than read operations anderase operations cost even more in both latency and energy. This affects how powermanagement on flash memories is carried out and prioritized. Not so much energycan be saved if the number of read operations are minimized but reduction of writeand erase operations will show a great impact on energy consumption.

It is important to note that minimizing the number of read, write and ultimatelyerase operations in this report not mean that the amount of read and write requeststo the flash volume is to be affected. The decrease of these operations instead aimsat the internal read, write and erase operations due to copying overhead duringflash memory management operations, a more extensive explanation can be foundin Chapter 3.

Operation Latency Energy consumptionPage read 47.2 µs 679 nJPage write 533 µs 7.66 µJBlock erase 3 ms 43.2 µJ

Table 2.1. NAND flash characteristics [14]

2.2. REQUIREMENTS 7

2.1.2 RAM usage in embedded systemsThe amount of RAM as primary storage in devices has been increasing steadilysince the day it was introduced and will probably continue to increase, but there isstill a need to be restrictive with RAM usage. In embedded system low RAM usageis of extra importance due to low cost demands and space restrictions. The flashmemory management system will therefore need to consider the amount of RAMused.

2.2 RequirementsThe following requirements for the flash memory management system can be derivedfrom this chapter.

REQ1: The design of the system shall be power-efficient.

REQ2: The system shall strive to be power-aware.

REQ3: The number of erase and write operations shall be minimized.

REQ4: The system shall use as little RAM as possible.

Power-efficiency has the highest priority of the four requirements stated above.However, consideration still needs to be given to the other requirements to makesure that none of them reach exaggerative proportions.

Chapter 3

Flash memory management

The first of two main goals of the master thesis is to form a specification of how anoptimized flash memory management scheme should be constructed. This chapterincludes a brief introduction to flash memory for readers with no prior knowledge inthe area, followed by an explanation of different flash memory management schemesavailable and ends with a discussion of the presented schemes and a design descrip-tion of the best solution.

3.1 An introduction to flash memory

The flash memory industry has exploded the last decade. Flash memory is nowone of the top choices when it comes to storage media in embedded systems. Thetechnology continues to improve and expand into new areas. The capacity nearlydoubles every year and solid state disks (SSD) are beginning to become a seriouscompetitor to regular hard drives [20].

3.1.1 Functionality of a flash memory

A flash memory is a non-volatile memory and a specific group in the EEPROMfamily. Non-volatile memory has the advantage over volatile memory such as DRAMand SRAM by being able to hold stored data without supply of power. Becauseof a flash memory’s low power consumption, shock resistance and small size, ithas many advantages compared to other non-volatile memory such as regular hard-drives. Flash memory can also access data randomly, unlike hard-drives which sufferfrom seek time because of their sequential data access properties [19, 4, 16].

Handling a flash drive requires three different main operators, read, write anderase, in contrast to hard-drives which only require read and write. The eraseoperator is needed because overwrites is not possible on flash. Instead of overwrites,updated data is written to a new location in the flash memory and the old data ismarked as invalid or dead. This is also known as out-of-place updates. Over timethe amount of invalid data in memory increases and will, if not handled correctly,

9

10 CHAPTER 3. FLASH MEMORY MANAGEMENT

fill up the whole flash memory. To reclaim an area occupied by invalid data theerase operator is used. Once a memory area has been erased, it is free to be usedagain by the system [4].

Flash memory also suffers from another disadvantage. A flash cell can onlyhandle a limited amount of erase cycles before it becomes unreliable or faulty. Thishardware problem can be addressed in software. The solution is to use the wholememory area as evenly as possible and make sure that the different sections of flashmemory area are erased approximately the same number of times. This methodavoids a situation where one flash area is worn out before all the others. This wholeprocess of evening out the load on the flash memory is called wear leveling [19].

3.1.2 Flash memory types

There are typically two main types of flash memory: NAND and NOR flash. Theyare both built with floating gate transistors but the two types differ in the layoutof these transistors. This report will focus on NAND flash because it is most wildlyused in embedded systems today [14, 4]. A brief explanation of the two will followbut the rest of the report is based on the NAND flash memory. Some of the contentcan, however, be applied to both.

NOR flash

The NOR flash memory gets its name from the resemblance of a logical NOR gate.The floating gates are connected in parallel just as a NOR gate which makes itpossible to access them individually. This gives NOR flash fast random accessspeed but is instead more costly and is less dense in its architecture compared toNAND. The possibility to access each cell individually makes NOR flash ideal foreXecution In Place (XIP) where programs can be executed directly on the flashwithout first being copied to RAM.

Devices today usually have a small NOR flash to boot from because of its shortread latency and its XIP abilities but use a NAND flash memory for storing data[4].

NAND flash in general

The NAND flash has the advantages of being smaller and cheaper to manufacturecompared to NOR, it also has faster write and erase cycles. The drawback withNAND flash is that it can only be accessed in sizes of a page where a page typicallycontains 512 bytes. This makes NAND flash unsuitable for XIP and more suitedfor storage [4].

Figure 3.1 shows the architecture of a NAND flash memory. The memory isarranged in blocks with a typical size of 16 KB where each block consists of 32pages. The read and write operators can act on page basis but erase operation canonly be done on a whole block [4].

3.1. AN INTRODUCTION TO FLASH MEMORY 11

This creates another difficulty when working with NAND flash. If the blockthat is to be erased still contains pages with valid data these have to be copied toanother block before the erase can take place. The copying of valid data and theerase procedure is called garbage collection.

Also seen in Figure 3.1 is the layout of a page and its dedicated spare area. Thespare data area is reserved for flash management metadata such as logical blockaddress and erase-count although it is up to the designer of the flash managementsystem to use it as he see fit. For pages with a size of 512 bytes the spare area willbe 16 bytes. The page and the spare area can be written/read independently butas they exist in the same block they are removed together on erase [14].

Page 1

Page 2

Page 3

Page 4

Page 32

User data Spare data

Flash pages in block

Flash blocks in memory

Figure 3.1. NAND flash architecture

NAND flash development

NAND flash is the type of flash memory which has been subject to most researchand development in recent years. It is also the type mostly used in embeddedsystems. A few years ago a new type of NAND flash memory was developed whichchanged the conditions for flash management system completely. In older NANDflash memories it was possible to write to a page two or three times before an eraseneeded to take place. It was also possible to write to arbitrary pages within a block[15].

With the new NAND flash hardware only one write is allowed before an eraseneeds to take place and arbitrary writes is no longer possible. Instead pages ina block needs to be written sequentially from the first page to the last. Duringthe same time as the release of the new NAND flash hardware a bigger block sizewas also introduced [15]. 2048 bytes per block became the regular size for the new

12 CHAPTER 3. FLASH MEMORY MANAGEMENT

NAND flash while older usually have a block size of 512 bytes, although both sizescan appear in either NAND flash hardware version.

3.1.3 Wear levelingDue to the fact that the flash blocks only can sustain a limited amount of erasecycles, typically 105 for NOR and 106 for NAND [20], the problem with potentiallyworn out blocks needs to be addressed. This is where wear leveling comes in. Wearleveling is the technique used to distribute block erases evenly across the whole flashvolume [4].

First of all the flash memory utilization needs to be even, but that is not enough.Some data might be static and never updated or deleted. The blocks containingthis type of information will always hold data which is valid thus will never beerased. The wear leveling mechanism could instead move the static data to a blockwhich have had many erase cycles in order to even out the wear. The wear levelingmechanism can be included in the software of the flash memory management systembut there are many different ways to deal with the problem [20].

3.1.4 Garbage collectionGarbage collection is the process of reclaiming invalidated memory. When data ismodified and updated the file system will allocate new pages for the updated fileand the old pages will be invalidated or marked as dead. Without garbage collectionthe flash memory would become full with invalid data and no more free space wouldbe available.

The garbage collector reclaims the invalidated pages and makes them free againby erasing blocks containing dead pages. After an erase the block is free to be usedfor new data. However, it not certain that all pages in a block are invalid when anerase needs to take place. The valid data then first needs to be copied to anotherblock with free space before the erase can be executed [19].

Garbage collection can be triggered either when the amount of free space hasreached a certain threshold value or it can run in background when the system isidle. It is important to note that garbage collection needs to be requested beforethe flash memory is fully utilized. A full memory has no free blocks for copies ofpotential valid data which means that the system would deadlock itself. Therefore,a few free blocks are always left for garbage collection purposes.

The effectiveness of a garbage collector is very much depending on how data isallocated in the flash memory. Different methods of data allocation are discussedfurther in the Sections 3.4 and 3.5.

3.1.5 RequirementsBecause NAND flash memory is mostly used in embedded systems today the designof the flash memory management system shall focus on NAND flash. It is alsoimportant that the flash memory management system can handle the requirements

3.2. OVERVIEW OF FLASH MEMORY MANAGEMENT SYSTEMS 13

that the new type of flash memory introduces. These requirements are thereforeintroduced.

REQ5: The design of the system shall focus on the NAND flash memory type.

REQ6: The system shall write pages in a block sequentially starting from the firstpage.

REQ7: A page shall only be written once before it is erased.

3.2 Overview of flash memory management systemsBasically there are two different ways to access a flash memory, see Figure 3.2. Thefirst way is to use a traditional file system like ext3 or FAT on top of a flash trans-lation layer. The translation layer maps the logical addresses addressed by the filesystem to the actual physical address on the flash. Besides supplying the transla-tion between logical to physical addresses the flash translation layer also providesgarbage collection and wear-leveling. The translation layer can also emulate theflash memory as a block device so that traditional file systems such as FAT andext3 can work against flash just as it was a normal hard drive [4].

The other option is to have a file system specifically developed for flash; twosuch examples are JFFS or YAFFS. With a flash dedicated file system there is noneed for a logical to physical address translation table, instead the file system keepstrack of the locations of the pages belonging to each file. Garbage collection andwear leveling is in this case a part of the flash file system [4].

To be able to control a flash memory a Memory Technology Device (MTD)driver is needed. This layer supports the primitive flash memory operations, suchas read, write and erase [20]. The MTD is located on top of the flash drive andinterfaces with either the flash translation layer or the flash file system.

Virtual File System

File system (ext3 , FAT... etc)

Flash Translation Layer (FTL)

Memoy Technology Device (MTD) drivers

JFFS YAFFS

Flash Memory

Figure 3.2. Flash memory management system architecture

14 CHAPTER 3. FLASH MEMORY MANAGEMENT

3.3 Current setupAs of today ENEA uses the JEFF file system on top of a flash translation layerwhen working with flash memory.

3.3.1 JEFF - Journaling Extensible File system FormatJEFF is developed by ENEA and is the file system currently used in ENEA’s oper-ating system OSE. JEFF is a journaling file system, i.e. changes to the file systemare first written to a journal before changes are made to the actual data. This isdone to ensure the consistency of the file system in event of a crash. If a crashoccurs during a transaction and data is corrupted the system can restore itself tothe previous state by reading the journal. This means that transactions are eitherperformed completely or not at all, i.e. transactions are atomic in JEFF [7].

JEFF is designed to run on block devices but has the ability to adjust the blocksize to suit the layer it is operating above; common block sizes are 512, 1024 and2048 bytes.

3.3.2 Current FTLThe current FTL emulates itself as a block device and allows JEFF to access itas if it was a hard-drive. The FTL includes all the required features of a flashmanagement system such as garbage collection and wear leveling [7]. However, it isa general purpose system and unnecessary features are included in the software.

3.4 Flash file systemsThis section will take a closer look at a few available flash file systems and try topinpoint the important elements in them. It seems reasonable to start from thevery beginning.

3.4.1 JFFS - Journaling Flash File SystemJFFS was the first file system designed specifically for flash media. JFFS wasdesigned for NOR flash and is a log-structured file system, this means that it writesdata sequentially on the flash chip [18]. It was developed in 1999 by the Swedishcompany Axis Communications and can be seen as the predecessor for all dedicatedflash file systems of today.

The basic idea of JFFS is that it uses a circular log. It starts to write data inthe head of the log which at the first entry will be at the beginning of the flash area.JFFS then continues to write data sequentially in the tail of the log and invalidatesold data on the way. This works fine until the flash runs out of free space and thetail of the log is about to reach the end of the flash.

Figure 3.3 shows what happens when free space in JFFS is running low and thegarbage collector starts to act. When the amount of free space reaches a certain

3.4. FLASH FILE SYSTEMS 15

level, Figure 3.3(a), JFFS garbage collection is initiated. It simply starts at thehead of the log and checks if the data is valid or invalid. Valid data is copied tothe tail and invalid data is ignored, Figure 3.3(b). This process is continued untilall valid data in an erase block has been copied to the tail. The erase block is thenerased and is clean and available to use for new data, Figure 3.3(c).

(a) Memory nearly full

(b) Copying valid data

(c) After garbage collection

Valid

Invalid

Free

Figure 3.3. JFFS garbage collection

The problem with this method is that the garbage collector works sequentiallyand will clean blocks even if they only contain valid data. However, this method willprovide perfect wear leveling because every block is cleaned exactly the same amountof times, but on the other hand are blocks cleaned when cleaning is unnecessary.

At mount time, the whole flash memory is scanned and the location of all nodes1

in the memory are stored in RAM. File reads can then be performed immediatelywithout extra computation by looking at the data structures held in RAM and thenreading the corresponding location on the medium. The drawback of this methodis that it is very memory consuming.

The problem with garbage collection and some other issues made it obvious thatJFFS needed to be improved. The second version of the flash file system is calledJFFS2. It was developed by David Woodhouse and is in many ways very similar toJFFS. Two of the differences are that JFFS2 has limited support for NAND flashalongside with the NOR support and an improved garbage collector.

JFFS2 separates files in three different lists. The clean list holds blocks withvalid data, the dirty list contains blocks with at least one obsolete page and thefree list contains erased blocks. Instead of selecting all the blocks sequentially asin JFFS, the garbage collector in JFFS2 takes a block from the dirty list when

1Building blocks of the file system, both metadata and data are stored in nodes

16 CHAPTER 3. FLASH MEMORY MANAGEMENT

garbage collection is requested. This method leads to a wear leveling problem.Blocks containing static data are never updated and will never be erased whilstother blocks will be erased constantly. This problem is addressed by letting thegarbage collector select a block from the clean list once every hundred time.

JFFS2 still scans the whole flash memory at mount to index all the valid nodes.Thus the RAM footprint and mount time increases linearly with the amount of datastored on the flash memory [16]. JFFS2 was first designed for small flash devicesand this issue becomes obvious with flash sizes over 128 Mbytes [3].

Both JFFS and JFFS2 are released under the general purpose license (GPL).

3.4.2 YAFFS - Yet Another Flash File System

YAFFS was written by Aleph One specifically for NAND flash file systems. It wasdeveloped because it was concluded that JFFS and its successor was not suitablefor NAND devices [9].

YAFFS was the first flash file system to fully utilize the spare area in each page.Just as JFFS it is a log-structured file system. By the time of developed the typeof flash devices available supported writes to arbitrary pages within a block. It wasalso possible to write to a page two or three times before an erase was needed. Thiswas used when a file had been updated and old pages needed to be invalidated. Abit in the spare area of the affected pages was rewritten and set to 0 to show thatthey were invalid.

YAFFS uses a tree structure which provides the mechanism to find all the pagesbelonging to a particular file [15]. The tree holds nodes containing 2-byte pointersto physical addresses. The 2-byte data is, however, not enough to individually mapeach page on a larger flash memory. For that reason, YAFFS uses approximatepointers which instead point to a group of pages. The pages themselves are selfdescribing and that makes it possible to search each page in the group individuallyto find the right page. In this way the RAM-footprint is smaller compared to JFFS2[9].

At mount only the spare area, containing the file ID and page number, of eachpage needs to be scanned. The result is faster mount compared to JFFS2 but themount time will still increase linearly with the size of the flash memory [16].

No particular wear leveling function is used in YAFFS. When blocks are allo-cated for storage they are chosen sequentially so no block will repeatedly be leftunused. On the other hand, no consideration is taken to blocks which are allocatedwith static data [1]. The author argues that wear leveling is not as important forNAND flash because the system needs to address bad blocks2 anyway. So even ifuneven wear will lead to a few more bad blocks the file system will still continue towork properly [9].

Garbage collection in YAFFS has two modes: passive and aggressive. Passivegarbage collection only cleans blocks which have a big majority of invalidated data

2Bad blocks are described in Subsection 3.6.1

3.4. FLASH FILE SYSTEMS 17

and is active when there is a lot of free blocks available. The Aggressive garbagecollection is activated when the amount of free space starts to run out. It will thenclean more blocks even if there are many valid pages in them. [15]

A few years after YAFFS was introduced, the flash memory hardware changed;it was no longer possible to write to a page more than just once and writes topages within a block had to be made sequentially. This meant that the method ofinvalidating pages was not allowed any more. This was why YAFFS2 was developed.Instead of invalidating pages a sequence number was added in the spare area to makeit possible to determine which page that is still valid when a file was updated. Eachtime a new block is allocated the sequence number is incremented and each pagein that block is marked with that number. The sequence shows the chronologicalorder of events, thus making it possible to restore the file system. [15]

Just as with JFFS both YAFFS and YAFFS2 are released under GPL and haveits source code available on the internet. YAFFS has been a very popular file systemamong computer scientists and many new file systems are based on YAFFS. One ofthem is the Core Flash File System (CFFS) explained in the next subsection.

3.4.3 CFFS - Core Flash File System

CFFS is based on YAFFS and the fundamental structure is the same but someimprovements have been made.

The blocks in CFFS have three classifications; inode-stored blocks, data blocksand free blocks. The inode-stored blocks contain the locations of all data in thememory. This means that only the inode-stored blocks need to be scanned atmount time. To be able to locate the inode-stored blocks at mount their locationsare written to an Inodemapblock at unmount and the Inodemapblock is always thefirst physical block in the flash memory. The method of saving a snapshot of thedata structure on the flash memory at unmount is usually called checkpointing [16].

The separation of inode-stored blocks and data blocks in CFFS has anotheradvantage besides the faster mount; it also improves the effectiveness of the garbagecollector. Metadata is updated more often than regular data, e.g., renaming, movingand changing attributes of a file will only change the metadata and not the regulardata. By separating metadata and data in different blocks the probability that allthe pages in a block will be invalidated around the same time increases. This willdecrease the copying overhead in the garbage collector and in that way save bothenergy and time. The separation of data according to their update frequencies iscalled hot-cold separation or data clustering [16].

The separation between metadata blocks and data blocks, however, creates awear leveling issue. Because of the fact that metadata is updated more often, themetadata blocks will be erased more frequently than the data blocks. This is solvedby using a weight value in each block. If the block was an inode-stored block lasttime it will be allocated as a data block next time it is erased, thus solving thatwear leveling problem.

18 CHAPTER 3. FLASH MEMORY MANAGEMENT

3.4.4 MODA - MODification AwareThe MODA scheme is not a complete file system; it is only a modification in the pageallocation scheme in YAFFS. The MODA page allocator is a further developmentof the one used in CFFS. It separates not only metadata and userdata, it alsodistinguishes between different update frequencies of userdata [2].

The MODA scheme uses a queue to classify how often a file is modified. Thefile stays in the queue for a specific amount of time and its classification dependson how many times the file is modified during this period. Figure 3.4 shows anoverview of the separation.

When a page is allocated to a specific area it will stay there during its lifetime even if its update frequency changes. The garbage collector mechanism usedin MODA operates in each area independently to avoid mixture of pages betweenblocks with different classifications [2].

Data

Meta data User data

Hot-modifieduser data

Cold-modifieduser data

Unclassifieduser data

Level0

Level1

Level2

Figure 3.4. The MODA scheme classification

3.4.5 Summary and discussionThis subsection will serve as a summary of this section and will also include adiscussion of the six flash management schemes presented above. Table 3.1 showsan overview of the most significant differences of the presented flash managementschemes.

As a part of the discussion, this subsection will also reference to the requirementsestablished earlier in the report. A complete table of these requirements can befound in Appendix B.

Individual flash management scheme evaluation

According to REQ5, JFFS can be ruled out as an optimized solution because itis developed for NOR flash and not NAND flash. JFFS2 has, on the other hand,a limited support for NAND flash but its functionality is surpassed by YAFFS.However, JFFS2 is the only file system which has a wear leveling function whichhandles uneven wear caused by static data.

Although YAFFS/YAFFS2 has no wear leveling function they are preferred infront of JFFS2 because they have a better garbage collector, also used in CFFS and

3.4. FLASH FILE SYSTEMS 19

FFS MT GC policy Wear leveling Data clusteringJFFS slower Collects each

block sequentiallyregardless of thecontents of theblock

Not needed be-cause of thegarbage collec-tor’s behavior

No clustering

JFFS2 slower Selects randomblock with atleast one obsoletepage

Once every hun-dred time a cleanblock is chosen forgarbage collection

No clustering

YAFFSYAFFS2

slow Random selectionwithin the bound-aries of the pas-sive and aggres-sive modes.

No wear leveling No clustering

CFFS fast Same as YAFFS Weight value;metadata blocklast time -> datablock next time

Separates meta-data and userdata

MODA slow Same as YAFFS No wear leveling Separates meta-data and userdataand uses hot-coldseparation ofuserdata.

Table 3.1. Overview of the different flash file system properties.FFS = Flash File system, MT = Mount time, GC = Garbage Collector

MODA, which considers the amount of valid data left in a block and not only thatthe block has at least one obsolete page. Nevertheless, YAFFS needs to be ruledout in favor of YAFFS2 because the older version fails to meet REQ6 and REQ7.

When the data clustering policy is considered there are only two options avail-able; separation of metadata and userdata as in CFFS or as in MODA, where theuserdata is also separated in hot and cold areas.

When considering REQ1 and REQ3 it seems that the MODA allocation schemeis the better solution because it is an improvement of the one used in CFFS. How-ever, CFFS has the benefit of using checkpointing and thus has the best mount timeof all schemes presented.

REQ2, which is the requirement concerning the introduction of power awareness,has not been applied by any of the flash management schemes and can thereforenot be used to facilitate the choice of the best scheme.

20 CHAPTER 3. FLASH MEMORY MANAGEMENT

Combination of the best flash management scheme features

Although an optimized solution cannot be found when looking at these schemesindividually a combination of them can lead to a good result.

The foundation of the optimized solution will therefore be the CFFS scheme be-cause of its fast mount properties. The data clustering scheme is, however, changedto the MODA variant. The wear leveling scheme can also be combined with theone used in JFFS2 to be able to handle wear cased by static data.

An optimized solution will, according the above stated arguments, look likeTable 3.2.

FFS MT GC policy Wear leveling Data clusteringOptimized fast Random selection

within the bound-aries of the pas-sive and aggres-sive modes.

Metadata blocklast time -> datablock next timeand once everyhundred time aclean block is cho-sen for garbagecollection.

Separates meta-data and userdataand uses hot-coldseparation ofuserdata.

Table 3.2. Optimal flash file system.FFS = Flash File system, MT = Mount time, GC = Garbage Collector

3.5 Flash Translation LayersJust as there are many different flash memory file systems available there are alsomany different flash translation layers (FTL). This report will explain and discussa few of the most significant ones. As explained in Section 3.2 the main function ofthe FTL is to translate the logical block addresses to a physical block addresses.

There are two major alternatives adopted for the translation table; page-levelmapping seen in Figure 3.5(a) and block-level mapping depicted in Figure 3.5(b)[11,19]. The page-level mapping translation scheme maps each logical sector numberto each physical page number. The mapping table will therefore have one entry foreach page on the flash memory. The block-level mapped translation scheme splitsthe logical sector number into a logical block number and a page offset instead. Thedata stored in the mapping table for the block-mapping technique is only the logicalto physical block numbers. This means that the block-level mapping approach needsextra operations to translate the logical block numbers and page offset to a physicaladdress but consequently it consumes far less RAM [4, 11].

Being restrictive with RAM usage is very important especially in products de-veloped for mass production and choosing the right mapping table can make a hugedifference. For example a 4 Gbyte NAND flash with the large block size of 2048

3.5. FLASH TRANSLATION LAYERS 21

Kbyte requires 8 Mbytes of RAM for maintaining the page-level mapping tablewhile the block-level mapping table only requires 128 Kbyte [11]. For this reason,varieties of the block-level mapping translation schemes are mostly used today.

Logical sector

number

Page-levelmapping table

Physical page

number

Flash blockPhysical block number

Logical sector

number

Block-levelmapping table

Physical block number

Locical page number

Flash block

(a) Page-mapped FTL (b) Block-mapped FTL

Logical block

number

Figure 3.5. Page- and block-address translation

Most of the resent flash translation layers have variations of a scheme using logblocks. They usually have an overall block-mapping scheme but introduces page-level management for a few blocks [12]. A few schemes using log blocks and oneusing another method are explained further in the following subsections.

3.5.1 BAST FTL - Block-Associative Sector Translation FTL

The Block-Associative Sector Translation (BAST) scheme is a translation layerdeveloped by Kim et al. It manages the majority of the blocks at block-level buta number of blocks are managed at the finer page-level. The former blocks arereferred to as data blocks and hold ordinary data, the latter are called log blocksand are used for temporary storage for small writes to data blocks [12].

When a page in a data block is updated, a log block is allocated from a poolwith available free blocks. Because of the out-of-place update characteristics of aflash medium the update is written to the log block instead where the writes areperformed incrementally from the first page and onward. A log block is dedicatedto a specific data block and if a page in another block needs to be updated a newlog block is allocated. The updates can be carried out until the log block is full,when this happens a merge operation takes place.

In the merge operation, see Figure 3.6(a), a new free block is allocated and thevalid data is copied from the data block and the log block to the free block. Notethat the internal page locations are kept intact so that the page offset in the block-level mapping does not need to change. The free block becomes the new data blockand the other two blocks can be erased [12].

During special circumstances the merge operation can be replaced with a switchoperation, see Figure 3.6(b). This happens when all the pages are updated sequen-tially. No new free block is then required, the log block can instead directly beturned into a data block and the old data block can be erased. This is an ideal

22 CHAPTER 3. FLASH MEMORY MANAGEMENT

situation and saves a lot of energy because no copying overhead is needed and oneerase operation is saved [12].

1

2

3

4

5

6

1

5

5

2

2

2

Data block Free block Log block

Free block Free blockData block

(a) Merge (b) Switch

1

2

3

4

5

6

Data block

Free block

1

2

3

4

5

6

Log block

Data block

1

2

3

4

5

6

Data block

Free block

1

2

3

4

5

6

Log block

Data block

Figure 3.6. Merge and switch operations

During both the merge and switch operation the data is moved to anotherblock, thus, the mapping information needs to be updated. According to Kim et al.previous schemes have had reverse physical to logical mapping information storedin the spare area of each page. This requires scanning of the whole flash at mounttime to be able to locate all mapping information. Instead Kim et al. purposes amapping table stored in dedicated blocks called map blocks to enable faster mount.

At mount only the map blocks are scanned and a map over the map blocks, themap directory, is stored in RAM. When a page is updated both the correspondingmap block and the map directory needs to be updated. However, this ensuresthe consistency of mapping table even at an unexpected power failure and thussimplifying the recovery.

3.5.2 AFTL - Adaptive FTL

Another variant to the log block scheme called Adaptive FTL is purposed by Wuand Kou. AFTL uses a combination of a block-level mapping scheme and a page-level mapping scheme. Two hash tables are held in RAM, one for the each mappingtable but the page-level mapped table has a limited amount of slots available. Wuand Kou uses a log block when pages are updated but instead of using the mergeoperation, described in Figure 3.6(a), when the log block is full AFTL leaves thelog block intact and stores its valid data in the page-level hash table. The argumentis that this data can be considered as hot data and is more likely to be accessedfrequently thus requires a page-level mapping [19].

There are only a finite number of page-level mapping slots available and whichpages that staying in the hash are handled by a linked list using the Least RecentlyUsed (LRU) policy [19]. When the list is full a new entry will force old pages to bebut back in the block-level mapped hash table. The pages are then returned to ablock and inserted in their original position to make sure that the page offset stillis pointing to the right data.

3.5. FLASH TRANSLATION LAYERS 23

3.5.3 FAST FTL - Fully-Associative Sector Translation FTL

The Fully-Associative Sector Translation (FAST) scheme is developed by Lee et al.and it is built on the BAST scheme, but it introduces two important differences.The first is that FAST adopts fully-associative address-mapping. The second isthat FAST uses two different kinds of log blocks; one kind for sequential writes andanother for random writes [13].

Fully-associative address-mapping means that a log block no longer is associ-ated with a particular data block. Instead updates from many data blocks can bestored in the same log block. This method actually introduces the need for thesecond difference. As shown in Figure 3.6 a switch operation is superior to a mergeoperation because it needs no copying overhead and only one erase operation. Butwith the fully-associative method its highly unlikely that a merge operation everoccurs. This is where the sequential log block make its contribution.

When a write is taking place the system first checks if the page which is beingupdated is the first page in a data block i.e. logicalsectornumber mod numberofpa-gesinablock = 0. If it is the first page it is put first in the sequential writes (SW)log block. The data (if any) already in the SW log block is merged with its datablock before the insertion. If the following updates are coming sequentially theywill continue to fill the SW log block and when it is full a switch operation can takeplace. However, if the data is not written sequentially the data is added to the SWlog block anyway but only a merge operation can be applied when it is full or whenanother first block needs to be inserted.

The random write (RW) log blocks are used when a sequence of writes doesnot start with a page from the first position in a block. A switch operation cannever occur for a RW log block, only merge operations. The merge for RW logblocks is, however, a bit different from the merge shown in Figure 3.6(a). It stillinvolves one log block but can involve many data blocks due to the fully-associativeaddress-mapping. A comparison between BAST and FAST has been done by Leeet al. and it shows that FAST can reduce the erase count by 10-50% depending ontest case and is in a worst case at the same level as BAST [13].

3.5.4 FTL/FC - FTL/Fast Cleaning

FTL/Fast Cleaning (FTL/FC) is a translation layer developed to speed up cleaningtime for larger flash memories. FTL/FC is not a log block based translation layer,instead it uses a data placement policy called Dynamic dAta Clustering (DAC).

DAC is a flash memory management scheme for logical partitioning of storagespace. The idea is to cluster data with similar update frequencies together. Fig-ure 3.7 shows how DAC operates. Data which is updated frequently will be movedupwards towards the top region and be considered as hot data, not so frequentlyupdated data will instead end up in the bottom regions as cold data [4].

A new page of data is first written to the bottom block. A promotion to anupper region can only happen if the page is updated and it is not older than a

24 CHAPTER 3. FLASH MEMORY MANAGEMENT

predefined “young-time”. If the page is updated after the “young-time” deadline itwill stay in the same region. Demotions to a lower level happen when blocks areselected for cleaning. Pages in the selected block that are still valid and older thana predefined “old-time” will be demoted to the previous level, younger pages arewritten back to the current region [5].

Region 2 ... Region nTop

Region 1Bottom

Too old

Too old Too old Too old

Updated & young

Updated & young

Updated & young

Updated & young

Figure 3.7. DAC regions translation

Each region includes multiple LRU lists; there is one list for every number ofinvalid pages a block can have, i.e., if the block layout is 32 pages there will be 32LRU lists in each region. There will also be a separate cleaning list shared with allregions holding the blocks with no valid data [4].

In FTL/FC DAC is set to partition the memory in three different regions, hot,neutral and cold.

The cleaning policy used in FTL/FC makes use of the multiple LRU lists ineach region. First of all the blocks in the cleaning list are selected for cleaning. Ifthe cleaning list is empty the cost benefit policy, see Subsection 3.6.3 for details,is used. Instead of having to search through all the blocks to find the optimal onefor cleaning only the first block in each LRU list needs to be searched. The otherblocks can be ignored because the cost-benefit policy wants the oldest block andthat will always be the first block in the list [4].

3.5.5 MFTL - Metadata FTLWu et al. purposes a file system aware FTL, named MFTL, with the ability toseparate metadata and userdata. Wu et al. argues that metadata is accessed moreoften than userdata and metadata also usually consist of very small files. Smallfiles does not use a considerable amount of memory and therefore can a page-level mapping scheme be used for the metadata partition without too much RAMoverhead. The userdata, however, can still be handled on block-level [20].

Writes in the page-level mapped area are performed sequentially in a logging-fashion.

3.5.6 Summary and discussionThis subsection will summarize the current section and also discuss the benefitsand drawbacks of the five translation layers presented with the goal of choosingan optimized solution. References to the requirements will be made during thediscussion, a table with all requirements can be found in Appendix B.

3.5. FLASH TRANSLATION LAYERS 25

Individual evaluation

In the AFTL scheme the switching operations between the hash tables creates copy-ing overhead. Performance gains like faster access to more frequently accessed data,however, makes up for this overhead. The problem with AFTL is that when pagesare switched from the page-level to the block-level mapped hash table writes areperformed to a specific page in the designated block. This is required to make surethat the page offset still points to the right data but it also means that AFTL doesnot meet the sequential write demand of REQ6.

Another consideration that needs to be taken into account is what is most impor-tant; energy saving or memory consumption. The only FTL of the ones discussedhere able to handle hot and cold separation is FTL/FC using the DAC technique.However, the DAC technique does not keep inter-block locations of pages intact.Therefore, FTL/FC cannot use a block-level mapping scheme and have to use themuch more memory-consuming page-level mapping scheme instead, and thus goingagainst REQ4. This might be acceptable in systems with a lot of physical memorybut for embedded systems where RAM usage is crucial it is not a suitable solution.

When considering the two translation schemes BAST and FAST, the FASTtranslation layer is, as mentioned above, an improvement of the BAST scheme andis therefore a more suitable choice.

The last translation layer, MFTL, is not really a competitor to the others, itis more a compliment. Separation between metadata and user data can be imple-mented in any of the other FTL:s. Here ENEA has an advantage compared to thedevelopers of MFTL, ENEA also controls the file system JEFF. The filtering tech-nique used in MFTL is therefore not even necessary. It is already possible for JEFFto inform the translation layer whether the sent data is metadata or userdata.

When REQ2, with its strive towards power-awareness, is considered it can beconcluded that none of the described translation layers apply any functionalities tosupport power-awareness.

Optimized solution

The optimized solution would, because of the above stated arguments, be the FASTFTL on the block-level mapped userdata in MFTL and make use of JEFF so thatthe translation layer is aware of if incoming data is userdata or metadata.

This solution will meet requirements; REQ1, REQ3, REQ4, REQ5, REQ6 andREQ7. REQ2 can also be considered to be met because a strive towards powerawareness has taken place although it could not be applied in any of the presentedtranslation layers.

The discussion in Subsection 3.5.6 is summarized in Table 3.3.

26 CHAPTER 3. FLASH MEMORY MANAGEMENT

FTL Opinion derived from discussionAFTL Cannot handle new flash memories with sequential write requirementFTL/FC Uses too much RAM for use in embedded systemsBAST Surpassed by FASTFAST Optimal flash transition layer for block-level mappingMFTL Preferable solution in collaboration with JEFF and FAST

Table 3.3. Evaluation summary of discussed translation layers

3.6 Other flash memory related functionalitiesThis section describes functionalities which are embedded in a flash memory man-agement scheme but are not fully elaborated in this study.

3.6.1 Bad block management

NAND flash are designed to get high density for a low cost and a perfect flashmemory is not guaranteed in production. Usually new flash memory has a few badblocks which are unusable and a few more are expected to go bad during its lifetime[15]. Bad block management is a feature which makes bad blocks invisible to thesystem. It can be done either in hardware or software but every good flash memorymanagement system needs a bad block manager.

3.6.2 ECC - Error correction code

Just as NAND flash requires bad block management, it also needs error correctioncode to handle frequent bit errors. ECC does not need to ask a sender if thereceived message was correct to detect an error; it is capable of detecting a specificamount of errors within a certain quantity of data by itself. The ECC feature canbe implemented in hardware or a separate software application or in the file systemsand translation layers themselves.

3.6.3 Cleaning policies

This subsection describes different cleaning policies used by a garbage collector ineither a flash file system or a flash translation layer.

One of the simplest cleaning policies is the greedy policy; it always selects theblock with the most amount of invalid data. The goal is to reclaim as much freespace as possible in each garbage collection [4]. The greedy policy has been provento be efficient when data is accessed uniformly. However, if some data is updatedmore frequently than other, also known as hot data, it would be preferred that thisdata is not copied because it will soon be invalidated anyway. The greedy policydoes not consider this and can therefore not avoid copying of hot data [14].

3.7. DISCUSSION 27

Another cleaning policy called cost-benefit policy does, however, consider hotdata [4, 14]. It calculates a value of each block on the flash memory according tothe specific formula

benefit

cost= age(1− u)

2uand the block which gets the highest value is selected for cleaning. The u is the

percentage of valid data in the selected block, hence (1−u) is the percentage of freespace that could be reclaimed. 2u simulates the cleaning cost; one u for reading thevalid data and one u for writing it to a free block. The age stand for the amountof time elapsed from the latest update of the block. The age parameter increasesif blocks have not been reclaimed for a long time and the chances for that block tobe chosen for cleaning increases. In this way blocks which contain invalid data canbe cleaned more evenly than with the greedy policy [4].

Another policy which is very similar to the cost-benefit policy is the CAT clean-ing policy. It selects block for cleaning according to the minimal value from theformula

u

1− u1age

numcl

.u and age are the same as for the Cost-benefit policy. numcl is the number

of times the block has been cleaned. u1−u stands for the cleaning cost. The only

difference from the cost-benefit policy is that CAT also considers the number oftimes the block has been cleaned before.

3.6.4 BufferingA write buffer in a flash memory management system can reduce write requests tothe memory and thus save energy. Less write operation will also mean less eraseoperations. Li et al. suggests that putting a buffer between the logical block addresslayer and the physical block address layer can save energy be reducing writes to flashmemory [14]. Buffer management is, however, out of the scope of this study.

3.7 DiscussionThis section will discuss the benefits and drawbacks of flash file systems versus flashtranslation layers and present an optimal flash memory management system.

3.7.1 Optimal flash management systemWoodhouse [18] argues that in order to have a crash safe system journaling needsto be integrated in both the file system and the translation layer. That is not aneffective solution and therefore, Woodhouse suggests that a dedicated file system isused instead.

28 CHAPTER 3. FLASH MEMORY MANAGEMENT

ENEA has, on the other hand, the well working file system JEFF and wantsto continue to use it. Another aspect is that ENEA’s operating system does notalways operate with flash memories; it might as well work with hard drives anda dedicated flash file system is therefore not always useful. The translation layeroption is, because of the above stated arguments, a more suitable solution for ENEAalthough it might not be an optimal solution if only performance is considered.

The optimized flash memory management system purposed by this study isthe translation layer selected in Subsection 3.5.6. It was the MFTL combinedwith FAST FTL supported by the JEFF interface which already has the ability todistinguish metadata from userdata.

3.7.2 Related questions and thoughts

It might be possible to combine the suggested optimal solution with the knowledgedrawn from Section 3.4. The parts concerning only file systems are not of interesthere but some solutions can be applied to both flash dedicated file systems andtranslation layers.

Data clustering with the DAC technique was dismissed because it requiredpage-mapping and thus too much RAM. The MODA scheme presented in Sub-section 3.4.4, however, only separates data in the first stage and will therefore notdisrupt the internal page locations during operation. It would therefore be possibleto combine the MODA scheme with the selected optimal translation layer. Meta-data and userdata separation is already included in MFTL so the difference withMODA would be the allocation of userdata in hot and cold regions.

On the other hand, it is not certain that any benefits can be drawn from theMODA scheme when the management system is using the block-level mapping tablefor userdata and thus locking pages in their positions within a block. The realquestion is if any copying and erase operations are saved during merge and switchesbe separating hot and cold data. This question is not answered in this study but itcould be interesting to look closer at.

Another question concerning the ideal situation of a switch operation is alsoworth some thought. Is it possible to increase the chances of a switch operationtaking place? One solution could be to introduce a buffer between the logical blockaddress layer and the physical block address layer and then buffer pages until acomplete block can be written sequentially. In a case where the sequential log blockin nearly full but the last pages fail to come another solution could be to copy thelast pages from the data block directly to the sequential log block and then performthe switch operation.

Wear leveling is an aspect not taken into account in the translation layer schemes.It could be forth while to use some knowledge of wear leveling drawn from Section 3.4and apply it to the selected solution.

Finally some thought should be put into what garbage collection policy to usein the page-level mapped section in MFTL. Of the three policies mentioned in Sec-tion 3.6.3 the CAT cleaning policy is probably the best choice. Using this policy will

3.7. DISCUSSION 29

also automatically introduce some wear leveling to the page-level section because ofits considerations to both the age parameter and the erase count.

3.7.3 RequirementsWhen looking at the discussion above at least one thing is clear. The selected choiceof an optimal flash memory management system will use a translation layer togetherwith ENEA:s own file system JEFF.

REQ8: The flash memory management system shall use a translation layer to-gether with JEFF.

Chapter 4

OSE 5.4 and the Soft Kernel BoardSupport Package (SFK-BSP)

4.1 OSE 5.4OSE is an embedded real time operating system based on a microkernel architecture.It is designed to manage both hard and soft real-time constraints. The operatingsystems fundamental building blocks are processes, a process corresponds to threadsand tasks in other operating systems.

There are many ways for OSE processes to communicate, but the simplest andmost preferred way is by signals. Signals are typed messages sent from one processto another.

This section will discuss the parts of OSE relevant to this master thesis.

4.1.1 Processes

The building blocks in OSE are the processes. This is because the systems useprocesses to allocate CPU time for different jobs. A process can be compared totasks or threads in other programming environments [6].

Only one process at a time can be executed on the CPU. To make sure criticalprocesses are executed first, processes in OSE are assigned to different priority levelsdepending on how imperative they are for the system performance. If a processwith a high priority is ready to run but a process with lower priority is currentlyexecuting on the processor, a context switch takes place. The lower prioritizedprocess is preempted and moved to the ready state, while the higher prioritizedprocess moved to the running state and executed in the processor.

A process will always be in one of these three states:

Running The process is currently being executed on the processor.

Ready The process is ready to execute, but a process with higher priority is cur-rently being executing on the processor.

31

32CHAPTER 4. OSE 5.4 AND THE SOFT KERNEL BOARD SUPPORT PACKAGE

(SFK-BSP)

Waiting The process is waiting for an event to occur and is currently not in theneed of the CPU.

A process can move between these three states during system operation as par-tially explained above. A more extensive explanation of these movements is givenin Figure 4.1.

Ready

Waiting Running

Preemption

Dispatch

Start*

Stop

Start**

Receive

Figure 4.1. Overview of the three process states and their rotation. * Start ofprocess with lower or the same priority as the one running. ** Start of process withhigher priority than the one running.

4.1.2 SignalsThe recommended tool for communication between processes in OSE is signals [6].A signal is a message buffer that is sent from one process to another.

The signals contains a signal number, hidden signal attributes, data if necessaryand an end mark, see Figure 4.2.

A B

EndMark

Sig.no

Adm.block Data field (optional)

Figure 4.2. A signal is sent from process A to B containing three or four blocks: Anadministrative block (Adm.block), a signal number (Sig.no), a data field (optional)and an end mark (0xEE).

Adm. block The hidden attributes can be found in the Administrative block. Theadministrative data is accessed by system calls and contain information aboutwhich process that sent the signal, which process it was sent to, signal size,etc.

4.1. OSE 5.4 33

Sig. no The signal number is an identification number telling the receiving processwhat kind of information to expect in the signal. Each signal number mustbe assigned to a data structure before it can be used.

Data field The data field is optional and its existence and size is defined in theassigned data structure.

End mark The end mark marks the end of the signal with the value 0xEE.

4.1.3 Flash Access ManagerThe Flash Access Manager (FAM) is a manager designed to give applications accessto a flash memory. It is designed for NOR-flash and can therefore not handle thetougher requirements of NAND flash memory. However, it is possible to manuallyapply the restrictions of NAND flash when using the FAM interface and, by this,simulate that the FAM is controlling a NAND flash device.

An application using the FAM communicates with it through the flash_api.hinterface. The following functions are available to the application:

flash_read: Function to read data from flash.

flash_write: Function to write data to flash.

flash_erase: Function to erase data from flash.

get_flash_characteristics: Gets start address, address range and erase-blocksize of the flash volume.

Applications using the FAM need to, in addition to the regular start address andaddress range, be aware of the erase-block size to be able to perform erases. Thisinformation can be extracted from the get_flash_characteristics function togetherwith information about the address range of the volume and the start address [8].

The use of flash_read, flash_write and flash_erase functions are simple. A startaddress of the operation needs to be specified together with the number of bytes theoperation is to be performed on. The read and write functions also need a pointerto a read and write buffer respectively [8].

Some restrictions of the addressing does, however, apply. The FAM itself canread and write any number of bytes to an arbitrary address, if the FAM is used tosimulate control of a NAND flash, reads and writes has to be page aligned and havethe size of a page. The same issue is applied to the erase operation but this time theblock size is the common denominator. The block align and erase size restrictionapply to both NOR and NAND flash [8].

The FAM is not an OSE process. It operates as a BIOS module and inherits itsprocess priority from the calling client process. Calls to the FAM will, because ofthe process inheritance, block all operations at lower priorities. This is especiallyimportant to consider for the flash_write and flash_erase operation which take a

34CHAPTER 4. OSE 5.4 AND THE SOFT KERNEL BOARD SUPPORT PACKAGE

(SFK-BSP)

very long time execute. As a result processes with lower priority are likely to bestarved during execution of these operations. Therefore it is important to ensurethat the process priority is low enough so that the systems real-time performanceis not destroyed [8].

4.2 Soft Kernel Board support package (SFK-BSP)The development platform used in this master thesis has been the OSE soft kernel.The soft kernel allows OSE processes to be run on a host computer; in this casePC using the Windows operating system. The soft kernel also sets up a simulatedRAM and NOR-flash memory on the host which are interpreted as real hardwareby the system [8]

The size of both the simulated RAM and the NOR-flash can be set manually.The simulated memories are built at the first startup and stored as .mem files inthe root directory on the host hard drive, this is called a cold start. At reboot, onlya warm start is needed, which means that the .mem files are just read into the hostRAM.

4.2.1 Modules in the soft kernelAuxiliary functions can be added to OSE by including modules in the compilation.The included modules will in this way be contained in the created binary file.

The translation layer is constructed as such a module and is included in theOSE build during compilation.

Chapter 5

Design and implementation

The second goal of this thesis is to design and implement a flash memory manage-ment system for the OSE operating system. The goal is not to build a completesystem but rather to build a foundation to facilitate future work. The ultimate goal,in a longer perspective, is to design and implement the flash memory managementsystem described in Section 3.7.

5.1 Translation layer design

This report focuses on the NAND flash type for the reasons mentioned in Sec-tion 3.1.1, however, suitable NAND flash developing tools were not available. Thesoft kernel version of OSE described in Section 4.1 includes a simulated environ-ment for NOR-flash. It was decided that the NOR-flash simulator should be useddue to the absence of a fitting NAND flash simulator and the advantage of usinga simulator instead of working with actual hardware from the start. However, thedesign should still consider the tougher requirements of NAND flash. The followingrequirements needed to be taken under consideration:

• Access to flash memory can only be done in the size of a page. (REQ5)

• Writes needs to be made sequentially within a block. (REQ6)

• A page can only be written once before it is erased. (REQ7)

The translation layer is designed to be as simple as possible but still includeall the basic features required by a translation layer. Because of the demand forsimplicity a page-level mapped translation layer approach was chosen. This meansthat the developed translation layer will consume more RAM than the proposedoptimal solution but will, on the other hand, be possible to implement within thetime constraints.

35

36 CHAPTER 5. DESIGN AND IMPLEMENTATION

5.1.1 Initiation of translation layer

At startup, before JEFF tries to mount the flash volume, the translation layeris registered in the kernel. Then the volume is scanned and the translation listcontaining the active files is built, details about the scan function and the translationlist can be found in section 5.1.3 and Section 5.1.5 respectively.

After the scan the translation layer module waits for request signals from JEFF.For instance, JEFF needs to know the size of the volume, which block size to useand which signals the translation layer module can receive and process. When theinitiation process is finalized the flash memory is mounted by the file system and isthen ready to be used for file storage.

5.1.2 Storage on flash volume

The two first blocks on the flash memory are dedicated to translation layer meta-data, for more information about this type of metadata see Section 5.1.4, the rest isavailable for file system data. The data on the flash memory is stored in a loggingfashion from the beginning of the address range to the end. When the end is reachedthe translation layer starts to write in the beginning of the memory again. Garbagecollection makes sure that the area selected for writing is erased and free, for moreinformation see Section 5.1.6.

5.1.3 Translation layer list

The map storing the logical to physical address information is built as a linked listand is stored in the system RAM. The linked list holds all data that is valid onthe memory at a given moment. The objects in the linked list holds the logicaladdress, the physical address and pointers to the previous and next object in thelist. Figure 5.1 shows an example of an object in the translation list.

Figure 5.1. Object in linked list

The translation list can handle four operations: add, search, remove and print.The add operation is used when new data is stored on the flash memory. A new entryis then added in the end of the linked list holding the logical address and the actualphysical address. The search operation is used when the system needs to know ifa specific data object can be found on the memory or if a specific flash address isoccupied by valid data. The remove operation is used when data is invalidated onthe flash memory, note that the remove function will not do anything to the physical

5.1. TRANSLATION LAYER DESIGN 37

memory, it will only remove the entry from the list if found. The last operation,print, is used if the user wants a printout of the logical to physical mapping of thecurrent valid data.

5.1.4 Translation layer metadataThe first two blocks on the flash memory are dedicated to translation layer meta-data. Note that the translation layer metadata is not the same as the file systemmetadata mentioned earlier. The file system, JEFF in this case, will send packagescontaining either userdata or file system metadata, the translation layer metadata,on the other hand, is created by the translation layer itself and is not received fromthe file system.

The translation layer metadata holds the same information as the translationlist and is needed because the translation list cannot be maintained during a restart.The translation list is stored in RAM and is therefore lost when the power is turnedoff. It can only be recreated at restart if the data is also stored on the flash memory.

The logical to physical data mapping is stored in a string of characters on thememory. Again, this is far from an optimal solution because it requires a lot morememory than necessary but it is simple and works sufficiently well. The metadatastring structure can be seen in Figure 5.2.

Figure 5.2. The structure of a metadata chunk

PA stands for physical address and the following ten characters holds the actualphysical address, LA stands for logical address and the characters following holdsthe logical address. The NULL character is added after the logical address to markthe end of the string.

This data is written to one of the two blocks dedicated for metadata whenever achange in the file system occurs. As with the file system data, the metadata is alsowritten in a logging fashion starting from the beginning of the block and continuingto the end. When the end is reached the writes do not just continue to the nextblock. Instead the information in the translation list is first written to the secondblock. The translation list holds all the live pages on the memory and is thereforethe only information that needs to be saved. When the translation list has beenwritten, the first block can be erased because all valid information is now stored inthe second block. New updates are thereafter written to the second block until itis full and the procedure starts over again.

If new data is added, a new metadata write is performed and a string withthe format shown in Figure 5.2 is written in the active metadata block. However,updates and erases also needs to be written to the metadata block because they doalso make changes to the file system. Updates are simple, they are written in theexact same way as a new added page. Erases needs to be handled separately, in this

38 CHAPTER 5. DESIGN AND IMPLEMENTATION

system erases are written on the format seen in Figure 5.3. The fact that metadatawrites are written sequentially makes it possible to determine which metadata entrythat is the most recent.

Figure 5.3. The structure of a metadata chunk which is to be deleted

Because of the fact that only one metadata block can be used at the same timethe number of pages possible to store on the flash memory are limited. With thecurrent setup the live page limit is 2048.

The method used for writing the metadata is not portable to a NAND flashenvironment. This is because the current system writes the metadata in sizes of ametadata chunk, which is 32 byte. NAND flash can only write in sizes of a flashpage with a size of 512 byte or 2048 byte.

From the beginning the idea was to fully simulate a NAND flash memory andalso make use of the spare area described in Section 3.1.2. However, the methodof storing metadata with the format depicted in Figure 5.2 and Figure 5.3 wasalready established and unfortunately not compatible with the spare area size. Themetadata chunks, with their size of 32 bytes, where to big to fit in to the 16 bytesof available space in the spare area. The design of the metadata then had tobe redesigned to be able to fit into the spare area, but because the focus of theimplementation was a solution as simple as possible it was not found necessary tochange the metadata write implementation. However, it needs to be change in thefuture to accommodate the requirements of NAND flash memory.

5.1.5 Scan function

The scan function is used when the translation layer is started and before theflash volume is mounted by JEFF. The scan function searches trough the metadatastored in one of the two metadata blocks at the beginning of the flash memory.As mentioned in Section 5.1.4 the metadata chunks are formatted according toFigure 5.2 and have the size of 32 bytes to be aligned with the block size. Thescan function looks at the start of a metadata chunk and confirms that the first twocharacters are "PA". Then the function stores the following characters which arethe physical address, when "LA" is reached it breaks and instead stores the logicaladdress in another variable until the NULL character is found.

The scan starts at the beginning of the active metadata block and adds newpages to the translation list when they occur. If a page that is already in the listis encountered the physical address can just be updated. This effect is due to thecertainty that the encountered object is a more recent update because metadataupdates are written sequentially in the metadata block. If a chunk with the deletedformat occurs, its object in the translation list is deleted on the same basis.

5.2. IMPLEMENTATION IN OSE 39

The scan continues until the beginning of a metadata chunk no longer holds anydata. All valid data is then back in the translation list and the file system can thenmount the volume and start to operate on it again.

5.1.6 Garbage collection

The garbage collection algorithm used in the translation layer is constructed in thesame way as the one used in JFFS, described in Section 3.4.1. The data is kept in anongoing log with a beginning and an end, and the free space is located either beforeor after the data. When data needs to be erased, the first block in the log of data ischosen. The garbage collection algorithm will then go through the translation listchecking if any of the pages belong to the block selected for erasure. When a pageis found to belong to the selected block the data is read from the flash memory andthen written in the end of the log. At the same time the physical location of thepage is updated in the translation list and a new metadata chunk holding the newlocation is written to the metadata block.

When the whole translation list is scanned and all valid data is copied from theerase block to the end of the log, the erase can take place.

The system is set to perform garbage collection when the number of free blocksgoes under a threshold value. It is, however, possible to perform garbage collectionmanually if necessary.

5.2 Implementation in OSEThe new translation layer is implemented as a module between JEFF and the FAMreplacing the current FTL. Figure 5.4 shows the hierarchy of the processes handlingthe file system and the interfaces through which the translation layer communicates.The file system uses the ddb.sig interface when sending and receiving signals toand from the translation layer which in turn communicates with the FAM modulethrough flash_api.h.

5.2.1 Supported signals

The ddb.sig interface includes all necessary signals for a file system communicatingwith a secondary storage volume. The translation layer can, however, not handle allof them. The signals supported by the developed translation layer are the following:

DDB_EXAMINE_DISK_REQUEST Part of the initiation process. Requestsinformation about disk size, block size, cache size and if the device is readable,writable and/or able to handle random access.

DDB_INTERFACE_REQUEST Part of the initiation process. Requests in-formation about which signals the module can handle, the reply includes allsignals in this list.

40 CHAPTER 5. DESIGN AND IMPLEMENTATION

JEFF

Implemented FTL

FAM

Simulated flash

ddb.sig

flash_api.h

Figure 5.4. Overview of implementation of the new translation layer in the OSEfile system

DDB_SUPPORTED_IO_REQUEST Part of the initiation process. Requestsinformation about which commands in the DDB_IO_REQUEST that aresupported by the module.

DDB_MOUNT_REQUEST Part of the initiation process. Requests to mountan address range on the flash volume.

DDB_UNMOUNT_REQUEST A request to unmount an address range onthe flash volume.

DDB_IO_REQUEST The signal used to request I/O operations.

5.2.2 Supported I/O commandsWhen the module is up and running all communication between JEFF and thetranslation layer are managed through the DDB_IO_REQUEST signal. Duringinitiation JEFF asks which commands in this signal that are supported with theDDB_SUPPORTED_IO_REQUEST. The translation layer supports the followingDDB_IO_REQUEST commands:

DDB_OPC_READ: Read request to a sequence of blocks

DDB_OPC_WRITE: Write request to a sequence of blocks

DDB_OPC_JUNK: Delete request to a sequence of blocks

DDB_OPC_WRITE_BARRIER: Requests write barrier, i.e. ensure that allprevious read, write and junk requests are made persistent on the media beforeany following such requests are made persistent.

The implementation of these commands is fairly straight forward. The read/write/junkrequests hold a logical address, the number of pages in sequence to be affected by

5.2. IMPLEMENTATION IN OSE 41

the command and a pointer to a read/write buffer. When the logical address istranslated to a physical address by the translation list the operations can be car-ried out via the FAM interface. The last command, write barrier, is ignored by thetranslation layer although it is supported. This is because the translation layer doesnot use a cache and will therefore always make all read, write and junk requestspersistent on the media before any other requests are carried out.

5.2.3 Mount recommendationsThe translation layer does only support access for one application with one partitionon the flash volume. It is therefore unnecessary to mount anything but the wholevolume, because it would only result in unused space.

If more than one application needs to access the flash modifications needs to bemade to the translation layer.

Chapter 6

Test suite

6.1 Introduction to test

This test suite will focus on the functionality of translation layer and the tests willclarify if the implemented design meets the basic requirements of a translation layer.

The results will be printouts from the terminal and pieces of these printoutsare added as figures in the report to facilitate the understanding of the tests. Thecomplete printouts from all test cases can be found in Appendix A.

Figure 6.1 shows a short example of such a printout. The first line shows that awrite operation containing userdata with the logical address 1123 has been writtento the physical address 0x30022C00. The second line shows that 29 Junk operationswill take place on logical addresses 1124 through 1152.

Other operations appearing in the tests are read operations, where data from aphysical address is read into RAM, and Update operations where an already existinglogical address is written to a new physical location.

Figure 6.1. Example of a result printout

6.2 Test case 1

Description: Test case 1 will test write, read, append and overwrite operations ona file with a size smaller than a page (<512 byte).

Preparation: No preparation needed but if the exact same results depicted in thetest are required, the flash needs to be restarted and formatted before thestart of the test.

Steps: The following steps are manually executed once in chronological order.

43

44 CHAPTER 6. TEST SUITE

1. Open test file tf1 (Parameter: w+)1

2. Write data to file3. Read data from file4. Close test file5. Print content in file6. Open test file (Parameter: a+)2

7. Append data to file8. Read data from file9. Close test file

10. Print content in file11. Open test file (Parameter: w+)12. Write data to file13. Read data from file14. Close test file15. Print content in file

6.2.1 Results from test case 1

A complete printout of test case 1 can be found in Appendix A.1.The first part of the test is depicted in Figure 6.2. The test is initiated on line

1. Line 3-4 contain two metadata read requests from JEFF irrelevant to this test.The creation of the file in step 1 does not produce any flash operations.

1.2.3.4.5.6.7.7.8.9.

Figure 6.2. Result from test case 1:a

Line 5 is a result of JEFF’s journaling process where upcoming changes to thefile system are written to a journal before any changes are made persistent on thevolume. On line 6 the write operation in step 2 is performed and the logical address1120 is written to the physical address 0x30021600. In the next operation 31 pages

1Open empty file for reading and writing, if file already exists contents is erased and file istreated as new empty file

2Open file for reading and appending

6.2. TEST CASE 1 45

are deleted stating from the logical address 1121. This is not part of the test casebut an automatic junk request from JEFF. The result of step 3 is not visible, thisis because the last used page is kept in RAM by JEFF and a read from flash istherefore not necessary. Step 4 does not generate any requests to flash but whenstep 5 is executed a printout occurs which can be seen on lines 8 and 9.

The test continues in Figure 6.3. Step 7, consisting of an append operation,starts with a read at 0x30021600 on line 11, where the userdata of the file tf1 isstored. The read operation is followed by an update to the new physical address0x30021800 on line 12. Step 8 and 9 are again invisible but the result of step 10 canbe read on lines 13 through 15. Note that new data is appended to the old dataand the append operation has thereby been carried out correctly.

7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.

Figure 6.3. Result from test case 1:b

The last part of the test can be seen in Figure 6.4. At step 11 the junk operationon line 16 is occurring due to the parameter in the open request where the old fileis deleted before the creation of a new with the same name. The metadata writeon line 17 refers to the journaling operation prior to the userdata write in the nextstep. Step 12 is done at line 18 where new data is written to the new physicaladdress 0x30021C00. Step 13 and 14 are not visible but the result of the last stepcan be read on the lines 20 to 22. The previous data is now erased and the fileonly consists of new data, the overwrite operation can therefore also be declaredsuccessful.

14.15.16.17.18.19.20.21.21.22.23.

Figure 6.4. Result from test case 1:c

46 CHAPTER 6. TEST SUITE

6.3 Test case 2Description: Test case 2 will test write and read operations with a size bigger

than a page (>512 byte) and simultaneously make a stress test with manyconsecutive read and write operations.

Preparation: No preparation needed, however, if the exact same results as theresults depicted in the test are required, test case 1 needs to be executed priorto this test.

Steps: Step 1 to 4 are contained in a loop which iterate 512 times, the last step isexecuted once.

1. Open test file tf2 (Parameter: w+)2. Write 1242 bytes of data to file3. Read 1242 bytes of data from file4. Close test file5. Print content in file

6.3.1 Results from test case 2

The complete printout of test case 2 can be found in Appendix A.2.The first part of the test is displayed in Figure 6.5. The test is initiated on line 1.

The creation of tf2 in step 1 does not result in any flash operations. The metadatawrite illustrated on line 4 is part of the journaling process. The write operation instep 2 requires three pages to fit all 1242 bytes and will therefore need three flashwrites, all seen on line 5,6 and 7. Step 3 is executed on lines 9 and 10, the last pageis held in RAM by JEFF and is not needed to be read from flash. Step 4 does notcause any effect on the flash memory.

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14. Figure 6.5. Result from test case 2:a

These four steps are repeated 512 times but only iteration 1,2,511 and 512 isvisible in the figure in Appendix A.2. The junk operations on lines 12, 22 and 31are the result of step 1 where the previous file is erased before it is recreated.

6.4. TEST CASE 3 47

The 512 iterations are executed without error and the last iteration togetherwith the first three lines of the result of step 5, where the content in tf2 is printed,can be seen in Figure 6.6.28.29.30.31.32.33.34.35.35.36.37.38.39.40.41.42.42.43.44.45.46.47.48.49.

Figure 6.6. Result from test case 2:b

The printout on lines line 39 to 60 in Appendix A.2 is a correct reproduction ofthe written data and the stress test and the write and read operations with a sizebigger than a page can be considered successful.

6.4 Test case 3

Description: Test case 3 will test various file system commands and their effecton the flash memory.

Requirements Test case 1 and 2 needs to be executed before this test case inorder to retrieve any useful information from it.

Steps: The following steps are manually executed once in chronological order.

1. Change to flash directory (command: cd)

2. List files in directory (command: ls)

3. Copy content of test file tf1 to standard out (command: cat)

4. Copy test file tf1 to tf1_copy (command: cp)

5. Copy content of new file tf1_copy to standard out (command: cat)

6. Create test3 directory (command: mkdir)

7. Move test file tf2 to test3 directory (command: mv)

8. List files in flash directory (command: ls)

9. Change to test3 directory (command: cd)

10. List files in test3 directory (command: ls)

48 CHAPTER 6. TEST SUITE

6.4.1 Results from test case 3This test is not executed automatically, instead all steps are done manually fromthe terminal. The complete printout of test case 3 can be found in Appendix A.3.

The results of step 1 and 2 are depicted in Figure 6.7. The directory change instep 1 is made on line 1 followed by the listing of the files in the current directoryin step 2. Lines 3 to 5 shows the result of the list command where the two filescreated in test case 1 and 2 are visible.

1.2.3.4.5.6.7. Figure 6.7. Result from test case 3:a

Step 3 is initiated on line 6, seen in Figure 6.8, and results in a read at thephysical location of tf1 on line 8 and a printout of its content on line 8 and 9. Theprintout shows the most up to date content of tf1 and is therefore regarded as asuccess.

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.

Figure 6.8. Result from test case 3:b

Step 4 is commenced at the end of line 9 and results in another read at thephysical location of tf1 on line 10, see Figure 6.9. This operation is followed bya metadata update which is a journaling operation prior to the userdata write online 12, where the copied data with logical address 1124 is written to the newphysical address 0x30124200. The write is followed by three junk operations andfour updates of metadata initiated by JEFF on lines 13 through 19.7.

8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21. Figure 6.9. Result from test case 3:c

Step 5, executed on line 20, shows that the copied data in tf1_copy holds thesame information as the original file, , see Figure 6.10. The copy operation is thereby

6.4. TEST CASE 3 49

considered a success.

14.15.16.17.18.19.20.21.21.21.22.23.24.25.26.27.

Figure 6.10. Result from test case 3:d

The creation of a directory, test3, in step 6 is commenced on the end of line 22in Figure 6.11. It involves metadata writes to the journal before and updates of thefile system metadata which can be seen on lines 23 through 30.21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.

Figure 6.11. Result from test case 3:e

In step 7 tf2 is moved to the test3 directory which also only requires metadataupdates for the journal and the file system. See lines 32 through 36 in Figure 6.1227.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.

Figure 6.12. Result from test case 3:f

Step 8 lists the current files in the flash directory on lines 38 to 41 in Figure 6.13and it shows that the creation of the test3 directory was successful and the existence.

Step 9 followed by step 10 on line 42 and 43 respectively shows that the movecommand also was accomplished without error.34.35.36.37.38.39.40.41.41.42.43.44.45.46.

Figure 6.13. Result from test case 3:g

50 CHAPTER 6. TEST SUITE

6.5 Test case 4Description: This test case will show the translation layer list and the garbage

collection mechanism.

Preparation: Test case 1, 2 and 3 needs to be executed prior to this test.

Steps: The following steps are executed once in chronological order.

1. Print translation layer list (command: trans_print)2. Print flash usage (command: fusage)3. Perform garbage collect on 14 blocks (command: garbage_collect)4. Print translation layer list (command: trans_print)5. Print flash usage (command: fusage)

6.5.1 Results from test case 4A complete figure of test case 4 is depicted in Appendix A.4 and Appendix A.5.

Figure 6.14 shows a compressed view of the translation layer list after test case1, 2 and 3. The first step, printing the translation layer, is executed on line 1. Thelist contains 22 entries with mostly metadata. Userdata can only be found in fiveentries; list entry 6 containing the logical address 1120 which represent the data infile tf1, list entry 17-19 holding the logical addresses 1121-1123 containing the threepage long tf2 and list entry 20, which holds the copy of tf1, tf1_copy.

1.2.3.4.5....

18.18.19.20.21.22.23.24.25.25.18.19.20.21.21.22.23.

Figure 6.14. Result from test case 4:a

The translation layer list also supplies some additional information. On line 24the value of the metadata counter is displayed. It shows how many of the 2048available metadata slots that are currently in use in the current metadata block.Line 25 also prints information about the number of free blocks available and thestart and end address of the data on the flash medium.

Step 2 is initiated on line 27, visible in Figure 6.15. The fusage command is anexisting function communicating with the FAM-interface. Line 28 shows the addressrange of the flash memory, in this case 0x30000000-0x303FFFFF. The following data

6.5. TEST CASE 4 51

visualizes the memory usage of the flash volume. The 16 lines with 4 columns (onlysix are visible in Figure 6.15, the rest can be seen in Appendix A.5), from line 29to 44 represent the 64 flash blocks building the memory. The first number is thestart address of the block in hex followed by the size of the block, in this case 64K3. The next letter tells whether the block is used (U) or free (F).

21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.

Figure 6.15. Result from test case 4:b

The first two blocks on line 29 are, as mentioned in Section 5.1.4, dedicated totranslation layer metadata. The first block is used at the moment and the other isfree but a change will occur when the metadata counter reaches 2048. The rest ofthe flash memory is available for storage. So far block 3 through 19 are used by thefile system which correlate to the data_start and data_end information on line 25in Figure 6.14.

The following steps are visible in Figure 6.16 which also is the first part ofthe figure in Appendix A.5. Normally the memory usage would continue until thefree block threshold is reached before garbage collection is commenced. However,garbage collection can be forced, and that is done in step 3 on line 1.

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.

Figure 6.16. Result from test case 4:c

Garbage collection on 14 blocks is initiated and all valid data in those 14 blocksis copied to the end. As seen on lines 3 to 6 only four pages are copied. If the figurein Appendix A.4 is studied more closely it becomes clear that those four pagesare the only entries in the translation layer list with a physical address within theaffected 14 blocks.

Additionally, line 7 shows that the metadata counter has increased because ofthe four copy operations and now shows 1842.

As a result of step 4 the translation layer list after the garbage collect operationcan be seen in Appendix A.5 where the physical location of list entry 6 and 17-20has been updated.

3JEDEC memory standard, K = 1024

52 CHAPTER 6. TEST SUITE

Figure 6.17 shows the relevant result from step 5, which is the flash usage print-out, where the number of used blocks has decreased to three. Line 33 also showsthat the number of free blocks has increased together with the updated new startand end addresses of the data.

27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.41.42.43.44.45.46.47.48.

Figure 6.17. Result from test case 4:d

6.6 Test case 5

Description: This test case will test the flash scan function used at reboot.

Preparation: No preparation needed but if the exact same results as the resultsdepicted in the test are requested test case 1, 2 3 and 4 needs to be executedprior to this test.

Steps: The following three steps are executed once.

1. Exit soft kernel (command: sfkexit)

2. Restart soft kernel (command: source run)

3. Print translation layer list (command: trans_print)

6.6.1 Results from test case 5

The full printout of test case 5 can be found in Appendix A.6.The first part of the printout can be seen in Figure 6.18. Step 1 is executed

and the soft kernel is turned off on line 1. It is restarted on line 4 when step 2 isperformed. A warm start is used because the memory files are preserved from theprevious run.

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.

Figure 6.18. Result from test case 5:a

6.7. SUMMARY OF TEST RESULTS 537.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.

Figure 6.19. Result from test case 5:b

The initiation of OSE is visible in Figure 6.19. During the initiation phase thetranslation layer is started, see line 14, and the flash scan function is initiated andcompleted, see line 15 and 16 respectively.

When OSE is fully started, the print of the translation layer list in step 3 can becarried out. The result can be seen in the figure of Appendix A.6. When comparedto the translation layer printout in Appendix A.5 it can be concluded that the scanfunction is functional because the two printouts are identical.

6.7 Summary of test resultsAll tests have been carried out without error and the implemented translation layercan therefore be considered a success.

Chapter 7

Discussion

The aim of this report was to design and optimize a flash memory managementsystem for NAND-flash and to build and implement a simpler design of a flashsystem in OSE.

7.1 Problem statement

This section will present the answers to the questions stated in Section 1.2.

7.1.1 Flash memory management design

What flash management systems are available and is there a solution fitting therequirements?

There are many flash management systems available and they can be split in twogroups; flash dedicated file systems and flash translation layers. This report takes acloser look at the six flash file systems JFFS, JFFS2, YAFFS, YAFFS2 CFFS andMODA it also examines the five translation layers BAST, AFTL, FAST, FTL/FCand MFTL.

The report concludes that the optimal file system is the CFFS scheme because ofits fast mount properties. The data clustering scheme in CFFS should, however, bechanged to the MODA variant for better performance. The wear leveling schemeshould also be combined with the one used in JFFS2 to be able to handle wearcaused by static data.

This report also concludes that an optimal flash translation layer consist ofFAST FTL on the block-level mapped userdata in MFTL together with the JEFFinterface so that the translation layer can be aware of if incoming data is userdataor metadata.

When choosing the overall optimal flash management solution it is concludedthat a flash translation layer together with JEFF is the best choice. This is con-cluded although this requires journaling in both the file system and the translationlayer. The fact that JEFF is already a well working file system in OSE and that

55

56 CHAPTER 7. DISCUSSION

OSE does not always use flash memory as a storage device are considered as strongerarguments.

The optimized translation layer selected in Section 3.5 was the MFTL combinedwith FAST FTL. This knowledge can, however, be combined with the knowledgedrawn from the Section 3.4. The MODA scheme presented in Subsection 3.4.4only separates data in the first stage and will therefore not disrupt the internalpage locations during operation. It could therefore be a possibility to combine theMODA scheme with the selected optimal translation layer. Metadata and userdataseparation is already included in MFTL so the difference with MODA would be theallocation of userdata in hot and cold regions.

7.1.2 QoS and power awareness

How can power awareness and QoS be implemented on the flash memory manage-ment system?

The part of QoS included in the flash memory management system is staticpower management. The system is designed to be power efficient and the algorithmsare chosen with the aims to reduce the writes and erases as suggested in 2.1.1. TheMODA clustering scheme will reduce copying overhead and thus reduce both flashreads and writes. That will also lead to less erases because the flash memory is betterutilized. The FAST FTL is designed to minimize the number of erase operationsand is thus also a part of the static power management.

However, there are no obvious parts where dynamic power management can beapplied in this flash management system. If no dynamic power management existsthere is no need for the system to be power-aware.

7.1.3 File system and FTL interface

How can a potential interface between a file systems and an optimal flash memorymanagement system be improved?

The implementation of the MFTL scheme will work better if the interface be-tween JEFF and the FTL uses the data classification where metadata and userdatacan be distinguished. However, no modification to the interface is necessary becausethe separation between metadata and userdata is already included in the interface.

7.1.4 Flash memory management implementation

What will a suitable implementation design look like to be able to be constructedwithin the limited time frame?

In order to build a functional FTL within the limited time frame a simple designwas established. The design supports all necessary features of an FTL but nooptimization toward low power consumption and low RAM usage is applied.

The translation layers is run on a simulated NOR-flash even though the designis adapted for NAND-flash. The reason for this is because there were no suitable

7.2. CONCLUSIONS 57

NAND-flash developing tools available, the NOR-flash interface has, however, beenaltered to meet the more restrictive NAND-flash requirements.

The designed FTL uses a logging approach when writing to the flash volumewhich means that all data is written sequentially. The garbage collection functionis adapted to the method of writing data and will therefore also clean blocks se-quentially. During cleaning, which always takes place at the beginning of the log,valid data is moved to the end of the log before the first block in the log is erased.

The page mapping technique has been chosen for the mapping table due to itssimplicity.

Translation layer metadata is stored in two dedicated blocks in the beginning ofthe memory. At reboot a scan function scans the metadata block and rebuilds themapping table.

The FTL uses the ddb.sig interface when communicating with JEFF and theflash_api.h interface is used when communicating with the FAM which stores thedata on the actual flash memory.

7.1.5 Evaluation of implementation

Can the implemented design manage the basic requirements of a flash memory man-agement system?

The tests in the test suit are designed to determine whether the system canmanage the basic requirements of a translation layer.

All tests were carried out successfully and they showed that the implementedsystem can handle read, write and erase operations together with normal file systemcommands. It can also manage garbage collection and is able to recover data aftera reboot.

7.2 Conclusions

This section will compare the results from this master thesis with the requirements,stated throughout the report and also collected in Appendix B, and conclude if theyare met.

Due to the fact that the optimized design from Section 3.7 differs from theimplemented design in Chapter 5 the two will be compared separately.

7.2.1 Evaluation of requirements

REQ1: The design of the system shall be power efficient.The optimized design uses MFTL combined with FAST FTL and MODA and it

will therefore be a power efficient solution. The prototype does, on the other hand,use a design developed for simplicity and will therefore not reach the requirementof being power efficient.

REQ2: The system shall strive to be power-aware.

58 CHAPTER 7. DISCUSSION

It is a bit difficult to determine if the results meet this requirement for theoptimized system. Efforts have been made trying to find ways to implement dynamicpower management in the flash management system and thereby introduce the needfor power-awareness. Nevertheless, no dynamic power management situations havebeen found. Still it is reasonable to conclude that a strive towards power-awarenesshas taken place.

The implemented design has, however, no such aims and can therefore not beconcluded to have fulfilled the requirement.

REQ3: The number of erase and write operations shall be minimized.The number of write and erase operations will be minimized by using the MFTL

with the FAST FTL scheme for the block level mapped partition and MODA onthe page level mapped partition, in the optimized design.

The implemented design, however, writes and erases data sequentially and thusno minimizing of erase and write operation can ever occur.

REQ4: The system shall minimize RAM usage.Selecting a block level approach for the userdata in MFTL in the optimized

design will have a major impact on the RAM footprint. Power efficiency is stillprioritized but the block level approach makes sure that the RAM-usage does notgo out of hand.

The implemented design does use a page-level mapping table and will thereforeconsume a lot of RAM. It will therefore not reach the requirement for minimizingRAM usage.

REQ5: The design shall focus on the NAND-flash type.Here both the optimized design and the implemented design meet the require-

ments. The optimized design is fully prepared for NAND-flash and the implementeddesign will work on NAND-flash as soon as the function for storing metadata hasbeen altered.

REQ6: The system shall write pages in a block sequentially starting from thefirst page.

Met by both designs.REQ7: A page shall only be written once before it is erased.Met by both designs.REQ8: The flash memory management system shall use a translation layer

together with JEFF.Both designs use a translation layer together with JEFF.

7.2.2 Summary of requirement evaluationThe evaluation of the requirements are summarized in Table 7.1. The optimizeddesign has met all requirements while the simpler implemented design failed to meetthe requirements concerning performance and power efficiency.

7.2. CONCLUSIONS 59

REQ no. Optimized design Implemented design DescriptionREQ1 Pass Fail Power-efficiencyREQ2 Pass Fail Power-awarenessREQ3 Pass Fail Minimize writes and erasesREQ4 Pass Fail Minimize RAM usageREQ5 Pass Pass NAND flash requirementsREQ6 Pass Pass Write sequentiallyREQ7 Pass Pass Write once requirementREQ8 Pass Pass Translation layer + JEFF

Table 7.1. Evaluation of the requirements for the optimized and implemented de-signs.

Chapter 8

Future work

This chapter will explain the work with the translation layer that still needs to bedone but could not be completed within the time restraints of this master thesis.

8.1 Build according to specification

The most important part of the work that needs to be done in the future is toupgrade the implemented translation layer to the system described in the specifica-tion. The implemented design completed in this master thesis is only ground workfor further development.

First of all the mapping table needs to be changed from page-mapped to block-mapped. Consequently the linked list and the garbage collection mechanism needsto be altered; the linked list holding the logical to physical page addresses needsto be changed to blocks instead and the sequential garbage collection need to bealtered to meet the new demands concerning a block-mapped translation table, i.e.be able to handle merge and shift operations.

When the system is constructed a performance evaluation needs to take place.The questions that need answering are: How much energy can be saved with thenew system?, How big is the memory footprint and what is the throughput of thesystem compared with the previous FTL?

8.2 Dynamic support for different flash sizes

The implemented system is not dynamic in the sense that flash memories withdifferent address ranges and block sizes can be used without manual modificationto the code. As of now, the implemented system includes many hard coded valuesin the code instead of making use of information extracted from the FAM.

Information such as block size and address range are extracted from the FAMwith the function get_flash_characteristics during the initiation phase but thatinformation is not used in the main program. With a few modifications the trans-

61

62 CHAPTER 8. FUTURE WORK

lation layer could, however, be modified to be more dynamic and able to functionwith different flash sizes without manual modifications

8.3 Flash translation layer metadataThe way of storing metadata in the implemented system is not sufficient and it isnot portable to a NAND flash environment. The metadata is now stored in twodedicated blocks at the beginning of the memory but in the future it would be betterto use the spare area available on NAND flash.

The spare area in a flash memory with a page size of 512 bytes will have a sizeof 16 bytes but all bytes in the spare area cannot be used for just metadata. Spacealso needs to be allocated for ECC and bad block management. In the previoussystem used by ENEA involving the old FTL four bytes where used for metadataso it should be safe to assume that so much space could be allocated for metadatain the translation layer as well.

The physical address range goes from 0 - 0x3FFFFF which needs three bytes,logic address range is 0 - 0x2000 which requires two bytes. This means that at least5 bytes is necessary to represent all addresses. However, the system is supposedto be designed for NAND flash it is therefore not necessary to be able to addressevery byte on the flash memory. It is sufficient to just be able to address the startbyte of each page. this means that 0x3FFFFF could be divided by 512 or 0x200.This leaves 0x1FFF pages which will fit into two bytes. All necessary addresses cantherefore be represented by the 4 bytes available.

If a bigger flash memory needs to be used a different solution might be needed orperhaps there actually are more bytes available in the spare area. It is important toremember that NAND flash memories with 512 byte flash pages are the old model.New bigger flash memories has 2048 page size and a spare area of 64 byte whichshould be enough to fit the translation layer addresses.

8.4 Test system on hardwareSo far the system has only been tested on a host computer with the soft kernel. Tomake sure that system works properly it needs to be tested on proper hardware.ENEA works with a Freescale board with an i.MX31 processor which can be usefor hardware testing. It has both NOR and NAND flash available.

Bibliography

[1] Unknown author. View of /yaffs2/README-linux, Accessed: October16 2009. URL http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/README-linux?view=markup.

[2] Seungjae Baek, Seongjun Ahn, Jongmoo Choi, Donghee Lee, and Sam H. Noh.Uniformity Improving Page Allocation for Flash Memory File Systems. Pro-ceedings of the 7th ACM and IEEE international conference on Embedded soft-ware, pages 154 – 163, 2007.

[3] Artem B. Bityutskiy. JFFS3 design issues. URL http://www.linux-mtd.infradead.org/tech/JFFS3design.pdf. Version 0.32 (draft), 2005.

[4] Mei-Ling Chiang, Chen-Lon Cheng, and Chun-Hung Wu. A New FTL-basedFlash Memory Management Scheme with Fast Cleaning Mechanism. The 2008International Conference on Embedded Software and Systems, pages 205–214,2008.

[5] Mei-Ling Chiang, Paul C. H. Lee, and Ruei-Chuan Chang. Using Data Clus-tering to Improve Cleaning Performance for Flash Memory. Software PracticeExperience, 29(3):267–290, March 1999. ISSN 0038-0644.

[6] ENEA AB. Enea OSE Architecture User’s Guide, 2009.

[7] ENEA AB. Enea OSE Core Extensions User Guide, November 2009.

[8] ENEA AB. Enea OSE Core User’s Guide, October 2009.

[9] Eran Gal and Sivan Toledo. Algorithms and Data Structures for Flash Mem-ories. ACM Computing Surveys, 37(2):138–163, June 2005. ISSN 0360-0300.

[10] GEODES. GEODES (Global Energy Optimization for Distributed EmbeddedSystems), Project number: ITEA2 <07013>, 2009. URL http://www.itea2.org/public/project_leaflets/GEODES_profile_oct-08.pdf.

[11] Jeong-Uk Kang, Heeseung Jo, Jin-Soo Kim, and Joonwon Lee. A Superblock-based Flash Translation Layer for NAND Flash Memory. Proceedings of the6th ACM and IEEE International conference on Embedded software, pages 161– 170, 2006.

63

64 BIBLIOGRAPHY

[12] Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min, and Yookun Cho.A Space-Efficient Flash Translation Layer for Compactflash Systems. IEEETransactions on Comsumer Electronics, 48(2):366–375, May 2002. ISSN 0098-3063.

[13] San-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park,and Ha-Joo Song. A Log Buffer-Based Flash Translation LayerUsing Fully-Associative Sector Translation. ACM Transactions on Embedded ComputingSystems, 6(3), July 2007.

[14] Han-Lin Li, Chia-Lin Yang, and Hung-Wei Tseng. Energy-Aware Flash Mem-ory Management in Virtual Memory System. IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems, 16(8):952–964, August 2008. ISSN1063-8210.

[15] Charles Manning. How YAFFS Works. URL http://users.actrix.co.nz/manningc/yaffs/HowYaffsWorks.pdf.

[16] Kyo-Ho Park and Seung-Ho Lim. An efficient NAND Flash File System forFlash Memory Storage. IEEE TRANSACTIONS ON COMPUTERS, 55(7):906–912, July 2006. ISSN 0018-934.

[17] Padmanabhan Pillai, Hai Huang, and Kang G. Shin. Energy-Aware Quality ofService Adaptation. The University of Michigan, 2003.

[18] David Woodhouse. JFFS: The Journalling Flash File System. URL http://sources/redhat/jffs2/jffs2.com. Ottawa Linux Symposium, 2001.

[19] Chin-Hsien Wu and Tei-Wei Kuo. An Adaptive Two-Level Management forthe Flash Translation Layer in Embedded Systems. Proceedings of the 2006IEEE/ACM international conference on Computer-aided design, pages 601 –606, 2006. ISSN 1092-3152.

[20] Po-Liang Wu, Yuan-Hao Chang, and Tei-Wei Kuo. A File-System-Aware FTLDesign for Flash-Memory Storage Systems. DATE09, 2009.

Appendix A

Complete test cases

A.1 Test case 1

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.22.23.

Figure A.1. Print screen of the terminal showing the result of test case 1

65

66 APPENDIX A. COMPLETE TEST CASES

A.2 Test case 2

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.22.23.24.25.26.27.28.28.29.30.31.32.33.34.35.35.36.37.38.39.40.41.42.42.43.44.45.46.47.48.49.49.50.51.52.53.54.55.56.56.57.58.59.60.

Figure A.2. Print screen of the terminal showing the result of test case 2

A.3. TEST CASE 3 67

A.3 Test case 3

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.41.42.43.44.45.46.

Figure A.3. Print screen of the terminal showing the result of test case 3

68 APPENDIX A. COMPLETE TEST CASES

A.4 Test case 4a

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.41.42.43.44.45.

Figure A.4. Print screen of the terminal showing the result of test case 4a

A.5. TEST CASE 4B 69

A.5 Test case 4b

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.41.42.43.44.45.46.47.48.48.49.50.51.52.

Figure A.5. Print screen of the terminal showing the result of test case 4b

70 APPENDIX A. COMPLETE TEST CASES

A.6 Test case 5

1.2.3.4.5.6.7.7.8.9.

10.11.12.13.14.14.15.16.17.18.19.20.21.21.21.22.23.24.25.26.27.27.28.29.30.31.32.33.34.34.35.36.37.38.39.40.41.41.42.43.44.45.46.47.48.48.49.50.51.52.53.54.55.55.56.57.58.59.60.61.62.62.63.64.

Figure A.6. Print screen of the terminal showing the result of test case 5

Appendix B

Requirements

ID DescriptionREQ1 The design of the system shall be power efficient.REQ2 The system shall strive to be power aware.REQ3 The number of erase and write operations shall be minimized.REQ4 The system shall minimize the RAM-usage.REQ5 The design of the system shall focus on the NAND-flash memory type.REQ6 The system shall write pages in a block sequentially starting from the

first page.REQ7 A page shall only be written once before it is erased.REQ8 The flash memory management system shall use a translation layer

together with JEFF.Table B.1. Descriptions of the requirements.

71