MStore: Enabling Storage-Centric Sensornet Researchmanj/research/spots07-mstore.pdfMStore: Enabling...

MStore: Enabling Storage-Centric Sensornet Research

Kresimir Mihic, Ajay Mani, Manjunath Rajashekhar, and Philip Levis{kmihic,ajaym,manj,pal}@stanford.edu

Computer Systems LaboratoryStanford UniversityStanford, CA 94305

AbstractWe present MStore, an expansion board for telos and micafamily nodes that provides a non volatile memory hierarchy.MStore has four memory chips: a 32KB FRAM, an 8MBNOR flash, a 16MB NOR flash, and a 256MB NAND flash,which can be expanded to 8GB if needed. All chips providean SPI bus interface to the node processor. MStore alsoincludes a Complex Programmable Logic Device (CPLD),whose primary purpose is to be an SPI to parallel interfacefor the NAND chip. The CPLD can also be used to offloadcomplex data processing.

Using TinyOS TEP-compliant drivers, we measure thecurrent draw and latencies of read, write, and erase opera-tions of different sizes on each of the storage chips. Throughthis quantitative evaluation, we show that MStore’s many-level hierarchy and simple design provide an open and flexi-ble platform for sensor network storage research and exper-imentation.

1. INTRODUCTIONThe primary purpose of sensor networks is to sense and

process readings from the environment. Local storage canbenefit a lot of application scenarios like archival storage [16],temporary data storage [12], storage of sensor calibration ta-bles [17], in-network indexing [21], in-network querying [22]and code storage for network reprogramming [15], amongothers. Recent work [20] has also shown that local storageon flash chips is two orders of magnitude cheaper than trans-mitting over the radio, and is comparable to computation.

Such gains in cost, performance and energy consumptionhas strengthened the case for in-network storage and data-centric application. While flash memory has been a cheap,viable storage alternative for the low power, energy con-strained sensor nodes, storage sub-systems on existing sen-sor platforms has not caught up with the recent technologi-cal advancements in non volatile memory chip designs.

Existing storage capabilities on sensor nodes is restrictedto a single flash memory. Current designs of storage cen-

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Copyright 200X ACM X-XXXXX-XX-X/XX/XX ... $5.00.

Figure 1: The MStore mounted on a Telosb Sensor

tric applications thus focus their storage strategies arounda single storage chip. But existing storage solutions likeMatchbox [10], ELF [9] and Capsule [19] and indexing sys-tems like MicroHash [23] and Tinx [18] could greatly benefitfrom a multi-level storage subsystem. Having such a multilevel storage hierarchy will radically improve the flexibilityof design choices, enhance overall sensor performance andalso influence newer designs and system architectures.

MStore, our new extension storage board for the telos andmica sensor nodes, introduces such a non volatile memory hi-erarchy with different memory chips that provides the muchneeded module to enable research and experimentation instorage centric sensornet applications.

Non Volatile Memory Characteristics: There are manydifferent types of Non-Volatile storage memories available inthe market that target low power scenarios. Among them,flash memory is the most prominent one, while newer ad-vanced types like Magnetic RAM (MRAM) and FRAM aregetting popular. These chips vary a lot in their characteris-tics.NOR Flash: NOR flash memory has traditionally beenused to store relatively small amounts of executable code forembedded computing devices such as PDAs and cell phones.NOR is well suited to use for code storage because of itsreliability, fast read operations, and random access capabil-ities. Because code can be directly executed in place, NORis ideal for storing firmware, boot code, operating systems,and other data that changes infrequently. Apart from beingused as a ROM, the NOR memories can, of course, also bepartitioned with a file system and used as any storage de-vice. NOR flashes have capacities up to 512 MB.NAND Flash: NAND flash memory has become the pre-ferred format for storing larger quantities of data. Higherdensity, lower cost, and faster write and erase times, and alonger re-write life expectancy make NAND especially well

Figure 2: The MStore

suited for applications in which large amount of sequentialdata need to be loaded into memory quickly and replacedrepeatedly. Unlike NOR flash chips, NAND chips are ac-cessed like a block device with block sizes from 512 Bytes to2048 Bytes. Typically, associated with each block are a fewbytes (typically 1216 bytes) that could be used for storageof an error detection and correction block checksum. NANDflashes have capacities from up to 8GB.FRAM: Ferroelectric RAM (FeRAM or FRAM) is an ad-vanced non-volatile computer memory. It is similar in con-struction to DRAM, but uses a ferroelectric layer to achievenon-volatility. Although the market for non-volatile mem-ory is currently dominated by Flash chips, FeRAM offersa number of advantages, notably lower power usage, fasterwrite speed and a much greater maximum number (exceed-ing 1016 for 3.3V devices) of write-erase cycles. FeRAMshave smaller capacities and the maximum value of a manu-factured FRAM is 1MB.

Given the different characteristics available, we have in-troduced four different NVRAM chips on MStore: a 32KBFRAM, an 8MB NOR flash, a 16MB NOR flash, and a256MB NAND flash. The family of NAND chips also in-cludes chips that have sizes from 256MB to 8GB and theseshare identical current and timing performances. Thus theNAND chip on MStore can be expanded to 8GB if needed.Individual chip details will be highlighted in section 3.

Drivers and Evaluation: To support reading and writ-ing of data from these memory chips, we have written a setof drivers in TinyOS 2.0 [14], which follow the guidelines asmentioned in TinyOS Extension Proposal 103 [6]. Details ofthis implementation is in section 4. We conducted our eval-uation of the read write and erase characteristics using thesedrivers. The results are compared with the values specifiedin the data-sheets for each of the chips. These evaluationsare in section 5.

2. BOARD DESIGNMStore memory extension board consists of five functional

elements, four memory chips and complex programmablelogic device (CPLD) and supportive elements such as cur-rent resistors, bypass capacitors and 1.8 V voltage regulatorfor CPLD. The board has been designed to support bothtelosb and mica2 platforms, by providing interfaces to theplatforms by the means of extension ports. The board fitsboth platforms without a need for an alteration.

The backbone of the MStore board is the SPI bus that

Figure 3: Schematic diagram of MStore components

Figure 4: MStore Board (Front)

connects extension ports with functional elements (Figure 3).FRAM and NOR memory devices are connected directly tothe bus, while NAND uses the CPLD as the SPI to parallelinterface. The interface has been designed as a finite statemachine that handles NAND chip control inputs (command,address and data latch modes, input and output data) anddoes serial to parallel and parallel to serial data conversion.The state machine is chip specific as it follows specifics ofthe K9 flash memory family, but can be easily updated tosupport similar memory chips of other manufacturers. Theinterface has been implemented using VHDL and synthe-sized using Xilinx ISE 8.1i design software suit.MStore provides seven extension ports (Figure 4 and 5).Four connect MStore to a sensor platform: two for telosb,and two for mica2. Next two provide a direct connection toCPLD pins; one to access JTAG ports and the other for I/Opins (CPLD EX). The seventh extension port (RP) gives ac-cess to the current resistors R placed on power supply inputsof each chip, thus can be used for various measurements thataffect power lines (for example, operating current).

3. CHIPSThe MStore storage board consists of four different stor-

age chips with varying characteristics and behaviors. It alsoincludes a Complex Programmable Logic Device by whichone can program logic directly into the storage board, of-floading processing from the sensor mote. The board can actas an extension to both the telosb([2]) and the micaz([13])sensor nodes.

Table 1: Overview of storage and data organization as per datasheets

Chip Name Total Size Page Size Block Size Sector Size Read Unit Write Unit Erase UnitBytes KB KB Bytes Bytes KB

M25P64 8MB 256 - 64 1 to ∞ 1 1 to 256 64AT26DF161 2MB 256 - 128 1 to ∞ 1 1 to 256 4/32/64FM25L256 32KB - - - 1 1 NAK9F2G08UXA 256MB 2048 128 - 2048 2048 128

Figure 5: MStore Board (Back)

3.1 M25P64 NOR RAMThe M25P64 [5] is a 8MB serial NOR flash memory, that

can be accessed by a high speed SPI-compatible bus. Thememory can be programmed 1 to 256 bytes at a time, us-ing the Page Program instruction. An enhanced Fast Pro-gram/Erase mode is available to speed up operations. Thememory is organized as 128 sectors, each containing 256pages. Each page is 256 bytes wide. Thus, the whole mem-ory can be viewed as consisting of 32768 pages, or 8388608bytes. Even though there is no way to erase a single page,the entire memory can be erased using the Bulk Erase In-struction or a sector at a time, using the Sector Erase in-struction.

Table 2: M25P64 Program/Erase characteristics:

Parameter Typ Max UnitsPage Program Cycle Time(256 Bytes)

1.4 5 ms

Page Program Cycle Time (nBytes)

(0.4 +n/256)

5 ms

Sector Erase Cycle Time 1 3 secBulk Erase Cycle Time 68 160 sec

Protection: The M25P64 chip protects at a sector level,and three Block Protect bits can be set as a part of thestatus register to decide which of the sectors has to be pro-tected. The sectors have to be consecutive and dependingon the values protect the top 2,4,8,16,32 or 64 sectors.

Energy and latency: The chip consumes 50µA dur-ing the standby mode, and uses up about 20 mA duringthe program or erase mode. Reads are much less expensivecompared to the writes, and our measured values say thatthe read current is at about 2 mA.

Table 3: M25P64: Current draw characteristics

Operation Max Measured UnitsStandby Mode 50 – µARead at 20MHz 4 1.7 mAPage Program 15 4.5 mAFast Program 20 – mASector Erase 20 4.8 mABulk Erase 20 4.8 mA

3.2 AT26DF161 NAND ChipThe AT26DF161 [1] is a 2MB serial interface Flash mem-

ory device. It has sixteen 128-Kbyte physical sectors andeach sector can be individually protected from program anderase operations. The chip has a flexible erase architecturesupporting four different erase granularities of 4KB, 32 KB,64KB and full chip granularity. The data sheet claims thatthe chip is designed for use in a wide variety of high-volumeconsumer based applications in which program code is shad-owed from flash memory into embedded or external RAMfor execution. The erase granularity also makes it ideal fordata storage.Protection: The AT26DF161 also offers a sophisticatedmethod for protecting individual sectors against erroneousor malicious program and erase operations. By providingthe ability to individually protect and unprotect sectors, asystem can unprotect a specific sector to modify its contentswhile keeping the remaining sectors of the memory array se-curely protected. This is useful in applications where pro-gram code is patched or updated on a subroutine or modulebasis, or in applications where data storage segments need tobe modified without running the risk of errant modificationsto the program code segments. In addition to individualsector protection capabilities, the AT26DF161 incorporatesGlobal Protect and Global Unprotect features that allow theentire memory array to be either protected or unprotectedall at once. This reduces overhead during the manufacturingprocess since sectors do not have to be unprotected one-by-one prior to initial programming.

Specifically designed for use in 3-volt systems, the AT26DF161supports read, program, and erase operations with a supplyvoltage range of 2.7V to 3.6V. No separate voltage is re-quired for programming and erasing. The various chip spe-

1The address value greater than the chip maximum addressare ignored by our driver implementation

Table 4: AT26DF161: Program/Erase characteris-tics

Parameter Typ Max UnitsPage Program Time (256 Bytes) 1.5 5.0 msBlock Erase Time

4-Kbyte 0.05 0.2 sec32-Kbyte 0.35 0.6 sec64-Kbyte 1.0 0.7 sec

Chip Erase Time 18 28 sec

cific energy properties of the chip is laid out in Table 5

Deep Power Down: During normal operation of theAT26DF161, the device will be placed in the standby modeto consume less power as long as the CS pin remains de-asserted and no internal operation is in progress. The chipalso accepts a Deep Power-Down command that offers theability to place the device into an even lower power con-sumption state called the Deep Power-Down mode. Thedata sheet claims that the chip consumes about 4 µA inthis mode. This property would be useful for high energyutilization while duty cycling the flash chip

Table 5: AT26DF161: Current draw characteristics

Parameter Typ(Max) Measured UnitsStandby Current 25(35) – µADeep Power-Down Current 4(8) – µARead Operation at 20 MHz 7(10) 10.7 mAPage Program 12(18) 12.8 mASector Erase 14(20) 11.4 mABulk Erase 14(20) 11.3 mA

3.3 FM25L256 FRAMFerroelectric RAM (FeRAM or FRAM) is a new type

of non-volatile computer memory which uses a ferroelectriclayer to achieve non-volatility. FRAM is competitive in ap-plications where its properties of low write voltage, fast writespeed, much greater write-erase endurance but low storagevolume give it a compelling advantage over flash memory.We thus decided to include this chip in our storage board sothat we can investigate into possibilities of using chips withthese properties.

The FM25L256 [3] is a 32KB nonvolatile memory this ad-vanced ferroelectric process. The FM25L256 performs writeoperations at bus speed. The datasheet claims that no writedelays are incurred. The next bus cycle may commenceimmediately without the need for data polling. In addi-tion, the product offers virtually unlimited write endurance.Also, FRAM exhibits much lower power consumption thanEEPROM. These capabilities make the FM25L256 ideal fornonvolatile memory applications requiring frequent or rapidwrites or low power operation. Example applications rangefrom data collection, where the number of write cycles maybe critical, to demanding industrial controls where the longwrite time of EEPROM can cause data loss. The FM25L256provides substantial benefits to users of serial EEPROM as

a hardware drop-in replacement. The FM25L256 uses thehigh-speed SPI bus, which enhances the high-speed writecapability of FRAM technology.

Protection: The FM25L256 chip does not have the samepowerful protection system as the ATAT26DF161. The pro-grammer has fixed regions of the chip that she can protect:either the upper 1/4 of the addresses of the chip or the upper1/2 or all of the chip memory.

The datasheet comments that the current drawn in standbymode for this chip is in the range of 1µA, which is quite smallcompared to the 50µA of the M25P64 and the AT26DF161chips. There is no difference between an erase and a write,and the latency values for both reads and writes are said tobe at bus speeds.

Table 6: FM25L256: Current draw characteristics

Operation Typ(Max) Measured UnitsStandby Mode -(1) – µAByte Read 15(30) 0.70 mAByte Write 15(30) 0.71 mA

3.4 K9F2G08UXA NAND FLASHThe K9F2G08X0A [4] is a 2,112 MB memory organized

as 131,072 rows(pages) by 2,112x8 columns. Spare 64x8columns are located from column address of 2,048 2,111.A 2,112-byte data register is connected to memory cell ar-rays accommodating data transfer between the I/O buffersand memory during page read and page program operations.The memory array is made up of 32 cells that are seriallyconnected to form a NAND structure. Each of the 32 cellsresides in a different page. A block consists of two NANDstructured strings. A NAND structure consists of 32 cells.Total 1,081,344 NAND cells reside in a block.

The program and read operations are executed on a pagebasis, while the erase operation is executed on a block ba-sis. The memory array consists of 2,048 separately erasable128K-byte blocks. It indicates that the bit by bit erase op-eration is prohibited on the K9F2G08X0A.

Some commands require one bus cycle. For example, Re-set Command, Status Read Command, etc require just onecycle bus. Some other commands, like page read and blockerase and page program, require two cycles: one cycle forsetup and the other cycle for execution.

In addition to the enhanced architecture and interface,the device incorporates copy-back program feature from onepage to another page without need for transporting the datato and from the external buffer memory. Since the time-consuming serial access and data-input cycles are removed,system performance for solid-state disk application is signif-icantly increased. This feature is not used in our driver butis a potential optimization.

The chip doesn’t have any programmable protection scheme.

3.5 XC2C32A CPLD ChipThe XC2C32A [8] is a Complex Programmable Logic De-

vice (CPLD) chip manufactured by XILINX and is a part ofits CoolRunner-II CPLD family. It provides a 100% digitalcore with up to 323 MHz performance. The CPLD pro-

Table 7: K9F2G08UXA: Program/Erase character-istics

Parameter Typ Max UnitsPage Program Time .2 0.7 msBlock Erase Time 1.5 2 ms

Table 8: K9F2G08UXA: Current draw characteris-tics

Operation Typ(Max) Measured UnitsStandby Mode 10(50) NI µAPage Read 15(30) –2 mAPage Program 15(30) 7.03 mABlock Erase 15(30) 8.63 mA

vides high performance with ultra-low power consumptionusing up about 28.8µW in standby mode. It is one of thesmallest CPLD packages available and features 32 macro-cells device selection. The In-System Programming (ISP)supports both the IEEE 1532 In-System Programming andIEEE 1149.1 JTAG Boundary Scan testing.

During our experiments, we found that the chip takes upan overhead cost of 1.4 mA for the operations while un-dertaking the serial to parallel and parallel to serial dataconversion. The latency of the operations on the CPLD arenegligible. According to the datasheet, it uses up a standbycurrent of about 90µA.

4. DRIVER DESIGNTo support evaluation of the characteristics of the board

and its various memory chips, we needed to write the driversto talk to the hardware. We implemented these drivers inTinyOS 2.0 [14] and followed the guidelines as mentioned inTinyOS Extension Proposal 103 [6].

TEP 103 documents a set of hardware-independent in-terfaces to non-volatile storage for TinyOS and describessome design principles for the Hardware Presentation Layer(HPL) and Hardware Adaptation Layer (HAL) of variousflash chips. We follow the three-layer Hardware Abstrac-tion Architecture (HAA), with each chip providing a presen-tation layer (HPL), adaptation layer (HAL) and platform-independent interface layer (HIL) [11].

The TEP also describes three high-level storage abstrac-tions: large objects written in a single session (Block inter-face), small objects with arbitrary reads and writes (Configinterface), and logs (Log interface). We used our implemen-tation of the Block interface for our tests.

TinyOS 2.x, divides flash chips into separate volumes (withsizes fixed at compile-time) with each volume providing asingle storage abstraction (the abstraction defines the for-mat). We used the chip in a single volume and the driversoperated over this volume.

Bad Blocks and CRC’s: The block interface of TEP

2The readings were not complete at the time of submission3These values include the fixed 1.4 mA overhead because ofthe CPLD

103 contains no direct support for verifying the integrity ofthe data. We support this and we allow checking this CRCwhen desired.

5. EVALUATIONThe datasheets for the various chips give the current us-

age and latency values for read, write and erase operations.However, the actual values in practice tend to be very dif-ferent due to the overhead of driver associated with eachindividual operation. Hence, we decided to experimentallymeasure the latency and current usage characteristics of var-ious operations by exercising the driver code correspondingto each individual chip.

Experimental Setup: The experiments were carried outon an MStore chip connected to a Telosb sensor node. Eachof the chips had a single volume implementing the BlockStorage abstraction as defined by TEP 103. Read and writecharacteristics were measured by reading and writing a byte,a page, two pages, sector size and flash size worth of data.The erase characteristics were measured by exercising thebulk and sector erase commands4. The sensor did not anyother application running on top of TinyOS while the mea-surements were taken.

5.1 Latency CharacteristicsWe define the latency of a particular operation as the

time required to complete it at the Hardware Interface Layer(HIL). For a split phase operation, it is the average time thatthe operation takes for the callback phase to be signalledafter the operation was invoked.

The latency value of an operation includes the SPI busarbitration time, time needed to prepare a chip for an opera-tion (opcode, address, data etc), and the actual time neededfor an operation to complete. The later information (min,max and typical values) are usually put in the AC Charac-teristics section in the chip’s datasheet. For some operationsthe latency will also include the timing of instructions thatmust precede the actual command. For example, before eachwrite a command to enable writing must be sent to a flash.Also, the SPI bus may not send a regular stream of clock cy-cles, but may have bytes separated by some ∆T1 and groupof bytes separated by some other ∆T2. Actual latency ofan operation is thus a sum of latency of various operationsand timing and operational constrains of an SPI bus.

Using the Alarm interface of TinyOS for measuring theselatencies would have limited the precision and accuracy ofthe readings to that supported by the timer system. To havemore accurate readings, we decided to calculate latency interms of number of clock cycles required to execute eachsplit phase operation. We did this by starting a counterbefore a call to an operation and stopping the counter oncethe operation completion was ’signal’led. The value in thiscounter then represents the number of clock cycles consumedin carrying out a particular operation.

Our testing suite includes code to check the correctnessof the read and write operation. This introduces a slightoverestimate in the measured latency of chip operations.

4Erase operation occurs at a minimum granularity of theblock size. Hence, to erase any region of the flash of sizesmaller than the block size, we have to at least pay the costof erasing the smallest number of blocks encompassing theerased region.

The clock on the Telosb runs at 8MHz and so the finallatency time is calculated using the formula:

latency (in ns) = (125 * #clock-cycles)

The latency measurements for each flash operations wererepeated 10 times. The value in the table 9 represents theaverage over these 10 experiments. There were slight varia-tions in the running time of flash operations, but in all casesthe standard deviation was below 1% of the average values.

5.2 Energy CharacteristicsThe energy requirements of a memory chip is the primary

concern for adoption and is the motivating factor in designchoices. The energy required to read, store or erase data isa function of latency and current draw. This value is muchhigher than what can be calculated using the informationprovided in the datasheet, as it only includes the time foran operation to execute on the chip. For proper energy cal-culations, the latency of a particular operation that a driverprovides must be used. We use the latency of operationsas done in the latency experiments to calculate the energycosts.

To measure the current that was drawn by each of theoperations, we executed the different read, write and eraseoperations using the the drivers just as in latency measure-ments. We used a Velleman oscilloscope [7] to measure thesevalues. Current measurements were done by measuring theaverage voltage drop across the input resistors during mem-ory operations5. For flash chips and CPLD, resistor valuesare 1 ohm, while for the FRAM a resistor of 10 ohms is used.Tolerance of all resistors is 1%.

The energy value depends on the operating voltage. Weassume a constant running voltage of 2.8 volts. The con-sumed energy is calculated using the formula:

energy (µJ) = 2.8 (V) * measured current(mA)

* latency(ms)

The measured readings are tabulated in the current usagetables of each of the chips (Tables 3, 5 and 6). We noticethat these energy values also vary substantially from thereadings as provided in the datasheets.

Table 10: Energy consumed for Erase operation(mJ)

FM25L256 M25P64 AT26DF161Sector Erase NA 3.89 18.47Block Erase

4KB NA NA 9.2432KB NA NA 9.2664KB NA NA 18.48

Bulk Erase NA 241.47 245.52

5Standby current could not be measured due to the limitedresolution of the oscilloscope

Figure 6: Energy consumed during Read operation

Figure 7: Energy consumed during Write operation

Table 9: Operation latency over driver code

Operation Size (in bytes) FM25L256 M25P64 AT26DF161 Units

Erase Sector Erase NA 0.289 0.578 secBulk Erase NA 17.967 13.449 sec

Read 1 0.576 0.571 0.572 ms256 6.222 6.218 6.219 ms512 12.441 12.430 12.432 msSector Size NA 1.656 3.274 secFlash Size 0.829 19.117 52.587 sec

Write 1 0.717 34.198 34.301 ms256 5.946 39.309 72.690 ms512 11.985 78.760 145.382 msSector Size NA 8.985 25.066 secFlash Size 0.772 51.806 56.502 sec

6. DISCUSSIONThe different chips that we have chosen to be a part of

MStore have differing capabilities and properties. FRAMhas very desirable properties, but has a small size. NORchips are byte addressable and we can use it to operate onsmaller chunks of data, but energy consumption for the op-erations on the NAND may be much better than the mea-sured values for the NOR flashes. NAND flashes have alarger granularity for reads and writes which might subvertits energy advantages over the NOR flashes. The designeris thus faced with making choices based on these tradeoffsto decide where to put the data.

Abusing operating system terminology, we can divide datainto three categories: ’hot’, ’cold’ and ’warm’ based on itsusage characteristics. Hot data tends to be updated andaccessed very often and cold data is seldom accessed or up-dated. Usages of warm data falls in the middle of thesetwo categories. From the energy consumption values asseen from the evaluation section, we can conclude that theFRAM is a perfect candidate for ’hot’ data while the NORand NAND flashes are optimal for ’warm’ and ’cold’ datarespectively.

An important observation that we make from the mea-sured values is the superior performance characteristics ofthe FRAM chip. Except for its limitations with size, theread and write times and energy costs are very low. TheFRAM is ideal to store small sized hot data. It could alsobe used as an extension to the main memory on the sensorboard to offload data from the RAM when memory con-straints kick in. Neighbor tables, other short lived but livedata like packet queues and active meta-data are primarycandidates to be stored in the FRAM.

Unlike the M25P64, AT26DF161 NOR flashes have threedifferent erase granularities of 4, 32 and 64KB. While it canbe concluded from the datasheets that the AT chip may bea good candidate for storing warm data that are updated insmaller chunks than 64KB, the energy usage as per fig 6 andtable 10 shows that the payoff may not be worth it as theenergy cost for erasing the 4KB block on the AT26DF161 ismore than the erase of a 64KB block on the M25P64 chip.This result could either be due to a bug in our driver codeor a manifestation of the code being un-optimized.

The NAND chip is ideal for collection of large data that

dont change often, and can be accessed and read in largerchunks. While we could not measure the energy consump-tion values at the time of submission, based on the datasheetsand our readings as per tables 8 and 7, we do envisage thesevalues to be better than those of NOR.

6.1 Example Applications for MStoreMany existing applications can benefit from the storage

hierarchy that MStore provides. More specifically, they canspecialize the tasks of some of their data structures to theappropriate memory chip based on its ’temperature’. Wediscuss two different applications that could improve theirenergy savings and performance using MStore.

TINX: TINX [18] is an indexing scheme that can be usedfor fast retrieval of archived sensor data. It maintains anindex over the actual sensor data and also maintains anin-memory second-level index to optimize searching for thelocation of the index pages. The first level index is updatedoften, and the second level index is referenced with verylarge frequency and is an example of hot data. If the actualsensor data is not edited, it behaves like cold data and canbe stored in the NAND memory. The index which requiresmore byte addressability and finer grained modifications ismore like the warm data and can be stored in either of theNOR chips. The second level index is a perfect candidatefor the FRAM. Instead of manipulating these different indexstructures at a page level, which makes TINX waste energyfor minor updates on the indexes, moving them to the NORand FRAM will absorb a large part of the wasted energy.

Capsule: Capsule [19] exposes storage abstractions forgeneral use in sensor network application, and we see thatimplementations of the stack and index objects could bemoved to the NOR flashes while the stream object couldbe stored on the NAND chip. Stack compaction could uti-lize the FRAM to store the pointers. Capsule also allowsapplications to tolerate software faults and device failuresthrough checkpointing and rollback of storage objects. When-ever a new checkpoint is created, a new entry is made in theroot directory, which points to the newly created checkpoint.Clearly, such a directory entry needs to be overwritten everytime a new checkpoint is created. This root directory couldbe maintained in the FRAM. Capsule also implements amemory reclamation scheme by implementing a cleaner task

that can run periodically. We believe this cleaner task canbe offloaded to CPLD on the MStore.

7. CONCLUSIONWe have presented the design and implementation of MStore,

an extension storage board for the mica and telos sensornodes. MStore includes four non volatile memory chipswith varying characteristics, laid out as a hierarchy thatwill enable storage centric research in wireless sensor net-works. The enhancements and availability of newer nonvolatile chips compliment present thoughts and designs indistributed sensor network. We have presented the latencyand energy values measured for various operations on thesechips and compared them against the datasheet values fromthe manufacturers. With these values, the TEP 103 com-plaint drivers written in TinyOS and the flexibility of thedesign, researchers may come up with new designs, furthertheir design choices and tune existing designs for sensor net-work research.

AcknowledgementsWe would like to thank Kevin Klues for providing us withthe code to help calibrating the number of clock cycles re-quired to execute each split phase operation. We would alsolike to thank Prabal Dutta, Gaurav Mathur and DeepakGanesan for discussions and their feedback and general ad-vice.

8. REFERENCES[1] Atmel corporation, atmelchips.com.

[2] Private communication, joe polastre, moteiv.com.

[3] Ramtron international corporation,http://www.ramtron.com.

[4] Samsung corporation, http://www.samsung.com/.

[5] St microelectronics, st.com.

[6] Tinyos extension proposal 103,http://www.tinyos.net/tinyos-2.x/doc/html/tep103.html.

[7] Velleman inc.,http://www.vellemanusa.com/us/enu/product/view/?id=522377.

[8] Xilinx inc.,http://direct.xilinx.com/bvdocs/publications/ds310.pdf.

[9] H. Dai, M. Neufeld, and R. Han. Elf: an efficientlog-structured flash file system for micro sensor nodes. InSenSys ’04: Proceedings of the 2nd internationalconference on Embedded networked sensor systems, pages176–187, New York, NY, USA, 2004. ACM Press.

[10] D. Gay, P. Levis, R. von Behren, M. Welsh, E. Brewer, andD. Culler. The nesC language: A holistic approach tonetworked embedded systems. In SIGPLAN Conference onProgramming Language Design and Implementation(PLDI’03), June 2003.

[11] V. Handziski, J. Polastrey, J.-H. Hauer, C. Sharpy,A. Wolisz, and D. Cullery. Flexible hardware abstractionfor wireless sensor networks. In Proceedings of the SecondEuropean Workshop on Wireless Sensor Networks(EWSN), Feb 2005.

[12] J. Hellerstein, W. Hong, S. Madden, and K. Stanek.Beyond average: Towards sophisticated sensing withqueries, 2003.

[13] J. Hill and D. E. Culler. Mica: a wireless platform fordeeply embedded networks. IEEE Micro, 22(6):12–24,nov/dec 2002.

[14] J. Hill, R. Szewczyk, A. Woo, P. Levis, K. Whitehouse,J. Polastre, D. Gay, S. Madden, M. Welsh, D. Culler, andE. Brewer. Tinyos: An operating system for sensornetworks, 2003. Submitted for publication.

[15] J. W. Hui and D. Culler. The dynamic behavior of a datadissemination protocol for network programming at scale.In SenSys ’04: Proceedings of the 2nd internationalconference on Embedded networked sensor systems, pages81–94, New York, NY, USA, 2004. ACM Press.

[16] M. Li, D. Ganesan, and P. Shenoy. Presto: Feedback-drivendata management in sensor networks. In ThirdUSENIX/ACM Symposium on Network Systems Designand Implementation (NSDI), May 2006.

[17] S. Madden, M. J. Franklin, J. M. Hellerstein, andW. Hong. Tinydb: An acquisitional query processingsystem for sensor networks. Transactions on DatabaseSystems (TODS), 2005.

[18] A. Mani, M. B. Rajashekhar, and P. Levis. Tinx - a tinyindex design for flash memory on wireless sensor devices. InProceedings of the Fourth ACM Conference on EmbeddedNetworked Sensor Systems (SenSys), 2006.

[19] G. Mathur, P. Desnoyers, D. Ganesan, and P. Shenoy.Capsule: An energy-optimized object storage system formemory-constrained sensor devices. In Proceedings of theFourth ACM Conference on Embedded Networked SensorSystems (SenSys), November 2006.

[20] G. Mathur, P. Desnoyers, D. Ganesan, and P. Shenoy.Ultra-low power data storage for sensor networks. In IPSN’06: Proceedings of the fifth international conference onInformation processing in sensor networks, pages 374–381,New York, NY, USA, 2006. ACM Press.

[21] S. Ratnasamy, B. Karp, L. Yin, F. Yu, D. Estrin,R. Govindan, and S. Shenker. Ght: a geographic hash tablefor data-centric storage. In Proceedings of the first ACMinternational workshop on Wireless sensor networks andapplications, pages 78–87. ACM Press, 2002.

[22] S. Shenker, S. Ratnasamy, B. Karp, R. Govindan, andD. Estrin. Data-centric storage in sensornets. SIGCOMMComput. Commun. Rev., 33(1):137–142, 2003.

[23] D. Zeinalipour-Yazti, S. Lin, V. Kalogeraki, D. Gunopulos,and W. A. Najjar. Microhash: An efficient index structurefor flash-based sensor devices. In 4th USENIX Conferenceon File and Storage Technologies (FAST’2005), pages31–44, December 2005.

MStore: Enabling Storage-Centric Sensornet Researchmanj/research/spots07-mstore.pdfMStore: Enabling...

Documents

Transcript of MStore: Enabling Storage-Centric Sensornet Researchmanj/research/spots07-mstore.pdfMStore: Enabling...