Big in memory file system

18

Click here to load reader

Transcript of Big in memory file system

Page 1: Big in memory file system

1

Big Memory File System

Mahesh GuptaDOS Lab, IIT Madras

Page 2: Big in memory file system

2

Contents

Introduction Traditional System Architecture

Survey on Big Memory System Survey Violin memory system

Existing Systems Redis tmpfs

In Memory File System High level architecture Functional Requirements Various Experiments that can be done

References

Page 3: Big in memory file system

3

Introduction

Page 4: Big in memory file system

4

Traditional System Architecture

Processing Unit

L1 CacheL2CacheL3 Cache

Main Memory

Disk Disk

Page 5: Big in memory file system

5

Issues

Access to disk is quite costly : almost 1000 times that of main memory access [source: OS, Galvin]

Disk access is handled quite separately compared to main memory access. Process is blocked and pushed into device queue.

Data needs to be moved constantly to & from disk for using small memory to execute large process.

As number of process increases, some of process needs to be swapped out to accommodate new processes.

Page 6: Big in memory file system

6

Impact of Big Memory Systems

Page 7: Big in memory file system

7

Survey According to a study by Aberdeen group, “In-memory

Computing: Lifting the burden of Big Data”, storing the data in flash memory while processing makes huge impact.

Various businesses require real world analytics on very large amount of data, which is not feasible in traditional disk based storage.

Page 8: Big in memory file system

8

Statistics Data Stored in In-memory can be processed at the rate of

1200 TB / hour whereas On-disk data can be processed at the rate of 3.2 TB / hour. ~ 350 times faster

Average response time for query was 42 seconds (in-memory data) vs 75 minutes (disk data)

~ 100 times faster

Impact: Results obtained in real time. Quality of result is high. Real world value is very high.

Page 9: Big in memory file system

9

Violin Memory

Violin Memory, manufactures large scale flash memory array which could be added to the board directly giving very high capacity In-memory.

Having very high in-memory gives rise to a system where MM capacity would be as good as disk capacity and hence entire data while reading can be brought to MM and need not be accessed again and again.

Targeted Applications: Faster Big data analytics Accelerating enterprise applications

Page 10: Big in memory file system

10

The trend

Moving Desktop systems from disk based to diskless system where entire data will be stored on cloud, and systems can simply access them over network.

With very high processing speed available, disk access becomes the only bottleneck larger tasks.

Very large Memory can help to optimize system performance.

Page 11: Big in memory file system

11

Existing Systems

Page 12: Big in memory file system

12

REmote DIctionary Server (Redis)

In memory key-value store, text based protocol.

Idea: To provide a efficient data storage in Memory and access without accessing the disk.

Designed as DSL to work on simple to complex data structure

Uses Virtualization to handle data larger than main memory. (Based on the idea to give similar performance with / without virtual memory)

Asynchronous in nature (Memory to Disk transfer)

Used by Stack Overflow, Github

Page 13: Big in memory file system

13

tmpfs : virtual memory file system

Virtual memory File system by Sun Microsystem

Idea: to design file system for short lived small sized files which do not reside on disk. And give performance of access to RAM

Memory based file system, uses page cache instead of RAM to store data.

Page 14: Big in memory file system

14

In-Memory File System for OS

“To modify Memory Management Unit (MMU) to be able to take advantage of Big In-memory and improving system performance.”

Page 15: Big in memory file system

15

Processing Unit Memory Management Unit

Main Memory Disk

Memory Management Unit

Storage

Existing Architecture

Modified Architecture 1

Memory File System

StorageModified Architecture 2

Page 16: Big in memory file system

16Functional Requirements

Newly designed system should not Replace existing file system.

Should be flexible to support any changes done at the storage side so that later disk can be replaced by network storage.

Should minimize access to the storage unit.

Access to small, large and very large files should take roughly same amount of time. (transfer of data to/from memory should be done efficiently)

Page 17: Big in memory file system

17

Experiments

To see if storage unit can be replaced by storage over network, and architectural modification required. (to support thin client)

To see if memory unit can support database semantics for querying or if version control semantics can be implemented.

To see how the system can be designed to support concurrent access. Network server can act as a master while systems sharing

resource can act as slaves.

Page 18: Big in memory file system

18

References In-Memory Analytics:

http://spotfire.tibco.com/~/media/content-center/articles/aberdeen-in-memory-analytics-for-big-data.pdf

Violin Memory: http://www.violin-memory.com/

Redis: Slides: http://nosqlberlin.de/slides/NoSQLBerlin-Redis.pdf Internal VM: http://redis.io/topics/internals-vm

tmpfs Virtual memory file system: http://www.cs.rit.edu/~vcss544/tmpfs.pdf