Big in memory file system
Click here to load reader
-
Upload
mahesh-gupta -
Category
Technology
-
view
111 -
download
0
Transcript of Big in memory file system
1
Big Memory File System
Mahesh GuptaDOS Lab, IIT Madras
2
Contents
Introduction Traditional System Architecture
Survey on Big Memory System Survey Violin memory system
Existing Systems Redis tmpfs
In Memory File System High level architecture Functional Requirements Various Experiments that can be done
References
3
Introduction
4
Traditional System Architecture
Processing Unit
L1 CacheL2CacheL3 Cache
Main Memory
Disk Disk
5
Issues
Access to disk is quite costly : almost 1000 times that of main memory access [source: OS, Galvin]
Disk access is handled quite separately compared to main memory access. Process is blocked and pushed into device queue.
Data needs to be moved constantly to & from disk for using small memory to execute large process.
As number of process increases, some of process needs to be swapped out to accommodate new processes.
6
Impact of Big Memory Systems
7
Survey According to a study by Aberdeen group, “In-memory
Computing: Lifting the burden of Big Data”, storing the data in flash memory while processing makes huge impact.
Various businesses require real world analytics on very large amount of data, which is not feasible in traditional disk based storage.
8
Statistics Data Stored in In-memory can be processed at the rate of
1200 TB / hour whereas On-disk data can be processed at the rate of 3.2 TB / hour. ~ 350 times faster
Average response time for query was 42 seconds (in-memory data) vs 75 minutes (disk data)
~ 100 times faster
Impact: Results obtained in real time. Quality of result is high. Real world value is very high.
9
Violin Memory
Violin Memory, manufactures large scale flash memory array which could be added to the board directly giving very high capacity In-memory.
Having very high in-memory gives rise to a system where MM capacity would be as good as disk capacity and hence entire data while reading can be brought to MM and need not be accessed again and again.
Targeted Applications: Faster Big data analytics Accelerating enterprise applications
10
The trend
Moving Desktop systems from disk based to diskless system where entire data will be stored on cloud, and systems can simply access them over network.
With very high processing speed available, disk access becomes the only bottleneck larger tasks.
Very large Memory can help to optimize system performance.
11
Existing Systems
12
REmote DIctionary Server (Redis)
In memory key-value store, text based protocol.
Idea: To provide a efficient data storage in Memory and access without accessing the disk.
Designed as DSL to work on simple to complex data structure
Uses Virtualization to handle data larger than main memory. (Based on the idea to give similar performance with / without virtual memory)
Asynchronous in nature (Memory to Disk transfer)
Used by Stack Overflow, Github
13
tmpfs : virtual memory file system
Virtual memory File system by Sun Microsystem
Idea: to design file system for short lived small sized files which do not reside on disk. And give performance of access to RAM
Memory based file system, uses page cache instead of RAM to store data.
14
In-Memory File System for OS
“To modify Memory Management Unit (MMU) to be able to take advantage of Big In-memory and improving system performance.”
15
Processing Unit Memory Management Unit
Main Memory Disk
Memory Management Unit
Storage
Existing Architecture
Modified Architecture 1
Memory File System
StorageModified Architecture 2
16Functional Requirements
Newly designed system should not Replace existing file system.
Should be flexible to support any changes done at the storage side so that later disk can be replaced by network storage.
Should minimize access to the storage unit.
Access to small, large and very large files should take roughly same amount of time. (transfer of data to/from memory should be done efficiently)
17
Experiments
To see if storage unit can be replaced by storage over network, and architectural modification required. (to support thin client)
To see if memory unit can support database semantics for querying or if version control semantics can be implemented.
To see how the system can be designed to support concurrent access. Network server can act as a master while systems sharing
resource can act as slaves.
18
References In-Memory Analytics:
http://spotfire.tibco.com/~/media/content-center/articles/aberdeen-in-memory-analytics-for-big-data.pdf
Violin Memory: http://www.violin-memory.com/
Redis: Slides: http://nosqlberlin.de/slides/NoSQLBerlin-Redis.pdf Internal VM: http://redis.io/topics/internals-vm
tmpfs Virtual memory file system: http://www.cs.rit.edu/~vcss544/tmpfs.pdf