AMIR RACHUM CHAI RONEN FINAL PRESENTATION INDUSTRIAL SUPERVISOR: DR. ROEE ENGELBERG, LSI Optimized...

AMIR RACHUMCHAI RONEN

FINAL PRESENTATION

INDUSTRIAL SUPERVISOR:DR. ROEE ENGELBERG, LSI

Optimized Caching Policies for Storage Systems

System data is stored over different types of storage devices

Generally speaking, in data storage, for a given price, the higher the speed, the lower the volume

The idea is enable use of larger, low-cost disk space with the benefits of high-speed hardware-optimize data storage for fastest overall disk access

This requires a dynamic algorithm for managing (migrating) the data across the tiers.

Introduction – Storage Tiering

SSDHigh Cost

High PerformanceLow Volume

SATA DriveLow Cost

Low PerformanceHigh Volume

Goals

Creating a platform which will allow us to test different algorithms in system-specific scenarios.

Testing several algorithms and finding the optimal algorithm amongst them for storage tiering in different scenarios.

Methodology

We coded a simulator that represents the platform running the tiered storage system.

We created several data structures that represent the data on the system, its location at all times, record read/write operations, and several other unique features

We used a recording of real I/O calls for such a system to simulate an actual scenario.

Accomplishments

Created an Algorithm interface that supports any algorithm, multiple tiers and multiple platform data structures.

Our design is generic enough to enable very easy addition of usage statistics and platform data.

CLI enabled quick input of input file, chunk size, tiers information.

Varying chunk size let us research the effect of the size on run time and algorithm effectiveness.

We implemented 2 caching algorithms: A “naïve” algorithm that transfers every chunk to the top tier upon IO A more efficient algorithm that minimizes migrations

Smart implementation resulted in low disk space usage for the various data structures (used a default tier).

Algorithm conclusions

We ran 3 different scenarios: Small chunk size (16B), small SSD size (64B, *4 chunk

size) Large chunk size (2048B), (relatively) small SSD

size( 8196B, *4 chunk size) Small chunk size (16B), relatively large SSD size ( 8196B,

*512 chunk size)


When using extremely small SSD size (*4 chunk size), both caching algorithms are ineffective: The naïve one showed a high number of reads from

higher tier, yet had twice as many migrations between tiers

The smart algorithm, despite having half the migrations of the naïve algorithm, showed very little reading from higher tier.

In this case, the dummy algorithm proved very efficient, as it saved all the time needed for relatively useless migrations.

Algorithm Conclusions (16/64)

SATA/RSATA/WSSD/RSSD/WSATA --> SSDSSD --> SATA0

20000

40000

60000

80000

100000

120000

140000

160000

DummyCacheLruNaiveCacheLruSmart


When running with a large chunk size and *4 SSD size, the caching algorithms received much better results than the dummy algorithm. However, the 2 caching algorithms did not differ in between themselves.



2000

4000

6000

8000

10000

12000



Running with a small chunk size and a large SSD size, the 2 caching algorithms also gave similar results. However, they were far inferior to the results from the previous run.



20000

40000

60000

80000

100000

120000

140000

160000


General Conclusions

Chunk size greatly affects the runtime of the platform, but “standard” size does not take long to run.

Smart usage of Boost greatly decreases work and is very effective.

Good implementation can result in huge disk space saving.

Despite having data structures in the platform, most non-naïve algorithms also need their own data structure of some sort

Working with Git source control proved to be very helpful: Retrieving old code that was once thought to be obsolete . Collaboration.

AMIR RACHUM CHAI RONEN FINAL PRESENTATION INDUSTRIAL SUPERVISOR: DR. ROEE ENGELBERG, LSI Optimized...

Documents

Transcript of AMIR RACHUM CHAI RONEN FINAL PRESENTATION INDUSTRIAL SUPERVISOR: DR. ROEE ENGELBERG, LSI Optimized...