Tachyon memory centric, fault tolerance storage for cluster framworks

Tachyon: memory centric, fault tolerance storage for cluster framworks

presented by Viet-Trung Tran

Memory is King

• RAM throughput increasing exponentially

• Disk throughput increasing slowly

Memory-locality key to interactive response time

Memory as cache

• Improve READ• Cannot help much with write

• Replication for fault tolerance• Network bandwidth and latency are much worse than that of memory

• Write throughput is limited by disk I/O• Required at least one copy on disk

• Inter-job data sharing cost dominates pipeline end-to-end latency• 34% jobs output as large as input (Cloudera survey)

Different jobs share data

Slow writes to disk

Spark Task

Spark mem block manager

block 1

block 3

Spark Task

block 3

block 1

HDFS / Amazon S3block 1

block 3

block 2

block 4

storage engine & execution enginesame process(slow writes)

Different frameworks share data

Spark Task

block 1

block 3

Hadoop MR

HDFS / Amazon S3block 1

block 3

block 2

block 4

storage engine & execution enginesame process(slow writes)

Slow writes to disk

Tachyon: realiable data sharing at memory speed within and across frameworks/jobs

Tachyon

SparkMapRe

duceSparkSQL

H2O GraphX Impala

HDFS S3Gluster

FSOrange

FSNFS Ceph ……

……

Challenges

How to achieve reliability data sharing without replication?

Target workload properties

• Immutable data• Deterministic jobs• Locality based scheduling• All data vs working set• Program size vs data size

System architecture

Consists of two layer

• Lineage

• Deliver high throughput I/O

• Capture sequence of jobs/tasks that create output

• Persistence

• Asynchronous checkpoints

• One data copy in memory

• Recomputation for fault-tolerance

Memory-Centric Storage Architecture

Master Node

• Similar to HDFS and GPS• Passive standby model

• BUT also contains a workflow manager• Track lineage information• Compute checkpoint order• Interact with cluster resource manager to allocate resources for re-

computations

Lineage

More complex lineage

Lineage metadata

• Binary program

• Configuration

• Input Files List

• Output Files List

• Dependency Type

• Narrow (filter, map)

• Wide (suffle, join)

Fault-recovery by recomputations

• Challenge• Bounding the recomputation cost for a long running storage

• Asynchronous checkpointing• Allocate resources for recomputations

• Make sure recomputation tasks get enough resources• Do not impact system performance (task priorities)

• Assumption• Input files are immutable• job executions are deterministic

• Client side caching to mitigate read hotspots

Asynchronous checkpointing

• Goals• Bounded recomputation time• Checkpointing hot files• Avoid checkpointing temp files

• Edge algoritim • Modeling relationships of files with a DAG

• Vertices are files • Edge from A to B if B is generated by a job that read A

Edge algorithm

• Checkpoint leaves• Checkpointing hot files

• Most file access are less than 3 ( yahoo survey for big data workload)• Thus, access more than twice get checkpointed

• Dealing with large dataset• 96% active job sizes fit in the cluster memory• synchronously write dataset above a defined threshold to disk• Most of the files in memory checkpointed can be evicted from memory

to make room

Resource allocation

• Depend on the scheduling policy of the running cluster• Requirements

• Priority compatibility• Resource sharing • Avoid cascading recomputation

• Best ordering recomputation• Most common policies

• priority based• weighted fair sharing

Priority based scheduler

Fair sharing based scheduler

Evaluation

• 110x faster than MemHDFS• 4x faster in realistic jobs• 3,8x faster in case of failure• Recover from master failure within 1 second• reduce replication caused network traffic up to 50%• recomputation impact is less than 1,6%

Tachyon memory centric, fault tolerance storage for cluster framworks

Documents

Transcript of Tachyon memory centric, fault tolerance storage for cluster framworks

Tachyon Web - Christopher Pike

Amplified Tachyon Mirror Body Deflector2[1]

Tachyon Presentation at AMPCamp 6 (November, 2015)

Using Spark with Tachyon by Gene Pang

A Viable Superluminal Hypothesis: Tachyon Emission from

Chanju Kim et al- Cosmology of Rolling Tachyon

Tachyon Meetup: First-Ever Scalable, Distributed Deep Learning Architecture using Tachyon & Spark

Tachyon splice TEUFELBERGER spLIFE program · 2020. 10. 14. · TEUFELBERGER TACHYON SPLICE INSTRUCTION Please follow these instructions carefully! Teufelberger assumes no liability

Tachyon inﬂation in the holographic braneworld

Rolling Near the Tachyon Vacuum - Agenda (Indico) · the tachyon vacuum. 2.The late time expansion around the tachyon vacuum has vanishing radius of convergence. Therefore, while

tachyon new - High-Tech Toolinghightechtooling.com/ht-content/uploads/2020/05/Tachyon... · 2020. 5. 7. · ED. 2019/12. Title: tachyon new.cdr Author: HareshB Created Date: 12/24/2019

Tachyon Kinematics and Causality: A Systematic Thorough ... docs... · Tachyon Kinematics and Causality: A Systematic Thorough Analysis of the Tachyon Causal Paradoxes 1 Erasmo Recarni

Tachyon: Reliable, Memory Speed Storage for Cluster ...haoyuan/papers/2014_socc_tachyo… · Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks Haoyuan Li Ali

Braneworld Cosmology and Tachyon Inflation - RSII Numerical Models

Tachyon Stars

Using the Cinnafilm Tachyon Video Processing · PDF fileThe award winning Tachyon Video Processing Library (VPL) from Cinnafilm, Inc. is an exciting, ... Tachyon is already in broad

Tachyon-The General Quiz - Prelims

© 2017 New Infrared Technologies, S.L. ... · drilling Cámaras IR TACHYON 1024 microCAMERA TACHYON 1024 CORE TACHYON 16K CAMERA Proceso láser Control de calidad en línea y en

Standards Transcoding Defined Tachyon - DigitalGlue · Tachyon TM Tachyon™ is powered by Pixel Strings®, the fastest, most accurate GPU-based image processing engine ever created

Tachyon splice TEUFELBERGER spLIFE program · 2020-02-20 · TEUFELBERGER TACHYON SPLICE INSTRUCTION Please follow these instructions carefully! Teufelberger assumes no liability