Peer-to-Peer Supported Cache System for File Transfer

19
Peer-to-Peer Supported Peer-to-Peer Supported Cache System for File Cache System for File Transfer Transfer 2003.8.28 2003.8.28 Joonbok Lee Joonbok Lee KAIST KAIST [email protected] [email protected]

description

Peer-to-Peer Supported Cache System for File Transfer. 2003.8.28 Joonbok Lee KAIST [email protected]. Contents. Motivation Problem Statement Related Work Approach Simulation Conclusion Reference. 1. Motivation. KAIST Netflow Measurement (2002.10.4) - PowerPoint PPT Presentation

Transcript of Peer-to-Peer Supported Cache System for File Transfer

Page 1: Peer-to-Peer Supported Cache System for File Transfer

Peer-to-Peer SupportedPeer-to-Peer SupportedCache System for File TransferCache System for File Transfer

2003.8.282003.8.28

Joonbok LeeJoonbok Lee

KAISTKAIST

[email protected]@cosmos.kaist.ac.kr

Page 2: Peer-to-Peer Supported Cache System for File Transfer

ContentsContents

1.1. MotivationMotivation

2.2. Problem StatementProblem Statement

3.3. Related WorkRelated Work

4.4. ApproachApproach

5.5. SimulationSimulation

6.6. ConclusionConclusion

7.7. ReferenceReference

Page 3: Peer-to-Peer Supported Cache System for File Transfer

1. Motivation1. Motivation► KAIST Netflow MeasurementKAIST Netflow Measurement (2002.10.4)(2002.10.4)

Analyze the flow data of KAIST Border Router. Analyze the flow data of KAIST Border Router.

http17%

ftp-data17%

nntp6%

telnet0.4%

Microsoft- ds0.5%

NETBIOS-ss

0.4%

unknown60%

Fig 2. Cumulative Distribution Function of the files transferred by FTP and HTTP.

1/17

102

104

106

108

0

0.2

0.4

0.6

0.8

1

File Size(Byte)

Ra

tio

HTTPFTP

10MB

Some Findings: 1) The amount of bandwidth consumed by FTP is similar with the one

consumed by HTTP2) 78% of the FTP traffic is due to the large files which is larger than

10MB.

Fig 1. The byte ratio in terms of Protocols

Page 4: Peer-to-Peer Supported Cache System for File Transfer

2. Problem Statement2. Problem Statement

► Unnegligible access to the large multimedia data. Unnegligible access to the large multimedia data. [Jung00][Jung00]

► FTP Traffic: FTP Traffic: 17% of total traffic.17% of total traffic. 78% of them are larger than 10MB.78% of them are larger than 10MB. 11% of them were failed during transfer.11% of them were failed during transfer.

► The The large fileslarge files transferred by FTP generate much transferred by FTP generate much traffic,traffic, and and many of them takes long time.many of them takes long time.

► To solve this problem, we propose To solve this problem, we propose HTTP/FTP proxy HTTP/FTP proxy cachecache which is scalable in terms of which is scalable in terms of bandwidth and bandwidth and storagestorage..

2/17

Page 5: Peer-to-Peer Supported Cache System for File Transfer

3. Related Work3. Related Work

► The researches which solve large files’ The researches which solve large files’ transfer.transfer. RepliCache: A New Approach to Scalable Networking RepliCache: A New Approach to Scalable Networking

Storage System for Large Objects [Jung97]Storage System for Large Objects [Jung97] Proactive Web caching with cumulative prefetching Proactive Web caching with cumulative prefetching

[Jung00][Jung00]

► The researches which has scalable The researches which has scalable architecture.architecture. Squirrel: A decentralized peer-to-peer web cache Squirrel: A decentralized peer-to-peer web cache

[Iyer02][Iyer02] Peer-to-Peer Caching Scheme to Address Flash Peer-to-Peer Caching Scheme to Address Flash

Crowds[Stading02]Crowds[Stading02]

3/17

Page 6: Peer-to-Peer Supported Cache System for File Transfer

4. Approach 4. Approach

4.1 Motivation4.1 Motivation

4.2 Cache with Peer-to-Peer Storage4.2 Cache with Peer-to-Peer Storage

4.3 Model4.3 Model

4.4 Detail Design4.4 Detail Design

4/17

Page 7: Peer-to-Peer Supported Cache System for File Transfer

4.1 Motivation4.1 Motivation

► Peer-to-Peer Architecture as a CachePeer-to-Peer Architecture as a Cache Scalability (bandwidth, computing power and Scalability (bandwidth, computing power and

storage)storage) Cost Cost Overhead (to find object and to persist system)Overhead (to find object and to persist system)

► The LatencyThe Latency One of the important metric of cache performance.One of the important metric of cache performance. the lookup time + delivery timethe lookup time + delivery time Delivery time is depend on the file size.Delivery time is depend on the file size. Small files: Small files: the lookup timethe lookup time dominate dominate

Large files: Large files: the deliver timethe deliver time dominate dominate

5/17

Page 8: Peer-to-Peer Supported Cache System for File Transfer

4.2 Cache with Peer-to-Peer 4.2 Cache with Peer-to-Peer StorageStorage► Hybrid ApproachHybrid Approach

Scalability: peer-to-peer storageScalability: peer-to-peer storage Lookup and control: central cache.Lookup and control: central cache.

► Peer-to-Peer two-layer storagePeer-to-Peer two-layer storage The storage in central cacheThe storage in central cache

► Expected to be always available, low latency.Expected to be always available, low latency.► Store small files.Store small files.

The second tier storagesThe second tier storages► can be unavailable.can be unavailable.► Store large files. Store large files.

6/17

Page 9: Peer-to-Peer Supported Cache System for File Transfer

Os1

Connectivity Cloud

Peer 1

OS1 ,OS2 : Small objectOL1, OL2 : Large object

4.3 Model 4.3 Model HTTP/ FTP Server A

Local Area Network

Peer 2

Peer n

,Os2

OL1 OL2 OL1OL1

Peer-to-Peer Storage

Os1

OL1

Web Proxy Cache with FTP supporting module

HTTP/ FTP Server B

Os1

Fig 3. Cache with two-layer storage

7/17

Page 10: Peer-to-Peer Supported Cache System for File Transfer

4.4 Detail Design4.4 Detail Design

► 2 new components to support 2 new components to support FTP and large files.FTP and large files. Preserve transparency of File Preserve transparency of File

LocationLocation

► FTP Cache DaemonFTP Cache Daemon Store the state of FTP Store the state of FTP

connectionconnection Make the URL of files Make the URL of files

transferred by FTPtransferred by FTP Check consistency. Check consistency.

► P2P Storage ManagerP2P Storage Manager Control its own storage. Control its own storage. Managed by object table in Managed by object table in

central cache.central cache.

HTTP Cache Daemon

FTP Cache Daemon

Object TableStorageManager

FTP/HTTP

Server

FTP/HTTP Client

P2P Storage Manager

FTP/HTTP Client

P2P Storage Manager

1

34

44

2

ControlData

Fig 4. Control and Data connection between components

8/17

Page 11: Peer-to-Peer Supported Cache System for File Transfer

5. Simulation5. Simulation

5.1 Simulation Environment5.1 Simulation Environment

5.2 Simulation Result5.2 Simulation Result

9/17

Page 12: Peer-to-Peer Supported Cache System for File Transfer

5.1 Simulation Environment5.1 Simulation Environment

► TraceTrace Requested FTP file listRequested FTP file list Gather the FTP control (port 21) packet and produce the Gather the FTP control (port 21) packet and produce the

tracetrace► 2002.10.23 ~ 2002.11.5 ( two weeks2002.10.23 ~ 2002.11.5 ( two weeks ))

76,880 (783GB)76,880 (783GB) file requests.file requests. 417 clients 417 clients

► AssumptionAssumption Local Network: 100MbpsLocal Network: 100Mbps

► Simulated CachesSimulated Caches Cache A: 100GB Storage, 100Mbps Cache A: 100GB Storage, 100Mbps Cache B: Infinite Storage, 100MbpsCache B: Infinite Storage, 100Mbps Cache C: Infinite Storage, Infinite BandwidthCache C: Infinite Storage, Infinite Bandwidth Cache D: Cache with Peer-to-Peer StorageCache D: Cache with Peer-to-Peer Storage

10/17

Page 13: Peer-to-Peer Supported Cache System for File Transfer

5.2 Simulation Result: Hit Ratio5.2 Simulation Result: Hit Ratio

Fig 5. Cache Hit Ratio

11/17

0%

10%

20%

30%

40%

50%

60%

Cache A Cache B Cache C Cache D

Hit

Rati

o(%)

Count Hit Ratio

Byte Hit Ratio

0

100

200

300

400

500

600

700

800

900

NoCache

Cache A Cache B Cache C Cache D

Traffi

c(GB

)Fig 6. Outbound traffic

No strict storage control

• Some peers may have same files in their storage

• Even though some peers have available storage, the other peers can remove the file from their cache as a victim.

• degrade the performance of storage, but not much.

Page 14: Peer-to-Peer Supported Cache System for File Transfer

5.2 Simulation Result: Latency 5.2 Simulation Result: Latency

Fig 7. Average latency of 95~105MB files

12/17

0

200

400

600

800

1000

1200

1400

No Cache Cache A Cache B Cache C Cache D

Tim

e(S

econd)

0

0.5

1

1.5

2

NoCache

Cache A Cache B Cache C Cache DTim

e(S

econd)

Fig 8. Average latency of 95~105KB files

Without the increase of small files’ latency, we can reduce the latency of large files.

Page 15: Peer-to-Peer Supported Cache System for File Transfer

0%

10%

20%

30%

40%

50%

60%

0% 50% 100%

Peer Failure Ratio(%)

Hit

Rati

o(%)

Byte Hit ratioCount Hit Ratio

5.2 Simulation Result5.2 Simulation Result :Cache Hit Ratio degradation by the peer :Cache Hit Ratio degradation by the peer failurefailure

Fig 8. Cache hit ratio degradation by the peer failure

13/17

30%

Page 16: Peer-to-Peer Supported Cache System for File Transfer

6. Conclusion6. Conclusion

1)1) Shows that much amount of traffic is produced by FTP Shows that much amount of traffic is produced by FTP by the measurement. Among them,78% were by the measurement. Among them,78% were occurred by the files larger than 10MB.occurred by the files larger than 10MB.

2)2) Propose the cache system which has two-layer Propose the cache system which has two-layer storage using peer-to-peer architecture. It is storage using peer-to-peer architecture. It is transparent to the location of files.transparent to the location of files.

3)3) Shows that two layer storage has good performance Shows that two layer storage has good performance for the large files as well as small files using trace-for the large files as well as small files using trace-driven simulation.driven simulation.

4)4) Can reduce the outbound traffic and latency by Can reduce the outbound traffic and latency by caching using our sistem. caching using our sistem.

► Other issuesOther issues Collaboration between proposed systems.Collaboration between proposed systems. Load balancing between peers.Load balancing between peers. Security problem.Security problem.

15/17

Page 17: Peer-to-Peer Supported Cache System for File Transfer

7. Reference7. Reference

► Jaeyeon Jung, “RepliCache: Enhancing Web Caching Architecture with Jaeyeon Jung, “RepliCache: Enhancing Web Caching Architecture with Replication of Large Objects”Replication of Large Objects”

► Jaeyeon Jung, Dongman Lee and Kilnam Chon, "Proactive Web Caching with Jaeyeon Jung, Dongman Lee and Kilnam Chon, "Proactive Web Caching with Cumulative Prefetching for Large Multimedia Data" , Cumulative Prefetching for Large Multimedia Data" , Computer Networks 33 Computer Networks 33 (2000) pp. 645-655(2000) pp. 645-655

► Sitaram Iyer, Ant Rowstron and Peter Druschel, “Squirrel: A decentralized Sitaram Iyer, Ant Rowstron and Peter Druschel, “Squirrel: A decentralized peer-to-peer web cache” In Proceedings of the PODC ’02, Monterey, CA peer-to-peer web cache” In Proceedings of the PODC ’02, Monterey, CA

► Tyron Stading, Petros Maniatis, Mary Baker, “Peer-to-Peer Caching Schemes Tyron Stading, Petros Maniatis, Mary Baker, “Peer-to-Peer Caching Schemes to Address Flash Crowds”, In Proceedings of the IPTPS ’02, MA, USAto Address Flash Crowds”, In Proceedings of the IPTPS ’02, MA, USA

► Hyun-chul Kim, Joonbock Lee, Jungwon Suh, and Kilnam Chon, Hyun-chul Kim, Joonbock Lee, Jungwon Suh, and Kilnam Chon, “Measurements of File-Systems Deployed on High-Performance Research “Measurements of File-Systems Deployed on High-Performance Research and Education Networks”, Technical Reportand Education Networks”, Technical Report

► I.Stoica , R. Morris, D. Karger, F.Kaas hoek, and H.Balakrishnan. Chord: A I.Stoica , R. Morris, D. Karger, F.Kaas hoek, and H.Balakrishnan. Chord: A scalable content-addressable network. In Proceedings of the ACM SIGCOMM scalable content-addressable network. In Proceedings of the ACM SIGCOMM 2001 Technical Conference, San Diego, CA, USA, August 20012001 Technical Conference, San Diego, CA, USA, August 2001

► S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. “A scalable S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. “A scalable content-addressable network.” In Proceedings of the ACM SIGCOMM 2001 content-addressable network.” In Proceedings of the ACM SIGCOMM 2001 Technical Conference, San Diego, CA, USA, August 2001.Technical Conference, San Diego, CA, USA, August 2001.

16/17

Page 18: Peer-to-Peer Supported Cache System for File Transfer

7. Reference7. Reference

► A. Rowstron and P. Druschel, "A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object location and Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systemsrouting for large-scale peer-to-peer systems".  IFIP/ACM International ".  IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001. Germany, pages 329-350, November, 2001.

► Ian Clarke, Theodore W. Hong, Scott G. Miller, Oskar Sandberg, and Brandon Ian Clarke, Theodore W. Hong, Scott G. Miller, Oskar Sandberg, and Brandon Wiley, "Protecting Free Expression Online with Freenet," IEEE Internet Wiley, "Protecting Free Expression Online with Freenet," IEEE Internet Computing 6(1), January/February 2002.Computing 6(1), January/February 2002.

► William J. Bolosky, John R. Douceur, David Ely, and Marvin Theimer, Feasibility William J. Bolosky, John R. Douceur, David Ely, and Marvin Theimer, Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs In proceeding of SIGMETRICS 2000PCs In proceeding of SIGMETRICS 2000

► Internet RFC 959 File Transfer ProtocolInternet RFC 959 File Transfer Protocol

17/17

Page 19: Peer-to-Peer Supported Cache System for File Transfer

Request File

Check Protocol

Lookup Object Table

Check Consistency

Check Cached Location

Open FTP control connections to both peer which has file and peer which requests is.

Make FTP data connections between two the peers.

HTTP

FTPnot cached

cached

inconsistent

consistent

peer

Handle a request like web proxy cache

Transfer file

Check File Size

Central cache opens data connection to client.

central server

Update Object Table

Transfer file

Opens data connection between server and client

Transfer file

Server opens data connection to central cache.

Update Object Table

small

Large

Central cache opens data connection to client.

Transfer file

Update Object Table

Appendix A