Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu,...
-
Upload
susanna-waters -
Category
Documents
-
view
217 -
download
0
Transcript of Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu,...
Design Issues of Prefetching Strategies for Heterogeneous Software DSM
Author : Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching Li
Speaker : Chien-Lung Chou
Date : 2006/05/18
2/26
Outline
Introduction Motivation Related Work Proposed Method Performance Evaluation Conclusions and Future Work
3/26
Introduction
In Distributed Shared Memory (DSM) systems, it induces:• large number of page faults.• large number of communication time.
4/26
Introduction (cont.)
Page faults and communication time are major overheads in DSM systems.
We need additional strategies to reduce page faults and communication time are:• Home migration• Write vector• Prefetching
5/26
Introduction (cont.)
Most traditional prefetching strategies can provide good performance in homogeneous cluster platforms.
However, the performance of such strategies may be worse in heterogeneous environment.
6/26
MotivationHomogeneous cluster platform.
Program finish
7/26
Motivation (cont.)Heterogeneous cluster platform
Program finish
Waiting Time
Better Resources
WorseResources
8/26
Motivation (cont.)
We need to concern about heterogeneous environment.• More and more personal computers will be
perform computations collectively.• Large number of advanced techniques will be
develop in this environment.• We will usually meet this environment in
future.
9/26
Related Work
History Prefetching Strategy.• It permits home nodes sending data to remote
nodes in advance.
It has some disadvantages:• Accumulated Waiting Phenomenon.• Waiting Synchronization Phenomenon.• Misprefetch.• Home nodes have too much work.
10/26
Related Work (cont.)
Effective Prefetch Strategy • Filtering Unnecessary Prefetches.• Distributing Prefetch Overhead.• Load Balancing with Barrier Synchronization.
Agent Home of prefetching strategy• It will find a node that will help home nodes to
transfer prefetching data.• Thus, it reduces overhead of home nodes.
11/26
Proposed Method
Host 1 Host 4Host 2 Host 3Time
Idle Time
Prefetching
Enter Barrier
LeaveBarrier
ProgramFinish
Home node
Home node
Idle Time
Home node
Home node
Prefetching Strategy in Heterogeneous Environments
12/26
Proposed Method (cont.)
According to above disadvantages, we propose the method that allows• home nodes are adjusted to suitable place.• high speed processors to execute prefetch in
advance.• low speed processors to leave the barrier early.
13/26
Proposed Method (cont.)
First, we distribute home pages to nodes that have better resources.
These nodes are suitable to be home nodes because they have better performance.
14/26
Proposed Method (cont.)
Second, we observe that hosts that have worse resources finish work later, so we adjust policy of prefetching strategy.
Originally, all hosts leave barrier at the same time.
15/26
Proposed Method (cont.)
In our method, the hosts that have worse resources leave barrier after requesting prefetching data.
The hosts that have better resources leave barrier after sending prefetching pages to hosts that have worse resources.
16/26
Proposed Method (cont.)
Third, we also observe that hosts that have better resources spend large amount of idle time during barrier in heterogeneous environment.
It raises execution time and barrier time.
17/26
Proposed Method (cont.)
We utilize idle time in barrier of hosts that have better resources to perform prefetching to each other.
18/26
Proposed Method (cont.)
Our Proposed Method
Host 1 Host 4Host 2 Host 3Time
Serve Prefetch
Enter Barrier
LeaveBarrier
ProgramFinish
Home node
Home node
Non-home node
Non-home node
Part of Prefetch
Request Prefetch
Idle Time
Idle Time
LeaveBarrier
19/26
Performance Evaluation
Experimental Platform - Hardware
Node ID CPU Memory Network
1 Intel P4 2.4GHz 256MB Fast Ethernet
2 Intel P4 1.7GHz 256MB Fast Ethernet
3 Intel P3 500MHz 640MB Fast Ethernet
4 Intel P2 350MHz 128MB Fast Ethernet
20/26
Performance Evaluation (cont.)
Experimental Platform - Software• Linux Fedora Core 3.• Kernel 2.6.9.• JIAJIA DSM software.
21/26
Host 1 Host 2 Host 3 Host 4
Idle Time in Barrier
JIAJIA 70.71 67.79 27.00 0.36
History Prefetch
72.72 71.05 30.45 0.78
Effective Prefetch
68.31 67.42 26.60 0.63
Agent Home
70.72 68.61 28.11 0.63
Our Method
62.14 62.16 26.77 0.56
Performance Evaluation (cont.)
The Idle Time in Barrier for IS Application
22/26
Host 1 Host 2 Host 3 Host 4
Idle Time in Barrier
JIAJIA 3.75 7.35 8.67 8.21
History Prefetch
3.83 6.25 7.49 7.02
Effective Prefetch
3.92 6.29 7.43 6.97
Agent Home
4.34 6.03 7.03 6.57
Our Method
2.37 2.38 4.91 4.35
Performance Evaluation (cont.)
The Idle Time in Barrier for Merge Application
23/26
Performance Evaluation (cont.)
Performance Benefits
50%
60%
70%
80%
90%
100%
110%
120%
130%
EP IS LU MERGE
Spee
dup
J IAJIA History Prefetch Effective Prefetch Agent Home Our Method
24/26
Conclusions and Future Work
In heterogeneous environment, benefits of original prefetching strategies are limited.
In this paper, we utilized idle time to improve overall performance.
In the best situation, our proposed method could reduce idle time in barrier of about 60%.
25/26
Conclusions and Future Work (cont.)
In the future, we will make effort to find a method to optimize the use of idle time in barrier.
In addition, we will also investigate the parallel program execution with issues about dynamic CPU loads include in our next development stage.
26/26
Thank You!!