Download - PERFORMANCE EVALUATION OF ROUTE-BASED DISTRIBUTED … · 2004-07-07 · PERFORMANCE EVALUATION OF ROUTE-BASED DISTRIBUTED PACKET FILTERING FOR DDOS PREVENTION IN LARGE-SCALE NETWORKS

PERFORMANCE EVALUATION OF ROUTE-BASED DISTRIBUTED PACKET

FILTERING FOR DDOS PREVENTION IN LARGE-SCALE NETWORKS

A Thesis

Submitted to the Faculty

of

Purdue University

by

HyoJeong Kim

In Partial Fulfillment of the

Requirements for the Degree

of

Master of Science

December 2003

ii

ACKNOWLEDGMENTS

I would like to thank my advisor Professor Kihong Park for his persistent guidance

from technical details to mentoring. His keen criticism on my research has improved

my attitude exploring science, and his earnest devotion to science has been a source

of my motivation.

I present my gratitude to my friends and colleagues at Network Systems Lab.

In particular, I am grateful to my friend Bhagya for her valuable feedback on my

research; I remember enjoyable nights we spent together facing deadlines. I would like

to thank my friend Humayun for his help during my study which include numerous

discussions on the subject, implementation of the design, and proof-reading of my

thesis. I would also like to thank Ali for his participation in protocol design; I thank

Yan for his participation in implementation. I am grateful to Sunwoong who has

carefully examined my thesis and provided me vaulable comments.

Finally, I thank to my parents and brothers for their life-long love and support.

I also thank to my friends in Korea, especially Eun-Ju and Seung-Hyub who have

given me warm encouragements throughout my study.

iii

TABLE OF CONTENTS

Page

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Technical Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 DoS Attacks and Prevention Mechanisms . . . . . . . . . . . . . . . . 5

2.2 Methods for Computing Source Reachability . . . . . . . . . . . . . . 6

2.3 Scalable Network Simulation . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 Power-Law Network Topology . . . . . . . . . . . . . . . . . . . . . . 8

3 Route-based Distributed Packet Filtering Protocol . . . . . . . . . . . . . . 9

3.1 Overview of Route-based Distributed Packet Filtering . . . . . . . . . 9

3.2 Protocol Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.1 Filter Look-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2.2 Filter Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2.3 BGP and Its Extension . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Route-based DPF Protocol . . . . . . . . . . . . . . . . . . . . . . . . 18

3.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.3.2 DPF-lookup Protocol . . . . . . . . . . . . . . . . . . . . . . . 19

3.3.3 Semi-maximal Filter Table . . . . . . . . . . . . . . . . . . . . 19

3.3.4 BGP Extension . . . . . . . . . . . . . . . . . . . . . . . . . . 19

iv

Page

3.3.5 DPF-update Protocol . . . . . . . . . . . . . . . . . . . . . . . 25

3.4 Improvement of DPF-update Protocol for Fault-tolerance . . . . . . . 26

4 Performance Evaluation of Route-based DPF Protocol . . . . . . . . . . . 29

4.1 Overall Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.2 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2.1 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.2 Safety Violation . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.2.3 Staleness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2.4 Containment . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2.5 Traceback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 Safety Violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.6 Staleness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.7 Containment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.8 Traceback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Dynamic DPF Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.1 Overview of DaSSFNet . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 DaSSFNet-based Parallel Network Simulation Environment . . . . . . 52

5.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2.2 Automatic Model Configuration and Partitioning . . . . . . . 56

5.2.3 Measurement Framework . . . . . . . . . . . . . . . . . . . . . 63

5.2.4 Protocol Modeling . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Large-scale Network Simulation . . . . . . . . . . . . . . . . . . . . . . . . 83

6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.1.1 System Configuration . . . . . . . . . . . . . . . . . . . . . . . 83

6.1.2 Benchmark Topologies . . . . . . . . . . . . . . . . . . . . . . 84

6.1.3 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . 85

v

Page

6.2 Performance and Utility of Comprehensive Measurement Subsystem . 86

6.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.2.2 Memory Requirement Monitoring . . . . . . . . . . . . . . . . 87

6.2.3 Memory Consumption by Tables . . . . . . . . . . . . . . . . 89

6.2.4 Memory Consumption by Messages . . . . . . . . . . . . . . . 90

6.2.5 Memory Consumption and Counting of Major Kernel Events . 91

6.2.6 CPU Load Monitoring . . . . . . . . . . . . . . . . . . . . . . 93

6.2.7 Communication Cost . . . . . . . . . . . . . . . . . . . . . . . 94

6.3 Scalability of Partitioning . . . . . . . . . . . . . . . . . . . . . . . . 94

6.3.1 Completion Time . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.3.2 Balanced Memory Offloading . . . . . . . . . . . . . . . . . . 98

7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 101

LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

vi

LIST OF TABLES

Table Page

6.1 Statistics of the benchmark topologies. . . . . . . . . . . . . . . . . . 85

6.2 Parameter settings for TCP. . . . . . . . . . . . . . . . . . . . . . . . 86

vii

LIST OF FIGURES

Figure Page

3.1 Illustration of route asymmetry. . . . . . . . . . . . . . . . . . . . . . 15

3.2 Protocol stack of route-based DPF protocol. . . . . . . . . . . . . . . 18

3.3 Illustration of BGP Extension mechanism. . . . . . . . . . . . . . . . 20

3.4 BGP-REFLECT message format. . . . . . . . . . . . . . . . . . . . . 21

4.1 BGP routing stability as a function of simulation time. . . . . . . . . 37

4.2 Consistency of filter tables as a function of simulation time. . . . . . 38

4.3 Safety violation as a function of simulation time. . . . . . . . . . . . . 39

4.4 Staleness as a function of simulation time. . . . . . . . . . . . . . . . 40

4.5 Containment as a function of simulation time. . . . . . . . . . . . . . 41

4.6 Traceback as a function of resolution. . . . . . . . . . . . . . . . . . . 42

4.7 Traceback as a function of simulation time. . . . . . . . . . . . . . . . 43

5.1 A simple network specification in DML, which consists of one router. 50

5.2 Mapping of partition groups onto distributed machines. . . . . . . . . 50

5.3 A DML snippet of a point-to-point link. . . . . . . . . . . . . . . . . 51

5.4 Network protocol models in the Dynamic DPF Simulator. . . . . . . . 53

5.5 System architecture of the Dynamic DPF Simulator. . . . . . . . . . 55

5.6 A sample Meta-DML input file. . . . . . . . . . . . . . . . . . . . . . 58

5.7 Growth of AS-level Internet graph. . . . . . . . . . . . . . . . . . . . 58

5.8 300-node AS-level Internet graph. . . . . . . . . . . . . . . . . . . . . 60

5.9 Pseudo code of phase 0. . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.10 Pseudo code of phase 1. . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.11 Local and remote message passing procedures. . . . . . . . . . . . . . 65

5.12 A DML snippet for global measurement . . . . . . . . . . . . . . . . . 68

5.13 A DML snippet for local measurement at IP. . . . . . . . . . . . . . . 69

viii

Figure Page

5.14 An illustration of a peering relationship between one border router inAS 1 and another in AS 2. . . . . . . . . . . . . . . . . . . . . . . . . 71

5.15 The architecture of BGP protocol model with its tables. . . . . . . . 72

5.16 The class hierachy of BGP message types. . . . . . . . . . . . . . . . 72

5.17 The architecture of BGP protocol model with its timers. . . . . . . . 73

5.18 A DML snippet of BGP-4 model. . . . . . . . . . . . . . . . . . . . . 74

5.19 Interaction of BGP and DPF-update. . . . . . . . . . . . . . . . . . . 75

5.20 A DML snippet of DPF-update model. . . . . . . . . . . . . . . . . . 76

5.21 A DML snippet of DPF-lookup model. . . . . . . . . . . . . . . . . . 77

5.22 A DML snippet of UDP model. . . . . . . . . . . . . . . . . . . . . . 78

5.23 A DML snippet of CBR traffic generator. . . . . . . . . . . . . . . . . 79

5.24 A DML snippet of CBR massive attacker. . . . . . . . . . . . . . . . 80

5.25 Mechanism of ShutDown model. . . . . . . . . . . . . . . . . . . . . . 81

5.26 A DML snippet of ShutDown model. . . . . . . . . . . . . . . . . . . 81

6.1 Hardware configuration of a Linux cluster used for AS-level bench-marking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.2 Power-law connectivity of the benchmark topologies. . . . . . . . . . 85

6.3 Memory consumption as a function of simulation time. (a)M Mem-ory consumption is classifies into three categories—tables, messages,and kernel events. (b) The categories are further subdivided into finegranular components. . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.4 Memory consumption by protocol tables. BGP Adj-RIB-In, BGP Loc-RIB, and IP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.5 Memory consumption by protocol messages. BGP, TCP Send buffer,TCP Receive buffer, and IP. . . . . . . . . . . . . . . . . . . . . . . . 91

6.6 Major types of KernelEvent objects. (a) shows cumulative counts ofKernelEvent object creation as a function of simulation time for themajor types. (b) shows total memory consumption by KernelEvent ob-jects and memory consumption by major types of KernelEvent objectsas a function of simulation time. . . . . . . . . . . . . . . . . . . . . . 92

6.7 CPU load distribution over 16 Linux workstations. . . . . . . . . . . . 94

ix

Figure Page

6.8 (a) Completion time as a function of parallelism for different bench-mark graphs. (b) Completion time as a function of problem sie for 16,20, and 24 machines. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.9 (a) Total memory watermark as a function of parallelism for differentbenchmark topologies. (b) Average and maximum memory watermarkas a function of parallelism for different benchmark graphs. . . . . . . 98

6.10 (a) Total memory watermark as a function of problem size for 16, 20,and 24 machines. (b) Average and maximum memory watermark as afunction of problem size for 16, 20, and 24 machines. . . . . . . . . . 99

x

ABSTRACT

Kim, HyoJeong. M.S., Purdue University, December, 2003. Performance Evalua-tion of Route-based Distributed Packet Filtering for DDoS Prevention in Large-scaleNetworks. Major Professor: Kihong Park.

This thesis studies performance evaluation of route-based distributed packet fil-

tering (DPF) for spoofed distributed denial of service (DDoS) attack prevention in

large-scale networks under dynamic network conditions. Our contribution is three-

fold.

We design and implement a route-based DPF protocol which computes route-

based filter tables dynamically in the presence of IP (Internet Protocol) routing table

updates governed by BGP (Border Gateway Protocol), Internet’s inter-domain rout-

ing protocol. By introducing an additional signalling message type to BGP, our

solution discovers source reachability information despite the destination-based and

policy-based characteristics of BGP that is prone to generating asymmetric routes.

We evaluate proactive protection performance of route-based DPF under dynamic

network conditions including node failures and resulting transient system states.

Benchmarking is carried out in large-scale Internet measurement topologies where

we show that route-based DPF is robust and effective with respect to both proactive

(containment) and reactive (traceback) performance.

To facilitate large-scale simulation-based DDoS performance evaluation, we built

the Dynamic DPF Simulator as an extension of DaSSFNet. By incorporating auto-

mated network configuration, partitioning, and run-time measurement and monitor-

ing, we show that scalable network simulation is effected by enabling efficient memory,

CPU, and communication load balancing in workstation clusters.

1

1 INTRODUCTION

1.1 Motivation

By clogging the Internet, denial of service (DoS) attack impedes availability of

resources and services. Severity and prevalence of DoS attacks is increasing, and

their consequent malfunctions are experienced by the Internet population at large.

Moore et al. [1] observed that more than 12,000 denial of service attacks occurred

against 5,000 distinct targets during a three-week period. Whereas past incidents

mostly targeted commercial web sites, universities, and organizations, more recent

attacks have targeted the network infrastructure such as major root DNS servers [2–5].

In addition, Internet worms—self-replicating malicious code—have been used as an

agent for launching massive distributed DoS (DDoS) attacks [6].

Route-based distributed packet filtering (DPF) [7] is a recently advanced solution

that provides proactive and reactive protection against spoofed DDoS attack when

the source address of attack packets is forged. Given the routing information of the

Internet, route-based DPF at strategic border routers inspects the source address of

an incoming IP packet. If its source address turns out to be valid, i.e., unspoofed,

with respect to the routing information, it forwards the packet to IP for routing.

Otherwise, it discards the packet proactively. As many DDoS attacks use IP spoofing

to hinder source identification, proactive prevention guards the core and end points

of the network from attack. In the case when spoofed DDoS attack succeeds at

penetrating the proactive shield, route-based DPF’s reactive protection localizes its

physical source within a few sites. Route-based DPF has been evaluated in large but

static network environments assuming availability of relevant routing information,

and shown to be effective. However, a protocol for calculating route-based filter

2

tables has been missing, including its effectiveness and robustness in the presence of

dynamic route changes.

1.2 Objective

The objective of this thesis in three fold. First, we design and implement a route-

based DPF protocol for computing valid source address information. Route-based

DPF filters at distributed filter sites are updated in the presence of dynamic route

changes caused by BGP, Internet’s inter-domain routing protocol. Second, we carry

out performance evaluation of the route-based DPF protocol under dynamic network

conditions, including system failures. We evaluate fault-tolerance of route-based DPF

in large-scale autonomous system (AS)-level Internet measurement topologies. Third,

in order to perform large-scale AS-level Internet benchmarking, we built a scalable

simulation environment extending DaSSFNet, a distributed simulation platform for

workstation clusters. We perform comprehensive simulation benchmarks to determine

scalability in large-scale networks.

1.3 Technical Challenges

Route-based DPF has to infer source reachability from inter-domain routing infor-

mation in order to compute filter tables at distributed filter sites. However, IP routing

is based on destination reachability. When a packet arrives at a router, the router

is interested in only the destination address of the packet, not its source address.

Hence, existing routing protocols, in particular, BGP (Border Gateway Protocol),

do not provide source reachability information. Moreover, due to asymmetry of IP

routing, we cannot infer source reachability directly from the destination reachability

information. That is, we cannot ascertain that the path from a source node to a des-

tination node is the same as from the destination to the source in reverse order. In

addition, BGP uses administrative policies for determining routes that are not neces-

sarily of technical nature. As a result, routing information received at a border router

3

may be biased by its upstream routers. Thus, a major challenge when implementing

route-based DPF to the global IP Internet is to infer source reachability from BGP.

For large-scale performance evaluation involving dynamic network simulation, a

scalable simulation environment that is capable of providing necessary system sup-

port including monitoring and measurement, memory management, and partitioning

is crucial. Since we are aiming to carry out performance evaluation of route-based

DPF on large-scale AS-level Internet measurement topologies, our environment must

support up to 12,000- node/26,000- edge networks that may contain 144,000,000 rout-

ing entries. A critical problem in scalable network simulation is memory consumption

stemming from both static requirements such as routing tables and dynamic require-

ments such as protocol messages. In AS-level Internet simulation, each node repre-

sents an AS where each node is modelled as a single border router. Thus, in addition

to IP routing tables there exists the memory requirement of BGP’s internal tables.

For our performance evaluation, route-based DPF filter tables need to be maintained

along side IP and BGP tables. Hence, we need to build a scalable simulation environ-

ment for AS-level Internet simulations supporting large-scale memory requirements

which achieving parallel speed-up.

1.4 Contribution

We have designed a route-based DPF protocol, which updates filter tables dy-

namically in the presence of BGP which changes IP tables. The route-based DPF

protocol relies on a BGP Extension that allows source reachability information to be

deduced from routing related signalling messages. Based on observable information

obtained from the BGP Extension, the route-based DPF protocol at filter-deployed

border routers infers validity of source addresses for each interface and updates filter

tables accordingly. In addition, a counter-based table design facilitates incremental

filter update in tandem with incremental BGP routing update.

4

We have implemented the route-based DPF protocol in a process-oriented simula-

tion environment, DaSSFNet [8]. The process-oriented abstraction allows simulation

models to be almost as comprehensive as actual system-level protocol implementa-

tions. Since the filter update component of the route-based DPF protocol resides

above the transport layer as in BGP, it interacts with modules in the lower layers of

the protocol stack via BSD-like socket interface. Thus, the simulation model is built

independently from its underlying simulation environment. This working protocol

model and design decisions we made in the implementation phase is useful for future

system building work in network processor platforms.

We have carried out performance evaluation in large-scale AS-level Internet mea-

surement topologies [9]. Our performance evaluation results on fault-tolerant protec-

tion of route-based DPF in Internet measurement topologies is useful for assessing its

effectiveness and robustness in dynamic environments. Moreover, the scalable simu-

lation environment serves as a base for researching filter placement issues as well as

more thorough performance evaluation with respect to infrastructure attack against

route-based DPF.

In the context of distributed simulation, partitioning of a given simulation topol-

ogy affects the simulation’s completion time. Memory requirement imbalance may

cause swapping in the virtual memory management system, resulting in increased

execution time. We have devised a new partitioning algorithm for power-law net-

work topologies, characteristic of Internet AS measurement graphs, which achieves

balanced distribution of memory requirement as well as utilization of CPU resources.

5

2 RELATED WORK

In the following, we review related work across key areas relevant to the thesis.

2.1 DoS Attacks and Prevention Mechanisms

Denial of Service (DoS) attacks overburden a target system or network by de-

manding more resources than they can provide. As presented in classical types of

DoS attacks [2,10,11], resources can be network bandwidth, process, or network con-

nections. Typically network-based distributed DoS (DDoS) attack forges the source

address of DoS attack packets [12] called spoofing. Although some recent attacks used

agents to generate DoS attack traffic with unspoofed source addresses, initial attack

packets for launching remote agents employ IP source address spoofing [13]. Recent

incidents reveal that the Internet infrastructure such as core routers or Domain Name

Servers (DNS) have become a target of DoS attack [5].

Research on source identification—also called IP traceback [14]—have looked at

ways to localize the physical source of attack traffic. Manual, recursive link test-

ing [15], audit trails approach which use traffic logs at routers or gateways [16], and

packet-based traceback mechanisms [14, 17–20] belong to source identification mech-

anisms. Contrary to route-based DPF, these approaches are inherently reactive—

dmage must occur before traceback can be initiated—and cannot provide proactive

protection where attack packets are discarded before they reach a victim.

Packet filtering at ingress or egress points of a domain prevent DDoS attack traffic

proactively [21–23]. For example, a firewall at an egress points can check the source

address of an exiting IP packets. If the source address of an arriving packet is not

from the address space of its domain, the firewall can discard it determining that

the packet is spoofed. A limitation of egress filtering is that with partial deployment

6

there are still too many domains from which spoofed DDoS attack can be initiated

by compromised hosts. Ingress filtering only works at transit providers vis-a-vis stub

ASes which limits its effectiveness. In this sense, route-based DPF can be viewed as

a generalization of ingress filtering.

Recently, Mayday [24] has demonstrated a key aspect of route-based DPF, where

it functions as non-cryptographic authentication mechanism. Mayday applied source

address (or port number) based network layer packet filtering as a light weight au-

thentication mechanism for protecting servers of Secure Overlay Services (SOS) [25].

In this framework, router-based packet filters are deployed around the server, and, by

inspecting source address (or port number) of every incoming packet, they forward

only valid packets to SOS servers.

2.2 Methods for Computing Source Reachability

A major difficulty in route-based DPF table calculation lies in inference of source

reachability from destination-based IP routing, and in the presence of routing asym-

metry. Whereas our approach extends BGP-4 introducing an additional signalling

message type, SAVE [26] and OPCA [27] address the same problem in different do-

mains with different design principles. The objective of SAVE [26] is to enforce IP

packets to carry valid source addresses at a router-level network. Since SAVE is

designed to be routing protocol independent, it uses additional data structures to

maintain source reachability information at each SAVE router and is generic in na-

ture where the main contribution lies in performing efficient incremental update. The

key technical challenge with respect to protocol implementation is to realize route-

based DPF—semi-maximal or maximal filtering—as a minimal footprint companion

protocol of an underlying routing system. In the case of the route-based DPF proto-

col advanced in this thesis, this is done in the context of BGP utilizing its AS-PATH

message information. OPCA [27] proposes an overlay network on top of BGP as a

policy control architecture, which is applied for faster route fail-over and inbound

7

traffic load balancing. In the case of OPCA, a central repository maintains inter-AS

relationships and Internet hierarchy that is used to enhance routing performance.

2.3 Scalable Network Simulation

Simulating large-scale networks such as the Internet has been a challenging prob-

lem due to the size and complexity of the global IP Internet [28, 29]. ns-2 [30], a

packet-level discrete-event network simulator, has been used widely for research in-

cluding TCP congestion control, multicasting, and wireless networks. Although ns-2

is well-suited for small scale simulation, due to memory requirement of routing tables,

messages, and timer events in large-scale networks it is ill-suited for scalable network

simulation. Moreover, the lack of process-oriented abstraction hinders flexible and

accurate evaluation of dynamic protocols including those pertaining to dynamic rout-

ing.

To tackle these challenges and limitations, several studies have looked into a

fluid-based simulation approach [31–33] which represents network traffic as a fluid

flow in such a way simulator only keeps track of changes in rates of network flow

without maintaining individual packet events. A number of projects have studied

parallel/distributed simulation techniques in order to utilize parallel/distributed re-

sources in multi-processor and distributed memory environments for large-scale sim-

ulation [8, 34–36]. A principle focus of these studies has been on synchronization

and parallel speed-up issues. Recent work proposed techniques aimed at enabling

large-scale network simulation by using memory resource thriftily, emphasizing that

routing-related information inherently requires large amount of memory that can be-

come a bottleneck [37, 38].

Our environment, the Dynamic DPF Simulator, facilitates large-scale network

simulation by extending DaSSFNet [8], a C++ based realization of SSF for worksta-

tion clusters and multi-processor systems. SSF is a process-oriented discrete event

simulation framework aimed at flexible, accurate, and efficient simulation. Adopt-

8

ing DaSSF’s scalable simulation environment together with a process-oriented world

view [39], DaSSFNet provides a network simulation infrastructure which is amenable

to full-fledged network protocol implementation on commodity workstation/PC clus-

ters. The existing tools, including DaSSFNet, provide a parallel simulation kernel

capable of efficient synchronization and exporting standardized APIs(e.g., SSFNet’s

object classes and a BSP-like socket API), however, they lack tools for automated net-

work configuration, partitioning, and efficient dynamic monitoring. A key problem is

large-scale topology partitioning which has a dominant influence on performance with

respect to memory, CPU, and communication load balancing. Our system building

work addresses these issues.

2.4 Power-Law Network Topology

Recent measurements of various information networks, including Internet domain

networks [40], the World Wide Web [41, 42], metabolic networks [43], and a variety

of social networks [44–46] have shown that connectivity in these networks follow a

distinct pattern: most are connected to a few, but a few are connected to many.

These networks are sometimes collectively referred to as power-law networks as there

is a power-law relation between the degree and frequency of nodes of that degree:

Pr{deg(u) = k} ∝ k−β

In [7] the impact of power-law network connectivity on route-based DPF’s pro-

tection performance has been studied. It is shown that power-law connectivity of

Internet AS-level topology plays a crucial role in route-based DPF’s effectiveness for

achieving proactive and reactive while achieving sparse filter placement. Theoretic

studies that extend classical random graph theory to power-law graph based on degree

sequences include [47].

9

3 ROUTE-BASED DISTRIBUTED PACKET FILTERING

PROTOCOL

Route-based Distributed Packet Filtering (DPF) [7] has been proposed as a proactive

and reactive solution for distributed denial of service (DDoS) attacks which use source

IP address spoofing. Aiming at Internet autonomous system (AS) level protection

against DDoS attack, we have designed a protocol for route-based DPF.

This chapter is organized as follows. First, we introduce the concept of route-

based DPF for technical background. Next, we discuss several protocol design issues

surrounding route-based DPF. This is followed by the protocol specification. Finally,

we describe an improvement of the protocol for enhancing fault-tolerance.

3.1 Overview of Route-based Distributed Packet Filtering

This section presents the idea of route-based distributed packet filtering (DPF)

for technical background, summarizing [7]. First, the concept of route-based packet

filtering is described, followed by two filter types—maximal and semi-maximal filters.

Next, the concept of distribution of route-based packet filters and their synergistic

effect are described. Finally, issues regarding filter placement are discussed.

Route-based DPF assumes that each node1 has complete knowledge of routing

over the entire network. With this assumption, each node verifies if an incoming IP

packet is valid, i.e., non-spoofed, when it arrives through a specific link. If the packet

is deemed spoofed from the routing information, the packet is discarded. On the

other hand, if the routing information cannot conclusively determine the validity of

the source address, the node forwards the packet following IP.

1A node can be an AS in an AS-level network or a router in a router-level network. We will focuson AS-level network in this section.

10

Route-based DPF includes two types of filters—maximal and semi-maximal. A

filter is a mechanism for determining if a packet is valid or not over a link on which the

packet arrives. Given a graph G = (V, E) which represents the Internet AS topology

and routing information R, a maximal filter Fe at a link e = (u, w) ∈ E is defined as

Fe(s, t) =

0, if e ∈ R(s, t);

1, otherwise.

Here, R(s, t) represents a set of routes from a source IP address s to a destination

IP address t. The output 0 means that a given packet is valid—i.e., non-spoofed,

and the output 1 means that it is invalid—i.e., spoofed. A maximal filter evaluates

validity of an incoming packet M(s, t) based on the existence of a path from s to t

going through the link e based on R(s, t). Each node maintains a separate table per

link for storing the validity flag of incoming packets for all source and destination

pairs. This requires in general O(n2) space, where n is the number of nodes in the

network.

A semi-maximal filter over a link e = (u, w) ∈ E is defined as follows.

Fe(s, t) =

0, if e ∈ R(s, v) for some v ∈ V ;

1, otherwise.

A semi-maximal filter uses the source IP address of a packet for determining its

validity. It checks if a link, where a packet M(s, t) arrives, belongs to a routing path

from the source IP address s to some destination node, irrespective of the destination

IP address t inscribed in the IP header. In this setting, a semi-maximal filter over a

link maintains a table which keeps validity information based on source address only.

This requires O(n) space.

When more than one node of a network enable their filtering functionality, filter-

ing is distributed. The collaborative effect of route-based distributed packet filtering

(DPF) is two-fold—proactive and reactive. Proactive protection (a.k.a., containment)

means that route-based DPF discards spoofed IP packets proactively before they can

11

reach their target. Reactive protection (a.k.a., traceback) is in effect when route-

based DPF cannot proactively filter out spoofed attack traffic. Reactive protection

means that, upon receiving an IP packet—spoofed or non-spoofed—route-based DPF

can localize its physical source.

Distributed filter placement involves two issues—coverage ratio and selection of

nodes for a given coverage ratio. Coverage ratio is defined as the fraction of nodes

where filtering is enabled. For a given coverage ratio, the strategy for selecting the

filter nodes affects proactive and reactive protection performance. AS-level Internet

topology has been shown to exhibit power-law connectivity [40]. This implies that

there are a few high degree nodes which are connected to many low degree nodes.

Thus, exploiting the power-law nature of AS-level Internet topology, we can reduce

the coverage ratio by placing filters at high degree nodes. Conversely, we can apply

power-law connectivity information as a strategy for selecting a set of filter nodes

while minimizing coverage ratio.

3.2 Protocol Design Issues

Sharing a common protocol architecture with IP, the route-based DPF protocol

is composed of two major parts—filter look-up and filter update. The filter look-up

component does line rate packet processing to determine source address validity. As

with IP, the filter look-up component functions on the data plane at line rate. On

the other hand, the filter update component updates route-based DPF filter tables

as routes in the network change dynamically. Similarly to routing protocols such as

BGP, the filter update component operates in the control place which occurs at slower

time scales.

In the next section, we discuss issues in designing filter look-up and filter update

components, and present our approach for solving these issues.

12

3.2.1 Filter Look-up

Filter look-up resides between the network interface layer and internet layer in the

Internet Reference Model. Filter look-up implements packet forwarding/discarding

depending on the validity of a packet’s source address, i.e., it is a semi-maximal filter.

Semi-maximal filtering shows comparable protection with O(n) space requirement to

that of maximal filtering, which requires O(n2) space [7].

Note that, in the context of AS-level route-based DPF protocol, we interpret an

IP address of an AS node as an IP prefix within the administrative domain of the AS

node2. In other words, we assume that every AS node has a unique non-overlapping

IP prefix and we can check the originating AS node from a source IP address by

inspecting the prefix of the IP address.

We have designed a counter-based semi-maximal filter, which is suitable for incre-

mental filter update. Each entry within the counter-based semi-maximal filter table

includes a separate counter, and the counter represents the number of destinations

where the link e is traversed from each source IP address. The counter value is

interpreted as false if its value is positive. Otherwise, the value is interpreted as true.

A straightforward semi-maximal filter design is to maintain a table which consists

of entries for all source IP prefixes, where each entry includes a Boolean flag. If it is

true—1 in the definition—we consider it as invalid (spoofed). Otherwise, we consider

it as valid (non-spoofed). If a link e belongs to a set of routes from a source IP

address s to some destination IP address, we set the entry of the source’s IP prefix as

false. Otherwise, the entry has true as its value. Thus, when a packet, whose source

IP address is s, arrives, filter look-up will check its validity by performing maximum

prefix matching.

BGP handles dynamic state changes incrementally, whereas OSPF and RIP do

so periodically. The route-based DPF protocol, in the AS context, follows BGP and

2In reality, some AS node may include more than one IP prefix within its domain. In some cases,more than one AS node may have a common IP prefix within their domains.

13

hence its updates are carried out incrementally. The following example shows that

route-based DPF must be extended to handle incremental changes.

When network state is dynamic, a source IP address, which was valid earlier, may

become invalid later. Formally, let R0(s, t) be the set of routes from s to t calculated

at time 0 and let R1(s, t) be the set of routes calculated at time 1. When the state of

the network is transient, it is possible that e ∈ R0(s, t) but e /∈ R1(s, t). When a filter

entry has false as its flag value, it is difficult to tell whether the in question link is

used to reach from source s to the destination t only, or the link is also used to reach

to some other destination address. In the first case, the entry should be updated as

true, however, in the presence of continual changes and uncertainty this may cause

violation of safety if the link is used to reach from source s to some other destination.

As mentioned in [7], route-based DPF is safe in the sense that it never discards valid,

i.e., non-spoofed packets. In the second case, the entry should remain false. However,

if the link is in fact not used by any other source-destination pair, the entry becomes

stale. An entry for a source IP address s of a filter table at a link e is stale when a

semi-maximal filter cannot filter out spoofed DDoS packets, whose source IP address

is forged as s.

When the above transient change happens, our counter-based semi-maximal filter

decrements the corresponding entry’s counter value. If the link e is used to reach

from the source IP address s to some other destination, the counter value remains

positive. Thus, safety is not violated. If the link is used only for a single (s, t) pair,

the counter will reach 0 after decrement. Hence, the entry is prevented from being

stale.

3.2.2 Filter Update

As mentioned in [7], the most challenging problem, in the context of designing

a filter update protocol, lies in IP focusing on destination reachability, however, not

necessarily source reachability. This can induce asymmetry of routing, which makes

14

it difficult to infer source reachability. Asymmetry of IP routing is common in inter-

domain routing where non-technical, administrative policies are applied due to BGP

implementing policy routing. In this section, we focus on these challenges. Our

approach, a BGP Extension, for solving these challenges is introduced in the next

section.

For background, let us describe BGP’s route selection and advertisement pro-

cedure. According to RFC 1771 of BGP-4 [48], Decision Process selects routes for

subsequent advertisement by applying policies to route updates stored in Adj-RIB-In

(a table containing received route updates per peer). The output of Decision Process

is a set of routes to be advertised, and they are stored in Adj-RIB-Out. Decision Pro-

cess includes three distinct phases—Phase 1, Phase 2, and Phase 3. Phase 1 is a step

for calculating the degree of preference for each received route update by applying

policies. Let us call them update policies. During Phase 2, a best route is selected

out of the received route updates for each distinct destination. When there exist

more than one candidate route update for a destination with the same preference, a

tie-breaking procedure is applied. Each BGP speaker has a unique BGP Identifier,

which is set to an IP address assigned to it. When a BGP peering relationship is es-

tablished, participating BGP speakers exchange their BGP Identifier values. Among

routes selected during Phase 2, some routes are chosen for advertisement to peer BGP

speakers per policy during Phase 3. Let us call them advertisement policies.

As with other routing protocols, BGP focuses on destination reachability, which

is sufficient for calculating IP routing tables. Since IP routing relies on destination

IP address only, destination reachability is a sufficient condition. A BGP speaker at

a destination AS creates and forwards reachability information (i.e., route update)

to its neighbors. Once reachability information is sent out, the BGP speaker at the

destination AS does not know which route was selected by BGP speakers at source

ASes. Each source AS selects a route to the destination AS based on destination

reachability information only, without taking into account routing path from the des-

15

tination AS back to itself. Moreover, source ASes need not send routing information

back to the destination AS.

Destination-based IP routing may induce route asymmetry, and routes calculated

by BGP can exhibit the same problem. Given a route from a source IP address s to a

destination IP address t, routes between s and t are asymmetric if a route from t to

s is not the same as one from s to t. Route asymmetry is a characteristic feature of

destination-based IP routing, and it has been observed since the mid 1990s both at

the AS-level and at the router-level [49]. Figure 3.1 illustrates route symmetry in a

simple network. Let us use hop count as the metric in this example. When a routing

protocol lets node 1 know that there are two route candidates for reaching node 6

with the same metric 3, node 1 will choose one of them. Similarly, node 6 will select

one of two route candidates for reaching node 1. Depending on the routing protocol

and local information available, route asymmetry can arise as depicted in Figure 3.1.

Figure 3.1. Illustration of route asymmetry.

16

Due to route asymmetry, source reachability may not be inferable directly from

destination reachability information. In Figure 3.1, given a route from node 1 to

node 6, we cannot construe that the route from 6 to 1 is the same route in reverse.

On the other hand, link-state routing protocols such as OSPF can detect this asym-

metry and infer source reachability using its global knowledge of the entire network.

However, BGP does not provide to each node global knowledge for calculating source

reachability. Phase 3 of BGP’s Decision Process chooses route updates (1) among

routes selected during Phase 2, and (2) according to its advertisement policy. In

other words, route update information received from one’s peer may be biased and

restricted by the update policy (in case of 1) and advertisement policy (in case of 2)

of the peer. Hence, the current BGP is not suited for inferring source reachability for

use in route-based DPF filter table update.

To calculate correct filter tables at distributed filter sites, augmentation of BGP

or introduction of a new protocol for propagating source reachability information is

required. In reality, BGP may not compute correct routing tables. Sometimes, it

leads packets to black holes where packets cannot be forwarded any further. Even if

this is the case, route-based DPF protocol should infer source reachability from the

consistent image of routing tables which the BGP protocol calculates. Hence, both

methods are required to interact with BGP for synchronization. The latter requires

additional overhead for coordinating with a separate protocol, BGP. We extend BGP

for disseminating source reachability information.

3.2.3 BGP and Its Extension

We extend BGP by defining a new message type—BGP-REFLECT. A BGP-

REFLECT message contains source reachability information, and it is disseminated

back to destination ASes, where its corresponding route update message is initiated.

BGP-REFLECT includes two internal types—ADD and DELETE. In the case when

a BGP-REFLECT message carries potential source AS information which becomes

17

reachable, its internal type is set as ADD. In the case when it carries source AS

information which becomes unreachable, its type is set as DELETE.

As BGP presents destination reachability using AS-PATH in BGP route update

messages, source reachability information in BGP-REFLECT messages is represented

as AS-PATH. A BGP route update message includes AS-PATH as an attribute for a

destination IP prefix. AS-PATH, which contains destination reachability information,

is a sequence of AS numbers starting with that of the originating AS. As it is propa-

gated from the originating AS, AS numbers are prepended to the AS-PATH attribute

in BGP route update messages. Similarly, a BGP-REFLECT message includes source

reachability information as part of AS-PATH. The AS number of the source AS is

prepended to AS-PATH of selected BGP route updates from a destination AS.

Generation of BGP-REFLECT message is triggered during BGP’s Decision Pro-

cess. First, a BGP-REFLECT ADD message is initiated when a route update is

selected for a destination IP prefix. This may be caused by a newly-received BGP

route update or by expiration of the BGP Hold Timer, which checks liveness of its

peer BGP speaker. In the first case, when the newly-received BGP route update is

selected as the best route for the destination IP prefix, the BGP-REFLECT ADD

message for the new route update is instantiated. In the latter, expiration of BGP

Hold Timer initiates selection of another best route for the destination IP prefix. For

the newly-chosen route update, BGP-REFLECT ADD message is initiated. Next,

BGP-REFLECT DELETE message is created when a route update, which was se-

lected for a destination IP prefix, becomes invalidated. This can be caused by a

newly-received route update or by expiration of the BGP Hold Timer.

Triggered BGP-REFLECT messages are forwarded back to the destination AS,

which originated the corresponding BGP route update message. BGP consults AS-

PATH within BGP-REFLECT message to determine where to forward. Once the mes-

sage reaches the destination AS, it is destroyed. From the received BGP-REFLECT

message, the destination AS comes to know source reachability.

18

3.3 Route-based DPF Protocol

3.3.1 Architecture

Based on the discussion in Section 3.2, we define route-based DPF protocol—

DPF-lookup and DPF-update—together with BGP Extension. The DPF-lookup pro-

tocol challenges every incoming packet consulting the semi-maximal filter table to

determine validity. Filter update is composed of BGP Extension and DPF-update

protocol. As mentioned earlier, BGP Extension disseminates source reachability in-

formation to the entire network. The DPF-update protocol interprets source reacha-

bility from BGP-REFLECT messages received by the BGP Extension, and it updates

semi-maximal filter tables according to the information obtained.

Figure 3.2 shows the overall architecture of the route-based DPF protocol. As

mentioned in Section 3.2, DPF-lookup exists between the network interface layer

and internet layer in the Internet Reference Model, and DPF-update operates at the

application layer on top of TCP. The DPF-update protocol updates semi-maximal

filter tables which are used by the DPF-lookup protocol for filtering at smaller time

scales.

Figure 3.2. Protocol stack of route-based DPF protocol.

19

3.3.2 DPF-lookup Protocol

The DPF-lookup protocol performs filtering, inspecting every incoming IP packet.

When a packet arrives through a network interface, DPF-lookup fetches its source IP

address. Then, DPF-lookup finds the best matching IP prefix3 in the semi-maximal

filter table corresponding to the network interface. If an entry for a source IP address

stores a positive counter value, which means that the packet with the source IP

address is valid, it is passed to IP for routing. Otherwise, the packet is discarded.

3.3.3 Semi-maximal Filter Table

The DPF-lookup protocol defines semi-maximal filter tables as protocol compo-

nents. A semi-maximal filter table contains an entry for each source IP address. For

reduction of size (the number of entries) of a semi-maximal filter table, it is recom-

mended to maintain an entry only for valid source IP address. In this case, if there

is no matching entry for a source IP address, it implicitly means that the source IP

address is invalid. Filtering functionality can be enabled selectively. For each network

interface where filtering is enabled, DPF-lookup maintains a separate semi-maximal

filter table.

3.3.4 BGP Extension

BGP Extension is an augmentation of BGP-4 to assist the DPF-update protocol

calculate semi-maximal filter tables. The main reason for this extension is to overcome

route asymmetry by sending source reachability information back to the originating

destination AS. For this, a new type of message, BGP-REFLECT, is employed.

Figure 3.3 illustrates the BGP Extension mechanism. Let us consider three border

routers: 1, 2, and 3; their AS numbers are 1, 2, and 3, respectively. In this example,

3In this thesis, we assume that each AS has a unique, non-overlapping IP prefix. As mentioned in3.2, some ASes have more than one IP prefix, and some IP prefixes are shared by more than oneASes. In that case, this table searching method might cause safety violation or staleness problem.

20

Figure 3.3. Illustration of BGP Extension mechanism.

every border router installs BGP-4 for routing, and the BGP Extension functionality

is enabled. First, BGP route updates are propagated to the entire network according

to BGP-4. After establishing a TCP connection, BGP at 1 sends a route update for

itself to BGP at 2. Here, a route update corresponds to a BGP UPDATE message

which includes Network Layer Reachability Information (NLRI) and AS-PATH. NLRI

contains a list of IP prefixes of the triggering AS. On receiving the UPDATE message,

BGP at 2 passes it to Decision Process, and it is decided as a best route to reach 1.

Then NLRI and AS-PATH information are inserted into Loc-RIB (storage for selected

route updates) of BGP at 2, and a new BGP UPDATE message is sent to BGP at 3.

When a new route update is selected and inserted into Loc-RIB, a BGP-REFLECT

message for the route update is created and sent back to the destination. As shown

in Figure 3.4, BGP-REFLECT includes IP prefix, AS-PATH, and TYPE as its major

fields. In this illustration, BGP-REFLECT ADD messages are created for the newly

received route updates. The IP prefix field stores the IP prefix of the source AS. To

21

complete the AS-PATH representing source reachability, the initiating AS prepends

its AS number to AS-PATH of the received BGP route update. In Figure 3.3, a BGP-

REFLECT message from 2 to 1 and another one from 3 to 2 correspond to instances

of this case.

On receiving a BGP-REFLECT message from its peer, the BGP router for-

wards the message to its upstream router based on the AS-PATH field of the BGP-

REFLECT message. In Figure 3.3, BGP at 2 receives the BGP-REFLECT message

initiated by 3, and it forwards the BGP-REFLECT message without any modifica-

tion. Again, it refers to the AS-PATH field to find out the corresponding upstream

router. When BGP-REFLECT messages arrive at destination ASes, they are removed

and not forwarded any further. In Figure 3.3, BGP at 1 receives two BGP-REFLECT

messages, and they disappear from the network.

BGP-REFLECT Message Format

As a BGP message type, BGP-REFLECT contains a 19-byte BGP message header.

The BGP message header includes a 1-byte Type field. We assign 5 as the type code

of the BGP-REFLECT message.

Figure 3.4. BGP-REFLECT message format.

Figure 3.4 shows the BGP-REFLECT message format. Following is a description

of each field:

• IP prefix length

22

This 1-byte unsigned integer field specifies the length of a source IP prefix. The

length is represented in bits.

• IP prefix

The source IP prefix is stored in this field. The length of the IP prefix is variable.

For this reason, this field is padded with 0 in order to scale its length into a

byte unit.

• AS-PATH length

This 1-byte unsigned integer field specifies the AS-PATH length. The length is

represented as a count of ASes in the AS-PATH.

• AS-PATH

This field contains a sequence of AS numbers. Each AS number is represented

as a 2-byte unsigned integer. The total length of this field becomes two times

the AS-PATH length.

When a BGP-REFLECT message is created, an initiating source AS’s AS num-

ber is prepended to the AS-PATH of a given route update, starting with the

destination AS. The AS-PATH field must not be modified by forwarding BGP

nodes.

• TYPE

This 1-byte unsigned integer field indicates the type of a BGP-REFLECT

message—ADD or DELETE. The following codes have been defined:

Code Symbolic Name

1 ADD

2 DELETE

A BGP-REFLECT ADD message indicates a source AS, which becomes reach-

able to a destination AS via AS-PATH. BGP-REFLECT DELETE message

23

indicates a source AS, which becomes unreachable to a destination AS through

a given AS-PATH. Details of each message type is described in the following

section.

Message Types: ADD/DELETE

Whenever BGP chooses a new route update for a destination, a BGP-REFLECT

ADD message for the route update is generated. When BGP receives a new route

update from its peer, Decision Process starts. If it decides to choose the route, a

BGP-REFLECT ADD message is initiated. In another instance, when the BGP Hold

Timer for a BGP session expires, BGP withdraws all route updates received from the

corresponding peer. In this case, BGP tries to find an alternate route update for each

destination. For the newly-selected route updates, BGP-REFLECT ADD messages

are generated. IP prefix and IP prefix length fields are filled with the value of the

source AS. The AS number of this initiating AS is prepended to the AS-PATH of the

route update. At the destination AS, the received BGP-REFLECT ADD message

conveys that the destination AS becomes reachable from the source AS through the

AS-PATH.

A BGP-REFLECT DELETE message is generated when BGP invalidates a route

update for a destination. BGP-REFLECT DELETE can be triggered in the afore-

mentioned situations. First, when BGP selects a newly-received route update for

a destination AS, withdrawing a previously chosen AS-PATH, a BGP-REFLECT

DELETE message is generated for the old route update. On the other hand, when

the BGP Hold Timer for a BGP session expires, irrespective of the existence of an

alternate route, BGP-REFLECT DELETE messages for all route updates are gener-

ated. IP prefix, IP prefix length, AS-PATH, and AS-PATH fields are filled in in the

same manner as that of the BGP-REFLECT ADD message. At the destination AS,

the received BGP-REFLECT DELETE message is interpreted as the destination AS

becoming unreachable from the source AS via the specified AS-PATH.

24

Reflect Timer

In principle, whenever a new route update is selected and inserted into Loc-RIB,

a BGP-REFLECT message for the route update should be initiated and sent back to

the destination AS. However, for reduction of message complexity, Reflect Timer is

recommended to be employed. In this case, BGP-REFLECT messages are transmit-

ted when the reflect timer expires. BGP checks its Loc-RIB to find a route update for

which a BGP-REFLECT message has not yet been triggered, and it generates BGP-

REFLECT messages for them. In case when more than one route update for the

same destination are selected within the Reflect Timer interval, a BGP-REFLECT

ADD message for the last route update is initiated. BGP-REFLECT DELETE mes-

sages are sent out only for the invalidated route updates, for which BGP-REFLECT

ADD messages was initiated. Hence, we can reduce additional BGP-REFLECT

ADD/DELETE messages for other invalidated route updates. Depending on the Re-

flect Timer interval, the number of BGP-REFLECT messages generated is reduced

as intended. In addition, it affects timeliness of semi-maximal filter updates across

the entire network. Hence, the Reflect Timer has to be tuned carefully.

Interface to DPF-update

In the context of defining an interface between BGP Extension and DPF-update,

we assume that BGP Extension includes a flag for identifying a site where route-based

DPF filtering is deployed (a.k.a., filter site). The route-based DPF protocol—DPF-

update and DPF-lookup protocols—is deployed at filter sites, where the flag is set to

true. Hence, BGP can determine whether a router is a filter site or not.

BGP Extension provides a reflect buffer as an interface to the DPF-update pro-

tocol. In case of a filter site, all received BGP-REFLECT messages are stored in

the reflect buffer. Specifically, when a BGP speaker receives a BGP-REFLECT mes-

sage, it inserts a copy of the message into its reflect buffer before forwarding to the

upstream border router.

25

3.3.5 DPF-update Protocol

We assume that the DPF-update protocol at each filter site ascertains the set of

filter sites deployed. Note that DPF-update can access semi-maximal filter tables by

the AS number of peering ASes, since DPF-lookup maintains filter tables indexed by

the AS number of neighboring ASes.

The DPF-update protocol decodes received BGP-REFLECT messages for calcu-

lating semi-maximal filter tables. As mentioned earlier, BGP-REFLECT messages

are handed over by BGP Extension via a reflect buffer.

Given a BGP-REFLECT message, the DPF-update protocol operates as follows

based on its type code:

• ADD

From the AS-PATH of the BGP-REFLECT message, DPF-update fetches its

immediate downstream AS number. A corresponding semi-maximal filter table

is accessed. IP prefix and IP prefix length information are used to find an entry

for the source AS. Then, the counter value for the entry is incremented. Since

the filter table keeps only valid source IP addresses for reduction of table size

(a property of power-law networks), an entry for a source IP address may not

exist. In that case, DPF-update creates an entry and increments its counter

value.

• DELETE

DPF-update accesses a corresponding semi-maximal filter table based on the

AS-PATH of the BGP-REFLECT message. Using IP prefix and its length, an

entry for the source AS is fetched. Since the semantics of the BGP-REFLECT

DELETE conveys that the source AS does not use the given AS-PATH to reach

this node, DPF-update invalidates its source IP prefix from the filter table

by decrementing the counter. In case when the filter table stores only valid

source IP addresses, we need one more check if the counter becomes zero after

decrement. If this is the case, the entry is deleted.

26

3.4 Improvement of DPF-update Protocol for Fault-tolerance

The BGP-REFLECT message forwarding scheme relies on BGP peering relation-

ships. BGP forwards received BGP-REFLECT messages as dictated by their AS-

PATH field values. This generates a major issue to be considered: fault-tolerance.

When an intermediate border router, which belongs to a path between a certain

source AS and a destination AS, goes down, BGP at the source AS runs Decision

Process and selects another path (if any). As a result of Decision Process, the original

path is invalidated and a BGP-REFLECT DELETE message is initiated. BGP run-

ning at a border router, which had a connection with the failed border router loses

connection. As a result, in the course of BGP-REFLECT message forwarding by

BGP, the BGP-REFLECT DELETE message cannot be forwarded any further. In

this case, upstream filter sites do not receive this message, and it causes filter tables

at the upstream filter sites to contain stale entries.

The new forwarding mechanism offloads forwarding responsibility from BGP on

DPF-update. DPF-update maintains connection with other filter sites on demand

and forwards BGP-REFLECT messages to its upstream filter sites. However, since

only BGP can make a decision for generation of BGP-REFLECT messages, it relies

in part on BGP. When a BGP-REFLECT message is generated at a non-filter site, it

is forwarded to its upstream router in the same manner as the old mechanism. BGP

follows the old mechanism until the message reaches the first filter site. Then BGP at

the filter site passes the message to DPF-update so that it may start its forwarding

mechanism.

We need to consider the case when an upstream filter site may not be reachable

as well. DPF-update detects failure of its upstream filter site when TCP connection

set-up procedure fails. In this case, it forwards the received BGP-REFLECT message

to the next available upstream filter site. The connection failure, however, might be

caused by temporary network state. In addition, keeping unreachable upstream filter

site information is not suitable for scalability of DPF-update. For these reasons, DPF-

27

update does not maintain unreachable upstream filter sites information for future

use. Whenever a BGP-REFLECT message is received, DPF-update tries to make a

connection with its next upstream filter site. This enhances fault-tolerance without

adversely affecting scalability of the DPF-update protocol.

29

4 PERFORMANCE EVALUATION OF ROUTE-BASED DPF

PROTOCOL

As shown in [7], route-based DPF provides significant degree of proactive and reac-

tive protection against spoofed DDoS attacks in a static network environment. The

static network environment represents a network environment where there is no fail-

ure or variation of network infrastructure. Here, network infrastructure ranges from

hardware—such as host, router, or link—to software—such as routing protocol and

name server.

In this chapter, we carry out performance evaluation of route-based DPF protocol

in a dynamic network environment. Contrary to the static network environment, the

dynamic network environment encompasses failure or variation of network infrastruc-

ture. Route-based DPF is active during transient periods of the network while BGP

updates IP routing tables. Time lags in synchronization between the route-based

DPF protocol and BGP may lead to performance degradation. Thus, we measure

and analyze the effectiveness of the route-based DPF protocol during transient peri-

ods.

This chapter is organized as follows. First, we introduce performance measures—

stability, safety violation, staleness, containment, and traceback—for route-based

DPF. Next, the experimental setup is presented. Finally, results for the performance

measures are shown and analyzed in separate sections.

4.1 Overall Objective

During transient periods, route-based DPF may contain incorrect information

which may cause safety violation or staleness. That is, route-based DPF may discard

30

non-spoofed packets (safety violation) or may not be able to discard spoofed attack

packets (staleness) when it is safe to do so.

The route-based DPF protocol calculates semi-maximal filter tables based on

global knowledge of routing. Thus route-based DPF requires consistent routing infor-

mation with respect to BGP. During transient periods, BGP Extension generates and

handles BGP-REFLECT messages as well as BGP UPDATE messages. Accordingly,

the route-based DPF protocol interprets messages from the BGP Extension, in order

to be consistent with fluctuations of BGP routing itself.

In order to capture potential performance degradation during the transient peri-

ods, we measure effectiveness of the route-based DPF protocol with respect to safety

violation and staleness. We show route-based DPF’s protection performance using

two major performance measures—containment and traceback, as defined in [7]. In

addition, we measure stability of BGP’s routing table calculation as well as that of

the route-based DPF protocol’s filter table calculation. Since the route-based DPF

protocol relies on the underlying BGP for its update events, stability of BGP routing

affects stability of route-based DPF.

4.2 Performance Measures

Let G = (V, E) be an undirected graph representing an AS-level Internet topol-

ogy. BGP computes routes for all pairs of source and destination; let R be the

set of computed routes. A route r—an element of R—is represented as a 3-tuple

< node, destination, nexthop >, where node indicates a routing table where the en-

try belongs to; the others denote the destination and next hop. Similarly, route-based

DPF calculates filter tables based on R, and it generates a set of computed filter en-

tries F . A filter table entry f is represented as a 3-tuple < node, link, source >, where

node and link identifies a filter table where the entry belongs to; source denotes the

IP source address in semi-maximal filtering. The existence of an entry for a source

address represents validity. Let E be the set of events which change network topology

31

configuration. In reality, E ranges from addition, deletion, or change of BGP routing

policies to addition, failure, or configuration change of hardware infrastructure (host,

router, or link). In this section, we focus on a single AS node failure cases, which

may come from failure of its border router(s). We define a node failure event e as

a pair < time, node >, where time denotes the time of failure; node represents the

failed node.

Let R0 and F0 be the initial set of routes and filter tables before an event e in E

occurs. An event e triggers BGP’s route update procedure. We assume that there

exists a steady state of BGP route calculation, where there is no more BGP route

update triggered by an event occurrence.1 With this assumption, let Rs and Fs be

the set of routes and the set of filter entries in a steady state.

4.2.1 Stability

We define distance of two set of routes, Ri and Rj, with respect to entry or with

respect to node in terms of granularity. The distance of Ri and Rj with respect to

entry is defined as a scalar between 0 and 1, and it denotes the fraction of entries

which include inconsistent information. Here, inconsistent means either a route entry

for a destination does not exist or nexthop information is not the same. Similarly, we

define the distance of Ri and Rj with respect to node as a scalar between 0 and 1, and

it denotes the fraction of nodes whose routing tables include at least an inconsistent

entry. When two sets of routes contain exactly the same information, both distance

with respect to entry and distance with respect to node become 0.

Assuming E is a finite set, we can deduce the resulting network topology G∗ and

the corresponding set of routes R∗, which is calculated by Dijkstra’s shortest-path

algorithm. Let R∗ be the ideal set of routes. Since BGP may converge to a different

set in its steady state, Rs and R∗ may not be the same. BGP generates a set of routes

Rt at each time instance t as a result of BGP route update exchange. After BGP

1However, it is may not be true, in reality. BGP stability problem and its effects on route-basedDPF protocol can be examined separately.

32

convergence, Rt is the same as Rs. By plotting distances of each set of routes Rt and

R∗ with time, we can observe the evolution of BGP routing calculation and stability.

Given two sets of filter entries, Fi and Fj , we define three types of distances for

filter table comparison. First, distance of the given sets of filter entries with respect

to entry granularity is defined as a scalar between 0 and 1, and denotes the fraction of

entries which include inconsistent filter table information. Inconsistent information

means either a filter entry for a pair of a link and a source does not exist or validity

information is not the same. Next, we define distance with respect to filter granularity

as a scalar between 0 and 1, which denotes the fraction of filter tables which include

at least an inconsistent entry. Finally, distance with respect to node granularity is

defined as a scalar between 0 and 1, and denotes the fraction of nodes whose filter

tables include at least one inconsistent entry.

BGP routing table calculation fluctuates during transient periods, and it generates

sets of routes as it handles BGP route update messages. Subsequently, BGP’s route

calculation triggers DPF-update’s filter calculation. Since route-based DPF’s protec-

tion performance is based on underlying routing state, it is important DPF-update to

calculate filter table consistent with routing at each time instance. Hence, we define

consistency of the route-based DPF protocol’s filter calculation at each time instance

t as a distance of two set of filter entries, Ft and F ∗

t , where Ft denotes calculated filter

information; F ∗

t represents theoretically determined filter information from routing

Rt. For a given routing R, we define F ∗, an ideal set of filter entries calculated from

the given R.

33

4.2.2 Safety Violation

Given a set of routes Ri and a set of filter entries Fi at a time instance, we can

deduce the ideal set of filter entries F ∗

i for the given Ri. Here, safety violation of Fi

for the given Ri is defined as follows:

SV (Ri, Fi) =

0, if F ∗

i − Fi = ∅;

1, otherwise.

The value 0 represents that Fi for the given Ri is safe. The value 1 represents that

safety of Fi for the given Ri is violated. The intuitive meaning of the safety condition

is that filter tables should contain validity of all unspoofed source addresses, so that

unspoofed packets are not discarded by route-based DPF.

To quantify the degree of safety violation, we examine safety violation in entry,

filter, and node granularity. Let nftr be the number of filter tables, which are iden-

tified by a pair, node and link. For given Fi and Ri, safety violation index in entry

granularity SVentry is defined as follows:

SVentry(Ri, Fi) =|F ∗

i − Fi|

nftr · |V |

SVentry is a scalar between 0 and 1, and it denotes fraction of entries, which cause

safety violation.

Safety violation index in filter granularity SVfilter is defined as follows:

SVfilter(Ri, Fi) =|{(u, e) : ∃f =< u, e, s >∈ (F ∗

i − Fi)}|

nftr

In this definition, (u, e) represents a filter deployed at u over the link e. SVfilter is

a scalar between 0 and 1, and it represents fraction of filters, which include at least

one safety-violating entry.

Safety violation index in node granularity SVnode is defined as follows:

SVnode(Ri, Fi) =|{u : ∃f =< u, e, s >∈ (F ∗

i − Fi)}|

|V |

34

SVnode is a scalar between 0 and 1, and it denotes fraction of nodes, at least one

of whose filter table contains at least one safety-violating entry.

SVentry provides the most fine-granular metric for safety violation. SVfilter and

SVnode show distribution of safety-violating entries over all filter tables and nodes,

respectively.

4.2.3 Staleness

Given a set of routes Ri and a set of filter entries Fi at a time instance, we can

deduce the ideal set of filter entries F ∗

i for the given Ri. Staleness of Fi for the given

Ri is defined as follows:

ST (Ri, Fi) =

0, if Fi − F ∗

i = ∅;

1, otherwise.

The value 0 represents that Fi for the given Ri is not stale. The value 1 represents

that Fi for the given Ri contains stale information. In other words, at least a entry

for an invalid source address is kept in Fi. It implies that DDoS attack packets, whose

inscribed source address is the invalid one, cannot be discarded.

As with safety violation case, the degree of staleness is measured with staleness

indexes in entry, filter, and node granularity. For given Fi and Ri, staleness index in

entry granularity STentry is defined as follows:

STentry(Ri, Fi) =|Fi − F ∗

i |

nftr · |V |

STentry is a scalar between 0 and 1, and it denotes fraction of entries, which cause

staleness.

Staleness index in filter granularity STfilter is defined as follows:

STfilter(Ri, Fi) =|{(u, e) : ∃f =< u, e, s >∈ (Fi − F ∗

i )}|

nftr

35

STfilter is a scalar between 0 and 1, and it represents fraction of filters, which

include at least one stale entry.

Staleness index in node granularity STnode is defined as follows:

STnode(Ri, Fi) =|{u : ∃f =< u, e, s >∈ (Fi − F ∗

i )}|

|V |

STnode is a scalar between 0 and 1, and it denotes fraction of nodes, at least one

of whose filter table contains at least one stale entry.

As with safety violation measures, STentry provides the most fine-granular metric

for staleness. STfilter and STnode indicate distribution of stale entries over all filter

tables and nodes, respectively.

4.2.4 Containment

To observe proactive protection (a.k.a. containment) of route-based DPF during

transient periods, we use Φ2(τ) defined in [7]. Φ2(1) is a scalar between 0 and 1, and

it denotes the fraction of AS where no attacker can succeed spoofed DDoS attack

targeted at any victim in other ASes.

4.2.5 Traceback

We use Ψ1(τ) to measure reactive protection performance (a.k.a traceback) of

route-based DPF during transient periods. As defined in [7], Ψ1(τ) is a scalar between

0 and 1 for the given parameter τ , and it denotes the fraction of ASes, which on

receiving a spoofed IP packet can localize its physical source to within τ sites.

4.3 Experimental Setup

We performed experiments on the 3023-node NLANR [9] measurement topology

dated 11/08/1997. According to [?]’s stub/transit classification on the 3023-node

36

topology, around 80% of nodes are classified as stub nodes. We place route-based

DPF filter on selected 580 nodes, which compose a vertex cover of the given topology.

We considered two single-node failure benchmark scenarios. First, we considered

a single degree, stub AS node which is connected to the highest degree node. Due to

the power-law nature of AS-level Internet topology [40], most AS nodes correspond

to this category. The faulty node (AS3) is assigned to Massachusetts Institute of

Technology, and the connected one of highest degree AS nodes (AS1) corresponds to

Genuity. Next, we select a degree-9 transit AS node as a faulty node. The faulty

node (AS3407) is assigned to Interpath, and the connected 9 high degree AS nodes

(AS81, AS286, AS701, AS1239, AS2548, AS2551, AS2914, AS3561, and AS5413).

One item to note here is that we do not consider the case when a high degree node is

faulty. In that case, basic routing itself will not perform its functionality. Thus, the

route-based DPF protocol, which relies on the underlying routing, cannot function

correctly either.

For both experimental scenarios, Reflect Timer interval is set to 5 seconds; BGP

ConnectRetry interval, 120 seconds; Hold Time interval, 90 seconds; KeepAlive in-

terval, 30 seconds; and MinRouteAdvertisementInterval is set to 30 seconds. We

executed benchmark simulations from 0 second until 5000 seconds. BGP Loc-RIB

tables (storage for selected route update, which is consistent with local IP routing

table) and DPF filter tables are dumped every 150 seconds.2 When simulation starts,

BGP and route-based DPF calculate IP routing table and DPF filter table, respec-

tively. These tables converge at around 300 second. A faulty node goes down at 350

second. Thus, transient state transitions start at that time.

4.4 Stability

Figure 4.1 shows BGP stability as a function of simulation time. In each plot,

“entry count” and “node count” represent distance with respect to entry and distance

2We do not dump routing and filter tables whenever BGP and route-based DPF protocol affectchanges. Instead, we log the intermediate state of routing tables and filter tables periodically.

37

with respect to node, respectively. From the trajectory of “node count” in Figure

4.1(a), we observe that more than 40% of nodes converge at time 1500 second, around

1200 seconds after the start of the the transient period. Then it remains stagnant for

around 300 seconds, and the rest of them converge in the next roughly 700-second

period. In the end, BGP routing stabilizes at around 2500 second. Distance with

respect to entry is close to 0 during the whole transient period, although distance

with respect to node granularity is significant. It implies that most of the nodes have

had few inconsistent entries throughout the whole transient period. In the “node

count” trajectory of Figure 4.1(b), we observe that the pattern of BGP stabilization

is similar to that of 4.1(a). The speed of stabilization is slower, and it stabilizes at

around 2800 second. Comparatively, the transit node failure case takes longer time

for BGP stabilization than the stub node failure case. As with “entry count” plot of

Figure 4.1(a), it is close to 0 during the whole transient period which indicates that

each node has few inconsistent entries.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

dist

ance

of

rout

ing

time

entry countnode count

(a) A stub node failure.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

dist

ance

of

rout

ing

time

entry countnode count

(b) A transit node failure.

Figure 4.1. BGP routing stability as a function of simulation time.

Figure 4.2 shows consistency of filter tables as a function of simulation. We

measure consistency of filter table information at a time instance by measuring the

distance from theoretically determined filter information for the given routing at

38

the time instance. Each plot in Figure 4.2 presents distance with respect to entry

as “entry count”; distance with respect to table, “filter count”; and distance with

respect to node, “node count”. In Figure 4.2, we do not observe any transition in

“node count” after the faulty node goes down. Distance with respect to filter table

increases during the initial transient period, and it stabilizes at around 2500 seconds.

In the same way, distance with respect to entry increases during the initial period,

and it stabilizes. The “node count” plot shows that most filter sites have at least

an inconsistent entry. “filter count”, however, shows that the actual fraction of filter

tables which have those invalid entries are less than 20%. Moreover, “entry count”

plot shows that the fraction of invalid entries are significantly less. In conclusion, we

find that there exist few invalid entries in a few filter tables. Nevertheless, they are

dispersed around the whole filter sites.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

dist

ance

time

entry countfilter countnode count


0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

dist

ance

time



Figure 4.2. Consistency of filter tables as a function of simulation time.

As explained in Section 4.2, these differences belong to safety violation category

or staleness category. In case when there is no entry for a valid source address, this

belongs to safety violation category. On the other hand, the case when there exists

an entry for invalid source address is classified into staleness category.

39

4.5 Safety Violation

Figure 4.3 shows safety violation indices—SVentry, SVfilter, and SVnode—as a func-

tion of simulation time for the two benchmark scenarios. The plots “entry count”,

“filter count”, and “node count” denote SVentry, SVfilter, and SVnode, respectively. In

both Figure 4.3(a) and Figure 4.3(b), safety violation happens during the transient

period. After BGP’s convergence (at around 2500 seconds and 2800 seconds, respec-

tively), safety violation does not happen in both figures. From the “node count”

plots in both figures, we find that around 10% of filter sites violate safety during

the transient periods. However, distance values of “entry count” and “filter count”

indicate that few safety violating entries are scattered over those filter sites.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

safe

ty v

iola

tion

time



0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

safe

ty v

iola

tion

time



Figure 4.3. Safety violation as a function of simulation time.

40

4.6 Staleness

Figure 4.4 shows staleness indices—STentry, STfilter, and STnode—as a function

of simulation time for the two benchmark scenarios. The plots “entry count”, “filter

count”, and “node count” denote STentry, STfilter, and STnode, respectively. In Figure

4.4, “entry count”, “filter count”, and “node count” increase during the transient

period and converge after BGP’s stabilization. In both cases, staleness remains after

the transient period. From comparison with Figure 4.2, we find that inconsistencies

most of node experienced throughout the simulation period are due to staleness.

As shown in Figure 4.3, safety violation happens only during transient period, and

staleness persists even after the transient period. Plots of “node count” in Figure 4.2,

Figure 4.3(a), and Figure 4.4(a) imply that some nodes exhibit both safety violation

and staleness during the transient period. The same observation holds in the transit

node failure case.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

stal

enes

s

time



0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

stal

enes

s

time



Figure 4.4. Staleness as a function of simulation time.

41

4.7 Containment

Route-based DPF provides proactive protection (a.k.a. containment) against

spoofed DDoS attack by discarding spoofed IP packets in the first place. In this

section, we analyze containment of route-based DPF during the transient periods in

the presence of safety-violation and staleness.

As examined earlier, a node failure causes BGP instability for 2500 seconds to

3000 seconds. During the interim, route-based DPF filter contains incorrect entries,

which cause safety-violation and staleness. However, measures for safety violation

and staleness do not capture degree of performance degradation experienced by each

source and destination pair in reality.

Figure 4.5 provides containment index Φ2(1) as a function of simulation time. In

both experimental scenarios, route-based DPF provides around 99% containment. It

means that around 99% of nodes are contained by route-based DPF, and spoofed

DDoS attack cannot succeed on the 99% of nodes targeting at any victim at other

AS nodes. In particular, in contrast to wide spread of staleness during and after the

transient periods, we do not observe performance degeneration.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Φ2(

1)

time


0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Φ2(

1)

time


Figure 4.5. Containment as a function of simulation time.

42

4.8 Traceback

Although containment of route-based DPF does not allow around 99% of nodes

initiate spoofed DDoS attack, around 1% of nodes can do spoofed DDoS attack.

Reactive protection of route-based DPF (a.k.a. traceback) provides localization of

physical sources within few sites, upon reception of spoofed IP packets.

Figure 4.6 provides trajectories of traceback index Ψ2(τ) with resolution parameter

τ . Each trajectories are calculated from dumped routing and filter information at each

time instance. Although we do not observe any distinct pattern where resolution

parameter τ is less than 5, we find Ψ2(4) approaches to 1 in both node failure cases.

Over all simulation periods, upon receiving spoofed IP packets, route-based DPF can

localize their physical source within 4 nodes. This feature persists even during and

after transient periods.

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20

Ψ1(

τ)

τ

150 sec300 sec450 sec600 sec750 sec900 sec

1050 sec1200 sec1350 sec1500 sec1650 sec1800 sec1950 sec2100 sec2250 sec2400 sec2550 sec2700 sec2850 sec3000 sec


0

0.2

0.4

0.6

0.8

1

0 5 10 15 20

Ψ1(

τ)

τ

150 sec300 sec450 sec600 sec750 sec900 sec

1050 sec1200 sec1350 sec1500 sec1650 sec1800 sec1950 sec2100 sec2250 sec2400 sec2550 sec2700 sec2850 sec3000 sec


Figure 4.6. Traceback as a function of resolution.

Figure 4.7 provides traceback index Ψ2(τ), where τ is from 1 to 5, as a function

of simulation time. Since staleness increases during transient period, reactive per-

formance with resolution parameter 2 and 3 diminishes in a stub node failure case;

reactive performance with resolution 2 decreases in a transit node failure case. As

shown in Figure 4.6, nonetheless, Ψ2(4) remains close to 1 for the entire simulation

43

period. It means that, upon reception of spoofed IP packet, roughly any victim can

localize actual attacker within 4 sites at any time instance.

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Ψ1(

τ)

time

τ = 1τ = 2τ = 3τ = 4τ = 5


0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Ψ1(

τ)

time

τ = 1τ = 2τ = 3τ = 4τ = 5


Figure 4.7. Traceback as a function of simulation time.

45

5 DYNAMIC DPF SIMULATOR

The Dynamic DPF Simulator is a tool for evaluating performance of the route-based

DPF protocol in a dynamic network environment where states of network elements,

such as routers and links, and network protocols may change. Aiming at a realistic

and scalable evaluation of dynamic performance of the route-based DPF protocol, the

Dynamic DPF Simulator is designed to work with large-scale Internet Autonomous

System (AS) measurement graphs. The tools contained therein are applicable to

general network simulation environments, including router graphs and Internet traffic

controls.

The Dynamic DPF Simulator is built on top of DaSSFNet, which provides a net-

work simulation environment for workstation clusters and parallel computers. The

Dynamic DPF Simulator provides additional functionalities, encompassing automatic

partitioning and simulation configuration, various network protocols, and a compre-

hensive measurement framework.

The rest of the chapter is organized as follows. First, we give a brief overview

of DaSSFNet. This includes a description of the major features of DaSSFNet. The

description of the Dynamic DPF Simulator as a complete system follows. Each major

component is discussed in its dedicated section.

5.1 Overview of DaSSFNet

DaSSFNet [8] is a DaSSF-based implementation of SSFNet. As a general-purpose

simulation environment, the Dartmouth Scalable Simulation Framework (DaSSF) [39]

provides a C++ implementation of the Scalable Simulation Framework (SSF). In

addition, DaSSF supports shared-memory multiprocessors and distributed-memory

machines as its platform, incorporating advanced parallel simulation techniques. SSF

46

[50] defines a unified, object-oriented application programming interface (SSF API)

as a standard user interface for discrete-event simulation, considering usability and

performance as its primary design goal. Supporting a process-oriented world-view

of discrete-event simulation, SSF helps make detailed design and implementation

of network models including protocols possible. SSFNet [35] provides simulation

models of various network elements and network protocols on top of a Java-based

implementation of SSF.

In the rest of the section, descriptions of the three key features of DaSSFNet—

scalable simulation kernel, process-oriented world-view, and various network simu-

lation models, are discussed. Then, a brief introduction to the Domain Modelling

Language (DML) for network specification is presented.

DaSSF, as a base system of DaSSFNet, provides a scalable simulation kernel

along with a network specification language DML. DaSSF can be configured either

as a stand-alone single process application or as a networked distributed application.

As a distributed application, it runs on distributed-memory machines, such as Linux

workstation clusters—the specific environment we use for benchmarking. For the lat-

ter, DaSSF uses the Message Passing Interface (MPI) [51] for synchronization and

communication between system components, that implement advanced parallel simu-

lation techniques. The parallel simulation techniques employed by DaSSF include dis-

tributed event synchronization, thrifty memory usage, and efficient multi-threading.

The threading mechanism reduces the overhead of process context switching cost as

well as that of memory consumption [39]. DaSSF exports an application programming

interface that is compliant with SSF.

A process-oriented world-view is supported by SSF, and hence, DaSSF. The SSF

API includes five primary class interfaces—Entity, inChannel, outChannel, Event,

and Process. Entity is a simulation subject which stores the state of a simulation.

inChannel and outChannel define a way of message passing between Entity objects.

Event represents a form of message exchanged via inChannel and outChannel objects.

Finally, Process is a thread of control, which is scheduled by the simulation framework

47

nonpreemptively. As an action of a thread, a Process object, which belongs to an

Entity object, can generate and send an Event object to other Process objects. By

providing a similar abstraction of processes as that used in multi-tasking operating

systems, the SSF API supports a familiar process-oriented world-view to modelers.

Thus, we can design, implement, test, and analyze simulation models as we do so in

real systems modulo idiosyncracies imposed by hardware characteristics.

DaSSFNet comes with a collection of simulation models of network elements and

network protocols, specifically IP and TCP. Adopting the process-oriented world-

view of SSF and its C++ realization for workstation clusters by DaSSF, DaSSFNet

provides a network simulation environment, which is amenable to full-fledged network

protocol implementations. Following is a description of the main components:

• Machine is a logical subject of network simulation components and is modelled

as an Entity object of the SSF API. It consists of zero or more network interfaces

and a network protocol graph which contains installed network protocols.

• NIC representing a network interface, is modelled as a Process, containing an in-

Channel object and an outChannel object. The pair of inChannel and outChan-

nel objects reflects the bi-directional nature of communication media and MAC

protocols. A packet is received and sent through these inChannel and outChan-

nel objects. By modelling an interface as a Process, it can detect and handle

incoming packets as soon as the simulation kernel notifies their arrival to the

inChannel of the interface.

• Link models a mapping of inChannel objects and outChannel objects. Accord-

ing to the configuration of a link model, it provides one-to-one or one-to-many

mappings between participating inChannel objects and outChannel objects. For

example, the point-to-point connection between two machines is represented as

two one-to-one mappings between inChannel objects and outChannel objects.

On the other hand, in the case of modelling a Local Area Network (LAN), each

outChannel object is mapped to all inChannel objects within LAN.

48

• Hardware is modelled as a logical border between network elements and net-

work protocols. All incoming packets received by a network interface, or outgo-

ing packets sent from higher layer network protocols, are passed to a hardware

object. The hardware object passes the incoming packets to the network proto-

col model installed at the lowest layer. Conversely, outgoing packets sent from

the lowest layer protocol are passed to the network interface to be sent via its

outChannel object.

• Network protocols, including IP, TCP, UDP, HTTP, and a BSD-like socket

interface, are implemented as separate C++ classes, borrowing the design phi-

losophy of the x -kernel [52]. First, network protocols, installed on a machine,

compose a graph of network protocols following protocol layering. Next, every

incoming packet is passed to protocols by invoking generic member functions—

send() and receive()—of each protocol according to the network protocol

graph structure. Under these design rules, a full-fledged protocol implementa-

tion is provided.

One thing to note here is that message passing via network interfaces is managed

by DaSSF. An instance of class KernelEvents is created and scheduled by DaSSF

when an Event object is sent through an outChannel object. While processing the

scheduled event, DaSSF creates and schedules another instance of class KernelEvents

for its arrival. However, packet handling by network protocol is realized as a chain

of procedure calls between related objects, which model network protocols. In other

words, the scheme does not require DaSSF to be involved for these packet processing

tasks. As a result, packet processing might be finished in zero simulation time un-

less a protocol model implements a form of processing delay. Although this provides

efficiency by reducing the cost for additional event handling related to packet pro-

cessing, it can also be considered a drawback since processing delay in real systems is

ignored. Nonetheless, it provides an abstraction similar to that of network protocol

implementations in typical operating systems, such as Linux and Windows.

49

DaSSFNet requires a network model—the target of a simulation—to be described

in the Domain Modelling Language (DML) [53]. Figure 5.1 illustrates a simple net-

work specification in DML which consists of one router. Net encloses a configuration

of a network. Similarly, Router encloses a configuration of a router. Within the

router specification in Figure 5.1, a network interface and a network protocol graph

are configured using interface and graph, respectively. The network protocol graph

includes IP, TCP, a BSD-like socket interface, and BGP protocol specifications which

use ProtocolSession. Ordering of protocol specifications within graph implies pro-

tocol layering starting from the highest layer. When more than one protocol exists

above a protocol in the protocol stack (e.g., TCP/UDP on top of IP), a special key-

word child1 is used within the upper layer protocol specifications to indicate the

lower layer protocol as their common lower layer protocol.

DML includes a special keyword, alignment, within Net for describing partition-

ing information. The keyword takes a string value which identifies a partition group

uniquely. More than one network specification can belong to the same partition

group. Figure 5.2 presents how a mapping between partition groups and participat-

ing distributed machines are specified using MapInfo. A keyword, nnodes, specifies

the number of distributed machines participating in a simulation run. A partition

group, "group1", is mapped to a machine whose identifier is 1. An identifier of a ma-

chine is assigned by MPI starting from 0 to the number of machines - 1. By default,

network specifications, which do not have any alignment information, belong to the

same partition group, and it is implicitly mapped to a machine whose identifier is 0.

Figure 5.3 shows an example DML snippet of a point-to-point link. The link

represents a physical connection between two interfaces, whose Network Host Interface

(NHI) [53] addresses are 1:1(0) and 2:2(0). The keyword attach is used to specify

an interface. Here, 1:1(0) means the interface 0 of the host 1, within the network 1.

In the same way, 2:2(0) means the interface 0 of host 2 within network 2.

1Although this keyword was supported in a past version of DaSSFNet, the latest version does notprovide it. The Dynamic DPF Simulator has a modified version of DaSSFNet that supports thekeyword.

50

Net[

id 1

alignment "group1"

Router[

id 1

interface[id 0]

graph[

ProtocolSession[

name bgp

use SSF.OS.BGP ]

ProtocolSession[

name socketMaster

use SSF.OS.Socket.socketMaster ]

ProtocolSession[

name TCP

use SSF.OS.TCP.tcpSessionMaster ]

ProtocolSession[

name IP

use SSF.OS.IP ]

]

]

]

Figure 5.1. A simple network specification in DML, which consists of one router.

MapInfo[

nnodes 2

map [ alignment "group1" machid 1 ]

]

Figure 5.2. Mapping of partition groups onto distributed machines.

51

link[

attach 1:1(0)

attach 2:2(0)

]

Figure 5.3. A DML snippet of a point-to-point link.

52

5.2 DaSSFNet-based Parallel Network Simulation Environment

5.2.1 Description

As a comprehensive DaSSFNet-based distributed network simulation environ-

ment, which considers AS-level Internet graph as its main input, the Dynamic DPF

Simulator supports the following features essential for scalable simulation:

• Automatic model configuration and topology partitioning

Since the Dynamic DPF Simulator targets AS-level Internet graphs, which are

large and have their own graph representation format, a subsystem, Meta-DML,

is provided as an automatic DML configuration tool. At the same time, Meta-

DML partitions an AS-level Internet graph into separate partition groups, which

are mapped to participating distributed machines. Our own algorithm for par-

titioning AS-level Internet graphs is employed which aims to achieve scalable

simulation with respect to the size of the input graph and efficient use of dis-

tributed memory CPU and communication resources.

• Measurement framework

Although the distributed simulation platform supporting automatic network

configuration and power-law topology partitioning enables large-scale network

simulation that efficiently utilizes distributed system resources, a measurement

framework for resource monitoring is needed for analyzing run-time perfor-

mance of the distributed simulation, especially as it pertains to dynamic mem-

ory consumption—a key bottleneck—and event monitoring for diagnosis and

performance analysis. The Dynamic DPF Simulator incorporates a comprehen-

sive measurement framework for large-scale distributed simulation monitoring,

which includes memory consumption, CPU load distribution, and communica-

tion cost. The Dynamic DPF Simulator provides a collection of measurement

routines with a standardized way of specifying measurement configurations in

DML.

53

• Network protocol models

First, DPF-update and DPF-lookup protocol models are designed and imple-

mented for route-based DPF’s dynamic performance evaluation. Second, BGP

is implemented by porting the Java-based implementation of SSFNet [35]. Since

DaSSFNet does not provided a dynamic routing protocol models, such as BGP

or OSPF, a BGP implementation was required for performance benchmarking

on AS-level Internet measurement graphs. Next, a range of application models

are supported. They include traffic generators, attackers, and system fault gen-

erators. Figure 5.4 presents network protocol models supported by the Dynamic

DPF Simulator2.

CBR, Poisson, file trace, MMPP, LRD

{attackers, traffic generators, fault models, ...}

Applications

BGP

TCP UDP

IP

Link Layer

Figure 5.4. Network protocol models in the Dynamic DPF Simulator.

• Modelling AS-level Internet graph

An AS-level Internet graph represents an AS as a node and peering relation-

ships of ASes as edges between nodes. An AS in the Dynamic DPF Simulator

is modelled as a network with one border router. Peering relationships between

ASes are modelled as physical and logical connections between border routers.

2The latest release of DaSSFNet includes the UDP protocol. It was not provided at the time ofdesign and implementation of our prototype system.

54

Physical connection stands for physical link between border routers, and logical

connection stands for peering relationship established by BGP at run-time. Due

to the AS-level viewpoint, an AS where an attacker’s machine belongs to is mod-

elled as a network whose border router runs the attack application. Similarly, an

AS where a route-based DPF filter is deployed is modelled as a network whose

border router is configured with DPF-update and DPF-lookup protocols. The

specification of the DPF-lookup protocol includes a list of network interfaces

where route-based DPF filters are deployed. Thus, we can selectively deploy

route-based DPF filters at physical links of a border router.

For simplicity, identifiers of both network and border routers within the network

are the same as that of the corresponding AS node of the graph. In case

of network interfaces at a border router, serial numbers starting from 0 are

assigned.

• IP address assignment

DaSSFNet provides an IP addressing scheme which use subnet addressing. Sub-

net addressing is assumed within link and Net specifications. Thus, DaSSFNet

automatically calculates a subnet mask depending on the number of network

interfaces within link or Net. Each network interface is assigned an IP address

by adding a unique identifier within its subnet to the subnet mask calculated.

However, this addressing scheme is not suited for modelling peering relationships

between ASes. As an administrative domain, each AS manages and assigns IP

addresses that we allocated to it by InterNIC. Thus, modelling a link between

border routers at different ASes as a subnet is not appropriate. For this reason,

the Dynamic DPF Simulator has its own addressing scheme. Assuming “subnet

addressing” is used within an AS, the most significant 16 bits are used for the

subnet mask, in other words, a common IP prefix. The least significant 16 bits

are filled with a unique identifier denoting an interface.

55

Figure 5.5 shows the system architecture of the Dynamic DPF Simulator. As men-

tioned earlier, Meta-DML generates a DML configuration file from a given network

partitioning information (i.e., the number of distributed machines), network topology

(i.e., AS-level Internet graph), and additional user-specified protocol configuration

information.

Figure 5.5. System architecture of the Dynamic DPF Simulator.

DaSSFNet takes a DML configuration file as input and creates C++ objects of

network elements and network protocols as specified in the file. Since the DML config-

uration file includes partitioning information, C++ objects of a network specification

are instantiated at a corresponding machine where their partition group is assigned.

Before a simulation starts, there are two phases of preparation for a simulation—

self-configuration and self-initialization. Self-configuration precedes self-initialization.

First, DaSSFNet triggers self-configuration by calling Configure() of the C++ object

of the outermost Net specification, which encloses the whole simulation model. In

56

turn, it triggers Configure() of all C++ objects of specified models. Next, they

trigger Configure() of all C++ objects of models specified within. In this fashion,

C++ objects of all simulation models are recursively configured. After configuration,

DaSSFNet triggers self-initialization by calling Initialize() of the C++ object of

the outermost Net specification.

Once all C++ objects are created, configured, and initialized by DaSSFNet, sim-

ulation starts. As a discrete-event simulator, DaSSF processes events in the simu-

lation kernel. These events, called kernel events, are generated by applications and

network protocol models. After processing events scheduled at a particular time,

DaSSF schedules Processes nonpreemptively. After scheduling all eligible Processes,

DaSSF advances the current simulation time to that of the nearest future event(s)3.

In case processing of an event requires communication or synchronization with simu-

lation state at other participating machines, DaSSF communicates via MPI to affect

coordination.

5.2.2 Automatic Model Configuration and Partitioning

Since the Dynamic DPF Simulator is used with AS-level Internet graphs which

are large and have their own graph representation format, an automatic DML con-

figuration tool is provided as a subsystem. At the same time, Meta-DML partitions

a given AS-level Internet graph into separate partition groups, which are mapped

to participating distributed machines. We devised an algorithm for partitioning AS-

level Internet graphs possessing power-law connectivity properties, aimed achieving

scalable simulation with respect to the size of the input graph through efficient use

of distributed memory, CPU, and communication resources.

Meta-DML accepts network partitioning information (i.e., the number of dis-

tributed machines), network topology (i.e., AS-level Internet graph and transit AS

node information), and additional protocol configuration. For example, additional

3Depending on how the initial triggering of events in a simulation is set up, more than one eventmay occur simultaneously and is scheduled accordingly

57

protocol configuration includes distributed filter configuration such as, list of DPF

filter-deployed ASes and attack configuration such as, list of ASes, which install at-

tacker models. Parsing a given AS-level Internet graph input, Meta-DML builds an

initial graph representation. Meta-DML partitions the given Internet graph over the

participating distributed machines and stores the information. Next, distributed fil-

ter and attack configuration are processed. Finally, Meta-DML generates a DML file

from the information collected.

Figure 5.6 shows a sample Meta-DML input file. The variable size denotes the

number of nodes in the input graph, machines represents the total number of ma-

chines, and dynamic_1_static_0 takes on the values 0 or 1. When dynamic_1_static_0

is set to 0, Meta-DML generates an additional variable routing_tbl_file within the

IP DML specification, so that IP routing table entries are loaded from a file. Other-

wise, BGP is configured at every node, in order to calculate routing table information

dynamically. The variable filtering_1_nofiltering_0 is used to configure filtering

functionality of each DPF-lookup protocol centrally using filtering of DPF-lookup

DML model. An AS-level Internet graph which is represented as an adjacency list

is given as input specified by graph_input. transit_node_input specifies a list of

transit AS nodes. Nodes that are not listed in the file are configured as stub AS nodes.

The variable filter_node_input takes a list of filter-deployed nodes as argument.

attack_input represents a list of attacker nodes.

Partitioning

Partitioning a given graph for efficiently utilizing distributed memory, CPU, and

communication resources is a critical component of large-scale simulation. As pre-

sented in Figure 5.7, AS-level Internet graphs based on Oregon RouteViews/NLANR

measurements have grown super-linearly during 1998-2003. The measurement topol-

ogy dated 01/01/2002 has 12,514 nodes. With the growth of the number of AS nodes

n, O(n2) memory requirement for maintaining O(n2) routing entries is required in

58

size 3023

machines 10

dynamic_1_static_0 1

filtering_1_nofiltering_0 1

graph_input ASgraph.3023

transit_node_input AStrans.all

filter_node_input VC.3023

attack_input attack.3023

Figure 5.6. A sample Meta-DML input file.

case each AS has at least one unique IP prefix. Furthermore, simulation scenarios

involving varying traffic patterns imposes additional memory requirements as well as

CPU and communication resource requirements.

0

2000

4000

6000

8000

10000

12000

14000

16000

1998 1999 2000 2001 2002 2003

# A

S

year

Figure 5.7. Growth of AS-level Internet graph.

One important feature of distributed simulation with respect to partitioning is

that the slowest participating process determines the execution time of the whole

simulation. This implies that effective load balancing is crucial for parallel speed-

up. Furthermore, overloaded memory resource requirements may cause swapping

between main memory and disk if the virtual memory system gets triggered which

has a debilitating effect on distributed simulation. A key requirement of partitioning,

59

in addition to CPU and communication balancing, is that static and dynamic memory

requirement is balanced so that the virtual memory system is kept at bay.

In this thesis, we assume a homogeneous workstation cluster environment, where

each machine has uniform memory, CPU, and communication resources. Since a

major challenge for performing large-scale network simulation comes from significant

memory requirement, we focus on balanced distribution of static and dynamic mem-

ory requirement. To do so, we identify key factors involved in memory consumption

from experiments, and they are offloaded into participating machines. By harnessing

locality of message exchange within each distributed machine, overhead of synchro-

nization and message exchange between machines is reduced, and utilization of given

CPU and communication resources is increased.

From performance results of memory requirement monitoring in Chapter 6, it

turns out that routing tables, especially Adj-RIB-In tables of BGP, are the most

dominant factor in terms of static memory consumption. The results also show that

messages occupy only a small portion of dynamic memory compared to that of ta-

bles. Thus, we first focus on evenly distributing BGP Adj-RIB-In table’s memory

requirement into participating machines for static memory requirement balancing.

Since every edge in an AS-level Internet graph represents a BGP peering rela-

tionship, a AS node u has Adj-RIB-In tables with O(|V |deg(u)) space requirement

where |V | represents the total number of nodes. Denoting the number of edges of

an AS-level Internet graph and the number of participating machines as |E| and

k, respectively, the total number of Adj-RIB-In tables comes to 2|E|, and its space

complexity is O(|V ||E|). Thus, our heuristic algorithm limits the total number of

Adj-RIB-In tables in each partition group to around 2|E|/k.

As presented in [40], AS-level Internet graph possess power-law connectivity prop-

erties. One of the implications is that there exist a few high degree ASes, which pos-

sess peering relationships with many low degree ASes. Figure 5.8 shows a 300-node

subgraph of the 3023-node AS-level Internet graph of Oregon RouteViews/NLANR

60

measurements from 11/08/1997. We observe a few locally “star-like” AS clusters that

are connected by a more complicated “backnone”.

Figure 5.8. 300-node AS-level Internet graph.

Exploiting power-law connectivity in partitioning helps to reduce the frequency of

message exchange between distributed machines, which reduces completion time due

to reduced message exchange overhead. Considering locally star-like AS clusters, all

network traffic generated by, or heading to, many low degree ASes should go through

central high degree AS nodes. Hence, one locally star-like AS cluster is put into one

partition group. This presents message exchanges within locally star-like AS clusters

from being sent across distributed machines resulting in communication overhead.

The partitioning routine takes a graph G = (V, E) and the number of machines k

as input. It returns k partition groups as output. The entire procedure of partitioning

consists of 4 steps—sorting, phase 0, phase 1, and phase 2.

In the sorting step, V , the set of nodes of the graph G, is sorted by the degree of

each node. Phase 0 is a step that uses the power-law property for partitioning. During

phase 0, the k largest locally star-like AS clusters are assigned to k separate partition

groups. This is shown in Figure 5.9. Checking the adjacency list of each k highest

degree nodes, degree-1 nodes are put into the same partition group as their adjacent

highest degree node. This enables messages between degree-1 nodes connected by a

central high-degree node to be localized within a partition group.

61

Phase 1 and phase 2 are steps for balancing the memory requirement with respect

to the total number of Adj-RIB-In tables within one partition group. During phase 1,

connectivity information of the graph is additionally considered to reduce the number

of messages across distributed machines. As seen in Figure 5.10, partition groups are

filled from a group of the kth highest degree node to a group of the first highest

degree node. To fill a partition group, nodes are visited from the (k + 1)th node to

the last node according to the sorted order. When a node has not yet been assigned

into any group, the algorithm checks if the node has connections with any node of the

group. If this is the case, the algorithm considers if the degree of the node makes the

total number of Adj-RIB-In tables of the group exceed a limit. If the resulting total

number of Adj-RIB-In tables are within the limit, it is assigned into the partition

group. In this greedy fashion, each partition group is filled in order. Consequently,

there may be residual nodes due to its greedy characteristic.

The last step, phase 2, is for partitioning the residue which have not yet been

assigned until phase 1. Nodes are assigned into partition groups, where the total

number of Adj-RIB-In tables have not yet reached the limit, 2|E|/k, without regard

to their connectivity.

62

phase0

. Assign the k highest degree nodes into k different partitions

for i ← 1 to k

do partition[i].lists← partition[i].lists ∪ node[i];

partition[i].nedges← deg[i];

. Select degree-1 nodes from adjacency list of the k highest degree nodes

for i ← 1 to k

do for all j ∈ adj[i]

do if deg[j] = 1

then partition[i].lists← parition[i].lists ∪ node[j];

partition[i].nedges← partition[i].nedges + deg[j];

Figure 5.9. Pseudo code of phase 0.

phase1

. Use connectivity information and check for memory threshold

for i ← k to 1

do for j ← k + 1 to |V|

do if (node[j] is not assigned yet)

then for all l ∈ partition[i].lists

do if (node[j] ∈ adj[l]) and

((partition[i].nedges + deg[j]) < 2*|E|/k)

then partition[i].lists←

parition[i].lists ∪ node[j];

partition[i].nedges←

partition[i].nedges + deg[j];

break;

Figure 5.10. Pseudo code of phase 1.

63

5.2.3 Measurement Framework

Memory Requirement Analysis

In general, a network protocol model needs to maintain two category of state

information—tables and messages. For example, routing related tables are generally

required in any network simulation, and we can roughly estimate necessary amount

of memory resource statically. On the other hand, protocol messages are created

and deleted depending on each protocol’s own semantics or protocols’ interaction

under current state of the network being simulated. Hence, the amount of memory

requirement is determined dynamically, and it is difficult to approximate in advance.

Let us say protocol message related information as memory complexity; table re-

lated information as table complexity. The Dynamic DPF Simulator logs message

complexity of IP packet, TCP segment, and BGP message; table complexity of IP

routing table and BGP’s internal tables—Adj-RIB-In and Loc-RIB tables. The Dy-

namic DPF Simulator leaves a log at the time when a message or table entry is created

and the object’s size within its constructor; the time when a message or table entry is

deleted and the object’s size within its destructor. The trace file is used for deducing

the total number of objects created, the total number of objects existing at a time

instance, and the total memory consumption by them at the time instance.

The total memory consumption by protocol messages and tables at each time

instance indicates the amount of memory resources required for completion of the

given simulation. The memory requirement corresponds to the maximum value of

the total memory usage during simulation. Logging memory usage at each protocol

layer, we can identify memory requirement of messages and tables for each protocol

layer. Besides, it helps to understand dynamic state transition of network protocols.

For example, table complexity of IP tables and BGP tables indicates BGP’s routing

table calculation state—its transition and stability.

64

CPU Load Monitoring

The Dynamic DPF Simulator monitors simulation kernel-level event objects’ cre-

ation and deletion, and it logs their creation/deletion time, their size, and their type

information. Discrete event simulation framework including DaSSF generates and

handles internal events to advance simulation time. Although there exist differences

in processing events depending on their type, the total number of events exhibits

CPU load generally. Hence, we can infer CPU load balance from comparison of the

total number of events at each participating machine.

One item to note here is that there exists additional CPU load, which is introduced

by process-oriented world view of DaSSF. DaSSF provides a similar abstraction of

process in operating systems context in a way that users can design, implement, test,

and analyze simulation models as they do in real systems. Internally, DaSSF handles

event(s) which is scheduled to occur at a simulation time instance. Before advancing

the current simulation time to that of the next nearest future event(s), it schedules

all the eligible processes nonpreemptively. It causes nondeterministic computing load

per process depending on its execution flow. By monitoring CPU time over the course

of simulation, we can take this nondeterministic computing load into account.

Communication Cost Analysis

The Dynamic DPF Simulator’s detailed measurement of DaSSF’s internal events

also provides accurate and efficient run-time monitoring of communication cost. Fig-

ure 5.11 illustrates local and remote message passing procedures. An IP packet sent

through a network interface card (NIC) causes an outChannel type event to be sched-

uled. In case when the recipient side is partitioned within the same partition group,

another event object of type inChannel is scheduled while processing the outChan-

nel object. Processing the inChannel event, simulation kernel notifies arrival of the

IP packet to the recipient side NIC. During local message passing procedure, one

inChannel object and one outChannel object are instantiated.

65

On the other hand, when the recipient is present at a remote node, simulation ker-

nel schedules an Channel type event, while processing the outChannel event. Handling

the Channel event, simulation kernel sends a message to the remote node through

MPI. Subsequently, when the remote node receives the message subsequently, it sched-

ules an inChannel event. Finally, simulation kernel at the recipient side notifies the

IP packet’s arrival to the receiving side NIC, processing the inChannel event. Hence,

from fraction of the number of the Channel event processed to the number of the

outChannel (or inChannel) event processed, we observe degree of remote message

passing.

Figure 5.11. Local and remote message passing procedures.

Note that the communication cost incurred by synchronization is not taken into

account in this analysis. Before simulation starts, DaSSF calculates synchronization

interval from the given network configuration including link latency statically. Hence,

irrespective of degree of remote message passing, simulations which have same network

configuration require the same amount of fixed synchronization overhead.

66

Measurement Methodology

We characterize measurement methodology by grouping into canonical, sampling-

based, time-based, or hybrid of sampling-based and time-based measurement. A

canonical measurement is a category where a variable of interest is logged4 when an

event, changing state of the variable of interest, occurs. However, as level of detail

and scale of simulation increases, the number of event occurrence increases drastically.

Hence giving up minute details of information, techniques such as sampling-based or

time-based logging are applied. Sampling-based measurement is to log the variable of

interest once out of a specified amount of event occurrences. On the other hand, time-

based measurement is to log the variable with a specified time interval. Sometimes,

event occurrences are not evenly distributed along simulation time. In other words,

most of events might occur in a short period of simulation time, so that sampling-

based method might lose information which is rare. However, this information can

be important in terms of temporal trajectory of the variable of interest. In this

case, sampling and time-based method can be used together. That is, the variable of

interest can be logged when either a specified amount of events have occurred or a

specified time interval has passed.

DML Specification for Measurement Routines

The Dynamic DPF Simulator supports measurement routines for the aforemen-

tioned use. In addition, it proposes a generic way of specifying control variables in

input DML file.

The Dynamic DPF Simulator supports two types of measurement scope—global

and local. Global scope measurement represents to measure state of entire network

at a specific network protocol perspective or at simulation kernel perspective. On the

other hand, in local scope measurement, events related with incoming or outgoing

4Here, log means to write into a specified file. These categories are mainly focused on the action oflogging.

67

packets are measured at a local Machine’s view. Hence in order to obtain global

information from local measurement data, results logged at all Machine objects should

be collected and processed additionally.

In case of global measurement, the Dynamic DPF Simulator requires a sepa-

rate description enclosed with global_measurement. Figure 5.12 shows a DML

snippet for global measurement. The simulator classifies global measurement into

DaSSF_kernel_level and DaSSFNET_non_kernel_level. Note that measurement

from the viewpoint of simulation kernel belongs to global measurement, because

there is no concept of a local Machine at the kernel’s point of view. DML speci-

fications of measurement routines are listed within these classification boundaries.

For example, a measurement routine, kernel_event_count, monitors the number of

KernelEvent objects created cumulatively. This routine takes four control variables—

ON_OFF, SAMPLE_RATE, TIME_CONST, FILE. ON_OFF enables or disables the measure-

ment routine by taking ON or OFF as its value. Its default value is OFF. SAMPLE_RATE

provides a way of logging variables using sampling-based method. It takes an inte-

ger value, and the routine logs these variables once out of specified amount of event

occurrences. By default, the value is set to 1. In other words, these variables are

logged whenever any of those variables is changed. TIME_CONST provides a way of

logging variables using time-based method. It takes a real number, whose unit is in

simulation second. It corresponds to time interval of logging, and its default value

is 1 second. FILE takes name of a file as its value, and it is set to kevents_msr by

default. The same convention of control variables holds to other DML descriptions.

Since local measurement provides a way of measuring events related with incoming

or outgoing packets at a local Machine’s view, routines in this category are specified

as a part of each Machine model’s DML description. Figure 5.13 shows a DML snip-

pet for local measurement at IP. Depending on the viewpoint of measurement—per

network protocol, a set of control variables are provided by default. Hence, one can

use them to customize measurement routines.The meaning and usage of the four con-

trol variables (MSR_ON_OFF, MSR_SAMPLE_RATE, MSR_TIME_CONST, MSR_FILE) are the

68

global_measurement[

DaSSF_kernel_level[

kernel_event_count[

ON_OFF ON

SAMPLE_RATE 1

TIME_CONST 1

FILE kernelevt

]

]

DaSSFNET_non_kernel_level[

# other measurement routines at non-kernel level come here.

]

]

Figure 5.12. A DML snippet for global measurement

same as those of the global measurement (ON_OFF, SAMPLE_RATE, TIME_CONST, FILE),

respectively. Note that there are additional control variables—NI_MSR_ON_OFF and

NI_FILE, for measurement at network interface layer of Internet Reference Model. It

monitors a packet queue in network interface layer, and leaves a log when enqueueing,

dequeueing, or dropping a packet happen.

69

ProtocolSession [

name IP

use SSF.OS.IP

MSR_ON_OFF true # supported by all protocol models

MSR_SAMPLE_RATE 1 # supported by all protocol models

MSR_TIME_CONST 1 # supported by all protocol models

MSR_FILE ip_msr.txt # supported by all protocol models

NI_MSR_ON_OFF true # only for measurement at network interface

NI_MSR_FILE layer2_msr.txt # only for measurement at network interface

]

Figure 5.13. A DML snippet for local measurement at IP.

70

5.2.4 Protocol Modeling

The Dynamic DPF Simulator supports additional protocol models to the ones

given by DaSSFNet. Some of them, such as BGP, UDP, and various application

models, are useful for general purpose network simulations. Others, such as DPF-

update and DPF-lookup models, are designed for evaluating dynamic performance of

the route-based DPF protocol. In this section, we describe protocol models of these

protocols.

BGP-4 and its extension

DaSSFNet does not provide BGP—Internet inter-domain routing protocol. Al-

though it provides an indirect way for specifying routing table information as part of

an input DML file, a simulation, which requires dynamic alteration of routes accord-

ing to time-varying state of the network, is infeasible. We have implemented E-BGP

part of BGP-4, based on BGP-4’s RFC [48] and its protocol model in SSFNet [35].

One item to mention here is that the BGP-4 extension is designed together with

BGP-4 as a single BGP-4 model. To enable or disable the functionality of the BGP-4

extension, an additional parameter, BGP_reflect, is employed. An example of DML

for BGP-4 is described at the end of this section.

The BGP-4 protocol model is implemented as class bgp. As shown in Figure 5.14,

BGP-4 sits on top of TCP for reliable transport of its messages. BGP maintains one

or more peering relationships to other BGP speakers, and each peering relationship is

maintained as an instance of class neighbor. Thus each BGP protocol has separate

neighbor objects per peer. Each neighbor object creates two separate processes to

handle sending and receiving BGP messages from its corresponding peer. The two

processes communicate according to their states and events of BGP-4, as defined in

RFC1771 [48], and buffers between bgp instance and its neighbor instance store BGP

UPDATE and BGP REFLECT messages to send.

71

Figure 5.14. An illustration of a peering relationship between one border router inAS 1 and another in AS 2.

Three types of tables are defined in BGP-4 RFC—Adj RIB In, Loc RIB and

Adj RIB Out. In this model, there exist Adj RIB In and Loc RIB only. Without

having separate tables for advertisement, Loc RIB is used as Adj RIB Out simul-

taneously. This is because our advertisement policy allows all newly received BGP

UPDATE messages to be advertised to peers. Both Loc RIB and Adj RIB In use

trie as their internal data structure, so that IP prefix searching can be done quickly.

Figure 5.15 depicts the architecture of BGP protocol model with its tables. A BGP

protocol instance may have more than one peering relationships. A bgp instance has

a central Loc RIB table, which store route information in the form of Loc_RIB_data.

On the other hand, each neighbor objects has their own Adj RIB In tables to keep

route information from incoming BGP UPDATE messages.

There are 4 types of messages in BGP-4: OPEN, KEEPALIVE, UPDATE and

NOTIFICATION.5 As shown in Figure 5.16, all these messages are derived from a base

class bgp_message, which contains type and length fields of message header. Please

refer to RFC1771 [48] for detailed description of each message. To support the exten-

5NOTIFICATION type message is not supported in this model.

72

Figure 5.15. The architecture of BGP protocol model with its tables.

sion of BGP-4, BGP REFLECT type message is defined as class reflectmessage.

It is also derived from class bgp_message.

Figure 5.16. The class hierachy of BGP message types.

The BGP-4 protocol models employs four timers for BGP-4 and one additional

timer for its extension. ConnectRetry, Hold, KeepAlive, and MinRouteAdvertise-

mentInterval (MRAI) timers are used for BGP-4, and ReflectSend timer is used for

its extension, which is defined as Refelct timer in Section 3.3.4. Figure 5.17 illustrates

the architecture of BGP protocol model with its timers. First, Keepalive_timer is

73

used to send BGP KEEPALIVE messages periodically. Whenever Keepalive_timer

expires Handle_Send instance sends a BGP KEEPALIVE message to its peer. Next,

Handle_Listen instance uses Hold_timer for two purposes. During BGP Connect

and Active states, it is used as ConnectRetry timer. On the other hand, it is used

as Hold timer during BGP OpenConfirm and Established state. When Connec-

tRetry timer expires, the timer is reset and transport connection is initiated. When

Hold timer expires, transport connection is closed and resources are released. Both

Keepalive_timer and Hold_timer exist per BGP peer, managing separate peering

relationship with each peer. MRAI timer is responsible for initiating BGP UPDATE

messages periodically. Finally, ReflectSend timer is used to trigger BGP RELECT

messages. Both MRAI_send_timer and Reflect_send_timer are managed centrally

by BGP protocol instance.

Figure 5.17. The architecture of BGP protocol model with its timers.

Figure 5.18 shows a DML snippet of BGP-4 model. The string identifier of BGP-4

protocol model, SSF.OS.BGP, is taken as the keyword use. The keyword autoconfig

takes either true or false as a value. By setting it to true, BGP initiates peering

relationships with all BGP instances at directly connected machines automatically.

Otherwise, we can set peering relationships manually, using additional keywords for

manual configuration. The keyword connretry_time, keepalive_time, hold_time,

and mrai_time take values for setting ConnectRetry timer, KeepAlive timer, Hold

timer, and MRAI timer, respectively. As mentioned in Section 3.3.4, our BGP-4

74

model supports configuration of AS either as a stub AS or as a transit AS. The

keyword stub is used for this configuration. Our BGP-4 model supports entries of

Loc-RIB table to be dumped into a file. The keyword table_dump is used for en-

abling or disabling the functionality. The keyword BGP_reflect, reflect_timer,

and ref_start_time are used for controlling functionalities of BGP extension. By

setting the value of BGP_reflect either as true or as false, we can enable or disable

the functionalities, respectively. Otherwise, we can schedule start of the function-

alities by setting a value for reflect_start_time. The keyword reflect_timer is

used to set ReflectSend timer interval.

ProtocolSession [

name bgp # name of protocol

use SSF.OS.BGP # identifier of protocol

autoconfig true

connretry_time 120 # ConnectRetry timer interval

Keepalive_time 30 # KeepAlive timer interval

hold_time 90 # Hold timer interval

mrai_time 30 # MRAI timer Interval

stub false # stub or Transit

table_dump true # dumps table at the end

BGP_reflect false # BGP REFLECT ON/OFF

ref_start_time 1 # BGP REFLECT interval

reflect_timer 1 # BGP REFLECT start time

]

Figure 5.18. A DML snippet of BGP-4 model.

75

DPF-update

The DPF-update model provides two major functionalities—local filter table up-

date and BGP REFLECT message forwarding. DPF-update interacts with BGP

Extension as shown in Figure 5.19.

Figure 5.19. Interaction of BGP and DPF-update.

BGP Extension passes received BGP REFLECT messages to DPF-update using

Reflect buffer. First, all BGP REFLECT messages in Reflect buffer are interpreted by

DPF-update in order to increment or decrement counter value of the corresponding

entry. Next, DPF-update finds the next upstream filter site from AS-PATH of the

received message. For this, DPF-update gets a list of route-based DPF filter sites as

input. Subsequently, DPF-update attempts to make a connection with the upstream

filter site. If the connection is established, the received BGP REFLECT message

is forwarded. Otherwise, regarding the upstream filter site as an unreachable node,

DPF-update tries to find and make a connection to its next available upstream filter

site. If no next upstream filter site exists, the received BGP REFLECT message is

discarded.

Figure 5.20 shows a DML snippet of DPF-update model. It includes the keyword

filter_site which contains a list of route-based DPF filter sites, where all filter

sites’ AS numbers are specified using the keyword as.

76

ProtocolSession [

name DPF_update # name of protocol

use SSF.OS.DPF_update # identifier of protocol

filter_site[ # route-based DPF filter sites

as 1

as 2

]

]

Figure 5.20. A DML snippet of DPF-update model.

DPF-lookup

DPF-lookup provides Route-based Distributed Packet Filtering (DPF) function-

ality per network interface, referring to filter table of each network interface.

DPF-lookup is modelled as a class DPF_lookup. It includes a set of filter tables.

For verifying validity of source address of an incoming IP packet, DPF-lookup needs

to search filter table of specific link where the packet comes using its source address

as a key. Thus, filter table use trie as its data structure for fast searching speed.

When a packet is passed over from its lower layer protocol, DPF-lookup checks its

source address as we mentioned right before. Once a packet turns out to be valid, it is

passed over to its above layer protocol, IP. Otherwise, the packet is discarded. When

a packet is passed over from its higher layer protocol, it just forwards the packet into

its lower layer protocol.

Figure 5.21 provides an example DML snippet of DPF-lookup model. The key-

word nic is used to specify network interface where route-based DPF filter is deployed.

In other words, we can selectively deploy route-based DPF filter into network inter-

faces of a router. As a value for the keyword, an identifier of network interface is

used. The keyword filtering and discard_time are used to control the function-

ality for convenience. One can turn off the filtering functionality by setting the value

77

of filtering as false. Otherwise, the filtering functionality is turned on by default.

In addition, one can defer start of the filtering functionality into future by using

filtering and discard_time. Like Figure 5.21, one can set filtering to true and

set the time intended as the value of discard_time.

ProtocolSession [

name DPF_lookup

use SSF.OS.DPF_lookup

nic 0

nic 1

filtering true

discard_time 100

]

Figure 5.21. A DML snippet of DPF-lookup model.

UDP

An UDP6, model is incorporated in order to support connectionless transport

protocol. The given model for Socket API is modified to support this UDP model.

An application model needs the Socket API model to interact with UDP model. A

number of application models, which the dynamic DPF simulator supports, use UDP

model as their transport layer protocol. Figure 5.22 shows its DML specification.

The keyword use takes its string type identifier, UDP.

Applications

Supported application models ranges traffic generator, attacker, and system fault

generator models. First, five different traffic generator models are provided with

6The latest release of DaSSFNet includes UDP protocol. However, it has not been provided untilwe finished to design and implement our prototype system.

78

ProtocolSession [

name UDP

use UDP

]

Figure 5.22. A DML snippet of UDP model.

respect to their traffic distribution, which includes Poisson, Constant Bit Rate (CBR),

file trace, Markov modulated Poisson process (MMPP), and Long-Range Dependent

(LRD). Poisson, CBR, MMPP, and LRD uses probability models for traffic generation

as their names indicate. File trace is the case when we take traffic distribution from a

file. Traffic generators use UDP as its transport layer protocol. Figure 5.23 provides

an example DML snippet of CBR traffic generator. The keyword name takes a string

argument as for its name, which can be any string. The keyword use takes a string

identifier—trafficgen_cbr, of a protocol model, which is unique per protocol model.

The keyword receiver and receiverport take an address and a port number of its

receiver. Since IP address of an interface is assigned during initialization phase of

simulation, the IP address of receiver is represented in Network Host Interface (NHI)

address format, which is defined in DML [53]. Here, 1:1(0) represents the interface 0

of the machine 1 in the network 1. The start time and end time of traffic generation

are specified in seconds. The keyword traffic_rate takes a real number in Mbps

unit. The length of packet is specified using packet_length in B. The keyword

measurement_file takes a file name for measuring traffic generation. By setting the

keyword show_report to true, we observe the status of traffic generation from the

console. Other types of traffic generators follow the same pattern as that of Figure

5.23, while taking additional parameters depending on their models.

We support flooding-based attacker models, which supports IP source address

spoofing. The flooding-based attacker models are derived from the above traffic gen-

erators. In the same fashion as the traffic generators, there are five types of flooding-

79

ProtocolSession [

name trafficgen_cbr

use trafficgen_cbr

receiver 1:1(0)

receiverport 100

start_time 0.0 # in seconds

end_time 10.0 # in seconds

traffic_rate 1 # in Mbps

packet_length 200 # in B

measurement_file attacker.txt

show_report true

]

Figure 5.23. A DML snippet of CBR traffic generator.

based attacker models—Poisson, CBR, file trace, MMPP, and LRD. All of them

support IP source address spoofing. All the attackers can be configured as UDP ap-

plications or raw-IP applications, which sit on and interact with IP protocol directly.

Figure 5.24 illustrates a DML snippet for CBR massive attacker model. The model

takes two additional parameters—type and spoofing_addr. The keyword type takes

a string argument either UDP or IP, in order to specify whether it is configured as an

UDP application or an raw-IP application. The keyword spoofing_addr takes an

address in NHI format, which is used for forging source IP address of its attack traffic.

To emulate system crash at a given time or during a particular time period, we

devise an application model, called ShutDown. First, ShutDown is modelled as an

application, which is independent of its lower layer protocol. As shown in Figure

5.25, it may reside on top of Socket interface. However, it interacts with Hardware

object at the bottom of the machine, which functions as a border between network

protocols and network elements. We modified class Hardware to define additional

interfaces—shut_down() and booting(). When shut_down() is called, Hardware

80

ProtocolSession [

name attack_cbr

use attack_cbr

type UDP # UDP or IP

spoofing_addr 1:1(0)

receiver 1:1(0)

receiverport 100



traffic_rate 1 # in Mbps

packet_length 200 # in B

measurement_file attacker.txt

show_report true

]

Figure 5.24. A DML snippet of CBR massive attacker.

object blocks every incoming packet to reach to any of its above layer protocols

by discarding it. When booting() is invoked after shut_down(), Hardware object

recovers itself back to its normal state. In other words, it passes over incoming and

outgoing packets between network elements and network protocols, as described in

Section 5.1.

Figure 5.26 presents an example DML snippet of ShutDown model. The start

time and end time is specified in seconds. By setting end_time larger than the

termination time of simulation, we can emulate persistent system crash. Otherwise,

we can emulate system startup or rebooting during a simulation period.

81

Figure 5.25. Mechanism of ShutDown model.

ProtocolSession [

name ShutDown

use SSF.OS.app.ShutDown



]

Figure 5.26. A DML snippet of ShutDown model.

83

6 LARGE-SCALE NETWORK SIMULATION

The Dynamic DPF Simulator is designed to be a scalable network simulation environ-

ment with respect to the size of the input network topology. We focus on provisioning

of scalable partitioning and dynamic measurement and monitoring to facilitate large-

scale network simulation over workstation clusters.

In the first half of this chapter, we demonstrate performance and utility of the

measurement subsystem. In the second half, we carry out performance evaluation

of the partitioning algorithm. We analyze the Dynamic DPF Simulator’s time and

memory complexity with respect to network size and degree of parallelism for speed-

up.

6.1 Experimental Setup

6.1.1 System Configuration

The experiments described in the following sections were conducted on a dedicated

FastEthernet LAN cluster in the Network Systems Lab (NSL) consisting of twenty

four i686 machines running Linux 2.4.17; six with Pentium 4 2.53GHz CPUs and

1GB main memory, ten with Pentium 4 2.00GHz CPUs and 1GB main memory, and

eight with Pentium 3 996MHz CPUs and 512MB main memory. As shown in Figure

6.1, these machines are connected via a 100Mbps FastEthernet switch. As part of the

Dynamic DPF Simulator environment, DaSSFNet and MPI are installed in all the

participating machines.

For performance evaluation of parallelism speed-up, each benchmark test case is

carried out with several machine configurations—5, 8, 12, 16, 20, 24 machines. Since

machines in the cluster do not have a uniform system configuration, participating

84

Figure 6.1. Hardware configuration of a Linux cluster used for AS-levelbenchmarking.

machines for each benchmark test are chosen with preference of high clock rate CPU

and large main memory.

6.1.2 Benchmark Topologies

1,020-node, 2,020-node, 3,023-node, and 4,512-node AS-level Internet graphs are

selected for the benchmarks. 3,023-node and 4,512-node graphs are obtained from

Nov. 8, 1997 and Jan. 1, 1999 NLANR [9] measurement data, respectively. 1,020-

node and 2,020-node graphs are obtained from the 3,023-node graph by pruning nodes

uniformly randomly. Table 6.1 shows the statistics of the benchmark graphs.

To confirm our assumption that all benchmark topologies exhibit power-law con-

nectivity, each graph’s degree distribution is plotted as a function of ranked node

degree in log-log scale. In Figure 6.2, we observe a linear relationship which is con-

sistent with a power-law relation.

85

Table 6.1Statistics of the benchmark topologies.

Nodes Edges Description

1020 1465 subgraph of Nov. 8, 1997 NLANR

2020 3305 subgraph of Nov. 8, 1997 NLANR

3023 5257 Nov. 8, 1997 NLANR

4512 8383 Jan. 1, 1999 NLANR

1

10

100

1000

1 10 100 1000 10000

degr

ee

rank by degree

4512-node graph3023-node graph2020-node graph1020-node graph

Figure 6.2. Power-law connectivity of the benchmark topologies.

6.1.3 Simulation Setup

The simulation period is from 0 second to 300 simulation seconds. Each node

creates one or more BGP sessions to make separate peering relationships with its

directly connected neighbors depending on degree. When a simulation starts, each

BGP session attempts to make a TCP connection to one of directly connected neigh-

bors. Once a TCP connection is established, it sends a BGP OPEN message to set

up a BGP peering relationship. When its peer replies with a BGP OPEN message,

it sends a BGP KEEPALIVE message back, in order to complete the setup proce-

dure. When a BGP KEEPALIVE message is received from a peer, it initiates route

86

calculation by sending a BGP UPDATE message which advertises a route to reach to

itself from other nodes in the system. BGP employs four timers—ConnectRetry, Hold,

KeepAlive, and MinRouteAdvertisementInterval (MRAI). Referring to RFC1771 [48],

values of these timers are set to 120 seconds, 90 seconds, 30 seconds, and 30 seconds,

respectively.

Table 6.2Parameter settings for TCP.

Parameter Value

Maximum segment size 1024B

Receive window size 512

Send window size 512

Maximum retransmission count 12

TCP slow timer interval 0.5 sec

TCP fast timer interval 0.2 sec

Maximum segment lifetime 60 sec

Maximum idle time 600 sec.

Table 6.2 lists parameter settings for TCP. In addition, we assumed network in-

terfaces with sufficient buffers. Link bandwidth is set to 100Mbps, and propagation

delay is set to 0.001 second.

6.2 Performance and Utility of Comprehensive Measurement Subsystem

6.2.1 Methodology

To demonstrate the performance and utility of the measurement subsystem, we

simulated BGP’s dynamic route update and convergence behavior on the 3023-node

AS-level Internet graph over a 16-Linux cluster configuration.

87

We monitored memory consumption by each protocol’s messages and tables. The

total memory consumption by protocol messages and tables are then aggregated. In

addition, for each kernel event type, we measure both memory consumption and the

count of objects processed. The total memory consumption by kernel event objects

are then obtained. For all 16 participating machines, we get separate measurement

results with simulation time, and the results are merged and sorted by simulation

time in a way that the merged trajectory provides consistent information over the

whole distributed memory resources.

6.2.2 Memory Requirement Monitoring

We demonstrate memory requirement monitoring at run-time. Unlike static mem-

ory requirement introduced by protocol tables, dynamic memory requirement by pro-

tocol messages and kernel events is difficult to predict and estimate. In particular,

since large-scale network simulation requires a large amounts of memory, identifying

memory consumption by each component is necessary for memory oriented partition-

ing to keep the virtual memory system at bay.

Figure 6.3 shows memory consumption as a function of simulation time. Here,

trajectories of memory consumption by each category are stacked to help comparison.

Figure 6.3(a) classifies memory consumption into three categories—Kernel Events,

Messages, and Tables. Kernel Events represents memory consumption by DaSSF’s

kernel level events. Messages represents memory consumption by protocol messages.

Tables represents memory consumed by protocol tables. Figure 6.3(b) subdivides

these categories further into fine granular components.

From Figure 6.3(a), we observe that Tables and Kernel Events consume most of

the memory space. Although memory consumption by Messages increases to some

extent as simulation proceeds, it takes a small portion compared to that of Tables

and Kernel Events. As simulation proceeds, memory consumed by tables increases

till a certain point (around 180 simulation second) and stabilizes thereafter. Memory

88

(a) (b)

Figure 6.3. Memory consumption as a function of simulation time. (a)M Memoryconsumption is classifies into three categories—tables, messages, and kernel events.

(b) The categories are further subdivided into fine granular components.

consumed by kernel events increases initially as simulation proceeds, and it decreases

after a certain point (around 150 simulation second). At its peak, it consumes a

significant portion of memory on par with that of tables.

As Figure 6.3(b) shows, Kernel Events is further classified into major event types—

TIMER, CHANNEL, OUTCHANNEL, and INCHANNEL, with respect to their mem-

ory consumption. Messages is classified into BGP Messages, TCP Receive, TCP

Send, and IP Messages. Tables are classified into Adj-RIB-In, Loc-RIB, and IP table.

From the comparison of 6.3(a) and 6.3(b), we can notice that TIMER consumes the

most among the major event types. Among different protocol tables, Adj-RIB-In is

dominant with respect to memory consumption. We look into each category in the

following sections.

89

6.2.3 Memory Consumption by Tables

Figure 6.4 shows a trajectory of memory consumption by protocol tables—BGP

Adj-RIB-In, BGP Loc-RIB, and IP tables—as a function of simulation time. Tra-

jectories of memory consumption by each category are stacked to help to compare

their shares. At around 30, 60, 90, 120, 150, and 180 seconds, memory consumption

by protocol tables increases. We observe significant increment at around 60, 90, 120

simulation seconds. BGP Adj-RIB-In consumes most space than the others.

Figure 6.4. Memory consumption by protocol tables. BGP Adj-RIB-In, BGPLoc-RIB, and IP.

In Section 6.1, we mentioned that BGP MinRouteAdvertisementInterval (MRAI)

timer value is set to 30 seconds. Each BGP session maintains a separate MRAI

timer. Whenever MRAI timer expires, it checks if there exist any newly received

route advertisement. If there is a new route information to be advertised, it sends a

BGP UPDATE message to its peer. Since BGP peering relationships are established

around 0 second with a random difference in the range of from 0 to 1000 ms, MRAI

timers of all BGP sessions are expired almost synchronously with 30 second interval.

When BGP UPDATE messages are received, each BGP session updates its own Adj-

RIB-In and Loc-RIB tables depending on the result of decision process of them. From

the increase of memory consumption at around 30, 60, 90, 120, 150, 180 seconds, we

find how much portion of each table is filled up at each 30 second interval. After

90

180 simulation second, these is no noticeable increase of memory consumption. This

indicates that BGP route advertisement and calculation are in its final phase.

According to the description of BGP modelling in Section 5.2.4, BGP maintains

a separate Adj-RIB-In table per peering relationship to store BGP route update

received from each peer. There exists one BGP Loc-RIB table as a central storage for

selected route updates. In addition, due to the AS-level Internet modelling described

in Section 5.2.1, an AS has the equal number of Adj-RIB-In table to its degree. Thus,

denoting an AS-level Internet graph as G = (V, E), the total number of Adj-RIB-In

tables is equal to 2 · |E|, where |E| represents the total number of edges. From the

statistics of the 3023-node AS-level Internet graph in Table 6.1, we can assume that

the total number of edges, |E|, is approximately equal to 2 · |V |, where |V | represents

the total number of nodes of a graph. Thus, the total number of Adj-RIB-In tables

is equal to 4 · |V |. Similarly, the total number of Loc-RIB tables is equal to the total

number of nodes, |V |, because an AS includes one border router according to the AS-

level internet modelling. Additionally, the total number of IP routing table is equal

to the total number of nodes of a graph, |V |, from the same argument. Hence, the

total number of Adj-RIB-In tables are roughly three times more than that of Loc-RIB

tables, which is the same as the total number of IP routing tables. However, from

Figure 6.4, we observe that IP tables consumes more memory than Loc-RIB tables.

It is due to the difference in size of each entry’s data structure.

6.2.4 Memory Consumption by Messages

Figure 6.5 shows plots of memory consumption by protocol messages—BGP, TCP

send buffer, TCP receive buffer, and IP, as a function of simulation time. Trajectories

of memory consumption by each category are stacked to help to compare their shares.

In every 30 seconds, we observe a surge of messages due to BGP’s MRAI timer

expiration. In total, the amount of memory required by protocol messages becomes

around 1GB.

91

Figure 6.5. Memory consumption by protocol messages. BGP, TCP Send buffer,TCP Receive buffer, and IP.

On the other hand, It appears that IP message remains in IP output buffer con-

suming memory up to 400MB over the course of the simulation period. This result,

however, is due to DaSSFNet’s unique IP output buffer management, where dequeued

messages remain in the IP output buffer until another packet is enqueued. Hence,

current timer-based IP message cannot recognize that messages has been dequeued

until a packet is enqueued to the IP output buffer. We can correct this problem by

modifying timer-based IP measurement routine to reflect possible message dequeueing

before measurement. In this thesis, we focus on the fact that memory consumption

by messages are secondary compared to that by tables and kernel events.

6.2.5 Memory Consumption and Counting of Major Kernel Events

DaSSF schedules and processes several types of kernel level events. This section

starts with introduction of major types of kernel events. Next, measurement results

of kernel event objects are presented and discussed.

As mentioned in Section 5.1, most of packet handling is realized as a chain of proce-

dure calls between protocol models. Simulation kernel mainly processes events related

to message passing, timer, and semaphore. As illustrated in Figure 5.11, message

passing related events consist of inChannel, outChannel, and Channel. In addition,

92

DaSSF provides extended utilities such as timer and semaphore. The timer model

internally creates and schedules kernel event object of type EVTYPE_TIMER. DaSSF

allow to cancel a scheduled timer event. When cancel() of the timer model is called,

it internally changes the type of originally scheduled event into EVTYPE_CANCEL. The

kernel event object is processed and released at the originally scheduled time. Next,

the semaphore model is used as a mechanism of inter-process communication. When

semWait() of the semaphore model is called, the calling process is put into the

semaphore’s waiting process list, without creating any kernel event objects. How-

ever, when semSignal() of the semaphore model is called, it creates a kernel event

object of type EVTYPE_SEMSIGNAL.

0

1e+07

2e+07

3e+07

4e+07

5e+07

6e+07

7e+07

0 50 100 150 200 250 300

coun

t

simulation time

INCHANNELOUTCHANNEL

CHANNELTIMER

SEMSIGNAL

(a)

0

5e+08

1e+09

1.5e+09

2e+09

2.5e+09

0 50 100 150 200 250 300

mem

ory

wat

erm

ark

simulation time

TOTALINCHANNEL

OUTCHANNELCHANNEL

TIMERSEMSIGNAL

(b)

Figure 6.6. Major types of KernelEvent objects. (a) shows cumulative counts ofKernelEvent object creation as a function of simulation time for the major types.

(b) shows total memory consumption by KernelEvent objects and memoryconsumption by major types of KernelEvent objects as a function of simulation time.

Figure 6.6(a) provides cumulative counts of kernel level objects for each ma-

jor type—INCHANNEL, OUTCHANNEL, CHANNEL, TIMER, and SEMSIGNAL1.

Figure 6.6(b) shows total memory consumption by all kernel event objects as well as

memory consumption by major types of kernel event objects as a function of time.

1They are declared as EVTYPE INCHANNEL, EVTYPE OUTCHANNEL, EVTYPE CHANNEL,EVTYPE TIMER, and EVTYPE SEMSIGNAL within the class definition of KernelEvent, respec-tively.

93

From Figure 6.6(a), cumulative counts of INCHANNEL, OUTCHANNEL, and

TIMER exhibit similar trajectory throughout the period of simulation, and they are

the most dominant. From the plots of CHANNEL and SEMSIGNAL, we find that

the count of CHANNEL type events are more than half of those of the most dominant

event types during the simulation period; SEMSIGNAL type events are around half.

Figure 6.6(b) shows that total memory consumption by all kernel event objects

are equivalent to memory consumption by TIMER type objects. Comparatively, it

indicates that all other major types of objects consumes a minor portion of memory.

Thus, although comparable amount of objects are created for all major event types—

INCHANNEL, OUTCHANNEL, CHANNEL, TIMER, and SEMSIGNAL, only TIMER

event type is dominant in terms of memory consumption. This is because other event

type objects are created, processed, and deleted in short time. However, TIMER

event object exists until its originally scheduled time. Even if the scheduled timer

event is cancelled, the originally scheduled event object remains till its scheduled

time. For example, Hold timer interval of BGP model is set to 90 seconds, and

the timer is frequently rescheduled whenever a message is received from its peer. In

other words, the event is cancelled, and a new event of TIMER type is created and

scheduled whenever a message arrives. Since TIMER events’ memory consumption is

significant, we can reduce memory requirement by modifying the way DaSSF handles

TIMER events.

6.2.6 CPU Load Monitoring

Figure 6.7 presents CPU load distribution over 16 Linux workstations. The plot

“total kernel event” shows the total number kernel events processed at each machine;

“cpu total” represents CPU time occupied by corresponding process at each machine.

Both plots exhibits similar trajectories during the simulation period. Although kernel

events to be processed are evenly distributed over the 16 machines, we observe CPU

load imbalance in terms of CPU time. This imbalance may arise from uneven com-

94

putation load for scheduling processes or communication overhead. As we observed

in this experiment, CPU time measurement is a relevant metric for evaluating CPU

load balance.

0

0.05

0.1

0.15

0.2

0 2 4 6 8 10 12 14 16

CP

U lo

ad d

istr

ibut

ion

machine

total kernel eventcpu total

(a)

Figure 6.7. CPU load distribution over 16 Linux workstations.

6.2.7 Communication Cost

As shown in Figure 5.11, one inChannel and one outChannel objects are instan-

tiated in both local and remote message passing cases. In case of remote message

passing an additional Channel object is created. The measurement results in Figure

6.6(a) shows that cumulative counts of INCHANNEL and OUTCHANNEL objects

are identical. We find that more than half of messages are sent over MPI. This is

a valuable tool for analyzing communication cost for a given distributed simulation.

Moreover, it is a useful metric for evaluating different partitioning algorithms’ per-

formance in terms of communication cost.

6.3 Scalability of Partitioning

In this section, we present the benchmark test results for examining scalability of

partitioning. As stated in Section 5.2.2, partitioning subsystem is focused on efficient

95

use of distributed memory and utilization of CPU resources to achieve scalability of

simulation, as the size of a graph increases.

We evaluate scalability of the proposed system in terms of time complexity and

memory complexity. First, completion time of the benchmark simulation is deter-

mined as a metric of time complexity. In a distributed setting, completion time of a

distributed program is determined by that of a distributed partition of the program,

which completes last. As a result, completion time of a distributed program reflects

temporal balance between distributed partitions of the program and overhead due to

synchronization and message passing. Next, memory watermark—maximum mem-

ory usage during period of simulation, is used as a performance metric for memory

complexity. Although memory usage fluctuates as simulation proceeds, the amount

of memory required for a simulation is determined by memory watermark. Unlike

completion time, the result of a benchmark test consists of measurement results at

16 individual machines. Thus, memory complexity is represented by total, average,

and maximum value of memory watermark measurements.

For this experiment, completion time of a simulation is defined as the elapsed real

time between start and end of the simulation. The time command [54] of Linux is

used to measure completion time. It returns the elapsed real time when a program

terminates. Similarly, memory watermark is defined as the maximum memory con-

sumption by a simulation process2, which includes its code, data, and stack space.

Memory watermark is measured using the top command [55] of Linux. During the

period of a simulation, the top command is executed in a batch mode with 30 second

delay. The top command fetches information from the Linux process (/proc) file

system periodically with the specified delay. The SIZE field provides the size of a

process’s code, data, and stack space in KB, and it is logged throughout the period

of simulation. The top command is executed in all participating machines, and the

maximum value in each log file is taken as memory watermark on the specific machine.

2During the experiment, we could observe additional forked process in case of parallel setting viaMPI. Since their memory usages are negligible (less than 1MB), they are ignored for this measure-ment.

96

0

1000

2000

3000

4000

5000

6000

7000

0 5 10 15 20 25 30

com

plet

ion

time

number of machines

4512-node graph3023-node graph2020-node graph1020-node graph

(a)

0

1000

2000

3000

4000

5000

6000

7000

0 1000 2000 3000 4000 5000

com

plet

ion

time

problem size

24 machines20 machines16 machines

(b)

Figure 6.8. (a) Completion time as a function of parallelism for different benchmarkgraphs. (b) Completion time as a function of problem sie for 16, 20, and 24

machines.

6.3.1 Completion Time

Figure 6.8 shows completion time both as a function of parallelism for different

benchmark graphs and as a function of problem size for 16, 20, and 24 machines.

Here, completion time is represented in seconds. Figure 6.8(a) shows the result of

four different benchmark graphs, which we call 4512-node graph, 3023-node graph,

2020-node graph, and 1020-node graph. Here, the trajectory of 4512-node graph

includes results of 16, 20, and 24 machines cases only. It is because a 4512-node

graph simulation cannot be launched on 12 or lesser machines. In both 4512-node

graph and 3023-node graph, completion time decreases as parallelism increases. On

the other hand, completion time of 2020-node graph shows an optimal value at a

simulation with 20 machines. Although completion time decreases as parallelism

increases from 5 machines to 20 machines, it increases after the optimal point. In

case of 1020-node graph, completion time increases, as parallelism increases.

The results show an effect of the problem size on a relation between parallelism

and completion time. When the size of a problem is sufficiently large, parallelism

can be useful for reducing completion time of a simulation. The results of 4512-node

97

graph and 3023-node graph support this case. However, increased parallelism usually

induces overhead for synchronization and message passing, if communication between

parallel partitions is required. Especially, when the problem size is sufficiently small,

parallelism may be of no use. The result of 1020-node graph can be understood in

this category. Due to our modelling of network, we can expect a certain amount of

traffic on each link. This might cause more overhead as we increase parallelism. On

the other hand, the initial trend of 2020-node graph belongs to the first. However,

its trend changes into that of the latter after the optimal point. In this case, we may

conclude that, with the given traffic pattern, the scale of 2020-node graph simulation

does not need more parallelism than 20 machines.

Figure 6.8(b) shows the result of 16, 20, and 24 machine simulations. In case of

1020-node graph, 16 machine simulation shows better performance than the other two

although there exist minor differences. For 2020-node graph, 20 machine simulation

completes first, compared to the other two. In case of 3023-node graph and 4512-node

graph, 24 machine simulation shows better performance with minor differences. From

the graphs of 16 machine and 20 machine simulation, we can observe a general trend

that completion time increases as a function of problem size. From the trajectory of

24 machine simulation, completion time of 3023-node graph is shorter than that of

2020-node graph. Nevertheless, the graph of 24 machine simulation shows the general

trend. As mentioned during discussion of 6.8(a), the result of 2020-node graph shows

the best performance when 20 machines are used for its simulation. In case of 3023-

node and 4512-node graph, the best performance is observed when 24 machines are

used for their simulation. Here, we can observe that the results of 4512-node graph

in 20 machines case and 24 machines case are not so different. Although it is not

sufficient as for evidence, one thing to note is that system configuration is not the

same for all 24 machines, as mentioned in Section 6.1. In addition, the performance

of the additional 4 machines’ resources is not as good as other 16 machines.

98

6.3.2 Balanced Memory Offloading

Figure 6.9 shows total, average, and maximum memory watermark as a function of

parallelism for different benchmark graphs. Here, memory watermark is represented

in MB. In Figure 6.9(a), we observe trajectories of total memory watermark for dif-

ferent benchmark graphs—4512-node graph, 3023-node graph, 2020-node graph, and

1020-node graph. As we increase parallelism, total memory watermark increases in

all four benchmark graphs. On the other hand, Figure 6.9(b) shows average and max-

imum memory watermark for the benchmark graphs. As the number of participating

machines increases from 5 machines to 24 machines, average memory requirement de-

creases proportionally. In case of 4512-node graph and 3023-node graph, maximum

memory watermark decreases with similar trajectory to that of average memory wa-

termark. In 2020-node graph, maximum memory watermark decreases to a certain

point (16 machine simulation). However, it increases slightly as parallelism increases.

In 1020-node graph, maximum memory remains the same although the number of

participating machines increases more than 8.

0

2000

4000

6000

8000

10000

12000

14000

0 5 10 15 20 25 30

mem

ory

wat

erm

ark

number of machines

4512-node graph, total3023-node graph, total2020-node graph, total1020-node graph, total

(a)

0

500

1000

1500

2000

0 5 10 15 20 25 30

mem

ory

wat

erm

ark

number of machines

4512-node graph, avg4512-node graph, max3023-node graph, avg3023-node graph, max2020-node graph, avg2020-node graph, max1020-node graph, avg1020-node graph, max

(b)

Figure 6.9. (a) Total memory watermark as a function of parallelism for differentbenchmark topologies. (b) Average and maximum memory watermark as a function

of parallelism for different benchmark graphs.

99

Figure 6.9(a) shows that additional memory spaces are needed as parallelism in-

creases. It may come from cost for building and maintaining global context of a

simulation at each local machine. However, the results of 4512-node graph and 3023-

node graph cases in Figure 6.9(b) show that total memory requirement is effectively

offloaded into participating machines’ sides. At the same time, the results of 2020-

node graph and 1020-node graph indicate that there are some cases where maximum

memory watermark is not decreased expectantly. It is because the phase 0 of our

partitioning algorithm affects trends of maximum memory watermark in these cases.

In order to reduce traffic load crossing distributed machines, every node, which has

an only edge to a central high-degree node, is put into the same partition group with

the central high-degree node. The partition groups determined during the phase 0

are not divided any further. Hence, we can guarantee that traffic between them is

localized within the partition group where they belong. However, it also implies that

parallelism cannot help to reduce maximum memory requirement caused by these

partition groups.

0

2000

4000

6000

8000

10000

12000

14000

0 1000 2000 3000 4000 5000

mem

ory

wat

erm

ark

problem size

24 machines, total20 machines, total16 machines, total

(a)

0

500

1000

1500

2000

0 1000 2000 3000 4000 5000

mem

ory

wat

erm

ark

problem size

24 machines, avg24 machines, max20 machines, avg20 machines, max16 machines, avg16 machines, max

(b)

Figure 6.10. (a) Total memory watermark as a function of problem size for 16, 20,and 24 machines. (b) Average and maximum memory watermark as a function of

problem size for 16, 20, and 24 machines.

Figure 6.10 shows total, average, and maximum memory watermark as a function

of problem size for 16, 20, and 24 machines. Memory watermark is represented in MB.

100

In Figure 6.10(a), we observe trajectories of total memory watermark for 16, 20, and

24 machines. As we increase problem size, total memory watermark increases in all

three cases. In Figure 6.10(b), trajectories of average memory watermark increases as

a function of problem size in all three cases. The same trend is observed from those of

maximum memory watermark. In 1020-node graph case, results of maximum memory

watermark for 16, 20, and 24 machines are the same. In 2020-node graph case, results

show minor differences. However, Differences of maximum memory watermark results

increase as problem size increases. Moreover, result of 24-machine simulation shows

best performance in both 3023-node graph and 4512-node graph cases.

As shown in Figure 6.10(b), when the problem size is as large as 2020 or more, the

total memory watermark exceeds 1GB. It supports that it is difficult to run this much

scale simulation in a single PC machine, which has 1GB of memory. In addition, the

trend of total memory watermark as a function of problem size increases non-linearly.

Thus, it upholds that distributed simulation is required for scalable simulation with

respect to its problem size. At the same time, Figure 6.10(b) indicates effectiveness

of parallelism when problem size becomes larger.

101

7 CONCLUSION AND FUTURE WORK

This thesis studies performance evaluation of route-based distributed packet filter-

ing for distributed denial of service attack prevention in large-scale networks under

dynamic network conditions.

We have designed and implemented a route-based DPF protocol, which updates

route-based DPF tables dynamically in the presence of IP routing table updates

handled by BGP, Internet’s inter-domain routing protocol. We have carried out

performance evaluation of the route-based DPF protocol’s fault-tolerant protection in

Internet autonomous system level measurement topologies. We have designed a new

partitioning algorithm for power-law network topologies, characteristic of Internet AS

measurement graphs, which achieves balanced distribution of memory requirement as

well as efficient utilization of CPU and communication resources.

For future work, the first item is to release a public domain version of the Dy-

namic DPF Simulator, called DaSSFNet-Turbo, that is applicable to general network

simulation spanning traffic control and network security.

Performance evaluation of route-based DPF in the presence of DDoS attacks aimed

at the network infrastructure is next on the list. As shwon in this thesis, during

transient periods, before BGP’s routing tables have established, route-based DPF

filters may contain inconsistent—i.e., stale or safety-violating—entries. Our results

under single node failures show that route-based DPF continues to provide significant

protection, however, we would need to extend performance evaluation to a range of

full-fledged infrastructure attacks.

Power-law topology partitioning is not limited to the Dynamic DPF Simulation

environment and will be obtained as a general problem into its own, extracting the

results reported in the thesis with more recent advances outside its scope.

102

Lastly, we would like to build a prototype system utilizing a 7-node Intel IXP1200-

based network processor (NP) testbed in the Network Systems Lab. A prototype NP-

implementation allows evaluation of system level overhead and performance issues

that are important when considering an interim migration path that is compatible

with legacy routers.

Referring to the simulation model implemented in this thesis, we need to build a

prototype system. In case of filter look-up, high-speed processing is required. At the

same time, overhead at control plane due to filter table update needs to be analyzed.

LIST OF REFERENCES

103

LIST OF REFERENCES

[1] David Moore, Goeffrey Voelker, and Stefan Savage. Inferring internet denial-of-service activity. In the 10th USENIX Security Symposium, 2001.

[2] Lee Garber. Denial-of-service attacks rip the Internet. Computer, pages 12–17,April 2000.

[3] Ryan Naraine. Massive ddos attack hit dns root servers, October 2002.http://www.internetnews.com/dev-news/article.php/1486981.

[4] ComputerWire. Ddos attack ’really, really tested’ ultradns, November 2002.http://www.theregister.co.uk/content/55/28291.html.

[5] C.E.R.T. Cert advisory ca-2002-15 denial of service vulnerability in isc bind 9,sep. 2002. http://www.cert.org/advisories/CA-2002-15.html.

[6] Patrick Gray. Worm could be clearing path for ddos attack, March 2003.http://news.zdnet.co.uk/business/0,39020645,2131631,00.htm.

[7] K. Park and H. Lee. On the effectiveness of route-based packet filtering fordistributed dos attack prevention in power-law internets. In In Proc. ACM SIG-COMM ’01, pages pp. 15–26, 2001.

[8] Dassfnet: a c++ implementation of ssfnet.http://www.cs.dartmouth.edu/ ghyan/dassfnet/overview.htm.

[9] National Laboratory for Applied Network Research. Routing data, 2000. Sup-ported by NFS, http://moat.nlanr.net/Routing/rawdata/.

[10] NightAxis and Rain Forest Puppy. Purgatory 101: Learning to copewith the SYNs of the Internet, 2000. Some practical approachesto introducing accountability and responsibility on the public internet,http://packetstorm.securify.com/papers/contest/RFP.doc.

[11] Computer Emergency Response Team. Denial of service, February 1999. TechTips, http://www.cert.org/tech tips/denial of service.html, 2nd revision.

[12] Computer Emergency Response Team (CERT). CERT Advi-sory CA-2000-01 Denial-of-service developments, January 2000.http://www.cert.org/advisories/CA-2000-01.html.

[13] R. K. C. Chang. Defending against flooding-based distributed denial-of-serviceattacks: A tutorial. IEEE Communications Magazine, pages pp. 42–51, October2002.

[14] S. Savage, D. Wetherall, A. Karlin, and T. Anderson. Practical network supportfor IP traceback. In Proc. of ACM SIGCOMM, pages 295–306, August 2000.

[15] Cisco Systems. Characterizing and tracing packet floods using Cisco routers,Aug 1999. http://www.cisco.com/warp/public/707/22.html.

104

[16] Glenn Sager. Security fun with OCxmon and cflowd, November 1998. Presenta-tion at the Internet 2 Working Group.

[17] Jon Postel. Internet protocol, September 1981. RFC 791.

[18] S. Bellovin. ICMP traceback messages, March 2000. Internet Draft: draft-bellovin-itrace-00.txt (expires September 2000).

[19] H. Burch and B. Cheswick. Tracing anonymous packets to their approximatesource. In 14th Systems Administration Conference (LISA 2000), pages 319–327, 2000.

[20] K. Park and H. Lee. On the effectiveness of probabilistic packet marking forIP traceback under denial of service attack. Technical Report CSD-TR 00-013,Department of Computer Sciences, Purdue University, June 2000.

[21] CERT/CC, SANS Institute, and CERIAS. Consensus roadmap for defeatingdistributed denial of service attacks, February 2000. A Project of the Partnershipfor Critical Infrastructure Security, http://www.sans.org/ddos roadmap.htm.

[22] P. Ferguson and D. Senie. Network ingress filtering: Defeating denial of serviceattacks which employ IP source address spoofing, May 2000. RFC 2827.

[23] Daniel Senie. Changing the default for directed broadcasts in routers, August1999. RFC 2644.

[24] David G. Andersen. Mayday: Distributed filtering for internet services. In 4thUsenix Symposium on Internet Technologies and Systems, March 2003.

[25] A. Keromytis, V. Misra, and D. Rubenstein. Sos: Secure overlay services. In InProc. ACM SIGCOMM ’02, 2002.

[26] J. Li, J. Mirkovic, M. Wang, P. Reiher, and L. Zhang. Save: Source addressvalidity enforcement protocol. In IEEE INFOCOM, June 2002.

[27] S. AGARWAL, C. CHUAH, and R. KATZ. Opca: Robust interdomain policyrouting and traffic control. In IEEE Openarch, April 2003.

[28] George Riley and Mostafa Ammar. Simulating large networks: How big is bigenough? In Proceedings of First International Conference on Grand Challengesfor Modeling and Simulation, Jan. 2002.

[29] Vern Paxson and Sally Floyd. Why we don’t know how to simulate the internet.In Winter Simulation Conference, pages 1037–1044, 1997.

[30] The network simulator-ns-2. http://www.isi.edu/nsnam/ns/.

[31] D. Nicol, M. Goldsby, and M. Johnson. Fluid-based simulation of communicationnetworks using ssf, 1999.

[32] B. Liu, Y. Guo, J. Kurose, D. Towsley, and W. Gong. Fluid simulation oflarge scale network: Issues and tradeoffs. In Proceedings of 1999 InternationalConference on Parallel Distributed Processing Techniques and Applications, June1999.

105

[33] C. Kiddle, R. Simmonds, C. Williamson, and B. Unger. Hybrid packet/fluid flownetwork simulation. In Proceedings of the seventeenth workshop on Parallel anddistributed simulation, page 143, 2003.

[34] Pdns-parallel/distributed ns. http://www.cc.gatech.edu/computing/compass/pdns/index.html.

[35] The SSF Research Network. Scalable simulation framework.http://www.ssfnet.org/homePage.html.

[36] Boleslaw K. Szymanski, Yu Liu, and Rashim Gupta. Parallel network simulationunder distributed genesis. In Proceedings of the seventeenth workshop on Paralleland distributed simulation, page 61, 2003.

[37] Donghua Xu, George F. Riley, Mostafa H. Ammar, and Richard Fujimoto. En-abling large-scale multicast simulation by reducing memory requirements. InProceedings of the seventeenth workshop on Parallel and distributed simulation,page 69, 2003.

[38] Akihito Hiromori, Hirozumi Yamaguchi, Keiichi Yasumoto, Teruo Higashino, andKenichi Taniguchi. Reducing the size of routing tables for large-scale networksimulation. In Proceedings of the seventeenth workshop on Parallel and distributedsimulation, page 115, 2003.

[39] Dartmouth ssf. http://www.cs.dartmouth.edu/ jason-liu/projects/ssf/index.html.

[40] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of theInternet topology. In Proc. of ACM SIGCOMM, pages 251–262, 1999.

[41] R. Albert, H. Jeong, and A. Barabasi. Diameter of the world wide web. Nature,pages 130–131, 1999.

[42] A. Broder, R. Kumar, F. Maghoul, and P. Raghavan. Graph structure in theweb. In Proceedings of the WWW9 Conference, May 2000.

[43] H. Jeong, B. Tomber, R. Albert, Z. Oltvai, and A. L. Babarasi. The large-scaleorganization of metabolic networks. Nature, pages 378–382, 2000.

[44] A. J. Lotka. The frequency distribution of scientific productivity. The Journalof the Washington Academy of the Sciences, page 317, 1926.

[45] M. Newman. The structure of scientific collaboration networks. Proc. Natl. Acad.Sci. USA 98, 4:404–409, 2001.

[46] S. Redner. How popular is your paper? Euro. Phys. J. B, 4:131–134, 1998.

[47] Fan R. K. Chung. Spectral Graph Theory. American Methematical Society, 1997.

[48] Y. Rekhter and T. Li. A border gateway protocol 4 (bgp-4), March 1995. RFC1771.

[49] V. Paxson. End-to-end routing behavior in the internet. In In Proc. ACMSIGCOMM ’96, pages pp. 25–38, 1996.

106

[50] Editor James H. Cowie. Scalable simulation framework api reference manualversion 1.0. http://www.ssfnet.org/SSFdocs/ssfapiManual.pdf, March 1999.

[51] Mpi-the message passing interface standard. http://www-unix.mcs.anl.gov/mpi/.

[52] The x-kernel protocol framework. http://www.cs.arizona.edu/xkernel/.

[53] The SSF Research Network. Domain modeling language (dml) reference manual.http://www.ssfnet.org/SSFdocs/dmlReference.html.

[54] Linux reference manual, section 1, time.

[55] Linux reference manual, section 1, top.