PERFORMANCE EVALUATION OF ROUTE-BASED DISTRIBUTED PACKET
FILTERING FOR DDOS PREVENTION IN LARGE-SCALE NETWORKS
A Thesis
Submitted to the Faculty
of
Purdue University
by
HyoJeong Kim
In Partial Fulfillment of the
Requirements for the Degree
of
Master of Science
December 2003
ii
ACKNOWLEDGMENTS
I would like to thank my advisor Professor Kihong Park for his persistent guidance
from technical details to mentoring. His keen criticism on my research has improved
my attitude exploring science, and his earnest devotion to science has been a source
of my motivation.
I present my gratitude to my friends and colleagues at Network Systems Lab.
In particular, I am grateful to my friend Bhagya for her valuable feedback on my
research; I remember enjoyable nights we spent together facing deadlines. I would like
to thank my friend Humayun for his help during my study which include numerous
discussions on the subject, implementation of the design, and proof-reading of my
thesis. I would also like to thank Ali for his participation in protocol design; I thank
Yan for his participation in implementation. I am grateful to Sunwoong who has
carefully examined my thesis and provided me vaulable comments.
Finally, I thank to my parents and brothers for their life-long love and support.
I also thank to my friends in Korea, especially Eun-Ju and Seung-Hyub who have
given me warm encouragements throughout my study.
iii
TABLE OF CONTENTS
Page
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Technical Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 DoS Attacks and Prevention Mechanisms . . . . . . . . . . . . . . . . 5
2.2 Methods for Computing Source Reachability . . . . . . . . . . . . . . 6
2.3 Scalable Network Simulation . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Power-Law Network Topology . . . . . . . . . . . . . . . . . . . . . . 8
3 Route-based Distributed Packet Filtering Protocol . . . . . . . . . . . . . . 9
3.1 Overview of Route-based Distributed Packet Filtering . . . . . . . . . 9
3.2 Protocol Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Filter Look-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Filter Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.3 BGP and Its Extension . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Route-based DPF Protocol . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 DPF-lookup Protocol . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.3 Semi-maximal Filter Table . . . . . . . . . . . . . . . . . . . . 19
3.3.4 BGP Extension . . . . . . . . . . . . . . . . . . . . . . . . . . 19
iv
Page
3.3.5 DPF-update Protocol . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Improvement of DPF-update Protocol for Fault-tolerance . . . . . . . 26
4 Performance Evaluation of Route-based DPF Protocol . . . . . . . . . . . 29
4.1 Overall Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.1 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.2 Safety Violation . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.3 Staleness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.4 Containment . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.5 Traceback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Safety Violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Staleness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.7 Containment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.8 Traceback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Dynamic DPF Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.1 Overview of DaSSFNet . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 DaSSFNet-based Parallel Network Simulation Environment . . . . . . 52
5.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2.2 Automatic Model Configuration and Partitioning . . . . . . . 56
5.2.3 Measurement Framework . . . . . . . . . . . . . . . . . . . . . 63
5.2.4 Protocol Modeling . . . . . . . . . . . . . . . . . . . . . . . . 70
6 Large-scale Network Simulation . . . . . . . . . . . . . . . . . . . . . . . . 83
6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.1.1 System Configuration . . . . . . . . . . . . . . . . . . . . . . . 83
6.1.2 Benchmark Topologies . . . . . . . . . . . . . . . . . . . . . . 84
6.1.3 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . 85
v
Page
6.2 Performance and Utility of Comprehensive Measurement Subsystem . 86
6.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2.2 Memory Requirement Monitoring . . . . . . . . . . . . . . . . 87
6.2.3 Memory Consumption by Tables . . . . . . . . . . . . . . . . 89
6.2.4 Memory Consumption by Messages . . . . . . . . . . . . . . . 90
6.2.5 Memory Consumption and Counting of Major Kernel Events . 91
6.2.6 CPU Load Monitoring . . . . . . . . . . . . . . . . . . . . . . 93
6.2.7 Communication Cost . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Scalability of Partitioning . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3.1 Completion Time . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.2 Balanced Memory Offloading . . . . . . . . . . . . . . . . . . 98
7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 101
LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
vi
LIST OF TABLES
Table Page
6.1 Statistics of the benchmark topologies. . . . . . . . . . . . . . . . . . 85
6.2 Parameter settings for TCP. . . . . . . . . . . . . . . . . . . . . . . . 86
vii
LIST OF FIGURES
Figure Page
3.1 Illustration of route asymmetry. . . . . . . . . . . . . . . . . . . . . . 15
3.2 Protocol stack of route-based DPF protocol. . . . . . . . . . . . . . . 18
3.3 Illustration of BGP Extension mechanism. . . . . . . . . . . . . . . . 20
3.4 BGP-REFLECT message format. . . . . . . . . . . . . . . . . . . . . 21
4.1 BGP routing stability as a function of simulation time. . . . . . . . . 37
4.2 Consistency of filter tables as a function of simulation time. . . . . . 38
4.3 Safety violation as a function of simulation time. . . . . . . . . . . . . 39
4.4 Staleness as a function of simulation time. . . . . . . . . . . . . . . . 40
4.5 Containment as a function of simulation time. . . . . . . . . . . . . . 41
4.6 Traceback as a function of resolution. . . . . . . . . . . . . . . . . . . 42
4.7 Traceback as a function of simulation time. . . . . . . . . . . . . . . . 43
5.1 A simple network specification in DML, which consists of one router. 50
5.2 Mapping of partition groups onto distributed machines. . . . . . . . . 50
5.3 A DML snippet of a point-to-point link. . . . . . . . . . . . . . . . . 51
5.4 Network protocol models in the Dynamic DPF Simulator. . . . . . . . 53
5.5 System architecture of the Dynamic DPF Simulator. . . . . . . . . . 55
5.6 A sample Meta-DML input file. . . . . . . . . . . . . . . . . . . . . . 58
5.7 Growth of AS-level Internet graph. . . . . . . . . . . . . . . . . . . . 58
5.8 300-node AS-level Internet graph. . . . . . . . . . . . . . . . . . . . . 60
5.9 Pseudo code of phase 0. . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.10 Pseudo code of phase 1. . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.11 Local and remote message passing procedures. . . . . . . . . . . . . . 65
5.12 A DML snippet for global measurement . . . . . . . . . . . . . . . . . 68
5.13 A DML snippet for local measurement at IP. . . . . . . . . . . . . . . 69
viii
Figure Page
5.14 An illustration of a peering relationship between one border router inAS 1 and another in AS 2. . . . . . . . . . . . . . . . . . . . . . . . . 71
5.15 The architecture of BGP protocol model with its tables. . . . . . . . 72
5.16 The class hierachy of BGP message types. . . . . . . . . . . . . . . . 72
5.17 The architecture of BGP protocol model with its timers. . . . . . . . 73
5.18 A DML snippet of BGP-4 model. . . . . . . . . . . . . . . . . . . . . 74
5.19 Interaction of BGP and DPF-update. . . . . . . . . . . . . . . . . . . 75
5.20 A DML snippet of DPF-update model. . . . . . . . . . . . . . . . . . 76
5.21 A DML snippet of DPF-lookup model. . . . . . . . . . . . . . . . . . 77
5.22 A DML snippet of UDP model. . . . . . . . . . . . . . . . . . . . . . 78
5.23 A DML snippet of CBR traffic generator. . . . . . . . . . . . . . . . . 79
5.24 A DML snippet of CBR massive attacker. . . . . . . . . . . . . . . . 80
5.25 Mechanism of ShutDown model. . . . . . . . . . . . . . . . . . . . . . 81
5.26 A DML snippet of ShutDown model. . . . . . . . . . . . . . . . . . . 81
6.1 Hardware configuration of a Linux cluster used for AS-level bench-marking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2 Power-law connectivity of the benchmark topologies. . . . . . . . . . 85
6.3 Memory consumption as a function of simulation time. (a)M Mem-ory consumption is classifies into three categories—tables, messages,and kernel events. (b) The categories are further subdivided into finegranular components. . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 Memory consumption by protocol tables. BGP Adj-RIB-In, BGP Loc-RIB, and IP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.5 Memory consumption by protocol messages. BGP, TCP Send buffer,TCP Receive buffer, and IP. . . . . . . . . . . . . . . . . . . . . . . . 91
6.6 Major types of KernelEvent objects. (a) shows cumulative counts ofKernelEvent object creation as a function of simulation time for themajor types. (b) shows total memory consumption by KernelEvent ob-jects and memory consumption by major types of KernelEvent objectsas a function of simulation time. . . . . . . . . . . . . . . . . . . . . . 92
6.7 CPU load distribution over 16 Linux workstations. . . . . . . . . . . . 94
ix
Figure Page
6.8 (a) Completion time as a function of parallelism for different bench-mark graphs. (b) Completion time as a function of problem sie for 16,20, and 24 machines. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.9 (a) Total memory watermark as a function of parallelism for differentbenchmark topologies. (b) Average and maximum memory watermarkas a function of parallelism for different benchmark graphs. . . . . . . 98
6.10 (a) Total memory watermark as a function of problem size for 16, 20,and 24 machines. (b) Average and maximum memory watermark as afunction of problem size for 16, 20, and 24 machines. . . . . . . . . . 99
x
ABSTRACT
Kim, HyoJeong. M.S., Purdue University, December, 2003. Performance Evalua-tion of Route-based Distributed Packet Filtering for DDoS Prevention in Large-scaleNetworks. Major Professor: Kihong Park.
This thesis studies performance evaluation of route-based distributed packet fil-
tering (DPF) for spoofed distributed denial of service (DDoS) attack prevention in
large-scale networks under dynamic network conditions. Our contribution is three-
fold.
We design and implement a route-based DPF protocol which computes route-
based filter tables dynamically in the presence of IP (Internet Protocol) routing table
updates governed by BGP (Border Gateway Protocol), Internet’s inter-domain rout-
ing protocol. By introducing an additional signalling message type to BGP, our
solution discovers source reachability information despite the destination-based and
policy-based characteristics of BGP that is prone to generating asymmetric routes.
We evaluate proactive protection performance of route-based DPF under dynamic
network conditions including node failures and resulting transient system states.
Benchmarking is carried out in large-scale Internet measurement topologies where
we show that route-based DPF is robust and effective with respect to both proactive
(containment) and reactive (traceback) performance.
To facilitate large-scale simulation-based DDoS performance evaluation, we built
the Dynamic DPF Simulator as an extension of DaSSFNet. By incorporating auto-
mated network configuration, partitioning, and run-time measurement and monitor-
ing, we show that scalable network simulation is effected by enabling efficient memory,
CPU, and communication load balancing in workstation clusters.
1
1 INTRODUCTION
1.1 Motivation
By clogging the Internet, denial of service (DoS) attack impedes availability of
resources and services. Severity and prevalence of DoS attacks is increasing, and
their consequent malfunctions are experienced by the Internet population at large.
Moore et al. [1] observed that more than 12,000 denial of service attacks occurred
against 5,000 distinct targets during a three-week period. Whereas past incidents
mostly targeted commercial web sites, universities, and organizations, more recent
attacks have targeted the network infrastructure such as major root DNS servers [2–5].
In addition, Internet worms—self-replicating malicious code—have been used as an
agent for launching massive distributed DoS (DDoS) attacks [6].
Route-based distributed packet filtering (DPF) [7] is a recently advanced solution
that provides proactive and reactive protection against spoofed DDoS attack when
the source address of attack packets is forged. Given the routing information of the
Internet, route-based DPF at strategic border routers inspects the source address of
an incoming IP packet. If its source address turns out to be valid, i.e., unspoofed,
with respect to the routing information, it forwards the packet to IP for routing.
Otherwise, it discards the packet proactively. As many DDoS attacks use IP spoofing
to hinder source identification, proactive prevention guards the core and end points
of the network from attack. In the case when spoofed DDoS attack succeeds at
penetrating the proactive shield, route-based DPF’s reactive protection localizes its
physical source within a few sites. Route-based DPF has been evaluated in large but
static network environments assuming availability of relevant routing information,
and shown to be effective. However, a protocol for calculating route-based filter
2
tables has been missing, including its effectiveness and robustness in the presence of
dynamic route changes.
1.2 Objective
The objective of this thesis in three fold. First, we design and implement a route-
based DPF protocol for computing valid source address information. Route-based
DPF filters at distributed filter sites are updated in the presence of dynamic route
changes caused by BGP, Internet’s inter-domain routing protocol. Second, we carry
out performance evaluation of the route-based DPF protocol under dynamic network
conditions, including system failures. We evaluate fault-tolerance of route-based DPF
in large-scale autonomous system (AS)-level Internet measurement topologies. Third,
in order to perform large-scale AS-level Internet benchmarking, we built a scalable
simulation environment extending DaSSFNet, a distributed simulation platform for
workstation clusters. We perform comprehensive simulation benchmarks to determine
scalability in large-scale networks.
1.3 Technical Challenges
Route-based DPF has to infer source reachability from inter-domain routing infor-
mation in order to compute filter tables at distributed filter sites. However, IP routing
is based on destination reachability. When a packet arrives at a router, the router
is interested in only the destination address of the packet, not its source address.
Hence, existing routing protocols, in particular, BGP (Border Gateway Protocol),
do not provide source reachability information. Moreover, due to asymmetry of IP
routing, we cannot infer source reachability directly from the destination reachability
information. That is, we cannot ascertain that the path from a source node to a des-
tination node is the same as from the destination to the source in reverse order. In
addition, BGP uses administrative policies for determining routes that are not neces-
sarily of technical nature. As a result, routing information received at a border router
3
may be biased by its upstream routers. Thus, a major challenge when implementing
route-based DPF to the global IP Internet is to infer source reachability from BGP.
For large-scale performance evaluation involving dynamic network simulation, a
scalable simulation environment that is capable of providing necessary system sup-
port including monitoring and measurement, memory management, and partitioning
is crucial. Since we are aiming to carry out performance evaluation of route-based
DPF on large-scale AS-level Internet measurement topologies, our environment must
support up to 12,000- node/26,000- edge networks that may contain 144,000,000 rout-
ing entries. A critical problem in scalable network simulation is memory consumption
stemming from both static requirements such as routing tables and dynamic require-
ments such as protocol messages. In AS-level Internet simulation, each node repre-
sents an AS where each node is modelled as a single border router. Thus, in addition
to IP routing tables there exists the memory requirement of BGP’s internal tables.
For our performance evaluation, route-based DPF filter tables need to be maintained
along side IP and BGP tables. Hence, we need to build a scalable simulation environ-
ment for AS-level Internet simulations supporting large-scale memory requirements
which achieving parallel speed-up.
1.4 Contribution
We have designed a route-based DPF protocol, which updates filter tables dy-
namically in the presence of BGP which changes IP tables. The route-based DPF
protocol relies on a BGP Extension that allows source reachability information to be
deduced from routing related signalling messages. Based on observable information
obtained from the BGP Extension, the route-based DPF protocol at filter-deployed
border routers infers validity of source addresses for each interface and updates filter
tables accordingly. In addition, a counter-based table design facilitates incremental
filter update in tandem with incremental BGP routing update.
4
We have implemented the route-based DPF protocol in a process-oriented simula-
tion environment, DaSSFNet [8]. The process-oriented abstraction allows simulation
models to be almost as comprehensive as actual system-level protocol implementa-
tions. Since the filter update component of the route-based DPF protocol resides
above the transport layer as in BGP, it interacts with modules in the lower layers of
the protocol stack via BSD-like socket interface. Thus, the simulation model is built
independently from its underlying simulation environment. This working protocol
model and design decisions we made in the implementation phase is useful for future
system building work in network processor platforms.
We have carried out performance evaluation in large-scale AS-level Internet mea-
surement topologies [9]. Our performance evaluation results on fault-tolerant protec-
tion of route-based DPF in Internet measurement topologies is useful for assessing its
effectiveness and robustness in dynamic environments. Moreover, the scalable simu-
lation environment serves as a base for researching filter placement issues as well as
more thorough performance evaluation with respect to infrastructure attack against
route-based DPF.
In the context of distributed simulation, partitioning of a given simulation topol-
ogy affects the simulation’s completion time. Memory requirement imbalance may
cause swapping in the virtual memory management system, resulting in increased
execution time. We have devised a new partitioning algorithm for power-law net-
work topologies, characteristic of Internet AS measurement graphs, which achieves
balanced distribution of memory requirement as well as utilization of CPU resources.
5
2 RELATED WORK
In the following, we review related work across key areas relevant to the thesis.
2.1 DoS Attacks and Prevention Mechanisms
Denial of Service (DoS) attacks overburden a target system or network by de-
manding more resources than they can provide. As presented in classical types of
DoS attacks [2,10,11], resources can be network bandwidth, process, or network con-
nections. Typically network-based distributed DoS (DDoS) attack forges the source
address of DoS attack packets [12] called spoofing. Although some recent attacks used
agents to generate DoS attack traffic with unspoofed source addresses, initial attack
packets for launching remote agents employ IP source address spoofing [13]. Recent
incidents reveal that the Internet infrastructure such as core routers or Domain Name
Servers (DNS) have become a target of DoS attack [5].
Research on source identification—also called IP traceback [14]—have looked at
ways to localize the physical source of attack traffic. Manual, recursive link test-
ing [15], audit trails approach which use traffic logs at routers or gateways [16], and
packet-based traceback mechanisms [14, 17–20] belong to source identification mech-
anisms. Contrary to route-based DPF, these approaches are inherently reactive—
dmage must occur before traceback can be initiated—and cannot provide proactive
protection where attack packets are discarded before they reach a victim.
Packet filtering at ingress or egress points of a domain prevent DDoS attack traffic
proactively [21–23]. For example, a firewall at an egress points can check the source
address of an exiting IP packets. If the source address of an arriving packet is not
from the address space of its domain, the firewall can discard it determining that
the packet is spoofed. A limitation of egress filtering is that with partial deployment
6
there are still too many domains from which spoofed DDoS attack can be initiated
by compromised hosts. Ingress filtering only works at transit providers vis-a-vis stub
ASes which limits its effectiveness. In this sense, route-based DPF can be viewed as
a generalization of ingress filtering.
Recently, Mayday [24] has demonstrated a key aspect of route-based DPF, where
it functions as non-cryptographic authentication mechanism. Mayday applied source
address (or port number) based network layer packet filtering as a light weight au-
thentication mechanism for protecting servers of Secure Overlay Services (SOS) [25].
In this framework, router-based packet filters are deployed around the server, and, by
inspecting source address (or port number) of every incoming packet, they forward
only valid packets to SOS servers.
2.2 Methods for Computing Source Reachability
A major difficulty in route-based DPF table calculation lies in inference of source
reachability from destination-based IP routing, and in the presence of routing asym-
metry. Whereas our approach extends BGP-4 introducing an additional signalling
message type, SAVE [26] and OPCA [27] address the same problem in different do-
mains with different design principles. The objective of SAVE [26] is to enforce IP
packets to carry valid source addresses at a router-level network. Since SAVE is
designed to be routing protocol independent, it uses additional data structures to
maintain source reachability information at each SAVE router and is generic in na-
ture where the main contribution lies in performing efficient incremental update. The
key technical challenge with respect to protocol implementation is to realize route-
based DPF—semi-maximal or maximal filtering—as a minimal footprint companion
protocol of an underlying routing system. In the case of the route-based DPF proto-
col advanced in this thesis, this is done in the context of BGP utilizing its AS-PATH
message information. OPCA [27] proposes an overlay network on top of BGP as a
policy control architecture, which is applied for faster route fail-over and inbound
7
traffic load balancing. In the case of OPCA, a central repository maintains inter-AS
relationships and Internet hierarchy that is used to enhance routing performance.
2.3 Scalable Network Simulation
Simulating large-scale networks such as the Internet has been a challenging prob-
lem due to the size and complexity of the global IP Internet [28, 29]. ns-2 [30], a
packet-level discrete-event network simulator, has been used widely for research in-
cluding TCP congestion control, multicasting, and wireless networks. Although ns-2
is well-suited for small scale simulation, due to memory requirement of routing tables,
messages, and timer events in large-scale networks it is ill-suited for scalable network
simulation. Moreover, the lack of process-oriented abstraction hinders flexible and
accurate evaluation of dynamic protocols including those pertaining to dynamic rout-
ing.
To tackle these challenges and limitations, several studies have looked into a
fluid-based simulation approach [31–33] which represents network traffic as a fluid
flow in such a way simulator only keeps track of changes in rates of network flow
without maintaining individual packet events. A number of projects have studied
parallel/distributed simulation techniques in order to utilize parallel/distributed re-
sources in multi-processor and distributed memory environments for large-scale sim-
ulation [8, 34–36]. A principle focus of these studies has been on synchronization
and parallel speed-up issues. Recent work proposed techniques aimed at enabling
large-scale network simulation by using memory resource thriftily, emphasizing that
routing-related information inherently requires large amount of memory that can be-
come a bottleneck [37, 38].
Our environment, the Dynamic DPF Simulator, facilitates large-scale network
simulation by extending DaSSFNet [8], a C++ based realization of SSF for worksta-
tion clusters and multi-processor systems. SSF is a process-oriented discrete event
simulation framework aimed at flexible, accurate, and efficient simulation. Adopt-
8
ing DaSSF’s scalable simulation environment together with a process-oriented world
view [39], DaSSFNet provides a network simulation infrastructure which is amenable
to full-fledged network protocol implementation on commodity workstation/PC clus-
ters. The existing tools, including DaSSFNet, provide a parallel simulation kernel
capable of efficient synchronization and exporting standardized APIs(e.g., SSFNet’s
object classes and a BSP-like socket API), however, they lack tools for automated net-
work configuration, partitioning, and efficient dynamic monitoring. A key problem is
large-scale topology partitioning which has a dominant influence on performance with
respect to memory, CPU, and communication load balancing. Our system building
work addresses these issues.
2.4 Power-Law Network Topology
Recent measurements of various information networks, including Internet domain
networks [40], the World Wide Web [41, 42], metabolic networks [43], and a variety
of social networks [44–46] have shown that connectivity in these networks follow a
distinct pattern: most are connected to a few, but a few are connected to many.
These networks are sometimes collectively referred to as power-law networks as there
is a power-law relation between the degree and frequency of nodes of that degree:
Pr{deg(u) = k} ∝ k−β
In [7] the impact of power-law network connectivity on route-based DPF’s pro-
tection performance has been studied. It is shown that power-law connectivity of
Internet AS-level topology plays a crucial role in route-based DPF’s effectiveness for
achieving proactive and reactive while achieving sparse filter placement. Theoretic
studies that extend classical random graph theory to power-law graph based on degree
sequences include [47].
9
3 ROUTE-BASED DISTRIBUTED PACKET FILTERING
PROTOCOL
Route-based Distributed Packet Filtering (DPF) [7] has been proposed as a proactive
and reactive solution for distributed denial of service (DDoS) attacks which use source
IP address spoofing. Aiming at Internet autonomous system (AS) level protection
against DDoS attack, we have designed a protocol for route-based DPF.
This chapter is organized as follows. First, we introduce the concept of route-
based DPF for technical background. Next, we discuss several protocol design issues
surrounding route-based DPF. This is followed by the protocol specification. Finally,
we describe an improvement of the protocol for enhancing fault-tolerance.
3.1 Overview of Route-based Distributed Packet Filtering
This section presents the idea of route-based distributed packet filtering (DPF)
for technical background, summarizing [7]. First, the concept of route-based packet
filtering is described, followed by two filter types—maximal and semi-maximal filters.
Next, the concept of distribution of route-based packet filters and their synergistic
effect are described. Finally, issues regarding filter placement are discussed.
Route-based DPF assumes that each node1 has complete knowledge of routing
over the entire network. With this assumption, each node verifies if an incoming IP
packet is valid, i.e., non-spoofed, when it arrives through a specific link. If the packet
is deemed spoofed from the routing information, the packet is discarded. On the
other hand, if the routing information cannot conclusively determine the validity of
the source address, the node forwards the packet following IP.
1A node can be an AS in an AS-level network or a router in a router-level network. We will focuson AS-level network in this section.
10
Route-based DPF includes two types of filters—maximal and semi-maximal. A
filter is a mechanism for determining if a packet is valid or not over a link on which the
packet arrives. Given a graph G = (V, E) which represents the Internet AS topology
and routing information R, a maximal filter Fe at a link e = (u, w) ∈ E is defined as
Fe(s, t) =
0, if e ∈ R(s, t);
1, otherwise.
Here, R(s, t) represents a set of routes from a source IP address s to a destination
IP address t. The output 0 means that a given packet is valid—i.e., non-spoofed,
and the output 1 means that it is invalid—i.e., spoofed. A maximal filter evaluates
validity of an incoming packet M(s, t) based on the existence of a path from s to t
going through the link e based on R(s, t). Each node maintains a separate table per
link for storing the validity flag of incoming packets for all source and destination
pairs. This requires in general O(n2) space, where n is the number of nodes in the
network.
A semi-maximal filter over a link e = (u, w) ∈ E is defined as follows.
Fe(s, t) =
0, if e ∈ R(s, v) for some v ∈ V ;
1, otherwise.
A semi-maximal filter uses the source IP address of a packet for determining its
validity. It checks if a link, where a packet M(s, t) arrives, belongs to a routing path
from the source IP address s to some destination node, irrespective of the destination
IP address t inscribed in the IP header. In this setting, a semi-maximal filter over a
link maintains a table which keeps validity information based on source address only.
This requires O(n) space.
When more than one node of a network enable their filtering functionality, filter-
ing is distributed. The collaborative effect of route-based distributed packet filtering
(DPF) is two-fold—proactive and reactive. Proactive protection (a.k.a., containment)
means that route-based DPF discards spoofed IP packets proactively before they can
11
reach their target. Reactive protection (a.k.a., traceback) is in effect when route-
based DPF cannot proactively filter out spoofed attack traffic. Reactive protection
means that, upon receiving an IP packet—spoofed or non-spoofed—route-based DPF
can localize its physical source.
Distributed filter placement involves two issues—coverage ratio and selection of
nodes for a given coverage ratio. Coverage ratio is defined as the fraction of nodes
where filtering is enabled. For a given coverage ratio, the strategy for selecting the
filter nodes affects proactive and reactive protection performance. AS-level Internet
topology has been shown to exhibit power-law connectivity [40]. This implies that
there are a few high degree nodes which are connected to many low degree nodes.
Thus, exploiting the power-law nature of AS-level Internet topology, we can reduce
the coverage ratio by placing filters at high degree nodes. Conversely, we can apply
power-law connectivity information as a strategy for selecting a set of filter nodes
while minimizing coverage ratio.
3.2 Protocol Design Issues
Sharing a common protocol architecture with IP, the route-based DPF protocol
is composed of two major parts—filter look-up and filter update. The filter look-up
component does line rate packet processing to determine source address validity. As
with IP, the filter look-up component functions on the data plane at line rate. On
the other hand, the filter update component updates route-based DPF filter tables
as routes in the network change dynamically. Similarly to routing protocols such as
BGP, the filter update component operates in the control place which occurs at slower
time scales.
In the next section, we discuss issues in designing filter look-up and filter update
components, and present our approach for solving these issues.
12
3.2.1 Filter Look-up
Filter look-up resides between the network interface layer and internet layer in the
Internet Reference Model. Filter look-up implements packet forwarding/discarding
depending on the validity of a packet’s source address, i.e., it is a semi-maximal filter.
Semi-maximal filtering shows comparable protection with O(n) space requirement to
that of maximal filtering, which requires O(n2) space [7].
Note that, in the context of AS-level route-based DPF protocol, we interpret an
IP address of an AS node as an IP prefix within the administrative domain of the AS
node2. In other words, we assume that every AS node has a unique non-overlapping
IP prefix and we can check the originating AS node from a source IP address by
inspecting the prefix of the IP address.
We have designed a counter-based semi-maximal filter, which is suitable for incre-
mental filter update. Each entry within the counter-based semi-maximal filter table
includes a separate counter, and the counter represents the number of destinations
where the link e is traversed from each source IP address. The counter value is
interpreted as false if its value is positive. Otherwise, the value is interpreted as true.
A straightforward semi-maximal filter design is to maintain a table which consists
of entries for all source IP prefixes, where each entry includes a Boolean flag. If it is
true—1 in the definition—we consider it as invalid (spoofed). Otherwise, we consider
it as valid (non-spoofed). If a link e belongs to a set of routes from a source IP
address s to some destination IP address, we set the entry of the source’s IP prefix as
false. Otherwise, the entry has true as its value. Thus, when a packet, whose source
IP address is s, arrives, filter look-up will check its validity by performing maximum
prefix matching.
BGP handles dynamic state changes incrementally, whereas OSPF and RIP do
so periodically. The route-based DPF protocol, in the AS context, follows BGP and
2In reality, some AS node may include more than one IP prefix within its domain. In some cases,more than one AS node may have a common IP prefix within their domains.
13
hence its updates are carried out incrementally. The following example shows that
route-based DPF must be extended to handle incremental changes.
When network state is dynamic, a source IP address, which was valid earlier, may
become invalid later. Formally, let R0(s, t) be the set of routes from s to t calculated
at time 0 and let R1(s, t) be the set of routes calculated at time 1. When the state of
the network is transient, it is possible that e ∈ R0(s, t) but e /∈ R1(s, t). When a filter
entry has false as its flag value, it is difficult to tell whether the in question link is
used to reach from source s to the destination t only, or the link is also used to reach
to some other destination address. In the first case, the entry should be updated as
true, however, in the presence of continual changes and uncertainty this may cause
violation of safety if the link is used to reach from source s to some other destination.
As mentioned in [7], route-based DPF is safe in the sense that it never discards valid,
i.e., non-spoofed packets. In the second case, the entry should remain false. However,
if the link is in fact not used by any other source-destination pair, the entry becomes
stale. An entry for a source IP address s of a filter table at a link e is stale when a
semi-maximal filter cannot filter out spoofed DDoS packets, whose source IP address
is forged as s.
When the above transient change happens, our counter-based semi-maximal filter
decrements the corresponding entry’s counter value. If the link e is used to reach
from the source IP address s to some other destination, the counter value remains
positive. Thus, safety is not violated. If the link is used only for a single (s, t) pair,
the counter will reach 0 after decrement. Hence, the entry is prevented from being
stale.
3.2.2 Filter Update
As mentioned in [7], the most challenging problem, in the context of designing
a filter update protocol, lies in IP focusing on destination reachability, however, not
necessarily source reachability. This can induce asymmetry of routing, which makes
14
it difficult to infer source reachability. Asymmetry of IP routing is common in inter-
domain routing where non-technical, administrative policies are applied due to BGP
implementing policy routing. In this section, we focus on these challenges. Our
approach, a BGP Extension, for solving these challenges is introduced in the next
section.
For background, let us describe BGP’s route selection and advertisement pro-
cedure. According to RFC 1771 of BGP-4 [48], Decision Process selects routes for
subsequent advertisement by applying policies to route updates stored in Adj-RIB-In
(a table containing received route updates per peer). The output of Decision Process
is a set of routes to be advertised, and they are stored in Adj-RIB-Out. Decision Pro-
cess includes three distinct phases—Phase 1, Phase 2, and Phase 3. Phase 1 is a step
for calculating the degree of preference for each received route update by applying
policies. Let us call them update policies. During Phase 2, a best route is selected
out of the received route updates for each distinct destination. When there exist
more than one candidate route update for a destination with the same preference, a
tie-breaking procedure is applied. Each BGP speaker has a unique BGP Identifier,
which is set to an IP address assigned to it. When a BGP peering relationship is es-
tablished, participating BGP speakers exchange their BGP Identifier values. Among
routes selected during Phase 2, some routes are chosen for advertisement to peer BGP
speakers per policy during Phase 3. Let us call them advertisement policies.
As with other routing protocols, BGP focuses on destination reachability, which
is sufficient for calculating IP routing tables. Since IP routing relies on destination
IP address only, destination reachability is a sufficient condition. A BGP speaker at
a destination AS creates and forwards reachability information (i.e., route update)
to its neighbors. Once reachability information is sent out, the BGP speaker at the
destination AS does not know which route was selected by BGP speakers at source
ASes. Each source AS selects a route to the destination AS based on destination
reachability information only, without taking into account routing path from the des-
15
tination AS back to itself. Moreover, source ASes need not send routing information
back to the destination AS.
Destination-based IP routing may induce route asymmetry, and routes calculated
by BGP can exhibit the same problem. Given a route from a source IP address s to a
destination IP address t, routes between s and t are asymmetric if a route from t to
s is not the same as one from s to t. Route asymmetry is a characteristic feature of
destination-based IP routing, and it has been observed since the mid 1990s both at
the AS-level and at the router-level [49]. Figure 3.1 illustrates route symmetry in a
simple network. Let us use hop count as the metric in this example. When a routing
protocol lets node 1 know that there are two route candidates for reaching node 6
with the same metric 3, node 1 will choose one of them. Similarly, node 6 will select
one of two route candidates for reaching node 1. Depending on the routing protocol
and local information available, route asymmetry can arise as depicted in Figure 3.1.
Figure 3.1. Illustration of route asymmetry.
16
Due to route asymmetry, source reachability may not be inferable directly from
destination reachability information. In Figure 3.1, given a route from node 1 to
node 6, we cannot construe that the route from 6 to 1 is the same route in reverse.
On the other hand, link-state routing protocols such as OSPF can detect this asym-
metry and infer source reachability using its global knowledge of the entire network.
However, BGP does not provide to each node global knowledge for calculating source
reachability. Phase 3 of BGP’s Decision Process chooses route updates (1) among
routes selected during Phase 2, and (2) according to its advertisement policy. In
other words, route update information received from one’s peer may be biased and
restricted by the update policy (in case of 1) and advertisement policy (in case of 2)
of the peer. Hence, the current BGP is not suited for inferring source reachability for
use in route-based DPF filter table update.
To calculate correct filter tables at distributed filter sites, augmentation of BGP
or introduction of a new protocol for propagating source reachability information is
required. In reality, BGP may not compute correct routing tables. Sometimes, it
leads packets to black holes where packets cannot be forwarded any further. Even if
this is the case, route-based DPF protocol should infer source reachability from the
consistent image of routing tables which the BGP protocol calculates. Hence, both
methods are required to interact with BGP for synchronization. The latter requires
additional overhead for coordinating with a separate protocol, BGP. We extend BGP
for disseminating source reachability information.
3.2.3 BGP and Its Extension
We extend BGP by defining a new message type—BGP-REFLECT. A BGP-
REFLECT message contains source reachability information, and it is disseminated
back to destination ASes, where its corresponding route update message is initiated.
BGP-REFLECT includes two internal types—ADD and DELETE. In the case when
a BGP-REFLECT message carries potential source AS information which becomes
17
reachable, its internal type is set as ADD. In the case when it carries source AS
information which becomes unreachable, its type is set as DELETE.
As BGP presents destination reachability using AS-PATH in BGP route update
messages, source reachability information in BGP-REFLECT messages is represented
as AS-PATH. A BGP route update message includes AS-PATH as an attribute for a
destination IP prefix. AS-PATH, which contains destination reachability information,
is a sequence of AS numbers starting with that of the originating AS. As it is propa-
gated from the originating AS, AS numbers are prepended to the AS-PATH attribute
in BGP route update messages. Similarly, a BGP-REFLECT message includes source
reachability information as part of AS-PATH. The AS number of the source AS is
prepended to AS-PATH of selected BGP route updates from a destination AS.
Generation of BGP-REFLECT message is triggered during BGP’s Decision Pro-
cess. First, a BGP-REFLECT ADD message is initiated when a route update is
selected for a destination IP prefix. This may be caused by a newly-received BGP
route update or by expiration of the BGP Hold Timer, which checks liveness of its
peer BGP speaker. In the first case, when the newly-received BGP route update is
selected as the best route for the destination IP prefix, the BGP-REFLECT ADD
message for the new route update is instantiated. In the latter, expiration of BGP
Hold Timer initiates selection of another best route for the destination IP prefix. For
the newly-chosen route update, BGP-REFLECT ADD message is initiated. Next,
BGP-REFLECT DELETE message is created when a route update, which was se-
lected for a destination IP prefix, becomes invalidated. This can be caused by a
newly-received route update or by expiration of the BGP Hold Timer.
Triggered BGP-REFLECT messages are forwarded back to the destination AS,
which originated the corresponding BGP route update message. BGP consults AS-
PATH within BGP-REFLECT message to determine where to forward. Once the mes-
sage reaches the destination AS, it is destroyed. From the received BGP-REFLECT
message, the destination AS comes to know source reachability.
18
3.3 Route-based DPF Protocol
3.3.1 Architecture
Based on the discussion in Section 3.2, we define route-based DPF protocol—
DPF-lookup and DPF-update—together with BGP Extension. The DPF-lookup pro-
tocol challenges every incoming packet consulting the semi-maximal filter table to
determine validity. Filter update is composed of BGP Extension and DPF-update
protocol. As mentioned earlier, BGP Extension disseminates source reachability in-
formation to the entire network. The DPF-update protocol interprets source reacha-
bility from BGP-REFLECT messages received by the BGP Extension, and it updates
semi-maximal filter tables according to the information obtained.
Figure 3.2 shows the overall architecture of the route-based DPF protocol. As
mentioned in Section 3.2, DPF-lookup exists between the network interface layer
and internet layer in the Internet Reference Model, and DPF-update operates at the
application layer on top of TCP. The DPF-update protocol updates semi-maximal
filter tables which are used by the DPF-lookup protocol for filtering at smaller time
scales.
Figure 3.2. Protocol stack of route-based DPF protocol.
19
3.3.2 DPF-lookup Protocol
The DPF-lookup protocol performs filtering, inspecting every incoming IP packet.
When a packet arrives through a network interface, DPF-lookup fetches its source IP
address. Then, DPF-lookup finds the best matching IP prefix3 in the semi-maximal
filter table corresponding to the network interface. If an entry for a source IP address
stores a positive counter value, which means that the packet with the source IP
address is valid, it is passed to IP for routing. Otherwise, the packet is discarded.
3.3.3 Semi-maximal Filter Table
The DPF-lookup protocol defines semi-maximal filter tables as protocol compo-
nents. A semi-maximal filter table contains an entry for each source IP address. For
reduction of size (the number of entries) of a semi-maximal filter table, it is recom-
mended to maintain an entry only for valid source IP address. In this case, if there
is no matching entry for a source IP address, it implicitly means that the source IP
address is invalid. Filtering functionality can be enabled selectively. For each network
interface where filtering is enabled, DPF-lookup maintains a separate semi-maximal
filter table.
3.3.4 BGP Extension
BGP Extension is an augmentation of BGP-4 to assist the DPF-update protocol
calculate semi-maximal filter tables. The main reason for this extension is to overcome
route asymmetry by sending source reachability information back to the originating
destination AS. For this, a new type of message, BGP-REFLECT, is employed.
Figure 3.3 illustrates the BGP Extension mechanism. Let us consider three border
routers: 1, 2, and 3; their AS numbers are 1, 2, and 3, respectively. In this example,
3In this thesis, we assume that each AS has a unique, non-overlapping IP prefix. As mentioned in3.2, some ASes have more than one IP prefix, and some IP prefixes are shared by more than oneASes. In that case, this table searching method might cause safety violation or staleness problem.
20
Figure 3.3. Illustration of BGP Extension mechanism.
every border router installs BGP-4 for routing, and the BGP Extension functionality
is enabled. First, BGP route updates are propagated to the entire network according
to BGP-4. After establishing a TCP connection, BGP at 1 sends a route update for
itself to BGP at 2. Here, a route update corresponds to a BGP UPDATE message
which includes Network Layer Reachability Information (NLRI) and AS-PATH. NLRI
contains a list of IP prefixes of the triggering AS. On receiving the UPDATE message,
BGP at 2 passes it to Decision Process, and it is decided as a best route to reach 1.
Then NLRI and AS-PATH information are inserted into Loc-RIB (storage for selected
route updates) of BGP at 2, and a new BGP UPDATE message is sent to BGP at 3.
When a new route update is selected and inserted into Loc-RIB, a BGP-REFLECT
message for the route update is created and sent back to the destination. As shown
in Figure 3.4, BGP-REFLECT includes IP prefix, AS-PATH, and TYPE as its major
fields. In this illustration, BGP-REFLECT ADD messages are created for the newly
received route updates. The IP prefix field stores the IP prefix of the source AS. To
21
complete the AS-PATH representing source reachability, the initiating AS prepends
its AS number to AS-PATH of the received BGP route update. In Figure 3.3, a BGP-
REFLECT message from 2 to 1 and another one from 3 to 2 correspond to instances
of this case.
On receiving a BGP-REFLECT message from its peer, the BGP router for-
wards the message to its upstream router based on the AS-PATH field of the BGP-
REFLECT message. In Figure 3.3, BGP at 2 receives the BGP-REFLECT message
initiated by 3, and it forwards the BGP-REFLECT message without any modifica-
tion. Again, it refers to the AS-PATH field to find out the corresponding upstream
router. When BGP-REFLECT messages arrive at destination ASes, they are removed
and not forwarded any further. In Figure 3.3, BGP at 1 receives two BGP-REFLECT
messages, and they disappear from the network.
BGP-REFLECT Message Format
As a BGP message type, BGP-REFLECT contains a 19-byte BGP message header.
The BGP message header includes a 1-byte Type field. We assign 5 as the type code
of the BGP-REFLECT message.
Figure 3.4. BGP-REFLECT message format.
Figure 3.4 shows the BGP-REFLECT message format. Following is a description
of each field:
• IP prefix length
22
This 1-byte unsigned integer field specifies the length of a source IP prefix. The
length is represented in bits.
• IP prefix
The source IP prefix is stored in this field. The length of the IP prefix is variable.
For this reason, this field is padded with 0 in order to scale its length into a
byte unit.
• AS-PATH length
This 1-byte unsigned integer field specifies the AS-PATH length. The length is
represented as a count of ASes in the AS-PATH.
• AS-PATH
This field contains a sequence of AS numbers. Each AS number is represented
as a 2-byte unsigned integer. The total length of this field becomes two times
the AS-PATH length.
When a BGP-REFLECT message is created, an initiating source AS’s AS num-
ber is prepended to the AS-PATH of a given route update, starting with the
destination AS. The AS-PATH field must not be modified by forwarding BGP
nodes.
• TYPE
This 1-byte unsigned integer field indicates the type of a BGP-REFLECT
message—ADD or DELETE. The following codes have been defined:
Code Symbolic Name
1 ADD
2 DELETE
A BGP-REFLECT ADD message indicates a source AS, which becomes reach-
able to a destination AS via AS-PATH. BGP-REFLECT DELETE message
23
indicates a source AS, which becomes unreachable to a destination AS through
a given AS-PATH. Details of each message type is described in the following
section.
Message Types: ADD/DELETE
Whenever BGP chooses a new route update for a destination, a BGP-REFLECT
ADD message for the route update is generated. When BGP receives a new route
update from its peer, Decision Process starts. If it decides to choose the route, a
BGP-REFLECT ADD message is initiated. In another instance, when the BGP Hold
Timer for a BGP session expires, BGP withdraws all route updates received from the
corresponding peer. In this case, BGP tries to find an alternate route update for each
destination. For the newly-selected route updates, BGP-REFLECT ADD messages
are generated. IP prefix and IP prefix length fields are filled with the value of the
source AS. The AS number of this initiating AS is prepended to the AS-PATH of the
route update. At the destination AS, the received BGP-REFLECT ADD message
conveys that the destination AS becomes reachable from the source AS through the
AS-PATH.
A BGP-REFLECT DELETE message is generated when BGP invalidates a route
update for a destination. BGP-REFLECT DELETE can be triggered in the afore-
mentioned situations. First, when BGP selects a newly-received route update for
a destination AS, withdrawing a previously chosen AS-PATH, a BGP-REFLECT
DELETE message is generated for the old route update. On the other hand, when
the BGP Hold Timer for a BGP session expires, irrespective of the existence of an
alternate route, BGP-REFLECT DELETE messages for all route updates are gener-
ated. IP prefix, IP prefix length, AS-PATH, and AS-PATH fields are filled in in the
same manner as that of the BGP-REFLECT ADD message. At the destination AS,
the received BGP-REFLECT DELETE message is interpreted as the destination AS
becoming unreachable from the source AS via the specified AS-PATH.
24
Reflect Timer
In principle, whenever a new route update is selected and inserted into Loc-RIB,
a BGP-REFLECT message for the route update should be initiated and sent back to
the destination AS. However, for reduction of message complexity, Reflect Timer is
recommended to be employed. In this case, BGP-REFLECT messages are transmit-
ted when the reflect timer expires. BGP checks its Loc-RIB to find a route update for
which a BGP-REFLECT message has not yet been triggered, and it generates BGP-
REFLECT messages for them. In case when more than one route update for the
same destination are selected within the Reflect Timer interval, a BGP-REFLECT
ADD message for the last route update is initiated. BGP-REFLECT DELETE mes-
sages are sent out only for the invalidated route updates, for which BGP-REFLECT
ADD messages was initiated. Hence, we can reduce additional BGP-REFLECT
ADD/DELETE messages for other invalidated route updates. Depending on the Re-
flect Timer interval, the number of BGP-REFLECT messages generated is reduced
as intended. In addition, it affects timeliness of semi-maximal filter updates across
the entire network. Hence, the Reflect Timer has to be tuned carefully.
Interface to DPF-update
In the context of defining an interface between BGP Extension and DPF-update,
we assume that BGP Extension includes a flag for identifying a site where route-based
DPF filtering is deployed (a.k.a., filter site). The route-based DPF protocol—DPF-
update and DPF-lookup protocols—is deployed at filter sites, where the flag is set to
true. Hence, BGP can determine whether a router is a filter site or not.
BGP Extension provides a reflect buffer as an interface to the DPF-update pro-
tocol. In case of a filter site, all received BGP-REFLECT messages are stored in
the reflect buffer. Specifically, when a BGP speaker receives a BGP-REFLECT mes-
sage, it inserts a copy of the message into its reflect buffer before forwarding to the
upstream border router.
25
3.3.5 DPF-update Protocol
We assume that the DPF-update protocol at each filter site ascertains the set of
filter sites deployed. Note that DPF-update can access semi-maximal filter tables by
the AS number of peering ASes, since DPF-lookup maintains filter tables indexed by
the AS number of neighboring ASes.
The DPF-update protocol decodes received BGP-REFLECT messages for calcu-
lating semi-maximal filter tables. As mentioned earlier, BGP-REFLECT messages
are handed over by BGP Extension via a reflect buffer.
Given a BGP-REFLECT message, the DPF-update protocol operates as follows
based on its type code:
• ADD
From the AS-PATH of the BGP-REFLECT message, DPF-update fetches its
immediate downstream AS number. A corresponding semi-maximal filter table
is accessed. IP prefix and IP prefix length information are used to find an entry
for the source AS. Then, the counter value for the entry is incremented. Since
the filter table keeps only valid source IP addresses for reduction of table size
(a property of power-law networks), an entry for a source IP address may not
exist. In that case, DPF-update creates an entry and increments its counter
value.
• DELETE
DPF-update accesses a corresponding semi-maximal filter table based on the
AS-PATH of the BGP-REFLECT message. Using IP prefix and its length, an
entry for the source AS is fetched. Since the semantics of the BGP-REFLECT
DELETE conveys that the source AS does not use the given AS-PATH to reach
this node, DPF-update invalidates its source IP prefix from the filter table
by decrementing the counter. In case when the filter table stores only valid
source IP addresses, we need one more check if the counter becomes zero after
decrement. If this is the case, the entry is deleted.
26
3.4 Improvement of DPF-update Protocol for Fault-tolerance
The BGP-REFLECT message forwarding scheme relies on BGP peering relation-
ships. BGP forwards received BGP-REFLECT messages as dictated by their AS-
PATH field values. This generates a major issue to be considered: fault-tolerance.
When an intermediate border router, which belongs to a path between a certain
source AS and a destination AS, goes down, BGP at the source AS runs Decision
Process and selects another path (if any). As a result of Decision Process, the original
path is invalidated and a BGP-REFLECT DELETE message is initiated. BGP run-
ning at a border router, which had a connection with the failed border router loses
connection. As a result, in the course of BGP-REFLECT message forwarding by
BGP, the BGP-REFLECT DELETE message cannot be forwarded any further. In
this case, upstream filter sites do not receive this message, and it causes filter tables
at the upstream filter sites to contain stale entries.
The new forwarding mechanism offloads forwarding responsibility from BGP on
DPF-update. DPF-update maintains connection with other filter sites on demand
and forwards BGP-REFLECT messages to its upstream filter sites. However, since
only BGP can make a decision for generation of BGP-REFLECT messages, it relies
in part on BGP. When a BGP-REFLECT message is generated at a non-filter site, it
is forwarded to its upstream router in the same manner as the old mechanism. BGP
follows the old mechanism until the message reaches the first filter site. Then BGP at
the filter site passes the message to DPF-update so that it may start its forwarding
mechanism.
We need to consider the case when an upstream filter site may not be reachable
as well. DPF-update detects failure of its upstream filter site when TCP connection
set-up procedure fails. In this case, it forwards the received BGP-REFLECT message
to the next available upstream filter site. The connection failure, however, might be
caused by temporary network state. In addition, keeping unreachable upstream filter
site information is not suitable for scalability of DPF-update. For these reasons, DPF-
27
update does not maintain unreachable upstream filter sites information for future
use. Whenever a BGP-REFLECT message is received, DPF-update tries to make a
connection with its next upstream filter site. This enhances fault-tolerance without
adversely affecting scalability of the DPF-update protocol.
28
29
4 PERFORMANCE EVALUATION OF ROUTE-BASED DPF
PROTOCOL
As shown in [7], route-based DPF provides significant degree of proactive and reac-
tive protection against spoofed DDoS attacks in a static network environment. The
static network environment represents a network environment where there is no fail-
ure or variation of network infrastructure. Here, network infrastructure ranges from
hardware—such as host, router, or link—to software—such as routing protocol and
name server.
In this chapter, we carry out performance evaluation of route-based DPF protocol
in a dynamic network environment. Contrary to the static network environment, the
dynamic network environment encompasses failure or variation of network infrastruc-
ture. Route-based DPF is active during transient periods of the network while BGP
updates IP routing tables. Time lags in synchronization between the route-based
DPF protocol and BGP may lead to performance degradation. Thus, we measure
and analyze the effectiveness of the route-based DPF protocol during transient peri-
ods.
This chapter is organized as follows. First, we introduce performance measures—
stability, safety violation, staleness, containment, and traceback—for route-based
DPF. Next, the experimental setup is presented. Finally, results for the performance
measures are shown and analyzed in separate sections.
4.1 Overall Objective
During transient periods, route-based DPF may contain incorrect information
which may cause safety violation or staleness. That is, route-based DPF may discard
30
non-spoofed packets (safety violation) or may not be able to discard spoofed attack
packets (staleness) when it is safe to do so.
The route-based DPF protocol calculates semi-maximal filter tables based on
global knowledge of routing. Thus route-based DPF requires consistent routing infor-
mation with respect to BGP. During transient periods, BGP Extension generates and
handles BGP-REFLECT messages as well as BGP UPDATE messages. Accordingly,
the route-based DPF protocol interprets messages from the BGP Extension, in order
to be consistent with fluctuations of BGP routing itself.
In order to capture potential performance degradation during the transient peri-
ods, we measure effectiveness of the route-based DPF protocol with respect to safety
violation and staleness. We show route-based DPF’s protection performance using
two major performance measures—containment and traceback, as defined in [7]. In
addition, we measure stability of BGP’s routing table calculation as well as that of
the route-based DPF protocol’s filter table calculation. Since the route-based DPF
protocol relies on the underlying BGP for its update events, stability of BGP routing
affects stability of route-based DPF.
4.2 Performance Measures
Let G = (V, E) be an undirected graph representing an AS-level Internet topol-
ogy. BGP computes routes for all pairs of source and destination; let R be the
set of computed routes. A route r—an element of R—is represented as a 3-tuple
< node, destination, nexthop >, where node indicates a routing table where the en-
try belongs to; the others denote the destination and next hop. Similarly, route-based
DPF calculates filter tables based on R, and it generates a set of computed filter en-
tries F . A filter table entry f is represented as a 3-tuple < node, link, source >, where
node and link identifies a filter table where the entry belongs to; source denotes the
IP source address in semi-maximal filtering. The existence of an entry for a source
address represents validity. Let E be the set of events which change network topology
31
configuration. In reality, E ranges from addition, deletion, or change of BGP routing
policies to addition, failure, or configuration change of hardware infrastructure (host,
router, or link). In this section, we focus on a single AS node failure cases, which
may come from failure of its border router(s). We define a node failure event e as
a pair < time, node >, where time denotes the time of failure; node represents the
failed node.
Let R0 and F0 be the initial set of routes and filter tables before an event e in E
occurs. An event e triggers BGP’s route update procedure. We assume that there
exists a steady state of BGP route calculation, where there is no more BGP route
update triggered by an event occurrence.1 With this assumption, let Rs and Fs be
the set of routes and the set of filter entries in a steady state.
4.2.1 Stability
We define distance of two set of routes, Ri and Rj, with respect to entry or with
respect to node in terms of granularity. The distance of Ri and Rj with respect to
entry is defined as a scalar between 0 and 1, and it denotes the fraction of entries
which include inconsistent information. Here, inconsistent means either a route entry
for a destination does not exist or nexthop information is not the same. Similarly, we
define the distance of Ri and Rj with respect to node as a scalar between 0 and 1, and
it denotes the fraction of nodes whose routing tables include at least an inconsistent
entry. When two sets of routes contain exactly the same information, both distance
with respect to entry and distance with respect to node become 0.
Assuming E is a finite set, we can deduce the resulting network topology G∗ and
the corresponding set of routes R∗, which is calculated by Dijkstra’s shortest-path
algorithm. Let R∗ be the ideal set of routes. Since BGP may converge to a different
set in its steady state, Rs and R∗ may not be the same. BGP generates a set of routes
Rt at each time instance t as a result of BGP route update exchange. After BGP
1However, it is may not be true, in reality. BGP stability problem and its effects on route-basedDPF protocol can be examined separately.
32
convergence, Rt is the same as Rs. By plotting distances of each set of routes Rt and
R∗ with time, we can observe the evolution of BGP routing calculation and stability.
Given two sets of filter entries, Fi and Fj , we define three types of distances for
filter table comparison. First, distance of the given sets of filter entries with respect
to entry granularity is defined as a scalar between 0 and 1, and denotes the fraction of
entries which include inconsistent filter table information. Inconsistent information
means either a filter entry for a pair of a link and a source does not exist or validity
information is not the same. Next, we define distance with respect to filter granularity
as a scalar between 0 and 1, which denotes the fraction of filter tables which include
at least an inconsistent entry. Finally, distance with respect to node granularity is
defined as a scalar between 0 and 1, and denotes the fraction of nodes whose filter
tables include at least one inconsistent entry.
BGP routing table calculation fluctuates during transient periods, and it generates
sets of routes as it handles BGP route update messages. Subsequently, BGP’s route
calculation triggers DPF-update’s filter calculation. Since route-based DPF’s protec-
tion performance is based on underlying routing state, it is important DPF-update to
calculate filter table consistent with routing at each time instance. Hence, we define
consistency of the route-based DPF protocol’s filter calculation at each time instance
t as a distance of two set of filter entries, Ft and F ∗
t , where Ft denotes calculated filter
information; F ∗
t represents theoretically determined filter information from routing
Rt. For a given routing R, we define F ∗, an ideal set of filter entries calculated from
the given R.
33
4.2.2 Safety Violation
Given a set of routes Ri and a set of filter entries Fi at a time instance, we can
deduce the ideal set of filter entries F ∗
i for the given Ri. Here, safety violation of Fi
for the given Ri is defined as follows:
SV (Ri, Fi) =
0, if F ∗
i − Fi = ∅;
1, otherwise.
The value 0 represents that Fi for the given Ri is safe. The value 1 represents that
safety of Fi for the given Ri is violated. The intuitive meaning of the safety condition
is that filter tables should contain validity of all unspoofed source addresses, so that
unspoofed packets are not discarded by route-based DPF.
To quantify the degree of safety violation, we examine safety violation in entry,
filter, and node granularity. Let nftr be the number of filter tables, which are iden-
tified by a pair, node and link. For given Fi and Ri, safety violation index in entry
granularity SVentry is defined as follows:
SVentry(Ri, Fi) =|F ∗
i − Fi|
nftr · |V |
SVentry is a scalar between 0 and 1, and it denotes fraction of entries, which cause
safety violation.
Safety violation index in filter granularity SVfilter is defined as follows:
SVfilter(Ri, Fi) =|{(u, e) : ∃f =< u, e, s >∈ (F ∗
i − Fi)}|
nftr
In this definition, (u, e) represents a filter deployed at u over the link e. SVfilter is
a scalar between 0 and 1, and it represents fraction of filters, which include at least
one safety-violating entry.
Safety violation index in node granularity SVnode is defined as follows:
SVnode(Ri, Fi) =|{u : ∃f =< u, e, s >∈ (F ∗
i − Fi)}|
|V |
34
SVnode is a scalar between 0 and 1, and it denotes fraction of nodes, at least one
of whose filter table contains at least one safety-violating entry.
SVentry provides the most fine-granular metric for safety violation. SVfilter and
SVnode show distribution of safety-violating entries over all filter tables and nodes,
respectively.
4.2.3 Staleness
Given a set of routes Ri and a set of filter entries Fi at a time instance, we can
deduce the ideal set of filter entries F ∗
i for the given Ri. Staleness of Fi for the given
Ri is defined as follows:
ST (Ri, Fi) =
0, if Fi − F ∗
i = ∅;
1, otherwise.
The value 0 represents that Fi for the given Ri is not stale. The value 1 represents
that Fi for the given Ri contains stale information. In other words, at least a entry
for an invalid source address is kept in Fi. It implies that DDoS attack packets, whose
inscribed source address is the invalid one, cannot be discarded.
As with safety violation case, the degree of staleness is measured with staleness
indexes in entry, filter, and node granularity. For given Fi and Ri, staleness index in
entry granularity STentry is defined as follows:
STentry(Ri, Fi) =|Fi − F ∗
i |
nftr · |V |
STentry is a scalar between 0 and 1, and it denotes fraction of entries, which cause
staleness.
Staleness index in filter granularity STfilter is defined as follows:
STfilter(Ri, Fi) =|{(u, e) : ∃f =< u, e, s >∈ (Fi − F ∗
i )}|
nftr
35
STfilter is a scalar between 0 and 1, and it represents fraction of filters, which
include at least one stale entry.
Staleness index in node granularity STnode is defined as follows:
STnode(Ri, Fi) =|{u : ∃f =< u, e, s >∈ (Fi − F ∗
i )}|
|V |
STnode is a scalar between 0 and 1, and it denotes fraction of nodes, at least one
of whose filter table contains at least one stale entry.
As with safety violation measures, STentry provides the most fine-granular metric
for staleness. STfilter and STnode indicate distribution of stale entries over all filter
tables and nodes, respectively.
4.2.4 Containment
To observe proactive protection (a.k.a. containment) of route-based DPF during
transient periods, we use Φ2(τ) defined in [7]. Φ2(1) is a scalar between 0 and 1, and
it denotes the fraction of AS where no attacker can succeed spoofed DDoS attack
targeted at any victim in other ASes.
4.2.5 Traceback
We use Ψ1(τ) to measure reactive protection performance (a.k.a traceback) of
route-based DPF during transient periods. As defined in [7], Ψ1(τ) is a scalar between
0 and 1 for the given parameter τ , and it denotes the fraction of ASes, which on
receiving a spoofed IP packet can localize its physical source to within τ sites.
4.3 Experimental Setup
We performed experiments on the 3023-node NLANR [9] measurement topology
dated 11/08/1997. According to [?]’s stub/transit classification on the 3023-node
36
topology, around 80% of nodes are classified as stub nodes. We place route-based
DPF filter on selected 580 nodes, which compose a vertex cover of the given topology.
We considered two single-node failure benchmark scenarios. First, we considered
a single degree, stub AS node which is connected to the highest degree node. Due to
the power-law nature of AS-level Internet topology [40], most AS nodes correspond
to this category. The faulty node (AS3) is assigned to Massachusetts Institute of
Technology, and the connected one of highest degree AS nodes (AS1) corresponds to
Genuity. Next, we select a degree-9 transit AS node as a faulty node. The faulty
node (AS3407) is assigned to Interpath, and the connected 9 high degree AS nodes
(AS81, AS286, AS701, AS1239, AS2548, AS2551, AS2914, AS3561, and AS5413).
One item to note here is that we do not consider the case when a high degree node is
faulty. In that case, basic routing itself will not perform its functionality. Thus, the
route-based DPF protocol, which relies on the underlying routing, cannot function
correctly either.
For both experimental scenarios, Reflect Timer interval is set to 5 seconds; BGP
ConnectRetry interval, 120 seconds; Hold Time interval, 90 seconds; KeepAlive in-
terval, 30 seconds; and MinRouteAdvertisementInterval is set to 30 seconds. We
executed benchmark simulations from 0 second until 5000 seconds. BGP Loc-RIB
tables (storage for selected route update, which is consistent with local IP routing
table) and DPF filter tables are dumped every 150 seconds.2 When simulation starts,
BGP and route-based DPF calculate IP routing table and DPF filter table, respec-
tively. These tables converge at around 300 second. A faulty node goes down at 350
second. Thus, transient state transitions start at that time.
4.4 Stability
Figure 4.1 shows BGP stability as a function of simulation time. In each plot,
“entry count” and “node count” represent distance with respect to entry and distance
2We do not dump routing and filter tables whenever BGP and route-based DPF protocol affectchanges. Instead, we log the intermediate state of routing tables and filter tables periodically.
37
with respect to node, respectively. From the trajectory of “node count” in Figure
4.1(a), we observe that more than 40% of nodes converge at time 1500 second, around
1200 seconds after the start of the the transient period. Then it remains stagnant for
around 300 seconds, and the rest of them converge in the next roughly 700-second
period. In the end, BGP routing stabilizes at around 2500 second. Distance with
respect to entry is close to 0 during the whole transient period, although distance
with respect to node granularity is significant. It implies that most of the nodes have
had few inconsistent entries throughout the whole transient period. In the “node
count” trajectory of Figure 4.1(b), we observe that the pattern of BGP stabilization
is similar to that of 4.1(a). The speed of stabilization is slower, and it stabilizes at
around 2800 second. Comparatively, the transit node failure case takes longer time
for BGP stabilization than the stub node failure case. As with “entry count” plot of
Figure 4.1(a), it is close to 0 during the whole transient period which indicates that
each node has few inconsistent entries.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
dist
ance
of
rout
ing
time
entry countnode count
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
dist
ance
of
rout
ing
time
entry countnode count
(b) A transit node failure.
Figure 4.1. BGP routing stability as a function of simulation time.
Figure 4.2 shows consistency of filter tables as a function of simulation. We
measure consistency of filter table information at a time instance by measuring the
distance from theoretically determined filter information for the given routing at
38
the time instance. Each plot in Figure 4.2 presents distance with respect to entry
as “entry count”; distance with respect to table, “filter count”; and distance with
respect to node, “node count”. In Figure 4.2, we do not observe any transition in
“node count” after the faulty node goes down. Distance with respect to filter table
increases during the initial transient period, and it stabilizes at around 2500 seconds.
In the same way, distance with respect to entry increases during the initial period,
and it stabilizes. The “node count” plot shows that most filter sites have at least
an inconsistent entry. “filter count”, however, shows that the actual fraction of filter
tables which have those invalid entries are less than 20%. Moreover, “entry count”
plot shows that the fraction of invalid entries are significantly less. In conclusion, we
find that there exist few invalid entries in a few filter tables. Nevertheless, they are
dispersed around the whole filter sites.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
dist
ance
time
entry countfilter countnode count
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
dist
ance
time
entry countfilter countnode count
(b) A transit node failure.
Figure 4.2. Consistency of filter tables as a function of simulation time.
As explained in Section 4.2, these differences belong to safety violation category
or staleness category. In case when there is no entry for a valid source address, this
belongs to safety violation category. On the other hand, the case when there exists
an entry for invalid source address is classified into staleness category.
39
4.5 Safety Violation
Figure 4.3 shows safety violation indices—SVentry, SVfilter, and SVnode—as a func-
tion of simulation time for the two benchmark scenarios. The plots “entry count”,
“filter count”, and “node count” denote SVentry, SVfilter, and SVnode, respectively. In
both Figure 4.3(a) and Figure 4.3(b), safety violation happens during the transient
period. After BGP’s convergence (at around 2500 seconds and 2800 seconds, respec-
tively), safety violation does not happen in both figures. From the “node count”
plots in both figures, we find that around 10% of filter sites violate safety during
the transient periods. However, distance values of “entry count” and “filter count”
indicate that few safety violating entries are scattered over those filter sites.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
safe
ty v
iola
tion
time
entry countfilter countnode count
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
safe
ty v
iola
tion
time
entry countfilter countnode count
(b) A transit node failure.
Figure 4.3. Safety violation as a function of simulation time.
40
4.6 Staleness
Figure 4.4 shows staleness indices—STentry, STfilter, and STnode—as a function
of simulation time for the two benchmark scenarios. The plots “entry count”, “filter
count”, and “node count” denote STentry, STfilter, and STnode, respectively. In Figure
4.4, “entry count”, “filter count”, and “node count” increase during the transient
period and converge after BGP’s stabilization. In both cases, staleness remains after
the transient period. From comparison with Figure 4.2, we find that inconsistencies
most of node experienced throughout the simulation period are due to staleness.
As shown in Figure 4.3, safety violation happens only during transient period, and
staleness persists even after the transient period. Plots of “node count” in Figure 4.2,
Figure 4.3(a), and Figure 4.4(a) imply that some nodes exhibit both safety violation
and staleness during the transient period. The same observation holds in the transit
node failure case.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
stal
enes
s
time
entry countfilter countnode count
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
stal
enes
s
time
entry countfilter countnode count
(b) A transit node failure.
Figure 4.4. Staleness as a function of simulation time.
41
4.7 Containment
Route-based DPF provides proactive protection (a.k.a. containment) against
spoofed DDoS attack by discarding spoofed IP packets in the first place. In this
section, we analyze containment of route-based DPF during the transient periods in
the presence of safety-violation and staleness.
As examined earlier, a node failure causes BGP instability for 2500 seconds to
3000 seconds. During the interim, route-based DPF filter contains incorrect entries,
which cause safety-violation and staleness. However, measures for safety violation
and staleness do not capture degree of performance degradation experienced by each
source and destination pair in reality.
Figure 4.5 provides containment index Φ2(1) as a function of simulation time. In
both experimental scenarios, route-based DPF provides around 99% containment. It
means that around 99% of nodes are contained by route-based DPF, and spoofed
DDoS attack cannot succeed on the 99% of nodes targeting at any victim at other
AS nodes. In particular, in contrast to wide spread of staleness during and after the
transient periods, we do not observe performance degeneration.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Φ2(
1)
time
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Φ2(
1)
time
(b) A transit node failure.
Figure 4.5. Containment as a function of simulation time.
42
4.8 Traceback
Although containment of route-based DPF does not allow around 99% of nodes
initiate spoofed DDoS attack, around 1% of nodes can do spoofed DDoS attack.
Reactive protection of route-based DPF (a.k.a. traceback) provides localization of
physical sources within few sites, upon reception of spoofed IP packets.
Figure 4.6 provides trajectories of traceback index Ψ2(τ) with resolution parameter
τ . Each trajectories are calculated from dumped routing and filter information at each
time instance. Although we do not observe any distinct pattern where resolution
parameter τ is less than 5, we find Ψ2(4) approaches to 1 in both node failure cases.
Over all simulation periods, upon receiving spoofed IP packets, route-based DPF can
localize their physical source within 4 nodes. This feature persists even during and
after transient periods.
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
Ψ1(
τ)
τ
150 sec300 sec450 sec600 sec750 sec900 sec
1050 sec1200 sec1350 sec1500 sec1650 sec1800 sec1950 sec2100 sec2250 sec2400 sec2550 sec2700 sec2850 sec3000 sec
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
Ψ1(
τ)
τ
150 sec300 sec450 sec600 sec750 sec900 sec
1050 sec1200 sec1350 sec1500 sec1650 sec1800 sec1950 sec2100 sec2250 sec2400 sec2550 sec2700 sec2850 sec3000 sec
(b) A transit node failure.
Figure 4.6. Traceback as a function of resolution.
Figure 4.7 provides traceback index Ψ2(τ), where τ is from 1 to 5, as a function
of simulation time. Since staleness increases during transient period, reactive per-
formance with resolution parameter 2 and 3 diminishes in a stub node failure case;
reactive performance with resolution 2 decreases in a transit node failure case. As
shown in Figure 4.6, nonetheless, Ψ2(4) remains close to 1 for the entire simulation
43
period. It means that, upon reception of spoofed IP packet, roughly any victim can
localize actual attacker within 4 sites at any time instance.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Ψ1(
τ)
time
τ = 1τ = 2τ = 3τ = 4τ = 5
(a) A stub node failure.
0
0.2
0.4
0.6
0.8
1
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Ψ1(
τ)
time
τ = 1τ = 2τ = 3τ = 4τ = 5
(b) A transit node failure.
Figure 4.7. Traceback as a function of simulation time.
44
45
5 DYNAMIC DPF SIMULATOR
The Dynamic DPF Simulator is a tool for evaluating performance of the route-based
DPF protocol in a dynamic network environment where states of network elements,
such as routers and links, and network protocols may change. Aiming at a realistic
and scalable evaluation of dynamic performance of the route-based DPF protocol, the
Dynamic DPF Simulator is designed to work with large-scale Internet Autonomous
System (AS) measurement graphs. The tools contained therein are applicable to
general network simulation environments, including router graphs and Internet traffic
controls.
The Dynamic DPF Simulator is built on top of DaSSFNet, which provides a net-
work simulation environment for workstation clusters and parallel computers. The
Dynamic DPF Simulator provides additional functionalities, encompassing automatic
partitioning and simulation configuration, various network protocols, and a compre-
hensive measurement framework.
The rest of the chapter is organized as follows. First, we give a brief overview
of DaSSFNet. This includes a description of the major features of DaSSFNet. The
description of the Dynamic DPF Simulator as a complete system follows. Each major
component is discussed in its dedicated section.
5.1 Overview of DaSSFNet
DaSSFNet [8] is a DaSSF-based implementation of SSFNet. As a general-purpose
simulation environment, the Dartmouth Scalable Simulation Framework (DaSSF) [39]
provides a C++ implementation of the Scalable Simulation Framework (SSF). In
addition, DaSSF supports shared-memory multiprocessors and distributed-memory
machines as its platform, incorporating advanced parallel simulation techniques. SSF
46
[50] defines a unified, object-oriented application programming interface (SSF API)
as a standard user interface for discrete-event simulation, considering usability and
performance as its primary design goal. Supporting a process-oriented world-view
of discrete-event simulation, SSF helps make detailed design and implementation
of network models including protocols possible. SSFNet [35] provides simulation
models of various network elements and network protocols on top of a Java-based
implementation of SSF.
In the rest of the section, descriptions of the three key features of DaSSFNet—
scalable simulation kernel, process-oriented world-view, and various network simu-
lation models, are discussed. Then, a brief introduction to the Domain Modelling
Language (DML) for network specification is presented.
DaSSF, as a base system of DaSSFNet, provides a scalable simulation kernel
along with a network specification language DML. DaSSF can be configured either
as a stand-alone single process application or as a networked distributed application.
As a distributed application, it runs on distributed-memory machines, such as Linux
workstation clusters—the specific environment we use for benchmarking. For the lat-
ter, DaSSF uses the Message Passing Interface (MPI) [51] for synchronization and
communication between system components, that implement advanced parallel simu-
lation techniques. The parallel simulation techniques employed by DaSSF include dis-
tributed event synchronization, thrifty memory usage, and efficient multi-threading.
The threading mechanism reduces the overhead of process context switching cost as
well as that of memory consumption [39]. DaSSF exports an application programming
interface that is compliant with SSF.
A process-oriented world-view is supported by SSF, and hence, DaSSF. The SSF
API includes five primary class interfaces—Entity, inChannel, outChannel, Event,
and Process. Entity is a simulation subject which stores the state of a simulation.
inChannel and outChannel define a way of message passing between Entity objects.
Event represents a form of message exchanged via inChannel and outChannel objects.
Finally, Process is a thread of control, which is scheduled by the simulation framework
47
nonpreemptively. As an action of a thread, a Process object, which belongs to an
Entity object, can generate and send an Event object to other Process objects. By
providing a similar abstraction of processes as that used in multi-tasking operating
systems, the SSF API supports a familiar process-oriented world-view to modelers.
Thus, we can design, implement, test, and analyze simulation models as we do so in
real systems modulo idiosyncracies imposed by hardware characteristics.
DaSSFNet comes with a collection of simulation models of network elements and
network protocols, specifically IP and TCP. Adopting the process-oriented world-
view of SSF and its C++ realization for workstation clusters by DaSSF, DaSSFNet
provides a network simulation environment, which is amenable to full-fledged network
protocol implementations. Following is a description of the main components:
• Machine is a logical subject of network simulation components and is modelled
as an Entity object of the SSF API. It consists of zero or more network interfaces
and a network protocol graph which contains installed network protocols.
• NIC representing a network interface, is modelled as a Process, containing an in-
Channel object and an outChannel object. The pair of inChannel and outChan-
nel objects reflects the bi-directional nature of communication media and MAC
protocols. A packet is received and sent through these inChannel and outChan-
nel objects. By modelling an interface as a Process, it can detect and handle
incoming packets as soon as the simulation kernel notifies their arrival to the
inChannel of the interface.
• Link models a mapping of inChannel objects and outChannel objects. Accord-
ing to the configuration of a link model, it provides one-to-one or one-to-many
mappings between participating inChannel objects and outChannel objects. For
example, the point-to-point connection between two machines is represented as
two one-to-one mappings between inChannel objects and outChannel objects.
On the other hand, in the case of modelling a Local Area Network (LAN), each
outChannel object is mapped to all inChannel objects within LAN.
48
• Hardware is modelled as a logical border between network elements and net-
work protocols. All incoming packets received by a network interface, or outgo-
ing packets sent from higher layer network protocols, are passed to a hardware
object. The hardware object passes the incoming packets to the network proto-
col model installed at the lowest layer. Conversely, outgoing packets sent from
the lowest layer protocol are passed to the network interface to be sent via its
outChannel object.
• Network protocols, including IP, TCP, UDP, HTTP, and a BSD-like socket
interface, are implemented as separate C++ classes, borrowing the design phi-
losophy of the x -kernel [52]. First, network protocols, installed on a machine,
compose a graph of network protocols following protocol layering. Next, every
incoming packet is passed to protocols by invoking generic member functions—
send() and receive()—of each protocol according to the network protocol
graph structure. Under these design rules, a full-fledged protocol implementa-
tion is provided.
One thing to note here is that message passing via network interfaces is managed
by DaSSF. An instance of class KernelEvents is created and scheduled by DaSSF
when an Event object is sent through an outChannel object. While processing the
scheduled event, DaSSF creates and schedules another instance of class KernelEvents
for its arrival. However, packet handling by network protocol is realized as a chain
of procedure calls between related objects, which model network protocols. In other
words, the scheme does not require DaSSF to be involved for these packet processing
tasks. As a result, packet processing might be finished in zero simulation time un-
less a protocol model implements a form of processing delay. Although this provides
efficiency by reducing the cost for additional event handling related to packet pro-
cessing, it can also be considered a drawback since processing delay in real systems is
ignored. Nonetheless, it provides an abstraction similar to that of network protocol
implementations in typical operating systems, such as Linux and Windows.
49
DaSSFNet requires a network model—the target of a simulation—to be described
in the Domain Modelling Language (DML) [53]. Figure 5.1 illustrates a simple net-
work specification in DML which consists of one router. Net encloses a configuration
of a network. Similarly, Router encloses a configuration of a router. Within the
router specification in Figure 5.1, a network interface and a network protocol graph
are configured using interface and graph, respectively. The network protocol graph
includes IP, TCP, a BSD-like socket interface, and BGP protocol specifications which
use ProtocolSession. Ordering of protocol specifications within graph implies pro-
tocol layering starting from the highest layer. When more than one protocol exists
above a protocol in the protocol stack (e.g., TCP/UDP on top of IP), a special key-
word child1 is used within the upper layer protocol specifications to indicate the
lower layer protocol as their common lower layer protocol.
DML includes a special keyword, alignment, within Net for describing partition-
ing information. The keyword takes a string value which identifies a partition group
uniquely. More than one network specification can belong to the same partition
group. Figure 5.2 presents how a mapping between partition groups and participat-
ing distributed machines are specified using MapInfo. A keyword, nnodes, specifies
the number of distributed machines participating in a simulation run. A partition
group, "group1", is mapped to a machine whose identifier is 1. An identifier of a ma-
chine is assigned by MPI starting from 0 to the number of machines - 1. By default,
network specifications, which do not have any alignment information, belong to the
same partition group, and it is implicitly mapped to a machine whose identifier is 0.
Figure 5.3 shows an example DML snippet of a point-to-point link. The link
represents a physical connection between two interfaces, whose Network Host Interface
(NHI) [53] addresses are 1:1(0) and 2:2(0). The keyword attach is used to specify
an interface. Here, 1:1(0) means the interface 0 of the host 1, within the network 1.
In the same way, 2:2(0) means the interface 0 of host 2 within network 2.
1Although this keyword was supported in a past version of DaSSFNet, the latest version does notprovide it. The Dynamic DPF Simulator has a modified version of DaSSFNet that supports thekeyword.
50
Net[
id 1
alignment "group1"
Router[
id 1
interface[id 0]
graph[
ProtocolSession[
name bgp
use SSF.OS.BGP ]
ProtocolSession[
name socketMaster
use SSF.OS.Socket.socketMaster ]
ProtocolSession[
name TCP
use SSF.OS.TCP.tcpSessionMaster ]
ProtocolSession[
name IP
use SSF.OS.IP ]
]
]
]
Figure 5.1. A simple network specification in DML, which consists of one router.
MapInfo[
nnodes 2
map [ alignment "group1" machid 1 ]
]
Figure 5.2. Mapping of partition groups onto distributed machines.
51
link[
attach 1:1(0)
attach 2:2(0)
]
Figure 5.3. A DML snippet of a point-to-point link.
52
5.2 DaSSFNet-based Parallel Network Simulation Environment
5.2.1 Description
As a comprehensive DaSSFNet-based distributed network simulation environ-
ment, which considers AS-level Internet graph as its main input, the Dynamic DPF
Simulator supports the following features essential for scalable simulation:
• Automatic model configuration and topology partitioning
Since the Dynamic DPF Simulator targets AS-level Internet graphs, which are
large and have their own graph representation format, a subsystem, Meta-DML,
is provided as an automatic DML configuration tool. At the same time, Meta-
DML partitions an AS-level Internet graph into separate partition groups, which
are mapped to participating distributed machines. Our own algorithm for par-
titioning AS-level Internet graphs is employed which aims to achieve scalable
simulation with respect to the size of the input graph and efficient use of dis-
tributed memory CPU and communication resources.
• Measurement framework
Although the distributed simulation platform supporting automatic network
configuration and power-law topology partitioning enables large-scale network
simulation that efficiently utilizes distributed system resources, a measurement
framework for resource monitoring is needed for analyzing run-time perfor-
mance of the distributed simulation, especially as it pertains to dynamic mem-
ory consumption—a key bottleneck—and event monitoring for diagnosis and
performance analysis. The Dynamic DPF Simulator incorporates a comprehen-
sive measurement framework for large-scale distributed simulation monitoring,
which includes memory consumption, CPU load distribution, and communica-
tion cost. The Dynamic DPF Simulator provides a collection of measurement
routines with a standardized way of specifying measurement configurations in
DML.
53
• Network protocol models
First, DPF-update and DPF-lookup protocol models are designed and imple-
mented for route-based DPF’s dynamic performance evaluation. Second, BGP
is implemented by porting the Java-based implementation of SSFNet [35]. Since
DaSSFNet does not provided a dynamic routing protocol models, such as BGP
or OSPF, a BGP implementation was required for performance benchmarking
on AS-level Internet measurement graphs. Next, a range of application models
are supported. They include traffic generators, attackers, and system fault gen-
erators. Figure 5.4 presents network protocol models supported by the Dynamic
DPF Simulator2.
CBR, Poisson, file trace, MMPP, LRD
{attackers, traffic generators, fault models, ...}
Applications
BGP
TCP UDP
IP
Link Layer
Figure 5.4. Network protocol models in the Dynamic DPF Simulator.
• Modelling AS-level Internet graph
An AS-level Internet graph represents an AS as a node and peering relation-
ships of ASes as edges between nodes. An AS in the Dynamic DPF Simulator
is modelled as a network with one border router. Peering relationships between
ASes are modelled as physical and logical connections between border routers.
2The latest release of DaSSFNet includes the UDP protocol. It was not provided at the time ofdesign and implementation of our prototype system.
54
Physical connection stands for physical link between border routers, and logical
connection stands for peering relationship established by BGP at run-time. Due
to the AS-level viewpoint, an AS where an attacker’s machine belongs to is mod-
elled as a network whose border router runs the attack application. Similarly, an
AS where a route-based DPF filter is deployed is modelled as a network whose
border router is configured with DPF-update and DPF-lookup protocols. The
specification of the DPF-lookup protocol includes a list of network interfaces
where route-based DPF filters are deployed. Thus, we can selectively deploy
route-based DPF filters at physical links of a border router.
For simplicity, identifiers of both network and border routers within the network
are the same as that of the corresponding AS node of the graph. In case
of network interfaces at a border router, serial numbers starting from 0 are
assigned.
• IP address assignment
DaSSFNet provides an IP addressing scheme which use subnet addressing. Sub-
net addressing is assumed within link and Net specifications. Thus, DaSSFNet
automatically calculates a subnet mask depending on the number of network
interfaces within link or Net. Each network interface is assigned an IP address
by adding a unique identifier within its subnet to the subnet mask calculated.
However, this addressing scheme is not suited for modelling peering relationships
between ASes. As an administrative domain, each AS manages and assigns IP
addresses that we allocated to it by InterNIC. Thus, modelling a link between
border routers at different ASes as a subnet is not appropriate. For this reason,
the Dynamic DPF Simulator has its own addressing scheme. Assuming “subnet
addressing” is used within an AS, the most significant 16 bits are used for the
subnet mask, in other words, a common IP prefix. The least significant 16 bits
are filled with a unique identifier denoting an interface.
55
Figure 5.5 shows the system architecture of the Dynamic DPF Simulator. As men-
tioned earlier, Meta-DML generates a DML configuration file from a given network
partitioning information (i.e., the number of distributed machines), network topology
(i.e., AS-level Internet graph), and additional user-specified protocol configuration
information.
Figure 5.5. System architecture of the Dynamic DPF Simulator.
DaSSFNet takes a DML configuration file as input and creates C++ objects of
network elements and network protocols as specified in the file. Since the DML config-
uration file includes partitioning information, C++ objects of a network specification
are instantiated at a corresponding machine where their partition group is assigned.
Before a simulation starts, there are two phases of preparation for a simulation—
self-configuration and self-initialization. Self-configuration precedes self-initialization.
First, DaSSFNet triggers self-configuration by calling Configure() of the C++ object
of the outermost Net specification, which encloses the whole simulation model. In
56
turn, it triggers Configure() of all C++ objects of specified models. Next, they
trigger Configure() of all C++ objects of models specified within. In this fashion,
C++ objects of all simulation models are recursively configured. After configuration,
DaSSFNet triggers self-initialization by calling Initialize() of the C++ object of
the outermost Net specification.
Once all C++ objects are created, configured, and initialized by DaSSFNet, sim-
ulation starts. As a discrete-event simulator, DaSSF processes events in the simu-
lation kernel. These events, called kernel events, are generated by applications and
network protocol models. After processing events scheduled at a particular time,
DaSSF schedules Processes nonpreemptively. After scheduling all eligible Processes,
DaSSF advances the current simulation time to that of the nearest future event(s)3.
In case processing of an event requires communication or synchronization with simu-
lation state at other participating machines, DaSSF communicates via MPI to affect
coordination.
5.2.2 Automatic Model Configuration and Partitioning
Since the Dynamic DPF Simulator is used with AS-level Internet graphs which
are large and have their own graph representation format, an automatic DML con-
figuration tool is provided as a subsystem. At the same time, Meta-DML partitions
a given AS-level Internet graph into separate partition groups, which are mapped
to participating distributed machines. We devised an algorithm for partitioning AS-
level Internet graphs possessing power-law connectivity properties, aimed achieving
scalable simulation with respect to the size of the input graph through efficient use
of distributed memory, CPU, and communication resources.
Meta-DML accepts network partitioning information (i.e., the number of dis-
tributed machines), network topology (i.e., AS-level Internet graph and transit AS
node information), and additional protocol configuration. For example, additional
3Depending on how the initial triggering of events in a simulation is set up, more than one eventmay occur simultaneously and is scheduled accordingly
57
protocol configuration includes distributed filter configuration such as, list of DPF
filter-deployed ASes and attack configuration such as, list of ASes, which install at-
tacker models. Parsing a given AS-level Internet graph input, Meta-DML builds an
initial graph representation. Meta-DML partitions the given Internet graph over the
participating distributed machines and stores the information. Next, distributed fil-
ter and attack configuration are processed. Finally, Meta-DML generates a DML file
from the information collected.
Figure 5.6 shows a sample Meta-DML input file. The variable size denotes the
number of nodes in the input graph, machines represents the total number of ma-
chines, and dynamic_1_static_0 takes on the values 0 or 1. When dynamic_1_static_0
is set to 0, Meta-DML generates an additional variable routing_tbl_file within the
IP DML specification, so that IP routing table entries are loaded from a file. Other-
wise, BGP is configured at every node, in order to calculate routing table information
dynamically. The variable filtering_1_nofiltering_0 is used to configure filtering
functionality of each DPF-lookup protocol centrally using filtering of DPF-lookup
DML model. An AS-level Internet graph which is represented as an adjacency list
is given as input specified by graph_input. transit_node_input specifies a list of
transit AS nodes. Nodes that are not listed in the file are configured as stub AS nodes.
The variable filter_node_input takes a list of filter-deployed nodes as argument.
attack_input represents a list of attacker nodes.
Partitioning
Partitioning a given graph for efficiently utilizing distributed memory, CPU, and
communication resources is a critical component of large-scale simulation. As pre-
sented in Figure 5.7, AS-level Internet graphs based on Oregon RouteViews/NLANR
measurements have grown super-linearly during 1998-2003. The measurement topol-
ogy dated 01/01/2002 has 12,514 nodes. With the growth of the number of AS nodes
n, O(n2) memory requirement for maintaining O(n2) routing entries is required in
58
size 3023
machines 10
dynamic_1_static_0 1
filtering_1_nofiltering_0 1
graph_input ASgraph.3023
transit_node_input AStrans.all
filter_node_input VC.3023
attack_input attack.3023
Figure 5.6. A sample Meta-DML input file.
case each AS has at least one unique IP prefix. Furthermore, simulation scenarios
involving varying traffic patterns imposes additional memory requirements as well as
CPU and communication resource requirements.
0
2000
4000
6000
8000
10000
12000
14000
16000
1998 1999 2000 2001 2002 2003
# A
S
year
Figure 5.7. Growth of AS-level Internet graph.
One important feature of distributed simulation with respect to partitioning is
that the slowest participating process determines the execution time of the whole
simulation. This implies that effective load balancing is crucial for parallel speed-
up. Furthermore, overloaded memory resource requirements may cause swapping
between main memory and disk if the virtual memory system gets triggered which
has a debilitating effect on distributed simulation. A key requirement of partitioning,
59
in addition to CPU and communication balancing, is that static and dynamic memory
requirement is balanced so that the virtual memory system is kept at bay.
In this thesis, we assume a homogeneous workstation cluster environment, where
each machine has uniform memory, CPU, and communication resources. Since a
major challenge for performing large-scale network simulation comes from significant
memory requirement, we focus on balanced distribution of static and dynamic mem-
ory requirement. To do so, we identify key factors involved in memory consumption
from experiments, and they are offloaded into participating machines. By harnessing
locality of message exchange within each distributed machine, overhead of synchro-
nization and message exchange between machines is reduced, and utilization of given
CPU and communication resources is increased.
From performance results of memory requirement monitoring in Chapter 6, it
turns out that routing tables, especially Adj-RIB-In tables of BGP, are the most
dominant factor in terms of static memory consumption. The results also show that
messages occupy only a small portion of dynamic memory compared to that of ta-
bles. Thus, we first focus on evenly distributing BGP Adj-RIB-In table’s memory
requirement into participating machines for static memory requirement balancing.
Since every edge in an AS-level Internet graph represents a BGP peering rela-
tionship, a AS node u has Adj-RIB-In tables with O(|V |deg(u)) space requirement
where |V | represents the total number of nodes. Denoting the number of edges of
an AS-level Internet graph and the number of participating machines as |E| and
k, respectively, the total number of Adj-RIB-In tables comes to 2|E|, and its space
complexity is O(|V ||E|). Thus, our heuristic algorithm limits the total number of
Adj-RIB-In tables in each partition group to around 2|E|/k.
As presented in [40], AS-level Internet graph possess power-law connectivity prop-
erties. One of the implications is that there exist a few high degree ASes, which pos-
sess peering relationships with many low degree ASes. Figure 5.8 shows a 300-node
subgraph of the 3023-node AS-level Internet graph of Oregon RouteViews/NLANR
60
measurements from 11/08/1997. We observe a few locally “star-like” AS clusters that
are connected by a more complicated “backnone”.
Figure 5.8. 300-node AS-level Internet graph.
Exploiting power-law connectivity in partitioning helps to reduce the frequency of
message exchange between distributed machines, which reduces completion time due
to reduced message exchange overhead. Considering locally star-like AS clusters, all
network traffic generated by, or heading to, many low degree ASes should go through
central high degree AS nodes. Hence, one locally star-like AS cluster is put into one
partition group. This presents message exchanges within locally star-like AS clusters
from being sent across distributed machines resulting in communication overhead.
The partitioning routine takes a graph G = (V, E) and the number of machines k
as input. It returns k partition groups as output. The entire procedure of partitioning
consists of 4 steps—sorting, phase 0, phase 1, and phase 2.
In the sorting step, V , the set of nodes of the graph G, is sorted by the degree of
each node. Phase 0 is a step that uses the power-law property for partitioning. During
phase 0, the k largest locally star-like AS clusters are assigned to k separate partition
groups. This is shown in Figure 5.9. Checking the adjacency list of each k highest
degree nodes, degree-1 nodes are put into the same partition group as their adjacent
highest degree node. This enables messages between degree-1 nodes connected by a
central high-degree node to be localized within a partition group.
61
Phase 1 and phase 2 are steps for balancing the memory requirement with respect
to the total number of Adj-RIB-In tables within one partition group. During phase 1,
connectivity information of the graph is additionally considered to reduce the number
of messages across distributed machines. As seen in Figure 5.10, partition groups are
filled from a group of the kth highest degree node to a group of the first highest
degree node. To fill a partition group, nodes are visited from the (k + 1)th node to
the last node according to the sorted order. When a node has not yet been assigned
into any group, the algorithm checks if the node has connections with any node of the
group. If this is the case, the algorithm considers if the degree of the node makes the
total number of Adj-RIB-In tables of the group exceed a limit. If the resulting total
number of Adj-RIB-In tables are within the limit, it is assigned into the partition
group. In this greedy fashion, each partition group is filled in order. Consequently,
there may be residual nodes due to its greedy characteristic.
The last step, phase 2, is for partitioning the residue which have not yet been
assigned until phase 1. Nodes are assigned into partition groups, where the total
number of Adj-RIB-In tables have not yet reached the limit, 2|E|/k, without regard
to their connectivity.
62
phase0
. Assign the k highest degree nodes into k different partitions
for i ← 1 to k
do partition[i].lists← partition[i].lists ∪ node[i];
partition[i].nedges← deg[i];
. Select degree-1 nodes from adjacency list of the k highest degree nodes
for i ← 1 to k
do for all j ∈ adj[i]
do if deg[j] = 1
then partition[i].lists← parition[i].lists ∪ node[j];
partition[i].nedges← partition[i].nedges + deg[j];
Figure 5.9. Pseudo code of phase 0.
phase1
. Use connectivity information and check for memory threshold
for i ← k to 1
do for j ← k + 1 to |V|
do if (node[j] is not assigned yet)
then for all l ∈ partition[i].lists
do if (node[j] ∈ adj[l]) and
((partition[i].nedges + deg[j]) < 2*|E|/k)
then partition[i].lists←
parition[i].lists ∪ node[j];
partition[i].nedges←
partition[i].nedges + deg[j];
break;
Figure 5.10. Pseudo code of phase 1.
63
5.2.3 Measurement Framework
Memory Requirement Analysis
In general, a network protocol model needs to maintain two category of state
information—tables and messages. For example, routing related tables are generally
required in any network simulation, and we can roughly estimate necessary amount
of memory resource statically. On the other hand, protocol messages are created
and deleted depending on each protocol’s own semantics or protocols’ interaction
under current state of the network being simulated. Hence, the amount of memory
requirement is determined dynamically, and it is difficult to approximate in advance.
Let us say protocol message related information as memory complexity; table re-
lated information as table complexity. The Dynamic DPF Simulator logs message
complexity of IP packet, TCP segment, and BGP message; table complexity of IP
routing table and BGP’s internal tables—Adj-RIB-In and Loc-RIB tables. The Dy-
namic DPF Simulator leaves a log at the time when a message or table entry is created
and the object’s size within its constructor; the time when a message or table entry is
deleted and the object’s size within its destructor. The trace file is used for deducing
the total number of objects created, the total number of objects existing at a time
instance, and the total memory consumption by them at the time instance.
The total memory consumption by protocol messages and tables at each time
instance indicates the amount of memory resources required for completion of the
given simulation. The memory requirement corresponds to the maximum value of
the total memory usage during simulation. Logging memory usage at each protocol
layer, we can identify memory requirement of messages and tables for each protocol
layer. Besides, it helps to understand dynamic state transition of network protocols.
For example, table complexity of IP tables and BGP tables indicates BGP’s routing
table calculation state—its transition and stability.
64
CPU Load Monitoring
The Dynamic DPF Simulator monitors simulation kernel-level event objects’ cre-
ation and deletion, and it logs their creation/deletion time, their size, and their type
information. Discrete event simulation framework including DaSSF generates and
handles internal events to advance simulation time. Although there exist differences
in processing events depending on their type, the total number of events exhibits
CPU load generally. Hence, we can infer CPU load balance from comparison of the
total number of events at each participating machine.
One item to note here is that there exists additional CPU load, which is introduced
by process-oriented world view of DaSSF. DaSSF provides a similar abstraction of
process in operating systems context in a way that users can design, implement, test,
and analyze simulation models as they do in real systems. Internally, DaSSF handles
event(s) which is scheduled to occur at a simulation time instance. Before advancing
the current simulation time to that of the next nearest future event(s), it schedules
all the eligible processes nonpreemptively. It causes nondeterministic computing load
per process depending on its execution flow. By monitoring CPU time over the course
of simulation, we can take this nondeterministic computing load into account.
Communication Cost Analysis
The Dynamic DPF Simulator’s detailed measurement of DaSSF’s internal events
also provides accurate and efficient run-time monitoring of communication cost. Fig-
ure 5.11 illustrates local and remote message passing procedures. An IP packet sent
through a network interface card (NIC) causes an outChannel type event to be sched-
uled. In case when the recipient side is partitioned within the same partition group,
another event object of type inChannel is scheduled while processing the outChan-
nel object. Processing the inChannel event, simulation kernel notifies arrival of the
IP packet to the recipient side NIC. During local message passing procedure, one
inChannel object and one outChannel object are instantiated.
65
On the other hand, when the recipient is present at a remote node, simulation ker-
nel schedules an Channel type event, while processing the outChannel event. Handling
the Channel event, simulation kernel sends a message to the remote node through
MPI. Subsequently, when the remote node receives the message subsequently, it sched-
ules an inChannel event. Finally, simulation kernel at the recipient side notifies the
IP packet’s arrival to the receiving side NIC, processing the inChannel event. Hence,
from fraction of the number of the Channel event processed to the number of the
outChannel (or inChannel) event processed, we observe degree of remote message
passing.
Figure 5.11. Local and remote message passing procedures.
Note that the communication cost incurred by synchronization is not taken into
account in this analysis. Before simulation starts, DaSSF calculates synchronization
interval from the given network configuration including link latency statically. Hence,
irrespective of degree of remote message passing, simulations which have same network
configuration require the same amount of fixed synchronization overhead.
66
Measurement Methodology
We characterize measurement methodology by grouping into canonical, sampling-
based, time-based, or hybrid of sampling-based and time-based measurement. A
canonical measurement is a category where a variable of interest is logged4 when an
event, changing state of the variable of interest, occurs. However, as level of detail
and scale of simulation increases, the number of event occurrence increases drastically.
Hence giving up minute details of information, techniques such as sampling-based or
time-based logging are applied. Sampling-based measurement is to log the variable of
interest once out of a specified amount of event occurrences. On the other hand, time-
based measurement is to log the variable with a specified time interval. Sometimes,
event occurrences are not evenly distributed along simulation time. In other words,
most of events might occur in a short period of simulation time, so that sampling-
based method might lose information which is rare. However, this information can
be important in terms of temporal trajectory of the variable of interest. In this
case, sampling and time-based method can be used together. That is, the variable of
interest can be logged when either a specified amount of events have occurred or a
specified time interval has passed.
DML Specification for Measurement Routines
The Dynamic DPF Simulator supports measurement routines for the aforemen-
tioned use. In addition, it proposes a generic way of specifying control variables in
input DML file.
The Dynamic DPF Simulator supports two types of measurement scope—global
and local. Global scope measurement represents to measure state of entire network
at a specific network protocol perspective or at simulation kernel perspective. On the
other hand, in local scope measurement, events related with incoming or outgoing
4Here, log means to write into a specified file. These categories are mainly focused on the action oflogging.
67
packets are measured at a local Machine’s view. Hence in order to obtain global
information from local measurement data, results logged at all Machine objects should
be collected and processed additionally.
In case of global measurement, the Dynamic DPF Simulator requires a sepa-
rate description enclosed with global_measurement. Figure 5.12 shows a DML
snippet for global measurement. The simulator classifies global measurement into
DaSSF_kernel_level and DaSSFNET_non_kernel_level. Note that measurement
from the viewpoint of simulation kernel belongs to global measurement, because
there is no concept of a local Machine at the kernel’s point of view. DML speci-
fications of measurement routines are listed within these classification boundaries.
For example, a measurement routine, kernel_event_count, monitors the number of
KernelEvent objects created cumulatively. This routine takes four control variables—
ON_OFF, SAMPLE_RATE, TIME_CONST, FILE. ON_OFF enables or disables the measure-
ment routine by taking ON or OFF as its value. Its default value is OFF. SAMPLE_RATE
provides a way of logging variables using sampling-based method. It takes an inte-
ger value, and the routine logs these variables once out of specified amount of event
occurrences. By default, the value is set to 1. In other words, these variables are
logged whenever any of those variables is changed. TIME_CONST provides a way of
logging variables using time-based method. It takes a real number, whose unit is in
simulation second. It corresponds to time interval of logging, and its default value
is 1 second. FILE takes name of a file as its value, and it is set to kevents_msr by
default. The same convention of control variables holds to other DML descriptions.
Since local measurement provides a way of measuring events related with incoming
or outgoing packets at a local Machine’s view, routines in this category are specified
as a part of each Machine model’s DML description. Figure 5.13 shows a DML snip-
pet for local measurement at IP. Depending on the viewpoint of measurement—per
network protocol, a set of control variables are provided by default. Hence, one can
use them to customize measurement routines.The meaning and usage of the four con-
trol variables (MSR_ON_OFF, MSR_SAMPLE_RATE, MSR_TIME_CONST, MSR_FILE) are the
68
global_measurement[
DaSSF_kernel_level[
kernel_event_count[
ON_OFF ON
SAMPLE_RATE 1
TIME_CONST 1
FILE kernelevt
]
]
DaSSFNET_non_kernel_level[
# other measurement routines at non-kernel level come here.
]
]
Figure 5.12. A DML snippet for global measurement
same as those of the global measurement (ON_OFF, SAMPLE_RATE, TIME_CONST, FILE),
respectively. Note that there are additional control variables—NI_MSR_ON_OFF and
NI_FILE, for measurement at network interface layer of Internet Reference Model. It
monitors a packet queue in network interface layer, and leaves a log when enqueueing,
dequeueing, or dropping a packet happen.
69
ProtocolSession [
name IP
use SSF.OS.IP
MSR_ON_OFF true # supported by all protocol models
MSR_SAMPLE_RATE 1 # supported by all protocol models
MSR_TIME_CONST 1 # supported by all protocol models
MSR_FILE ip_msr.txt # supported by all protocol models
NI_MSR_ON_OFF true # only for measurement at network interface
NI_MSR_FILE layer2_msr.txt # only for measurement at network interface
]
Figure 5.13. A DML snippet for local measurement at IP.
70
5.2.4 Protocol Modeling
The Dynamic DPF Simulator supports additional protocol models to the ones
given by DaSSFNet. Some of them, such as BGP, UDP, and various application
models, are useful for general purpose network simulations. Others, such as DPF-
update and DPF-lookup models, are designed for evaluating dynamic performance of
the route-based DPF protocol. In this section, we describe protocol models of these
protocols.
BGP-4 and its extension
DaSSFNet does not provide BGP—Internet inter-domain routing protocol. Al-
though it provides an indirect way for specifying routing table information as part of
an input DML file, a simulation, which requires dynamic alteration of routes accord-
ing to time-varying state of the network, is infeasible. We have implemented E-BGP
part of BGP-4, based on BGP-4’s RFC [48] and its protocol model in SSFNet [35].
One item to mention here is that the BGP-4 extension is designed together with
BGP-4 as a single BGP-4 model. To enable or disable the functionality of the BGP-4
extension, an additional parameter, BGP_reflect, is employed. An example of DML
for BGP-4 is described at the end of this section.
The BGP-4 protocol model is implemented as class bgp. As shown in Figure 5.14,
BGP-4 sits on top of TCP for reliable transport of its messages. BGP maintains one
or more peering relationships to other BGP speakers, and each peering relationship is
maintained as an instance of class neighbor. Thus each BGP protocol has separate
neighbor objects per peer. Each neighbor object creates two separate processes to
handle sending and receiving BGP messages from its corresponding peer. The two
processes communicate according to their states and events of BGP-4, as defined in
RFC1771 [48], and buffers between bgp instance and its neighbor instance store BGP
UPDATE and BGP REFLECT messages to send.
71
Figure 5.14. An illustration of a peering relationship between one border router inAS 1 and another in AS 2.
Three types of tables are defined in BGP-4 RFC—Adj RIB In, Loc RIB and
Adj RIB Out. In this model, there exist Adj RIB In and Loc RIB only. Without
having separate tables for advertisement, Loc RIB is used as Adj RIB Out simul-
taneously. This is because our advertisement policy allows all newly received BGP
UPDATE messages to be advertised to peers. Both Loc RIB and Adj RIB In use
trie as their internal data structure, so that IP prefix searching can be done quickly.
Figure 5.15 depicts the architecture of BGP protocol model with its tables. A BGP
protocol instance may have more than one peering relationships. A bgp instance has
a central Loc RIB table, which store route information in the form of Loc_RIB_data.
On the other hand, each neighbor objects has their own Adj RIB In tables to keep
route information from incoming BGP UPDATE messages.
There are 4 types of messages in BGP-4: OPEN, KEEPALIVE, UPDATE and
NOTIFICATION.5 As shown in Figure 5.16, all these messages are derived from a base
class bgp_message, which contains type and length fields of message header. Please
refer to RFC1771 [48] for detailed description of each message. To support the exten-
5NOTIFICATION type message is not supported in this model.
72
Figure 5.15. The architecture of BGP protocol model with its tables.
sion of BGP-4, BGP REFLECT type message is defined as class reflectmessage.
It is also derived from class bgp_message.
Figure 5.16. The class hierachy of BGP message types.
The BGP-4 protocol models employs four timers for BGP-4 and one additional
timer for its extension. ConnectRetry, Hold, KeepAlive, and MinRouteAdvertise-
mentInterval (MRAI) timers are used for BGP-4, and ReflectSend timer is used for
its extension, which is defined as Refelct timer in Section 3.3.4. Figure 5.17 illustrates
the architecture of BGP protocol model with its timers. First, Keepalive_timer is
73
used to send BGP KEEPALIVE messages periodically. Whenever Keepalive_timer
expires Handle_Send instance sends a BGP KEEPALIVE message to its peer. Next,
Handle_Listen instance uses Hold_timer for two purposes. During BGP Connect
and Active states, it is used as ConnectRetry timer. On the other hand, it is used
as Hold timer during BGP OpenConfirm and Established state. When Connec-
tRetry timer expires, the timer is reset and transport connection is initiated. When
Hold timer expires, transport connection is closed and resources are released. Both
Keepalive_timer and Hold_timer exist per BGP peer, managing separate peering
relationship with each peer. MRAI timer is responsible for initiating BGP UPDATE
messages periodically. Finally, ReflectSend timer is used to trigger BGP RELECT
messages. Both MRAI_send_timer and Reflect_send_timer are managed centrally
by BGP protocol instance.
Figure 5.17. The architecture of BGP protocol model with its timers.
Figure 5.18 shows a DML snippet of BGP-4 model. The string identifier of BGP-4
protocol model, SSF.OS.BGP, is taken as the keyword use. The keyword autoconfig
takes either true or false as a value. By setting it to true, BGP initiates peering
relationships with all BGP instances at directly connected machines automatically.
Otherwise, we can set peering relationships manually, using additional keywords for
manual configuration. The keyword connretry_time, keepalive_time, hold_time,
and mrai_time take values for setting ConnectRetry timer, KeepAlive timer, Hold
timer, and MRAI timer, respectively. As mentioned in Section 3.3.4, our BGP-4
74
model supports configuration of AS either as a stub AS or as a transit AS. The
keyword stub is used for this configuration. Our BGP-4 model supports entries of
Loc-RIB table to be dumped into a file. The keyword table_dump is used for en-
abling or disabling the functionality. The keyword BGP_reflect, reflect_timer,
and ref_start_time are used for controlling functionalities of BGP extension. By
setting the value of BGP_reflect either as true or as false, we can enable or disable
the functionalities, respectively. Otherwise, we can schedule start of the function-
alities by setting a value for reflect_start_time. The keyword reflect_timer is
used to set ReflectSend timer interval.
ProtocolSession [
name bgp # name of protocol
use SSF.OS.BGP # identifier of protocol
autoconfig true
connretry_time 120 # ConnectRetry timer interval
Keepalive_time 30 # KeepAlive timer interval
hold_time 90 # Hold timer interval
mrai_time 30 # MRAI timer Interval
stub false # stub or Transit
table_dump true # dumps table at the end
BGP_reflect false # BGP REFLECT ON/OFF
ref_start_time 1 # BGP REFLECT interval
reflect_timer 1 # BGP REFLECT start time
]
Figure 5.18. A DML snippet of BGP-4 model.
75
DPF-update
The DPF-update model provides two major functionalities—local filter table up-
date and BGP REFLECT message forwarding. DPF-update interacts with BGP
Extension as shown in Figure 5.19.
Figure 5.19. Interaction of BGP and DPF-update.
BGP Extension passes received BGP REFLECT messages to DPF-update using
Reflect buffer. First, all BGP REFLECT messages in Reflect buffer are interpreted by
DPF-update in order to increment or decrement counter value of the corresponding
entry. Next, DPF-update finds the next upstream filter site from AS-PATH of the
received message. For this, DPF-update gets a list of route-based DPF filter sites as
input. Subsequently, DPF-update attempts to make a connection with the upstream
filter site. If the connection is established, the received BGP REFLECT message
is forwarded. Otherwise, regarding the upstream filter site as an unreachable node,
DPF-update tries to find and make a connection to its next available upstream filter
site. If no next upstream filter site exists, the received BGP REFLECT message is
discarded.
Figure 5.20 shows a DML snippet of DPF-update model. It includes the keyword
filter_site which contains a list of route-based DPF filter sites, where all filter
sites’ AS numbers are specified using the keyword as.
76
ProtocolSession [
name DPF_update # name of protocol
use SSF.OS.DPF_update # identifier of protocol
filter_site[ # route-based DPF filter sites
as 1
as 2
]
]
Figure 5.20. A DML snippet of DPF-update model.
DPF-lookup
DPF-lookup provides Route-based Distributed Packet Filtering (DPF) function-
ality per network interface, referring to filter table of each network interface.
DPF-lookup is modelled as a class DPF_lookup. It includes a set of filter tables.
For verifying validity of source address of an incoming IP packet, DPF-lookup needs
to search filter table of specific link where the packet comes using its source address
as a key. Thus, filter table use trie as its data structure for fast searching speed.
When a packet is passed over from its lower layer protocol, DPF-lookup checks its
source address as we mentioned right before. Once a packet turns out to be valid, it is
passed over to its above layer protocol, IP. Otherwise, the packet is discarded. When
a packet is passed over from its higher layer protocol, it just forwards the packet into
its lower layer protocol.
Figure 5.21 provides an example DML snippet of DPF-lookup model. The key-
word nic is used to specify network interface where route-based DPF filter is deployed.
In other words, we can selectively deploy route-based DPF filter into network inter-
faces of a router. As a value for the keyword, an identifier of network interface is
used. The keyword filtering and discard_time are used to control the function-
ality for convenience. One can turn off the filtering functionality by setting the value
77
of filtering as false. Otherwise, the filtering functionality is turned on by default.
In addition, one can defer start of the filtering functionality into future by using
filtering and discard_time. Like Figure 5.21, one can set filtering to true and
set the time intended as the value of discard_time.
ProtocolSession [
name DPF_lookup
use SSF.OS.DPF_lookup
nic 0
nic 1
filtering true
discard_time 100
]
Figure 5.21. A DML snippet of DPF-lookup model.
UDP
An UDP6, model is incorporated in order to support connectionless transport
protocol. The given model for Socket API is modified to support this UDP model.
An application model needs the Socket API model to interact with UDP model. A
number of application models, which the dynamic DPF simulator supports, use UDP
model as their transport layer protocol. Figure 5.22 shows its DML specification.
The keyword use takes its string type identifier, UDP.
Applications
Supported application models ranges traffic generator, attacker, and system fault
generator models. First, five different traffic generator models are provided with
6The latest release of DaSSFNet includes UDP protocol. However, it has not been provided untilwe finished to design and implement our prototype system.
78
ProtocolSession [
name UDP
use UDP
]
Figure 5.22. A DML snippet of UDP model.
respect to their traffic distribution, which includes Poisson, Constant Bit Rate (CBR),
file trace, Markov modulated Poisson process (MMPP), and Long-Range Dependent
(LRD). Poisson, CBR, MMPP, and LRD uses probability models for traffic generation
as their names indicate. File trace is the case when we take traffic distribution from a
file. Traffic generators use UDP as its transport layer protocol. Figure 5.23 provides
an example DML snippet of CBR traffic generator. The keyword name takes a string
argument as for its name, which can be any string. The keyword use takes a string
identifier—trafficgen_cbr, of a protocol model, which is unique per protocol model.
The keyword receiver and receiverport take an address and a port number of its
receiver. Since IP address of an interface is assigned during initialization phase of
simulation, the IP address of receiver is represented in Network Host Interface (NHI)
address format, which is defined in DML [53]. Here, 1:1(0) represents the interface 0
of the machine 1 in the network 1. The start time and end time of traffic generation
are specified in seconds. The keyword traffic_rate takes a real number in Mbps
unit. The length of packet is specified using packet_length in B. The keyword
measurement_file takes a file name for measuring traffic generation. By setting the
keyword show_report to true, we observe the status of traffic generation from the
console. Other types of traffic generators follow the same pattern as that of Figure
5.23, while taking additional parameters depending on their models.
We support flooding-based attacker models, which supports IP source address
spoofing. The flooding-based attacker models are derived from the above traffic gen-
erators. In the same fashion as the traffic generators, there are five types of flooding-
79
ProtocolSession [
name trafficgen_cbr
use trafficgen_cbr
receiver 1:1(0)
receiverport 100
start_time 0.0 # in seconds
end_time 10.0 # in seconds
traffic_rate 1 # in Mbps
packet_length 200 # in B
measurement_file attacker.txt
show_report true
]
Figure 5.23. A DML snippet of CBR traffic generator.
based attacker models—Poisson, CBR, file trace, MMPP, and LRD. All of them
support IP source address spoofing. All the attackers can be configured as UDP ap-
plications or raw-IP applications, which sit on and interact with IP protocol directly.
Figure 5.24 illustrates a DML snippet for CBR massive attacker model. The model
takes two additional parameters—type and spoofing_addr. The keyword type takes
a string argument either UDP or IP, in order to specify whether it is configured as an
UDP application or an raw-IP application. The keyword spoofing_addr takes an
address in NHI format, which is used for forging source IP address of its attack traffic.
To emulate system crash at a given time or during a particular time period, we
devise an application model, called ShutDown. First, ShutDown is modelled as an
application, which is independent of its lower layer protocol. As shown in Figure
5.25, it may reside on top of Socket interface. However, it interacts with Hardware
object at the bottom of the machine, which functions as a border between network
protocols and network elements. We modified class Hardware to define additional
interfaces—shut_down() and booting(). When shut_down() is called, Hardware
80
ProtocolSession [
name attack_cbr
use attack_cbr
type UDP # UDP or IP
spoofing_addr 1:1(0)
receiver 1:1(0)
receiverport 100
start_time 0.0 # in seconds
end_time 10.0 # in seconds
traffic_rate 1 # in Mbps
packet_length 200 # in B
measurement_file attacker.txt
show_report true
]
Figure 5.24. A DML snippet of CBR massive attacker.
object blocks every incoming packet to reach to any of its above layer protocols
by discarding it. When booting() is invoked after shut_down(), Hardware object
recovers itself back to its normal state. In other words, it passes over incoming and
outgoing packets between network elements and network protocols, as described in
Section 5.1.
Figure 5.26 presents an example DML snippet of ShutDown model. The start
time and end time is specified in seconds. By setting end_time larger than the
termination time of simulation, we can emulate persistent system crash. Otherwise,
we can emulate system startup or rebooting during a simulation period.
81
Figure 5.25. Mechanism of ShutDown model.
ProtocolSession [
name ShutDown
use SSF.OS.app.ShutDown
start_time 100.0 # in seconds
end_time 150.0 # in seconds
]
Figure 5.26. A DML snippet of ShutDown model.
82
83
6 LARGE-SCALE NETWORK SIMULATION
The Dynamic DPF Simulator is designed to be a scalable network simulation environ-
ment with respect to the size of the input network topology. We focus on provisioning
of scalable partitioning and dynamic measurement and monitoring to facilitate large-
scale network simulation over workstation clusters.
In the first half of this chapter, we demonstrate performance and utility of the
measurement subsystem. In the second half, we carry out performance evaluation
of the partitioning algorithm. We analyze the Dynamic DPF Simulator’s time and
memory complexity with respect to network size and degree of parallelism for speed-
up.
6.1 Experimental Setup
6.1.1 System Configuration
The experiments described in the following sections were conducted on a dedicated
FastEthernet LAN cluster in the Network Systems Lab (NSL) consisting of twenty
four i686 machines running Linux 2.4.17; six with Pentium 4 2.53GHz CPUs and
1GB main memory, ten with Pentium 4 2.00GHz CPUs and 1GB main memory, and
eight with Pentium 3 996MHz CPUs and 512MB main memory. As shown in Figure
6.1, these machines are connected via a 100Mbps FastEthernet switch. As part of the
Dynamic DPF Simulator environment, DaSSFNet and MPI are installed in all the
participating machines.
For performance evaluation of parallelism speed-up, each benchmark test case is
carried out with several machine configurations—5, 8, 12, 16, 20, 24 machines. Since
machines in the cluster do not have a uniform system configuration, participating
84
Figure 6.1. Hardware configuration of a Linux cluster used for AS-levelbenchmarking.
machines for each benchmark test are chosen with preference of high clock rate CPU
and large main memory.
6.1.2 Benchmark Topologies
1,020-node, 2,020-node, 3,023-node, and 4,512-node AS-level Internet graphs are
selected for the benchmarks. 3,023-node and 4,512-node graphs are obtained from
Nov. 8, 1997 and Jan. 1, 1999 NLANR [9] measurement data, respectively. 1,020-
node and 2,020-node graphs are obtained from the 3,023-node graph by pruning nodes
uniformly randomly. Table 6.1 shows the statistics of the benchmark graphs.
To confirm our assumption that all benchmark topologies exhibit power-law con-
nectivity, each graph’s degree distribution is plotted as a function of ranked node
degree in log-log scale. In Figure 6.2, we observe a linear relationship which is con-
sistent with a power-law relation.
85
Table 6.1Statistics of the benchmark topologies.
Nodes Edges Description
1020 1465 subgraph of Nov. 8, 1997 NLANR
2020 3305 subgraph of Nov. 8, 1997 NLANR
3023 5257 Nov. 8, 1997 NLANR
4512 8383 Jan. 1, 1999 NLANR
1
10
100
1000
1 10 100 1000 10000
degr
ee
rank by degree
4512-node graph3023-node graph2020-node graph1020-node graph
Figure 6.2. Power-law connectivity of the benchmark topologies.
6.1.3 Simulation Setup
The simulation period is from 0 second to 300 simulation seconds. Each node
creates one or more BGP sessions to make separate peering relationships with its
directly connected neighbors depending on degree. When a simulation starts, each
BGP session attempts to make a TCP connection to one of directly connected neigh-
bors. Once a TCP connection is established, it sends a BGP OPEN message to set
up a BGP peering relationship. When its peer replies with a BGP OPEN message,
it sends a BGP KEEPALIVE message back, in order to complete the setup proce-
dure. When a BGP KEEPALIVE message is received from a peer, it initiates route
86
calculation by sending a BGP UPDATE message which advertises a route to reach to
itself from other nodes in the system. BGP employs four timers—ConnectRetry, Hold,
KeepAlive, and MinRouteAdvertisementInterval (MRAI). Referring to RFC1771 [48],
values of these timers are set to 120 seconds, 90 seconds, 30 seconds, and 30 seconds,
respectively.
Table 6.2Parameter settings for TCP.
Parameter Value
Maximum segment size 1024B
Receive window size 512
Send window size 512
Maximum retransmission count 12
TCP slow timer interval 0.5 sec
TCP fast timer interval 0.2 sec
Maximum segment lifetime 60 sec
Maximum idle time 600 sec.
Table 6.2 lists parameter settings for TCP. In addition, we assumed network in-
terfaces with sufficient buffers. Link bandwidth is set to 100Mbps, and propagation
delay is set to 0.001 second.
6.2 Performance and Utility of Comprehensive Measurement Subsystem
6.2.1 Methodology
To demonstrate the performance and utility of the measurement subsystem, we
simulated BGP’s dynamic route update and convergence behavior on the 3023-node
AS-level Internet graph over a 16-Linux cluster configuration.
87
We monitored memory consumption by each protocol’s messages and tables. The
total memory consumption by protocol messages and tables are then aggregated. In
addition, for each kernel event type, we measure both memory consumption and the
count of objects processed. The total memory consumption by kernel event objects
are then obtained. For all 16 participating machines, we get separate measurement
results with simulation time, and the results are merged and sorted by simulation
time in a way that the merged trajectory provides consistent information over the
whole distributed memory resources.
6.2.2 Memory Requirement Monitoring
We demonstrate memory requirement monitoring at run-time. Unlike static mem-
ory requirement introduced by protocol tables, dynamic memory requirement by pro-
tocol messages and kernel events is difficult to predict and estimate. In particular,
since large-scale network simulation requires a large amounts of memory, identifying
memory consumption by each component is necessary for memory oriented partition-
ing to keep the virtual memory system at bay.
Figure 6.3 shows memory consumption as a function of simulation time. Here,
trajectories of memory consumption by each category are stacked to help comparison.
Figure 6.3(a) classifies memory consumption into three categories—Kernel Events,
Messages, and Tables. Kernel Events represents memory consumption by DaSSF’s
kernel level events. Messages represents memory consumption by protocol messages.
Tables represents memory consumed by protocol tables. Figure 6.3(b) subdivides
these categories further into fine granular components.
From Figure 6.3(a), we observe that Tables and Kernel Events consume most of
the memory space. Although memory consumption by Messages increases to some
extent as simulation proceeds, it takes a small portion compared to that of Tables
and Kernel Events. As simulation proceeds, memory consumed by tables increases
till a certain point (around 180 simulation second) and stabilizes thereafter. Memory
88
(a) (b)
Figure 6.3. Memory consumption as a function of simulation time. (a)M Memoryconsumption is classifies into three categories—tables, messages, and kernel events.
(b) The categories are further subdivided into fine granular components.
consumed by kernel events increases initially as simulation proceeds, and it decreases
after a certain point (around 150 simulation second). At its peak, it consumes a
significant portion of memory on par with that of tables.
As Figure 6.3(b) shows, Kernel Events is further classified into major event types—
TIMER, CHANNEL, OUTCHANNEL, and INCHANNEL, with respect to their mem-
ory consumption. Messages is classified into BGP Messages, TCP Receive, TCP
Send, and IP Messages. Tables are classified into Adj-RIB-In, Loc-RIB, and IP table.
From the comparison of 6.3(a) and 6.3(b), we can notice that TIMER consumes the
most among the major event types. Among different protocol tables, Adj-RIB-In is
dominant with respect to memory consumption. We look into each category in the
following sections.
89
6.2.3 Memory Consumption by Tables
Figure 6.4 shows a trajectory of memory consumption by protocol tables—BGP
Adj-RIB-In, BGP Loc-RIB, and IP tables—as a function of simulation time. Tra-
jectories of memory consumption by each category are stacked to help to compare
their shares. At around 30, 60, 90, 120, 150, and 180 seconds, memory consumption
by protocol tables increases. We observe significant increment at around 60, 90, 120
simulation seconds. BGP Adj-RIB-In consumes most space than the others.
Figure 6.4. Memory consumption by protocol tables. BGP Adj-RIB-In, BGPLoc-RIB, and IP.
In Section 6.1, we mentioned that BGP MinRouteAdvertisementInterval (MRAI)
timer value is set to 30 seconds. Each BGP session maintains a separate MRAI
timer. Whenever MRAI timer expires, it checks if there exist any newly received
route advertisement. If there is a new route information to be advertised, it sends a
BGP UPDATE message to its peer. Since BGP peering relationships are established
around 0 second with a random difference in the range of from 0 to 1000 ms, MRAI
timers of all BGP sessions are expired almost synchronously with 30 second interval.
When BGP UPDATE messages are received, each BGP session updates its own Adj-
RIB-In and Loc-RIB tables depending on the result of decision process of them. From
the increase of memory consumption at around 30, 60, 90, 120, 150, 180 seconds, we
find how much portion of each table is filled up at each 30 second interval. After
90
180 simulation second, these is no noticeable increase of memory consumption. This
indicates that BGP route advertisement and calculation are in its final phase.
According to the description of BGP modelling in Section 5.2.4, BGP maintains
a separate Adj-RIB-In table per peering relationship to store BGP route update
received from each peer. There exists one BGP Loc-RIB table as a central storage for
selected route updates. In addition, due to the AS-level Internet modelling described
in Section 5.2.1, an AS has the equal number of Adj-RIB-In table to its degree. Thus,
denoting an AS-level Internet graph as G = (V, E), the total number of Adj-RIB-In
tables is equal to 2 · |E|, where |E| represents the total number of edges. From the
statistics of the 3023-node AS-level Internet graph in Table 6.1, we can assume that
the total number of edges, |E|, is approximately equal to 2 · |V |, where |V | represents
the total number of nodes of a graph. Thus, the total number of Adj-RIB-In tables
is equal to 4 · |V |. Similarly, the total number of Loc-RIB tables is equal to the total
number of nodes, |V |, because an AS includes one border router according to the AS-
level internet modelling. Additionally, the total number of IP routing table is equal
to the total number of nodes of a graph, |V |, from the same argument. Hence, the
total number of Adj-RIB-In tables are roughly three times more than that of Loc-RIB
tables, which is the same as the total number of IP routing tables. However, from
Figure 6.4, we observe that IP tables consumes more memory than Loc-RIB tables.
It is due to the difference in size of each entry’s data structure.
6.2.4 Memory Consumption by Messages
Figure 6.5 shows plots of memory consumption by protocol messages—BGP, TCP
send buffer, TCP receive buffer, and IP, as a function of simulation time. Trajectories
of memory consumption by each category are stacked to help to compare their shares.
In every 30 seconds, we observe a surge of messages due to BGP’s MRAI timer
expiration. In total, the amount of memory required by protocol messages becomes
around 1GB.
91
Figure 6.5. Memory consumption by protocol messages. BGP, TCP Send buffer,TCP Receive buffer, and IP.
On the other hand, It appears that IP message remains in IP output buffer con-
suming memory up to 400MB over the course of the simulation period. This result,
however, is due to DaSSFNet’s unique IP output buffer management, where dequeued
messages remain in the IP output buffer until another packet is enqueued. Hence,
current timer-based IP message cannot recognize that messages has been dequeued
until a packet is enqueued to the IP output buffer. We can correct this problem by
modifying timer-based IP measurement routine to reflect possible message dequeueing
before measurement. In this thesis, we focus on the fact that memory consumption
by messages are secondary compared to that by tables and kernel events.
6.2.5 Memory Consumption and Counting of Major Kernel Events
DaSSF schedules and processes several types of kernel level events. This section
starts with introduction of major types of kernel events. Next, measurement results
of kernel event objects are presented and discussed.
As mentioned in Section 5.1, most of packet handling is realized as a chain of proce-
dure calls between protocol models. Simulation kernel mainly processes events related
to message passing, timer, and semaphore. As illustrated in Figure 5.11, message
passing related events consist of inChannel, outChannel, and Channel. In addition,
92
DaSSF provides extended utilities such as timer and semaphore. The timer model
internally creates and schedules kernel event object of type EVTYPE_TIMER. DaSSF
allow to cancel a scheduled timer event. When cancel() of the timer model is called,
it internally changes the type of originally scheduled event into EVTYPE_CANCEL. The
kernel event object is processed and released at the originally scheduled time. Next,
the semaphore model is used as a mechanism of inter-process communication. When
semWait() of the semaphore model is called, the calling process is put into the
semaphore’s waiting process list, without creating any kernel event objects. How-
ever, when semSignal() of the semaphore model is called, it creates a kernel event
object of type EVTYPE_SEMSIGNAL.
0
1e+07
2e+07
3e+07
4e+07
5e+07
6e+07
7e+07
0 50 100 150 200 250 300
coun
t
simulation time
INCHANNELOUTCHANNEL
CHANNELTIMER
SEMSIGNAL
(a)
0
5e+08
1e+09
1.5e+09
2e+09
2.5e+09
0 50 100 150 200 250 300
mem
ory
wat
erm
ark
simulation time
TOTALINCHANNEL
OUTCHANNELCHANNEL
TIMERSEMSIGNAL
(b)
Figure 6.6. Major types of KernelEvent objects. (a) shows cumulative counts ofKernelEvent object creation as a function of simulation time for the major types.
(b) shows total memory consumption by KernelEvent objects and memoryconsumption by major types of KernelEvent objects as a function of simulation time.
Figure 6.6(a) provides cumulative counts of kernel level objects for each ma-
jor type—INCHANNEL, OUTCHANNEL, CHANNEL, TIMER, and SEMSIGNAL1.
Figure 6.6(b) shows total memory consumption by all kernel event objects as well as
memory consumption by major types of kernel event objects as a function of time.
1They are declared as EVTYPE INCHANNEL, EVTYPE OUTCHANNEL, EVTYPE CHANNEL,EVTYPE TIMER, and EVTYPE SEMSIGNAL within the class definition of KernelEvent, respec-tively.
93
From Figure 6.6(a), cumulative counts of INCHANNEL, OUTCHANNEL, and
TIMER exhibit similar trajectory throughout the period of simulation, and they are
the most dominant. From the plots of CHANNEL and SEMSIGNAL, we find that
the count of CHANNEL type events are more than half of those of the most dominant
event types during the simulation period; SEMSIGNAL type events are around half.
Figure 6.6(b) shows that total memory consumption by all kernel event objects
are equivalent to memory consumption by TIMER type objects. Comparatively, it
indicates that all other major types of objects consumes a minor portion of memory.
Thus, although comparable amount of objects are created for all major event types—
INCHANNEL, OUTCHANNEL, CHANNEL, TIMER, and SEMSIGNAL, only TIMER
event type is dominant in terms of memory consumption. This is because other event
type objects are created, processed, and deleted in short time. However, TIMER
event object exists until its originally scheduled time. Even if the scheduled timer
event is cancelled, the originally scheduled event object remains till its scheduled
time. For example, Hold timer interval of BGP model is set to 90 seconds, and
the timer is frequently rescheduled whenever a message is received from its peer. In
other words, the event is cancelled, and a new event of TIMER type is created and
scheduled whenever a message arrives. Since TIMER events’ memory consumption is
significant, we can reduce memory requirement by modifying the way DaSSF handles
TIMER events.
6.2.6 CPU Load Monitoring
Figure 6.7 presents CPU load distribution over 16 Linux workstations. The plot
“total kernel event” shows the total number kernel events processed at each machine;
“cpu total” represents CPU time occupied by corresponding process at each machine.
Both plots exhibits similar trajectories during the simulation period. Although kernel
events to be processed are evenly distributed over the 16 machines, we observe CPU
load imbalance in terms of CPU time. This imbalance may arise from uneven com-
94
putation load for scheduling processes or communication overhead. As we observed
in this experiment, CPU time measurement is a relevant metric for evaluating CPU
load balance.
0
0.05
0.1
0.15
0.2
0 2 4 6 8 10 12 14 16
CP
U lo
ad d
istr
ibut
ion
machine
total kernel eventcpu total
(a)
Figure 6.7. CPU load distribution over 16 Linux workstations.
6.2.7 Communication Cost
As shown in Figure 5.11, one inChannel and one outChannel objects are instan-
tiated in both local and remote message passing cases. In case of remote message
passing an additional Channel object is created. The measurement results in Figure
6.6(a) shows that cumulative counts of INCHANNEL and OUTCHANNEL objects
are identical. We find that more than half of messages are sent over MPI. This is
a valuable tool for analyzing communication cost for a given distributed simulation.
Moreover, it is a useful metric for evaluating different partitioning algorithms’ per-
formance in terms of communication cost.
6.3 Scalability of Partitioning
In this section, we present the benchmark test results for examining scalability of
partitioning. As stated in Section 5.2.2, partitioning subsystem is focused on efficient
95
use of distributed memory and utilization of CPU resources to achieve scalability of
simulation, as the size of a graph increases.
We evaluate scalability of the proposed system in terms of time complexity and
memory complexity. First, completion time of the benchmark simulation is deter-
mined as a metric of time complexity. In a distributed setting, completion time of a
distributed program is determined by that of a distributed partition of the program,
which completes last. As a result, completion time of a distributed program reflects
temporal balance between distributed partitions of the program and overhead due to
synchronization and message passing. Next, memory watermark—maximum mem-
ory usage during period of simulation, is used as a performance metric for memory
complexity. Although memory usage fluctuates as simulation proceeds, the amount
of memory required for a simulation is determined by memory watermark. Unlike
completion time, the result of a benchmark test consists of measurement results at
16 individual machines. Thus, memory complexity is represented by total, average,
and maximum value of memory watermark measurements.
For this experiment, completion time of a simulation is defined as the elapsed real
time between start and end of the simulation. The time command [54] of Linux is
used to measure completion time. It returns the elapsed real time when a program
terminates. Similarly, memory watermark is defined as the maximum memory con-
sumption by a simulation process2, which includes its code, data, and stack space.
Memory watermark is measured using the top command [55] of Linux. During the
period of a simulation, the top command is executed in a batch mode with 30 second
delay. The top command fetches information from the Linux process (/proc) file
system periodically with the specified delay. The SIZE field provides the size of a
process’s code, data, and stack space in KB, and it is logged throughout the period
of simulation. The top command is executed in all participating machines, and the
maximum value in each log file is taken as memory watermark on the specific machine.
2During the experiment, we could observe additional forked process in case of parallel setting viaMPI. Since their memory usages are negligible (less than 1MB), they are ignored for this measure-ment.
96
0
1000
2000
3000
4000
5000
6000
7000
0 5 10 15 20 25 30
com
plet
ion
time
number of machines
4512-node graph3023-node graph2020-node graph1020-node graph
(a)
0
1000
2000
3000
4000
5000
6000
7000
0 1000 2000 3000 4000 5000
com
plet
ion
time
problem size
24 machines20 machines16 machines
(b)
Figure 6.8. (a) Completion time as a function of parallelism for different benchmarkgraphs. (b) Completion time as a function of problem sie for 16, 20, and 24
machines.
6.3.1 Completion Time
Figure 6.8 shows completion time both as a function of parallelism for different
benchmark graphs and as a function of problem size for 16, 20, and 24 machines.
Here, completion time is represented in seconds. Figure 6.8(a) shows the result of
four different benchmark graphs, which we call 4512-node graph, 3023-node graph,
2020-node graph, and 1020-node graph. Here, the trajectory of 4512-node graph
includes results of 16, 20, and 24 machines cases only. It is because a 4512-node
graph simulation cannot be launched on 12 or lesser machines. In both 4512-node
graph and 3023-node graph, completion time decreases as parallelism increases. On
the other hand, completion time of 2020-node graph shows an optimal value at a
simulation with 20 machines. Although completion time decreases as parallelism
increases from 5 machines to 20 machines, it increases after the optimal point. In
case of 1020-node graph, completion time increases, as parallelism increases.
The results show an effect of the problem size on a relation between parallelism
and completion time. When the size of a problem is sufficiently large, parallelism
can be useful for reducing completion time of a simulation. The results of 4512-node
97
graph and 3023-node graph support this case. However, increased parallelism usually
induces overhead for synchronization and message passing, if communication between
parallel partitions is required. Especially, when the problem size is sufficiently small,
parallelism may be of no use. The result of 1020-node graph can be understood in
this category. Due to our modelling of network, we can expect a certain amount of
traffic on each link. This might cause more overhead as we increase parallelism. On
the other hand, the initial trend of 2020-node graph belongs to the first. However,
its trend changes into that of the latter after the optimal point. In this case, we may
conclude that, with the given traffic pattern, the scale of 2020-node graph simulation
does not need more parallelism than 20 machines.
Figure 6.8(b) shows the result of 16, 20, and 24 machine simulations. In case of
1020-node graph, 16 machine simulation shows better performance than the other two
although there exist minor differences. For 2020-node graph, 20 machine simulation
completes first, compared to the other two. In case of 3023-node graph and 4512-node
graph, 24 machine simulation shows better performance with minor differences. From
the graphs of 16 machine and 20 machine simulation, we can observe a general trend
that completion time increases as a function of problem size. From the trajectory of
24 machine simulation, completion time of 3023-node graph is shorter than that of
2020-node graph. Nevertheless, the graph of 24 machine simulation shows the general
trend. As mentioned during discussion of 6.8(a), the result of 2020-node graph shows
the best performance when 20 machines are used for its simulation. In case of 3023-
node and 4512-node graph, the best performance is observed when 24 machines are
used for their simulation. Here, we can observe that the results of 4512-node graph
in 20 machines case and 24 machines case are not so different. Although it is not
sufficient as for evidence, one thing to note is that system configuration is not the
same for all 24 machines, as mentioned in Section 6.1. In addition, the performance
of the additional 4 machines’ resources is not as good as other 16 machines.
98
6.3.2 Balanced Memory Offloading
Figure 6.9 shows total, average, and maximum memory watermark as a function of
parallelism for different benchmark graphs. Here, memory watermark is represented
in MB. In Figure 6.9(a), we observe trajectories of total memory watermark for dif-
ferent benchmark graphs—4512-node graph, 3023-node graph, 2020-node graph, and
1020-node graph. As we increase parallelism, total memory watermark increases in
all four benchmark graphs. On the other hand, Figure 6.9(b) shows average and max-
imum memory watermark for the benchmark graphs. As the number of participating
machines increases from 5 machines to 24 machines, average memory requirement de-
creases proportionally. In case of 4512-node graph and 3023-node graph, maximum
memory watermark decreases with similar trajectory to that of average memory wa-
termark. In 2020-node graph, maximum memory watermark decreases to a certain
point (16 machine simulation). However, it increases slightly as parallelism increases.
In 1020-node graph, maximum memory remains the same although the number of
participating machines increases more than 8.
0
2000
4000
6000
8000
10000
12000
14000
0 5 10 15 20 25 30
mem
ory
wat
erm
ark
number of machines
4512-node graph, total3023-node graph, total2020-node graph, total1020-node graph, total
(a)
0
500
1000
1500
2000
0 5 10 15 20 25 30
mem
ory
wat
erm
ark
number of machines
4512-node graph, avg4512-node graph, max3023-node graph, avg3023-node graph, max2020-node graph, avg2020-node graph, max1020-node graph, avg1020-node graph, max
(b)
Figure 6.9. (a) Total memory watermark as a function of parallelism for differentbenchmark topologies. (b) Average and maximum memory watermark as a function
of parallelism for different benchmark graphs.
99
Figure 6.9(a) shows that additional memory spaces are needed as parallelism in-
creases. It may come from cost for building and maintaining global context of a
simulation at each local machine. However, the results of 4512-node graph and 3023-
node graph cases in Figure 6.9(b) show that total memory requirement is effectively
offloaded into participating machines’ sides. At the same time, the results of 2020-
node graph and 1020-node graph indicate that there are some cases where maximum
memory watermark is not decreased expectantly. It is because the phase 0 of our
partitioning algorithm affects trends of maximum memory watermark in these cases.
In order to reduce traffic load crossing distributed machines, every node, which has
an only edge to a central high-degree node, is put into the same partition group with
the central high-degree node. The partition groups determined during the phase 0
are not divided any further. Hence, we can guarantee that traffic between them is
localized within the partition group where they belong. However, it also implies that
parallelism cannot help to reduce maximum memory requirement caused by these
partition groups.
0
2000
4000
6000
8000
10000
12000
14000
0 1000 2000 3000 4000 5000
mem
ory
wat
erm
ark
problem size
24 machines, total20 machines, total16 machines, total
(a)
0
500
1000
1500
2000
0 1000 2000 3000 4000 5000
mem
ory
wat
erm
ark
problem size
24 machines, avg24 machines, max20 machines, avg20 machines, max16 machines, avg16 machines, max
(b)
Figure 6.10. (a) Total memory watermark as a function of problem size for 16, 20,and 24 machines. (b) Average and maximum memory watermark as a function of
problem size for 16, 20, and 24 machines.
Figure 6.10 shows total, average, and maximum memory watermark as a function
of problem size for 16, 20, and 24 machines. Memory watermark is represented in MB.
100
In Figure 6.10(a), we observe trajectories of total memory watermark for 16, 20, and
24 machines. As we increase problem size, total memory watermark increases in all
three cases. In Figure 6.10(b), trajectories of average memory watermark increases as
a function of problem size in all three cases. The same trend is observed from those of
maximum memory watermark. In 1020-node graph case, results of maximum memory
watermark for 16, 20, and 24 machines are the same. In 2020-node graph case, results
show minor differences. However, Differences of maximum memory watermark results
increase as problem size increases. Moreover, result of 24-machine simulation shows
best performance in both 3023-node graph and 4512-node graph cases.
As shown in Figure 6.10(b), when the problem size is as large as 2020 or more, the
total memory watermark exceeds 1GB. It supports that it is difficult to run this much
scale simulation in a single PC machine, which has 1GB of memory. In addition, the
trend of total memory watermark as a function of problem size increases non-linearly.
Thus, it upholds that distributed simulation is required for scalable simulation with
respect to its problem size. At the same time, Figure 6.10(b) indicates effectiveness
of parallelism when problem size becomes larger.
101
7 CONCLUSION AND FUTURE WORK
This thesis studies performance evaluation of route-based distributed packet filter-
ing for distributed denial of service attack prevention in large-scale networks under
dynamic network conditions.
We have designed and implemented a route-based DPF protocol, which updates
route-based DPF tables dynamically in the presence of IP routing table updates
handled by BGP, Internet’s inter-domain routing protocol. We have carried out
performance evaluation of the route-based DPF protocol’s fault-tolerant protection in
Internet autonomous system level measurement topologies. We have designed a new
partitioning algorithm for power-law network topologies, characteristic of Internet AS
measurement graphs, which achieves balanced distribution of memory requirement as
well as efficient utilization of CPU and communication resources.
For future work, the first item is to release a public domain version of the Dy-
namic DPF Simulator, called DaSSFNet-Turbo, that is applicable to general network
simulation spanning traffic control and network security.
Performance evaluation of route-based DPF in the presence of DDoS attacks aimed
at the network infrastructure is next on the list. As shwon in this thesis, during
transient periods, before BGP’s routing tables have established, route-based DPF
filters may contain inconsistent—i.e., stale or safety-violating—entries. Our results
under single node failures show that route-based DPF continues to provide significant
protection, however, we would need to extend performance evaluation to a range of
full-fledged infrastructure attacks.
Power-law topology partitioning is not limited to the Dynamic DPF Simulation
environment and will be obtained as a general problem into its own, extracting the
results reported in the thesis with more recent advances outside its scope.
102
Lastly, we would like to build a prototype system utilizing a 7-node Intel IXP1200-
based network processor (NP) testbed in the Network Systems Lab. A prototype NP-
implementation allows evaluation of system level overhead and performance issues
that are important when considering an interim migration path that is compatible
with legacy routers.
Referring to the simulation model implemented in this thesis, we need to build a
prototype system. In case of filter look-up, high-speed processing is required. At the
same time, overhead at control plane due to filter table update needs to be analyzed.
LIST OF REFERENCES
103
LIST OF REFERENCES
[1] David Moore, Goeffrey Voelker, and Stefan Savage. Inferring internet denial-of-service activity. In the 10th USENIX Security Symposium, 2001.
[2] Lee Garber. Denial-of-service attacks rip the Internet. Computer, pages 12–17,April 2000.
[3] Ryan Naraine. Massive ddos attack hit dns root servers, October 2002.http://www.internetnews.com/dev-news/article.php/1486981.
[4] ComputerWire. Ddos attack ’really, really tested’ ultradns, November 2002.http://www.theregister.co.uk/content/55/28291.html.
[5] C.E.R.T. Cert advisory ca-2002-15 denial of service vulnerability in isc bind 9,sep. 2002. http://www.cert.org/advisories/CA-2002-15.html.
[6] Patrick Gray. Worm could be clearing path for ddos attack, March 2003.http://news.zdnet.co.uk/business/0,39020645,2131631,00.htm.
[7] K. Park and H. Lee. On the effectiveness of route-based packet filtering fordistributed dos attack prevention in power-law internets. In In Proc. ACM SIG-COMM ’01, pages pp. 15–26, 2001.
[8] Dassfnet: a c++ implementation of ssfnet.http://www.cs.dartmouth.edu/ ghyan/dassfnet/overview.htm.
[9] National Laboratory for Applied Network Research. Routing data, 2000. Sup-ported by NFS, http://moat.nlanr.net/Routing/rawdata/.
[10] NightAxis and Rain Forest Puppy. Purgatory 101: Learning to copewith the SYNs of the Internet, 2000. Some practical approachesto introducing accountability and responsibility on the public internet,http://packetstorm.securify.com/papers/contest/RFP.doc.
[11] Computer Emergency Response Team. Denial of service, February 1999. TechTips, http://www.cert.org/tech tips/denial of service.html, 2nd revision.
[12] Computer Emergency Response Team (CERT). CERT Advi-sory CA-2000-01 Denial-of-service developments, January 2000.http://www.cert.org/advisories/CA-2000-01.html.
[13] R. K. C. Chang. Defending against flooding-based distributed denial-of-serviceattacks: A tutorial. IEEE Communications Magazine, pages pp. 42–51, October2002.
[14] S. Savage, D. Wetherall, A. Karlin, and T. Anderson. Practical network supportfor IP traceback. In Proc. of ACM SIGCOMM, pages 295–306, August 2000.
[15] Cisco Systems. Characterizing and tracing packet floods using Cisco routers,Aug 1999. http://www.cisco.com/warp/public/707/22.html.
104
[16] Glenn Sager. Security fun with OCxmon and cflowd, November 1998. Presenta-tion at the Internet 2 Working Group.
[17] Jon Postel. Internet protocol, September 1981. RFC 791.
[18] S. Bellovin. ICMP traceback messages, March 2000. Internet Draft: draft-bellovin-itrace-00.txt (expires September 2000).
[19] H. Burch and B. Cheswick. Tracing anonymous packets to their approximatesource. In 14th Systems Administration Conference (LISA 2000), pages 319–327, 2000.
[20] K. Park and H. Lee. On the effectiveness of probabilistic packet marking forIP traceback under denial of service attack. Technical Report CSD-TR 00-013,Department of Computer Sciences, Purdue University, June 2000.
[21] CERT/CC, SANS Institute, and CERIAS. Consensus roadmap for defeatingdistributed denial of service attacks, February 2000. A Project of the Partnershipfor Critical Infrastructure Security, http://www.sans.org/ddos roadmap.htm.
[22] P. Ferguson and D. Senie. Network ingress filtering: Defeating denial of serviceattacks which employ IP source address spoofing, May 2000. RFC 2827.
[23] Daniel Senie. Changing the default for directed broadcasts in routers, August1999. RFC 2644.
[24] David G. Andersen. Mayday: Distributed filtering for internet services. In 4thUsenix Symposium on Internet Technologies and Systems, March 2003.
[25] A. Keromytis, V. Misra, and D. Rubenstein. Sos: Secure overlay services. In InProc. ACM SIGCOMM ’02, 2002.
[26] J. Li, J. Mirkovic, M. Wang, P. Reiher, and L. Zhang. Save: Source addressvalidity enforcement protocol. In IEEE INFOCOM, June 2002.
[27] S. AGARWAL, C. CHUAH, and R. KATZ. Opca: Robust interdomain policyrouting and traffic control. In IEEE Openarch, April 2003.
[28] George Riley and Mostafa Ammar. Simulating large networks: How big is bigenough? In Proceedings of First International Conference on Grand Challengesfor Modeling and Simulation, Jan. 2002.
[29] Vern Paxson and Sally Floyd. Why we don’t know how to simulate the internet.In Winter Simulation Conference, pages 1037–1044, 1997.
[30] The network simulator-ns-2. http://www.isi.edu/nsnam/ns/.
[31] D. Nicol, M. Goldsby, and M. Johnson. Fluid-based simulation of communicationnetworks using ssf, 1999.
[32] B. Liu, Y. Guo, J. Kurose, D. Towsley, and W. Gong. Fluid simulation oflarge scale network: Issues and tradeoffs. In Proceedings of 1999 InternationalConference on Parallel Distributed Processing Techniques and Applications, June1999.
105
[33] C. Kiddle, R. Simmonds, C. Williamson, and B. Unger. Hybrid packet/fluid flownetwork simulation. In Proceedings of the seventeenth workshop on Parallel anddistributed simulation, page 143, 2003.
[34] Pdns-parallel/distributed ns. http://www.cc.gatech.edu/computing/compass/pdns/index.html.
[35] The SSF Research Network. Scalable simulation framework.http://www.ssfnet.org/homePage.html.
[36] Boleslaw K. Szymanski, Yu Liu, and Rashim Gupta. Parallel network simulationunder distributed genesis. In Proceedings of the seventeenth workshop on Paralleland distributed simulation, page 61, 2003.
[37] Donghua Xu, George F. Riley, Mostafa H. Ammar, and Richard Fujimoto. En-abling large-scale multicast simulation by reducing memory requirements. InProceedings of the seventeenth workshop on Parallel and distributed simulation,page 69, 2003.
[38] Akihito Hiromori, Hirozumi Yamaguchi, Keiichi Yasumoto, Teruo Higashino, andKenichi Taniguchi. Reducing the size of routing tables for large-scale networksimulation. In Proceedings of the seventeenth workshop on Parallel and distributedsimulation, page 115, 2003.
[39] Dartmouth ssf. http://www.cs.dartmouth.edu/ jason-liu/projects/ssf/index.html.
[40] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of theInternet topology. In Proc. of ACM SIGCOMM, pages 251–262, 1999.
[41] R. Albert, H. Jeong, and A. Barabasi. Diameter of the world wide web. Nature,pages 130–131, 1999.
[42] A. Broder, R. Kumar, F. Maghoul, and P. Raghavan. Graph structure in theweb. In Proceedings of the WWW9 Conference, May 2000.
[43] H. Jeong, B. Tomber, R. Albert, Z. Oltvai, and A. L. Babarasi. The large-scaleorganization of metabolic networks. Nature, pages 378–382, 2000.
[44] A. J. Lotka. The frequency distribution of scientific productivity. The Journalof the Washington Academy of the Sciences, page 317, 1926.
[45] M. Newman. The structure of scientific collaboration networks. Proc. Natl. Acad.Sci. USA 98, 4:404–409, 2001.
[46] S. Redner. How popular is your paper? Euro. Phys. J. B, 4:131–134, 1998.
[47] Fan R. K. Chung. Spectral Graph Theory. American Methematical Society, 1997.
[48] Y. Rekhter and T. Li. A border gateway protocol 4 (bgp-4), March 1995. RFC1771.
[49] V. Paxson. End-to-end routing behavior in the internet. In In Proc. ACMSIGCOMM ’96, pages pp. 25–38, 1996.
106
[50] Editor James H. Cowie. Scalable simulation framework api reference manualversion 1.0. http://www.ssfnet.org/SSFdocs/ssfapiManual.pdf, March 1999.
[51] Mpi-the message passing interface standard. http://www-unix.mcs.anl.gov/mpi/.
[52] The x-kernel protocol framework. http://www.cs.arizona.edu/xkernel/.
[53] The SSF Research Network. Domain modeling language (dml) reference manual.http://www.ssfnet.org/SSFdocs/dmlReference.html.
[54] Linux reference manual, section 1, time.
[55] Linux reference manual, section 1, top.
Top Related