2: Application Layer 1
ECE5650
FTP, Email, DNS, and P2P
2: Application Layer 2
Recap: HTTP and Web
HTTP request msg format and method types: GET, POST, HEAD, PUT, DELATE
HTTP response msg format and status codes
Cookies and their usage: Persistent vs Non-Persistent cookies
Web cache or proxy server: Conditional GET (If-modified-since:) in HTTP
header
2: Application Layer 3
Recap: FTP
FTP/SFTP is used to transfer files between hosts FTP is an out-of-band protocol: control is sent
over server port 21 while data is sent over server port 20.
Control connection is persistent and the FTP server must maintain the state of the user.
Data connection is non-persistent and initiated by FTP server.
2: Application Layer 4
Recap: Email
SMTP and POP3 uses persistent connections
SMTP requires message (header & body) to be in 7-bit ASCII
SMTP server uses CRLF.CRLF to determine end of message
download-and-delete vs download-and-keep in POP3
All data communications are insecure by default
Comparison with HTTP:
HTTP: pull data from web server
SMTP: push data to mail server
both have command/response interaction, status codes
HTTP: each object encapsulated in its own response msg
SMTP: multiple objects sent in one multipart msg
SMPT msg must be in 7-bit ASCII while HTTP has no restriction
2: Application Layer 5
Examples ofInternet Services
2.1 Principles of network applications
2.2 Web and HTTP 2.3 FTP 2.4 Electronic Mail
SMTP, POP3, IMAP
2.5 DNS
2.6 P2P file sharing 2.7 Socket
programming with TCP 2.8 Socket
programming with UDP
2.9 Building a Web server
2: Application Layer 6
DNS: Domain Name System
People: many identifiers: SSN, name, passport #
Internet hosts, routers: IP address (32 or 128
bit) - used for addressing datagrams
“canonical name”, e.g., ww.yahoo.com - used by humans
Q: map between IP addresses and name ?
Domain Name System (DNS) is:
1- distributed database implemented in hierarchy of many name servers
2- application-layer protocol: host, routers and name servers communicate to resolve names (address/name translation). DNS protocol uses UDP transport protocol and port 53.
3- employed by other application layer protocols (HTTP, SMTP, FTP) to resolve host names.
2: Application Layer 7
DNS Why not centralize
DNS? single point of
failure traffic volume distant centralized
database maintenance
doesn’t scale!
DNS services Hostname to IP address translation Host aliasing
Canonical (actual) and alias names (user-friendly): cwis-1.wayne.edu for alias www.wayne.edu
Mail server aliasing: mail server and web server can
share the same alias name. E.g. [email protected], wayne.edu
Load distribution Replicated Web servers: a set of IP
addresses for one canonical name. DNS returns the list of IPs for a name but rotated by 1 each time so the user can use the first listed IP.
2: Application Layer 8
Root DNS Servers (13 servers labeled A-M)
com DNS servers org DNS servers edu DNS servers
poly.eduDNS servers
umass.eduDNS servers
yahoo.comDNS servers
amazon.comDNS servers
pbs.orgDNS servers
Distributed, Hierarchical Database
Each Client uses a local DNS server that does not belong to the hierarchy:
The local DNS is usually assigned by the DHCP server as part of the temporary IP assignment (run command: “ipconfig /all” to find your local DNS server).
Top-Level Domain Servers
(TLDs)
Authoritative DNS servers
2: Application Layer 9
DNS: Root name servers
b USC-ISI Marina del Rey, CAl ICANN Los Angeles, CA
e NASA Mt View, CAf Internet Software C. Palo Alto, CA (and 17 other locations)
i Autonomica, Stockholm (plus 3 other locations)
k RIPE London (also Amsterdam, Frankfurt)
m WIDE Tokyo
a Verisign, Dulles, VAc Cogent, Herndon, VA (also Los Angeles)d U Maryland College Park, MDg US DoD Vienna, VAh ARL Aberdeen, MDj Verisign, ( 11 locations)
There are 13 root DNS server world wide that are labeled A-M: map of root DNS, as of Oct 2006.
2: Application Layer 10
TLD and Authoritative Servers Top-level domain (TLD) servers: responsible for
com, org, net, edu, etc, and all country code top-level domains (ccTLD) us, ca, in, cn, jp. Network solutions maintains servers for com TLD Educause for edu TLD
Authoritative DNS servers: organization’s with public names has DNS servers, providing authoritative hostname to IP mappings for organization’s servers (e.g., Web and mail). Can be maintained by organization or service provider
2: Application Layer 11
Local Name Server
Does not strictly belong to hierarchy Each ISP (residential ISP, company,
university) has one. Also called “default name server”
When a host makes a DNS query, query is sent to its local DNS server Acts as a proxy, forwards query into
hierarchy.
2: Application Layer 12
requesting hostX
Y
root DNS server
local DNS server
1
23
4
5
6
authoritative DNS server
78
TLD DNS server
Example of Typical DNS request
Client X wants IP address for Y Steps performed:1- Client sends DNS request to the local
DNS server to search on its behalf (recursive query)
2- local DNS contacts one of the root DNSs to resolve hostname Y.
3- root DNS returns the TLD DNS IP to local DNS
4- local DNS contacts one of the TLDs to get an Authoritative DNS nam
5- TLD returns IP of authoritative DNS to local DNS
6- local DNS contacts authoritative DNS to resolve X
7- authoritative DNS returns IP of Y8- local DNS return IP of Y to X
Query 1 is recursiveQueries 2, 4 and 6 are iterative
Example of recursive+iterative DNS query - typically used
2: Application Layer 13
requesting host
requested host
root DNS server
local DNS server
1
2
45
6
authoritative DNS server
7
8
TLD DNS server
3
Recursive and Iterative DNS queries
recursive query: puts burden of
name resolution on contacted name server
heavy load?
iterative query: reply is directly
returned to requesting server
“I don’t know this name, but ask this server” Example of pure recursive DNS
query - not typically used
2: Application Layer 14
DNS: caching and updating records once (any) name server learns mapping, it
caches mapping cache entries timeout (disappear) after
some time TLD servers typically cached in local name
servers• Thus root name servers not often visited
Client may also cache DNS names update/notify mechanisms under design by
IETF RFC 2136 http://www.ietf.org/html.charters/dnsind-charter.html
2: Application Layer 15
hosts file
local file that is checked by the client DNS of the OS before sending a DNS request. It can speed the web access.
If the requested name is found in the hosts file then its corresponding IP is used.
Can be used to create custom (name-IP) entries. File Location:
windows XP: C:\WINDOWS\system32\drivers\etc most UNIX and Linux: /etc
File Structure: <IP address><space><name><space><# comment> Example of an entry: 127.0.0.1 localhost #default entry
2: Application Layer 16
DNS recordsDNS: distributed db storing resource records (RR)
Type=NS name is domain (e.g. foo.com) value is hostname of authoritative
name server for this domain always in non-authoritative DNSs to
point to authoritative DNSs
RR format: (name, value, type, ttl)
Type=A name is hostname value is IP address always in authoritative DNS may be cached in non-
authoritative DNSs
Type=CNAME name is alias name for some
“canonical” (the real) name www.ibm.com is really servereast.backup2.ibm.com value is canonical name used by all hosts
Type=MX value is name of mailserver associated
with name that is usually an alias name
company can have a web server and a mail server with the same alias name. e.g. [wayne.edu mail.wayne.edu, MX]
TTL is time to live of the RR and determines when an RR should be removed from cache.
2: Application Layer 17
DNS records with DNS servers
Authoritative DNSs for an institution: must contain Type A RRs for the institution’s public
names and IPs. may contain Type MX RRs for the institution’s public mail
server names and IPs. may contain Type CNAME RRs if the institution has
Canonical names for its alias names.
TLD DNSs contain Type NS RRs with each organization’s public
name is mapped to its authoritative DNS server names. There is usually a primary and secondary authoritative DNS servers.
contain Type A RRs with the Authoritative DNS server name and IP address.
2: Application Layer 18
DNS protocol, messagesDNS protocol : query and reply messages, both with same message format
msg header identification: 16 bit #,
query and reply msgs use the same #
flags: query or reply 1 bit
flag recursion desired or
available 1 bit reply is authoritative
2: Application Layer 19
DNS protocol, messages
Name, type fields for a query
RRs in responseto query
records forauthoritative servers
additional “helpful”info that may be used
2: Application Layer 20
Inserting records into DNS
Example: just created startup “Network Utopia” Register name networkuptopia.com at a registrar
(e.g., Network Solutions) Need to provide registrar with names and IP addresses
of your authoritative name server (primary and secondary)
Registrar inserts two RRs into the com TLD server:
(networkutopia.com, dns1.networkutopia.com, NS)(dns1.networkutopia.com, 212.212.212.1, A)
Put in authoritative server Type A record for www.networkuptopia.com and Type MX record for networkutopia.com
How do people get the IP address of your Web site?
2: Application Layer 21
nslookup command and whois DB used to displays information that you can use to diagnose Domain
Name System (DNS) infrastructure. Contacts the specified DNS server to retrieve requested records.
nslookup <domain or IP to find> <DNS server name> Example: nslookup wayne.com whois database can be used to locate the corresponding registrar,
DNS server and IPs for a particular domain. Only registrars accredited by the Internet Corporation for Assigned
Names and Numbers (ICANN - non-profit org) are authorized to register .aero, .biz, .com, .coop, .info, .museum, .name, .net, .org, or .pro names.
.com whois database: http://www.internic.net/whois.html .edu whois database http://whois.educause.net/index.asp wayne.edu DNS name servers:
NS.WAYNE.EDU 141.217.1.15 NS2.WAYNE.EDU 141.217.1.13 DNS.MERIT.NET NS2.CS.WAYNE.EDU 141.217.16.10
2: Application Layer 22
DNS Vulnerabilities DDoS bw-flooding attack against DNS server.
A large scale attack on 13 DNS root servers on Oct 21, 2002 by using ICMP ping messages
Block ICMP ping packets in packet filtering DNS queries attack
Hard to be filtered Mitigated by caching in local DNS servers
Man-in-the-middle attack Trick a server into bogus records into its cache Hard to implement, because it needs to intercept
packets Reflection attack on other hosts
Send queries with spoofed source addr of a target server
2: Application Layer 23
DNS Summary DNS services:
Hostname to IP address translation Host aliasing, Mail server aliasing, Load distribution
DNS is hierarchical and distributed root DNS vs TLD vs Authoritative DNS vs local DNS recursive vs iterative DNS query DNS cache: local server caches TLDs so that root
servers are rarely visited DNS record types: A, NS, CNAME, MX DNS Query and Reply msg format is the same nslookup command and the whois database DNS vulnerabilities
2: Application Layer 24
Examples ofInternet Services
2.1 Principles of network applications
2.2 Web and HTTP 2.3 FTP 2.4 Electronic Mail
SMTP, POP3, IMAP
2.5 DNS
2.6 P2P file sharing 2.7 Socket
programming with TCP 2.8 Socket
programming with UDP
2.9 Building a Web server
2: Application Layer 25
P2P file sharing
Example Alice runs P2P client
application on her notebook computer
Intermittently connects to Internet; gets new IP address for each connection
Asks for “Hey Jude” Application displays
other peers that have copy of Hey Jude.
Alice chooses one of the peers, Bob.
File is copied from Bob’s PC to Alice’s notebook: HTTP
While Alice downloads, other users uploading from Alice.
Alice’s peer is both a Web client and a transient Web server.
All peers are servers = highly scalable!
2: Application Layer 26
How Did it Start?
A killer application: NapsterFree music over the Internet
Key idea: share the storage and bandwidth of individual (home) users
Internet
2: Application Layer 27
Main ChallengesFind where a particular file is stored
Note: problem similar to finding a particular page in web caching Nodes join and leave dynamically
AB
C
D
E
F
E?
2: Application Layer 28
P2P file sharing Architectures
Centralized Directory: Central Directory keeps track of peer IPs and their
shared content Example: Napster and Instant Messaging
Distributed Query Flooding: Peers keep their own shared directory and content
is located in nearby peers. Example: Gnutella protocol
Distributed Heterogeneous Peers proprietary protocol, group leaders with high
bandwidth act as central directories searched by connected peers
Example: KaZaA
2: Application Layer 29
P2P: centralized directory
Original “Napster” design
1) when peer connects, it informs central server: IP address content
2) Alice queries for “Hey Jude”
3) Alice requests file from Bob
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
2: Application Layer 30
Napster: Example
AB
C
D
E
F
m1m2
m3
m4
m5
m6
m1 Am2 Bm3 Cm4 Dm5 Em6 F
E?m5
E? E
2: Application Layer 31
P2P: problems with centralized directory
Single point of failure: if the central directory crashes the whole application goes down.
Performance bottleneck: the central server maintains a large database
Copyright infringement
file transfer is decentralized, but locating content is highly centralized
2: Application Layer 32
Napster: Historyhistory:
5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service
12/99: first lawsuit3/00: 25% UWisc traffic Napster2000: est. 60M users2/01: US Circuit Court of
Appeals: Napster knew users violating copyright laws
7/01: # simultaneous online users:Napster 160K, Gnutella: 40K,
2: Application Layer 33
Query flooding: Gnutella
fully distributed no central server
public domain protocol many Gnutella clients
implementing protocol Peers discover other
peers through Gnutella hosts that maintain and cache list of available peers. Discovery is not part of the Gnutella protocol.
overlay network: graph edge between peer X
and Y if there’s a TCP connection
all active peers and edges is overlay network
Edge is not a physical link but logical link
Given peer will typically be connected with < 10 overlay neighbors
2: Application Layer 34
Gnutella: Example
Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;…
AB
C
D
E
F
m1m2
m3
m4
m5
m6
E?
E?
E?E?
E
2: Application Layer 35
Gnutella: Peer joining or leaving1. Joining peer X must find some other peer in Gnutella
network: use list of candidate peers2. X sequentially attempts to make TCP with peers on list
until connection setup with Y3. X sends Ping message to Y; Y forwards Ping message.
The frequency of Ping messages are not part of the protocol but they should be minimized.
4. All peers receiving Ping message respond with Pong message containing the number of files shared and their size in kbytes.
5. X receives many Pong messages. It can then setup additional TCP connections
6. When a peer leaves the network, other peers try to connect sequentially to others
2: Application Layer 36
Gnutella protocol Query
QueryHit
Query
Query
Query
QueryHit
Query
QueryHit
File transfer:HTTP
A Query message (each with a MessageID) is sent over existing TCP connections. peers forward Query message and keep track of the last socket source of the message with the message ID and decrement the peer-count field. QueryHit message sent over reverse path using the message ID so that peers can remove the QueryHit messages from the network.
limited scope query flooding has been implemented where a peer-count field of the query is decremented when it reaches a peer and returned to sender when it reaches 0
The number of edges of a Gnutella overlay network with N nodes = N(N-1)/2
2: Application Layer 37
Gnutella vs Napster Distribute file location and decentralize lookup. Idea: multicast the request Hot to find a file:
Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the
request, and it sends back the answer
Advantages: Totally decentralized, highly robust
Disadvantages: Not scalable; the entire network can be swamped with
request (to alleviate this problem, each request has a TTL)
2: Application Layer 38
Recap: P2P file sharing Arch
Centralized Directory: Central Directory keeps track of peer IPs and their
shared content Example: Napster and Instant Messaging
Distributed Query Flooding: Peers keep their own shared directory and content
is located in nearby peers. Example: Gnutella protocol
Distributed Heterogeneous Peers proprietary protocol, group leaders with high
bandwidth act as central directories searched by connected peers
Example: KaZaA
2: Application Layer 39
Exploiting heterogeneity: KaZaA
Proprietary protocol, encrypts the control traffic but not the data files
Each peer is either a group leader or assigned to a group leader. TCP connection between peer
and its group leader. TCP connections between
some pairs of group leaders.
Group leader tracks the content in all its children.
ordinary peer
group-leader peer
neighoring re la tionshipsin overlay network
2: Application Layer 40
KaZaA: Querying Each file has a hash and a descriptor Client sends keyword query to its group
leader Group leader responds with matches:
For each match: metadata, hash, IP address If group leader forwards query to other
group leaders, they respond with matches. limited scope query flooding is also implemented by KaZaA.
Client then selects files for downloading HTTP requests using hash as identifier sent
to peers holding desired file
2: Application Layer 41
KaZaA tricks to improve performance
Request queuing: each peer can limit the #simultaneous uploads (~3-7) to avoid long delays
Incentive priorities: the more a peer uploads the higher his priority to download
Parallel downloading of a file across peers: peer can download different portions of the same file from different peers using the byte-range header of http.
2: Application Layer 42
Comparing Client-server, P2P architecturesQuestion : How much time distribute file initially
at one server to N other computers?
us
u2d1 d2u1
uN
dN
Server
Network (with abundant bandwidth)
File, size F
us: server upload bandwidth
ui: client/peer i upload bandwidth
di: client/peer i download bandwidth
2: Application Layer 43
Client-server: file distribution time
us
u2d1 d2u1
uN
dN
Server
Network (with abundant bandwidth)
F server
sequentially sends N copies: NF/us time
client i takes F/di
time to download
increases linearly in N(for large N)
= dcs = max { NF/us, F/min(di) }i
Time to distribute F to N clients using
client/server approach
2: Application Layer 44
P2P: file distribution time
us
u2d1 d2u1
uN
dN
Server
Network (with abundant bandwidth)
F server must send one
copy: F/us time
client i takes F/di time to download
NF bits must be uploaded (aggregate) Total upload rate: us +
ui
dP2P = max { F/us, F/min(di) , NF/(us + ui) }i i=1,N
2: Application Layer 45
0
0.5
1
1.5
2
2.5
3
3.5
0 5 10 15 20 25 30 35
N
Min
imum
Dis
trib
utio
n T
ime P2P
Client-Server
Comparing Client-server, P2P architectures
2: Application Layer 46
P2P Case Study: BitTorrent
tracker: each torrent has an infrstrctrnode, which keeps record of peers participating in the torrent
torrent: group of peers exchanging chunks of a file
obtain listof peers
trading chunks
peer
P2P file distribution
2: Application Layer 47
BitTorrent (1) file divided into 256KB chunks. peer joining torrent:
has no chunks, but will accumulate them over time registers with tracker to get list of peers, connects to
subset of peers (“neighbors”) concurrently in TCP Alice’s neighboring peers may fluctuate over time Alice periodically ask each of her neighbor for the list of
chunks they have (pull chunk) while downloading, peer uploads chunks to other peers. peers may come and go once peer has entire file, it may (selfishly) leave or
(altruistically) remain
2: Application Layer 48
BitTorrent (2)
Which chunk to pull first? at any given time, diff
peers have different subsets of file chunks
periodically, a peer (Alice) asks each neighbor for list of chunks that they have.
Alice issues requests for her missing chunks rarest first
Which request to be responded first: tit-for-tat trading
Alice sends chunks to four neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10
secs every 30 secs: randomly
select another peer, starts sending chunks the new peer may join top 4
Random selection allows new peers to get chunks, so they can start to trade
Trading algorithm helps eliminate free-riding problem.
2: Application Layer 49
P2P Case study: Skype
P2P (pc-to-pc, pc-to-phone, phone-to-pc) Voice-Over-IP (VoIP) application also IM
proprietary application-layer protocol (inferred via reverse engineering)
hierarchical overlay
Founded by the same people of Kazaa
Acquired by eBay in 2005 for $2.6B
Skype clients (SC)
Supernode (SN)
Skype login server
2: Application Layer 50
Skype: making a call
User starts Skype
Skype login server
SC registers with SN list of bootstrap SNs
SC logs in (authenticate)
Call: SC contacts SN will callee ID SN contacts other SNs (unknown protocol, maybe flooding) to find addr of callee;
returns addr to SC
SC directly contacts callee, overTCP
2: Application Layer 51
P2P summary
P2P 3 Architectures: Central Directory, Distributed Query Flooding, Distributed Heterogeneous Peers
Examples of P2P applications: Napster, Gnutella and KaZaA, Bittorrent, Skype
Top Related