1 CMPT 471 Networking II Transport Layer Network Programming © Janice Regan, 2013.

58
1 CMPT 471 Networking II Transport Layer Network Programming © Janice Regan, 2013

Transcript of 1 CMPT 471 Networking II Transport Layer Network Programming © Janice Regan, 2013.

1

CMPT 471Networking II

Transport Layer

Network Programming

© Janice Regan, 2013

© Janice Regan, 2013 2

Reliable underlying network: 1 Allow each end to know the other exists Negotiate of optional parameters (segment size, window

size, QoS) Allocation of transport entity resources (buffer space,

entry in connection table) Need ESTABLISHED (open), CLOSED, and ‘half open’

states. Process must be ‘listening’ to accept a connection

Need two types of messages (socket subroutine calls) SYN: specifies initial sequence numbers for synchronization FIN: indicates no more data to send, requesting termination

© Janice Regan, 2013 3

Reliable underlying network (2) ‘half open’ states occur when opening or closing a

connection connection requested and not yet established (SYN

SENT) (waiting for response SYN) ready to accept connections no request received yet

(LISTEN) (waiting for initiating SYN, will immediately respond with response SYN)

Close requested, data flow from requesting endpoint is complete (FIN WAIT) (waiting for response to sent FIN)

Close request received, data flow from endpoint receiving close request continues until completion (CLOSE WAIT) (waiting to respond to close request, FIN)

© Janice Regan, 2013 4

Connection State Diagram

Stallings 2003: Figure 20.3

© Janice Regan, 2013 5

Connection Establishment

Stallings 2003: Figure 20.4

© Janice Regan, 2013 6

Unreliable Network Service

PROBLEMS Solved by using sliding windows and credit scheme

already discussed Segments may get lost, need retransmission strategy Duplication detection Flow control

Segments may arrive out of order, transport layer must deliver in order to application (solution unique sequence numbers)

Connection establishment / termination (replace two way handshake with three way handshake to deal with losses, duplications)

© Janice Regan, 2013 7

Unreliable underlying network Must add additional states to the diagram

When opening need a SYN RECEIVED state to be in while waiting for the ACK of the SYN sent in response to the SYN sent by the initiating station

When closing in passive mode need a LAST ACK state to be in while waiting for the ACK of the FIN sent in response to the received FIN

© Janice Regan, 2013 8

Connection Establishment Lost or delayed data segments can cause

connection problems if the previously discussed 2-way handshake is used Cannot assume sent SYN or FIN reaches destination Arriving SYN or FIN could be from an old connection

attempt Solve by using a three way handshake (must

ACK receipt of FIN or SYN) and start segment numbers (SYNi, ACKi, FINi) far removed from those used by previous connections

© Janice Regan, 2013 9

Unreliable underlying network Must add additional states to the diagram

When opening need a SYN RECEIVED state to be in while waiting for the ACK of the SYN sent in response to the SYN sent by the initiating station

When closing in passive mode need a LAST ACK state to be in while waiting for the ACK of the FIN sent in response to the received FIN

© Janice Regan, 2013 10

Unreliable underlying network When closing in active mode need

FIN WAIT2 state to be in while waiting for the station receiving the close request (the first FIN) to complete transmission and send response FIN. The ACK of the first FIN is received before entering this state

CLOSING state to be in while waiting for the ACK to the response FIN. The response FIN has already been received before entering this state

© Janice Regan, 2013 11

Three Way Handshake

Stallings 2003: Figure 20.3

© Janice Regan, 2013 12

Three Way Handshake:

Stallings 2003: Figure 20.3

© Janice Regan, 2013 13

The socket interface The socket interface is used in application

programs to specify the interface between the application program and the protocol software

The form of the interface (API) is not specified in the protocol, it may vary for different OS’s

The socket interface discussed is a BSD UNIX version that in practice forms the basis of the API for many OS’s

© Janice Regan, 2013 14

NETWORK IO When sending or receiving data across a network

connection a socket descriptor is used to specify the connection

A socket is a communication endpoint, the socket descriptor is an integer describing that communication endpoint.

A socket descriptor is not automatically bound to a particular connection only to a particular socket (one endpoint of the connection)

The application programmer can choose when to bind the socket to a connection, or to let the OS bind to a connection at runtime

You can read from or write to a socket, (send and receive)

© Janice Regan, 2013 15

IO: FILE vs. NETWORK When reading or writing a file a file descriptor is used to

specify the file referred to A file is opened using a particular file descriptor, binding

the file or device to that file descriptor Data is read from and/or written to file to read or write need at least three pieces of information

1. File descriptor

2. Buffer (holds data read from file or data being written to the file)

3. Number of bytes to read/write (or size of buffer) File is closed

© Janice Regan, 2013 16

IO: FILE vs. NETWORK When reading or writing through a socket a

socket descriptor is used to specify the particular socket

A socket is created and associated with a particular socket descriptor this creates a socket (communication

endpoint) This does not bind the socket to a particular

data stream or device A socket is bound to one particular

communication endpoint (IP address, port) within the present process

© Janice Regan, 2013 17

IO: FILE vs. NETWORK

When reading or writing through a socket a socket descriptor is used to specify the particular socket

A socket is connected to another communication endpoint

Data can then be transferred from one endpoint to the other. Once the sockets are connected the socket descriptor can be used like a file descriptor

When the data transfer is complete the connection is closed

© Janice Regan, 2013 18

The socket function

#include <sys/socket.h>int socket( int family, int type, int protocol)

The family indicates the family of protocols that will use the socket (PF_INET for IPv4, PF_INET6 for IPv6, PF_ROUTE routing sockets)

The type indicates the particular type of transfer that will be used (SOCK_STREAM for TCP, SOCK_DATAGRAM for UDP, SOCK_RAW for raw data, SOCK_SEQPACKET )

The protocol indicates the particular protocol to use (IPPROTO_TCP, IPPROTO_UDP, IPPROTO_SCTP)

© Janice Regan, 2013 19

Creating a socket socketfd = socket( PF_INET, SOCK_STREAM,

IPPROTO_TCP); socketfd is the socket descriptor for the newly created

IPv4 TCP socket (-1 to indicate an error in the socket creation process)

This socket is not yet associated with any communication endpoint (IP address, port pair)

The socket is an active socket (for use in active connection mode)

You can see all the possible values for protocol family, protocol type and protocol in /usr/include/sys/socket.h and /usr/include/netinet/in.h

© Janice Regan, 2013 20

The bind function The bind function associates the socket descriptor

with a local (same host that the process doing the bind is running on) communication endpoint

#include <sys/socket.h>int bind( int socketfd, const struct sockaddr

*myaddr, socklen_t addrlen)

The socket length, addrlen, specifies the number of bytes in the socket address (see following discussion of sockaddr structure)

© Janice Regan, 2013 21

The bind function int bind( int socketfd, const struct sockaddr

*myaddr, socklen_t addrlen) The socket address, myaddr, specifies the local

communication endpoint as a structure containing the IP address, the port number, The IP address must belong to an interface on the

local host a port number of 0 indicates that the port should be

assigned by the OS when the socket is connected (an ephemeral port)

The argument is a pointer to a generic address structure (cast from the actual address structure used)

© Janice Regan, 2013 22

local communication endpoints The local port and IP address of the host’s

interface can be determined automatically within the sockets software at run time. An interface to the network will be specified for you

(local IP address) A available (ephemeral) port on the host will be

chosen to associate with the endpoint Normally servers bind to a specified port, clients

allow the OS kernel to choose an interface (the one that can connect to the server) and an ephemeral port

© Janice Regan, 2013 23

local communication endpoints The local port and/or IP address can be

specified using the bind function If the port number is specified as 0 an

available (ephemeral) port will be assigned. The specified local IP will be used (must be one of the interfaces on the local host)

If a wildcard (INADDR_ANY) is used for the IP address the OS kernel will select the local IP address

© Janice Regan, 2013 24

sockaddr_in structure IPv4

struct sockaddr_in {

uint8_t sa_len

sa_family_t sa_family; /* address family */

in_port_t sin_port; /* port number */

struct in_addr sin_addr; /* IP address */

unsigned char sin_zero[8]; /* unused space in generic structure */ };

Found in /usr/include/netinet/in.h

LENGTH ADDRESS FAMILY

© Janice Regan, 2013 25

sockaddr_in6 structure for IPv6

(2) PROTOCOL PORT

32 BIT FLOW LABEL

UNUSED (0)

128 BIT IP ADDRESS

32 BIT SCOPE ID

LENGTH |A DDRESS FAMILY

© Janice Regan, 2013 26

sockaddr_in6 structure for IPv6

struct sockaddr_in {

uint8_t sa_len

sa_family_t sa_family; /* address family */

in_port_t sin_port; /* port number */

uint32_t sin6_flowinfo; /* undefined */

struct in6_addr sin6_addr; /* IP address */

uint sin6_scope_id; /* set of interfaces */ };

Found in /usr/include/netinet/in.h

LENGTH |A DDRESS FAMILY

© Janice Regan, 2013 27

Generic sockaddr_storage

struct sockaddr _storage{ uint8_t sa_len

sa_family_t sa_family; /* address family */ /* enough storage to hold any supported type of

socket address */};Found in /usr/include/netinet/in.h for each particular OS

Opaque storage (no direct access) for remaining variablesEnough storage for longest type on system

PORT NUMBERLength | protocol family

© Janice Regan, 2013 28

Specifying Destination Address The connect function associates the socket descriptor

with a destination address [(IP address, port) pair specifying the destination connection endpoint] For TCP this function initiates the three way

handshake. A client normally requests a connection using the

connect function. A server normally waits for that connect request.

A socket can be connected regardless of whether it has been bound to a local address. If no call has been made to bind the OS kernel will assign the local ephemeral port and IP address

© Janice Regan, 2013 29

The connect function#include <sys/socket.h>int connect( int socketfd, const struct sockaddr

*servaddr, socklen_t addrlen)

The socket address, servaddr, specifies the IP address and port number of the destination connection endpoint The IP address must belong to an interface on the host running

the server to be connected to The argument is a pointer to a generic address structure (cast

from the actual address structure used) The socket length, addrlen, specifies the number of

bytes in the socket address

© Janice Regan, 2013 30

Errors during connection It is possible that errors will occur during the threeway

handshake process The host running the server will respond to a connect

with a RST (reset) if the server is not available. This is a hard failure, no further attempts at connection will be made

If the request does not reach the host running the server a soft error (destination unreachable) may occur, the client will continue to attempt to connect by sending additional SYNs until a timeout occurs (usually 75 seconds).

© Janice Regan, 2013 31

Waiting for a connection The listen function converts an

unconnected socket into a passive socket. When created using socket the socket is an

active socket Listen is usually used by a server process.

The OS kernel queues requests for connection to the passive socket.

© Janice Regan, 2013 32

Waiting for a connection The number of processes queued

includes two classes of connections (in two queues) Incompleted connections which have not

completed the three way handshake (SYN has been sent)

Completed connections for which the connection has been established (three way handshake complete)

© Janice Regan, 2013 33

The listen function

#include <sys/socket.h>int listen( int socketfd, int backlog)

The backlog indicates the maximum number of connections the kernel (OS) should queue for this socket.

The listen function returns an integer indicating the success (1) or failure (0) of the conversion of the socket to a passive socket

© Janice Regan, 2013 34

The listen function The backlog indicates the maximum number of

connections the kernel (OS) should queue for this socket. When counting the number of connections

the sum of the number of processes waiting in both queues is the backlog

Historic default value of 5 may not be adequate in some modern systems (for example browsing may need more)

Do not use 0, response is implementation dependent

© Janice Regan, 2013 35

Accepting a connection The accept function returns the connection descriptor,

connfd If connect is successful it returns a connection

descriptor for the next connection in the completed queue

If connect is not successful it returns -1 to indicate an error condition

if the queue is empty it puts the process to sleep until a connection is requested (blocking ).

Usually used in a server process.

© Janice Regan, 2013 36

The accept function #include <sys/socket.h>

int accept( int socketfd, const struct sockaddr *cliaddr, socklen_t addrlen)

The socket descriptor of the destination (server) connection endpoint, socketfd, is the integer returned by the socket function (-1 for error)

The socket address, cliaddr, returns the IP address and port number of the connection endpoint whose request is being accepted. (must be an interface on the host running the client)

The socket length, addrlen, returns the number of bytes in the cliaddr.

© Janice Regan, 2013 37

Closing a connection The close functions default results are

mark the socket with socket descriptor sockfd as closed

For an TCP connection, start the three way handshake to terminate the connection by sending a FIN

Prevents further use of the socket by the application

Allows any previously queued data to be sent before closing the connection

#include <sys/socket.h>int close( int socketfd)

© Janice Regan, 2013 38

The socket interface We have discussed the basics of the socket

interface when dealing with a simple iterative server and clients

The server will queue connect requests from different clients and deal with them one by one For short requests like sending a single packet this can

work well For longer requests that take significant processing time

this is not efficient, we would like the server to be able to deal with the requests simultaneously

The solution is to use a concurrent server, that makes a copy of itself or child to deal with each client

© Janice Regan, 2013 39

client1

Child server

server

Child server

206.168.112.219

12.106.32.254

*:21 *:*

12.106.32.254 :21206.168.112.2.19.: 1500

12.106.32.254 :21206.168.112.2.19.: 1501

client2

206.168.112.2.19.: 150112.106.32.254 :21

206.168.112.2.19.: 150012.106.32.254 :21

Ephemeral port 1500, or 1501 is assigned by the protocol’s software

© Janice Regan, 2013 40

Creating a child server When the accept returns and the connection has been

established the server should call #include <sys/types.h>#include <unistd.h>pid_t fork(void)

fork() creates a copy of the server which inherits access to all open descriptors in the parent server

fork() returns the process id of the new child server to the parent, and a process id of 0 to the new child server. (-1 to the parent on failure)

On successful creation of the child server the parent server will close the connection and continue listening for the next connection

© Janice Regan, 2013 41

Sample listen( listenfd, LISTENQ);for( ; ; ) {

connfd = accept(listenfd, …);if ( (pid = fork()) == 0 ) {

close(listenfd); processit(connfd); /* uses connection to do

something */close( connfd );exit (0);}

close (connfd);}

© Janice Regan, 2013 42

When is a client connected? When running a concurrent server the child server

needs to remain connected to the client, the parent server does not

When the parent server closes the connection to the client why does the child servers connection remain open?

So far we have said that the close sends a FIN to initiate the handshake to close the connection

This is a simplification that needs to be clarified to understand the operation of the concurrent server

© Janice Regan, 2013 43

Reference counts The OS maintains a reference count for every

socket and connection. When a socket is created the reference count for the

socket descriptor is set to one When a connection is accepted the reference count for

the connection descriptor is set to one When a server fork calls fork() and successfully creates

a child both reference counts are incremented When the parent server (or child) closes a connection

the connection’s reference count is decremented The FIN to initiate closing of the connection is sent

when the connection’s reference count reaches zero

© Janice Regan, 2013 44

Network byte order The order in which bytes in multi byte words are stored is

system dependent. Host byte ordering is little endian (low order byte first) Big endian (high order byte first)

Network byte ordering must be consistent even between different types of hosts using different byte ordering Network byte ordering is big endian

Need conversion functions from host byte ordering and network byte ordering to convert data being used in some fields in the socket address structures

These conversion functions make servers and clients usable on hosts using both types of byte ordering

© Janice Regan, 2013 45

Byte Order conversion htons, htonl, ntohs, and ntohl can be used to convert

socket data arguments between host byte order and network byte order#include <netinet/in.h>uint16_t htons( uint16_t host16bitvalue);uint32_t htonl( uint32_t host16bitvalue);uint16_t ntohs( uint16_t net16bitvalue);uint32_t ntohl( uint32_t net16bitvalue);

ntohs and ntohl convert values from network byte order to host byte order

Htons and htonl convert values from host byte order to network byte order

© Janice Regan, 2013 46

Address Translation In many applications we will want to express

addresses in both binary form and other forms for presentation For IPv4 dotted decimal notation For IPv6 hexadecimal form

The socket interface provides translation routines between these forms of address representation Functions appropriate for IPv4 only Functions appropriate for both IPv4 and IPv6

© Janice Regan, 2013 47

IPv4 only address conversion inet_aton, inet_addr, and inet_ntoa can be used to convert ip

addresses between network byte ordered binary and dotted decimal#include <netinet/in.h>int inet_aton( const char *strptr, struct in_addr *addrptr);in_addr_t inet_addr( const char *strptr);char *inet_ntoa(struct in_addr inaddr);

inet_aton converts a dotted decimal string value to 32 bit network byte ordered binary form and inserts it into an in_addr structure. (returns 1 for success, 0 for failure)

inet_addr converts a dotted decimal string value to 32 bit network byte ordered binary form returning an in_addr structure containing the address (INADDR_NONE in case of error)

Additional functions are mentioned in your text

© Janice Regan, 2013 48

Address conversion inet_pton, and inet_ntop can be used to convert IP addresses

between network byte ordered binary and dotted decimal (IPv4) or hexadecimal (IPv6). These functions should supercede inet_aton and inet_notoa#include <netinet/in.h>int inet_atop(int family, const char *strptr, void *addrptr);char *inet_ntop(int family, const void *addptr, char *strptr, size_t len);

inet_pton converts a dotted decimal string value to 32 bit network byte ordered binary form and inserts it into an in_addr structure. (returns 1 for success, 0 for failure)

inet_ntop converts to a dotted decimal string value from a 32 bit network byte ordered binary form in an in_addr structure

Additional functions that do not require you to know the address of the IP address (makes code protocol dependent) are also available

© Janice Regan, 2013 49

Obtaining socket addresses getsockname can be used to return the IP

address and port of the local communication endpoint in a socket address structure. Usually used by a client #include <sys/socket.h>int getsockname( int sockfd,

struct sockaddr *localaddr, soclen_t *addrlen);

Returns 0 on success 1 on failure

© Janice Regan, 2013 50

Uses of getsockname To get address of local communication endpoint

after connect returns in a client that does not use bind (client)

To get the port number of the local communication endpoint after a bind using a port number of 0 (to request an ephemeral port)

To get the address family of a socket To get the local IP address when a wildcard

(INADDR_ANY) is used to specify IP address in bind

© Janice Regan, 2013 51

Obtaining socket addresses getpeername can be used to return the IP address and

port of the remote communication endpoint in a socket address structure. Usually used by a server

#include <sys/socket.h>

int getpeername( int sockfd,

struct sockaddr *peeraddr, soclen_t *addrlen); Returns 0 on success 1 on failure When a server uses accept the server obtains the

information on the clients communication endpoint using getpeername

© Janice Regan, 2013 52

Obtaining host information gethostbyname, and gethostbyaddr are keyed lookup functions

that can be used to associate an IPv4 address with a hosts hostname or address. (For IPv6 see getaddrinfo)

#include <netdb.h>struct hostent *gethostbyname( const char *hostname); struct hostent *gethostbyaddr( const char *addr,

socklen_t len, int family); Returns a pointer to a hostent structure

Struct hostent {Char *h_name; /*official name*/char **h_aliases; /* pointer to array of pointers to alias names*/int h_addrtype; /* host address type, AF_INET */int h_length, /* length of address */char **h_addr_list; /* pointer to array of pointers to IPv4 addresses8/}

© Janice Regan, 2013 53

Obtaining server information getservbyname, and getservbyport are keyed lookup functions

that can be used to associate a service with the name of a service or with the port to which that service is assigned.

#include <netdb.h>struct servent *getservbyname( const char *servname, const char

*protoname); struct servent *getservbyport(int port, const char *protoname); Returns a pointer to a servent structure

struct servent { char *s_name; /* official service name */ char **s_aliases; /* alias list */ int s_port; /* port # */ char *s_proto; /* protocol to use */};

protoname is the name of a protocol “tcp”, “udp” … Servname is the name of a service “ftp” “domain” …

© Janice Regan, 2013 54

Obtain protocol information getprotbyname, and getservbyport are keyed lookup

functions that can be used to associate a protocol with its name or the port on which it commonly operates

#include <netdb.h>struct protoent *getprotobyname(const char *name);struct protoent *getprotobynumber(int proto); Returns a pointer to a protoent structure

struct protoent { char *p_name; /* official protocol name */ char **p_aliases; /* alias list */ int p_proto; /* protocol # */};

name is the name of a protocol “tcp”, “udp”

© Janice Regan, 2013 55

Obtain network information getnetbyname, and getnetbyaddr are keyed lookup

functions that can be used to associate a network with its domain name or network IP address #include <netdb.h>struct netent *getnetbyname(const char *name);struct netent *getnetbyaddr(long net, inttype);

Returns a pointer to a protoent structurestruct netent { char *n_name; /* official name of net */ char **n_aliases; /* alias list */ int n_addrtype; /* net address type */ in_addr_t n_net; /* network # */};

name is the name of a protocol “tcp”, “udp” …

© Janice Regan, 2013 56

Socket Options

getsockopt can be used to request or the values of socket options#include <sys/socket.h>int getsockopt( int sockfd, int level, int optname

void *optval, soclen_t *optlen);int setsockopt( int sockfd, int level, int optname

const void *optval, soclen_t *optlen); Returns 0 on success 1 on failure When a server uses accept the server obtains

the information on the clients communication endpoint using getpeername

© Janice Regan, 2013 57

Socket Options Socket options can apply on different levels

Exectued in the general socket code (SOL_SOCKET) Executed in the protocol specific code (IPPROTO_IP,

IPPROTO_IPV6, IPPROTO_TCP…) The option name specifies the particular option being

looked at or set. There are two basic types of options

Binary options: set on or off Options that set or return values of various types (through the

void pointer) Actual options are discussed in the man pages for the

functions

© Janice Regan, 2013 58

Examples Socket Options Set buffer size (size of sliding window) Allow broadcast of packets Route/don’t route outgoing packets Turn on/off keepalive messages (2 hours) Change the default operation of the close function acts

(SO_LINGER option) Default is 0: normal 3-way handshake, close returns

immediately Value >0: waits for value seconds before closing or closes as

soon as all outstanding data have been sent and acknowledged. In case of time out normal close is aborted.

Used to catch problems with server or client crash before completing the handshake causing lost data