Lecture1 HTTP

42
11/06/22 Developing Web Applications (C) 2007 John Wiley & Sons Ltd. www.wileyeurope.com/college/moseley 1 Developing Web Applications Lecture 1: Web Basics and HTML Dr. Ralph Moseley

Transcript of Lecture1 HTTP

Page 1: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

1

Developing Web Applications

Lecture 1: Web Basics and HTML

Dr. Ralph Moseley

Page 2: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

2

• WWW– The World Wide Web (WWW) was developed by Tim

Berners-Lee and other research scientists at CERN, the European center for nuclear research, in the late 1980s and early 1990s.

– WWW is a client-server model and uses TCP connections to transfer information or web pages from server to client.

– WWW uses a Hypertext model. Hypertext allows interactive accesses to a collection of documents.

– Documents can hold• Text (hypertext), Graphics, Sound, Animations, Video

– Documents are linked together• Non-distributed – all documents stored locally (e.g on

CD-Rom).• Distributed – documents stored at remote servers on the

Internet.

Page 3: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

3

• WWW - Hyperlinks (or links)– Each document contains links (pointers) to other

documents.

– The link represented by "active area" on screen

• Graphic - button

• Text - highlighted

– By selecting a particular link, the client fetches the referenced document from a server for display.

– Links may become invalid.

– Link is simply a text name for a remote document.

– Remote document may be moved to a new location while name in link remains in place.

Page 4: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

4

• WWW – Document Representation– Each WWW document is called a page.– Initial page for individual or organization is called a home

page.– Page can contain many different types of information; page

must specify:• Content – The actual information• Type of content – The type of information, e.g. text, pictures etc• Links to other documents

– Rather than having a fixed representation for every browser, pages are formatted with a mark up language.

– This allows browser to format page to fit display.– Different browsers can display pages in different ways.– This also allows text-only browser to discard graphics for

example.– Standard is called HyperText Markup Language

(HTML).

Page 5: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

5

• WWW – HTML– HTML specifies

• Major structure of document • Formatting instructions for browsers to execute.• Hypertext links – Links to other documents• Additional information about document contents

– Two parts to document:• Head contains details about the document.• Body contains the information/content of the document.

– Each web page is represented in ASCII text with embedded HTML tags that give formatting instructions to the browser.

• Formatted section begins with tag, <TAGNAME>• End of formatted section is indicated by

</TAGNAME>

Page 6: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

6

• WWW – HTML Example<HTML>

<HEAD>

<TITLE> Example Page for lecture</TITLE>

</HEAD>

<BODY>

Lecture notes for today go here!

<CENTER>

<TABLE BORDER=3>

<TR>

<TD><A HREF="./lecture10.html">Previous Lecture</A>

<TD><A HREF="./lecture12.html">Next Lecture</A>

<TD><A HREF="./Contents.html">Table of contents</A>

<TD><A HREF="./solutions.html">Solutions to Assignments</A>

<TD><A HREF="./index.html">Index of terms</A>

</TABLE>

</CENTER>

</BODY>

</HTML>

Page 7: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

7

• WWW – Other HTML Tags– Headings - <H1>, <H2>

– Lists

• <OL> - Ordered (numbered) list

• <UL> - Unordered (bulleted) list

• <LI> - List item

– Tables

• <TABLE>, </TABLE> - Define table

• <TR> - Begin row

• <TD> - Begin item in row

– Parameters

• Keyword-value pairs in HTML tags

• <TABLE BORDER=3>

Page 8: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

8

• WWW – Embedding Graphics– IMG tag specifies insertion of graphic

• Parameters:

• SRC="filename"

• ALIGN= - alignment relative to text

– <img SRC=“GCD.gif" height=35 width=30>

– The above line would insert the image in the file GCD.gif into any web page.

– Image must be in format known to browser, e.g., Graphics Interchange Format (GIF), Joint Photographic Experts Group (JPEG), Bitmap etc

Page 9: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

9

•WWW – Style

<html><head><style type="text/css">body {background-color: yellow}h1 {background-color: #00ff00}h2 {background-color: transparent}p {background-color: rgb(250,0,255)}</style></head>

<body>

<h1>This is header 1</h1><h2>This is header 2</h2><p>This is a paragraph</p>

</body></html>

The layout and format of an HTML document can be simplified by using CSS (Cascading Style Sheets)

Page 10: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

10

• WWW – Identifying a web page– A web page is identified by:

• The protocol used to access the web page.

• The computer on which the web page is stored.

• The TCP port that the server is listening on to allow a client to access the web page.

• Directory pathname of web page on server.

– Specific syntax for Uniform Resource Locator (URL): protocol://computer_name:port/document_name

• Protocol can be http, ftp, file, mailto.

– Computer name can be DNS name or IP address.

– TCP port is optional (http uses port 80 as its default port).

– document_name is path on server to web page (file).

Page 11: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

11

• WWW – Identifying a web page– E.g. http://www.yahoo.com/Recreation/Sports/Soccer/index.html

– Protocol is http

– Computer name or DNS name is www.yahoo.com

– Port number is the default port for http, i.e. port 80.

– Document name is /Recreation/Sports/Soccer/index.html

Page 12: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

12

• WWW – Hyperlinks between web pages– Each hyperlink is specified in HTML by using a special

tag.

– An item on a page is associated with another HTML document.

– Each link is passive, no action is taken until link is selected.

– HTML tags for a hyperlink are <A> and </A>

– The linked document is specified by parameter to the tag: HREF="document URL"

– <A HREF=“http://www.gcd.ie”>Click here to go to GCD web site.</A>

– Whatever is between the HTML tags, <A> and </A> is the highlighted hyperlink.

Page 13: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

13

• WWW – Client Server Model– The browser is the client, WWW (or web) server is the

server.– Browser:

• The browser makes TCP connection to the web server.• The browser sends request for the particular web page that it

wishes to display.• The browser reads the contents of the web page from the TCP

connection and displays it in the browsers window.• The browser closes the TCP connection used to transfer the web

page.

– Each separate item in a web page (e.g., pictures, audio) require a separate TCP connection.

– HyperText Transport Protocol (HTTP) specifies commands that the client (browser) issues to the server (web server) and the responses that the server sends back to the client.

Page 14: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

14

Figure 1-1: Web client/server architecture

•WWW – Client Server Model

Page 15: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

15

Web Server Basics

• Duties– Listen to a port– When a client is connected, read the HTTP

request– Perform some lookup function– Send HTTP response and the requested data

Page 16: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

16

Serving a Page

• User of client machine types in a URL

client( N etscape)

server( Apache)

http : / / w w w .sm allco.com / index.htm l

Page 17: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

17

Serving a Page

• Server name is translated to an IP address via DNS

client(Netscape)

server(Apache)

http:// www.smallco.com /index.html

192.22.107.5

Page 18: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

18

Serving a Page

• Client connects to server using IP address and port number

client( N etscape)

server( Apache)

http : / / ww w.sm allco.com / index.htm l

192.22.107.5

192.22.107.5port 80

Page 19: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

19

Serving a Page

• Client determines path and file to request

client( N etscape)

server( Apache)

http : / / w w w .sm allco.com / index.htm l

Page 20: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

20

Serving a Page

• Client sends HTTP request to server

client( N etscape)

server( Apache)

http : / / w w w .sm allco.com / index.htm l

GET index .html HTTP/ 1.1

Page 21: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

21

Serving a Page

• Server determines which file to send

client( Netscape)

server( Apache)

http : / / w w w.sm allco.com / index.htm l"index.htm l" is really/ etc/ httpd/ htdocs/ index.htm l

Page 22: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

22

Serving a Page

• Server sends response code and the document

client( N etscape)

server( Apache)

http : / / w w w .sm allco.com / index.htm l

HTTP/ 1.1 200 OKContent- type: text/ htm l

[ contents of index.htm l]

Page 23: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

23

Serving a Page

• Connection is broken

client( Netscape)

server( Apache)

Page 24: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

24

HTTP

• HTTP is…– Designed for document transfer– Generic

• not tied to web browsers exclusively

• can serve any data type

– Stateless• no persistant client/server connection

Page 25: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

25

HTTP Protocol Definitions

• MIME– Multipurpose Internet Mail Extensions– Standards for encoding different media types in

a message– Originally developed for emailing files and

messages in different languages

Page 26: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

26

HTTP Protocol Definitions• "MIME types" are used to identify the type of information that a

file contains. While the file extension .html is informally understood to mean that the file is an HTML page, there is no requirement that it mean this, and many HTML pages have different file extensions.

• In the HTTP protocol used by web browsers to talk to web servers, the "file extension" of the URL is not used to determine the type of information that the server will return. Indeed, there may be no file extension at all at the end of the URL.

• Instead, the web server specifies the correct MIME type using a Content-type: header when it responds to the web browser's HTTP request.

Page 27: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

27

HTTP Protocol Definitions

TypeCommon File Extension

Purpose

text/html .html Web Page

image/png .png PNG-format image

image/jpeg .jpeg JPEG-format image

audio/mpeg .mp3 MPEG Audio File

application/octet-stream

.exeBest for downloads that should just be saved to disk

Examples of common mime types :

Page 28: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

28

HTTP Protocol Definitions

• In addition to e-mail applications, Web browsers also support various MIME types. This enables the browser to display or output files that are not in HTML format.

•  A new version, called S/MIME, supports encrypted messages.

Page 29: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

29

• WWW – HTTP Protocol– When a user types in

http://www.yahoo.com/Recreation/Sports/Soccer/index.html, the broswer creates a HTTP GET Request message and sends it over a TCP connection to the web server.

– In the above case, the HTTP GET Request message would be

GET /Recreation/Sports/Soccer/index.html HTTP/1.0

User-Agent: InternetExplorer/5.0

Accept: text/html, text/plain, image/gif, audio/au

“\r\n”

Page 30: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

30

• WWW – HTTP Request messages– HTTP Request messages are sent from client to server.

“\r\n”Request Line Optional DataOptional HTTP Header

Type of Request(e.g. GET)

Additional informationsuch as brower beingused, media types accepted

DelimiterCarriage returnLine feed

User data e.g. contents of completed form

– There are a number of valid HTTP Request messages• Get – Used to request a web page from a web server

• Head – Return the header of a web page, used by search engines to test the validity of hyperlinks

• Post – Used to send data (e.g. results of registration form) to a web server

• Put / Delete – Not typically implemented by browsers.

Page 31: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

31

• WWW – HTTP Response messages– HTTP Response messages are sent from server to client.

“\r\n”Status Line Optional DataOptional HTTP Header

Success/FailureIndicationNumber between200 and 599

Type of content returnede.g. text/html or image/gif

DelimiterRequested Data e.g. web page

– The Status Line gives information about the success of the previous HTTP Request

• 200 – 299 Success

• 300 – 399 Redirection – Document has been moved

• 400 – 499 Client Error – Bad Request, Unauthorised, Not found

• 500 – 599 Server Error – Internal Error, Service Overloaded

Page 32: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

32

• WWW – Caching Web pages– Downloading HTML documents from servers can be slow

due to a number of conditions:• Parts of the Internet can be congested

• Dialup connection is typically very slow, 33Kbps or 56Kbps

• Web server can have a lot of clients connecting to it at the same time, causing it to be overloaded.

– If a user returns to previous HTML document, then this could require downloading the document from the server again.

– A browser can hold copies of recently visited pages. This avoids having to download pages again.

– An organisation can use a HTTP proxy that caches documents for multiple users. Thus improving the speed at which pages can be displayed on each users computer.

Page 33: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

33

Proxy server:

Page 34: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

34

• WWW – Browser Architecture

Network Interface

HTTPclient

Otherclient…

Controller

htmlinterpreter

optionalplugins

Display

Driver

Input fromkeyboard andmouse

Output sent todisplay

Communication with remote server

Page 35: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

35

• WWW – Browser Architecture– Browser has more components than a server:

• Display driver for painting screen.

• HTML interpreter for formatting HTML documents.

• Plugins to display different content (e.g., Shockwave or Real Audio content)

• HTTP client to fetch HTML documents from WWW server.

• Other clients for other protocols (e.g., ftp, mail)

• Controller also must accept input from the computer user through the mouse or keyboard.

Page 36: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

36

• FTP - File Transfer Protocol– The Internet began development in the 1960s. – Moving a file from one computer to another computer

required some form of removable medium (floppy disk or tape).

– People required a protocol to reliably transfer files between any two computers connected to the Internet.

– Why not use HTTP?• The HTTP protocol was developed in the late 1980s and

the early 1990s after 10 years of FTP developed in 1971..• HTTP provides a poor authentication mechanism of users

of the protocol.• HTTP doesn’t easily allow files to be sent in both

directions.• HTTP doesn’t allow files to be downloaded in separate

stages.

Other Protocols

Page 37: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

37

Two major differences between FTP and HTTP:

1) When connecting to a FTP server you are using a FILE server (that means you can't see anything but only files are there), but if you connect to a HTTP server you access a WEB server, which means you can load web pages into a browser.

2) Using a FTP connection you can download and upload files to the server, but when you use the HTTP connection you can only download content from the Internet for viewing, is a "read only" method.

Page 38: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

38

• FTP - Functions– The main function of FTP was to allow the sharing of files

across the Internet.

– It has CHMOD permission for read, write and Execute.

– Other functions included

• Allowing computer users to use computers remotely.

• Hiding file storage differences from the user. The format that files are stored on a Macintosh are different from a PC which in turn are different from a Unix workstation. Different length filenames also have to be accommodated.

• Transfer of file data between computers has to be done reliably and efficiently. FTP should also allow transfer of very large files to be done in stages.

Page 39: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

39

• FTP– FTP is a client/server program– An FTP client program enables the user to interact with an

ftp server in order to access files on the ftp server computer.

– Client programs can be:• Simple command line interfaces. E.g. MS-Dos Prompt

C:\ ftp ftp.maths.tcd.ie• Integrated with Web browsers, e.g. Netscape Navigator, Internet

Explorer.

– FTP provides similar services to those available on most filesystems: list directories, create new files, download files, delete files.

– FTP uses TCP connections and the default server port for FTP is 21.

Page 40: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

40

• FTP - Transfer modes– Batch transfer

• User creates list of files to be transferred by ftp program.

• Users request is dropped into a queue of similar requests.

• FTP program reads requests and performs transfers of files.

• Transfer program can retry until successful.• Good for slow or unreliable transfers.

– Interactive transfer• User starts ftp program• User can interactively list contents of directories,

transfer files, delete files etc.• User can find and transfer files immediately • Quick feedback in case of mistakes, e.g., spelling errors

Page 41: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

41

• FTP - Sample Commands– Command Description

open Open connection to computerls List Directory contentscd Change to another directorybin Change to binary transfer, used for

downloading executables.get Download a file from remote

computerput Upload a file to the remote

computermget Start download of multiple filesmput Start upload of multiple files

Page 42: Lecture1 HTTP

11/04/23 Developing Web Applications(C) 2007 John Wiley & Sons Ltd.

www.wileyeurope.com/college/moseley

42

• FTP - Checkpointing– A data transfer may be aborted after only transferring part

of a file.• This could be due to the client or the server crashing, the TCP

connection being broken due to congestion, phone hanging up during dial up connection.

– FTP allows the file transfer from where the transfer was stopped, no need to re-transfer part of file.

– FTP achieves this by sending restart markers between the server and the client.

– Restart markers are saved in a restart file by the client. Client sends restart marker when it wants to continue the transfer of a previously stopped transfer.