Web Technologies - Alexandru Ioan Cuza Universitybusaco/teach/courses/web/presentations/...HTTP/1.1...

Post on 23-May-2018

216 views 0 download

Transcript of Web Technologies - Alexandru Ioan Cuza Universitybusaco/teach/courses/web/presentations/...HTTP/1.1...

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/Web Technologies

Web programming (I)

⥁HTTP protocol

cookies & sessions

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

“There are 2 ways to write error-free programs; only the third one works.”

Alan Perlis

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

What the Web means?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

World Wide Web

an information space containing elements (things) of interest, called resources,

denoted by global identifiers – URI/IRI

details at www.w3.org/TR/webarch/W3C Recommendation (2004)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

Web resources

Aspects of interest

identification

interaction

representation by using data formats

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

Web resources

Aspects of interest

identification

interaction

representation by using data formats

URI/IRIprotocol:

HTTP

markup language(s)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

How about the interaction between client(s) and Web server(s)?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

HyperText Transfer Protocol

based on TCP/IP

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

situated on the application layer

access control to the data transmission medium (MAC – Medium Access Control)

network interconnection + data routing(IP – Internet Protocol)

reliable transport via sockets(TCP – Transmission Control Protocol)

hypertext/hypermedia transfer(HTTP – HyperText Transfer Protocol)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

HyperText Transfer Protocol

a reliable request/response protocol

standard access port: 80

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

HTTP/1.1

Internet standard: RFC 2616 (1999)

from 2014, defined by RFC 7230—7235

www.w3.org/Protocols/

http://devdocs.io/http/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

HTTP/2.0

RFC 7540 (2015)

focused on performance

http://royal.pingdom.com/2015/06/11/http2-new-protocol/

http://http2.github.io/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: architecture

Web Server

daemon – “attendant spirit”

Web Client

browser, Web bot (crawler), multimedia player,…

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: architecture

Web ServerApache, Internet Information Services, Lighttpd, Nginx,…

Web ClientMosaicNetscapeMozillaFirefox,

Internet Explorer, Chromium, wget, iTunes, Echofon, etc.

details in “Web browser architecture” presentation:http://profs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week2

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP

Request and responseaccessing – possibly, changing – a resource

representation by using its URI

Web Server

Web Client

request

response

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Message

base unit of the HTTP communication(request or response)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Intermediary

proxygatewaytunnel

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Proxylocated in the client/server proximity

having the role of both server and client

Web Server

Web Client p

rox

y

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Proxy

forward proxyintermediary for a group of clients

acts on behalf of clients

reverse proxyintermediary for a group of servers

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Gatewayintermediary hiding the target (origin) server

the client has no knowledge about it

Web Gate-way

Web Client

Web Server

Web Server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Gateway

can assure: traffic distribution across servers – load balancing

short-term data storage – cachingmessage or request translation (e.g., HTTPSHTTP)

other negotiation operations – role of mediator/broker

open source solutions: HAProxy, Squid, Varnishcloud-based: Amazon ELB (Elastic Load Balancing)

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Tunnel

retransmits – usually, encrypted – HTTP messages

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Tunnel

retransmits – usually, encrypted – HTTP messages

context: HTTPS protocol – to assure a “secure” HTTP communication via TLS (Transport Layer Security)

authentication based on digital certificates+ bidirectional data encryption

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

details about a HTTPS

connection

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Cache

local storage area – in memory, on a disc –for the messages (data)

server- and/or client-side

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: concepts

Cache

local storage area – in memory, on a disc –for the messages (data)

future requests for that data can be served faster

context: Web application performance

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: messages

HTTP message = header + body

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: messages

Header

includes a set of fields

field-name ":" [ field-value ] CRLF

CR = Carriage Return \r – code 13LF = Line Feed \n – code 10

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: messages

HTTP request

Method Request-URI ProtocolVersion CRLF

[ Message-header ] [ CRLF MIME-data ]

GET /~busaco/teach/courses/web/ HTTP/1.1 CRLF

Host: profs.info.uaic.ro

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: messages

HTTP response

HTTP-version Digit Digit Digit Reason

CRLF Content

HTTP/1.1 200 OK CRLF …

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

GET

request – performed by a client – to access a resource representation

HTML document, CSS stylesheet, image in PNG format, vector illustration as SVG,

JavaScript program, Atom or RSS (XML) news feed,PDF presentation, JSON data,…

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

HEAD

similar to GETusually, offers only meta-data

e.g., MIME type of a resource, last update,…

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

PUT

updates a resource representation or, possibly, creates a resource on the Web server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

POST

creates a resource, usually sending entities (data, actions) to the server

e.g., data entered into a Web form’ fields

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

DELETE

erases a resource – its representation –from the server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

Remark

traditionally, the Web browser only permits the use of GET and POST methods

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

A method is considered safeif it does not modify the server state

i.e. no side-effect actions are performed on the server

GET and HEAD are safe

POST, PUT and DELETE are not safe

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: methods

A method is considered idempotent when it can be called many times without different outcomes,returning the same response (representation)

GET, HEAD, PUT and DELETE are idempotent

POST is not idempotent

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: resource representations

Character set encodings

ISO-8859-1ISO-8859-2

KOI8-RISO-2022-JP

UTF-8UTF-16 Little Endian

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: resource representations

Message (content) encodings

compression, identity and/or integrity

in most cases: gzip

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: resource representations

Representation formats

textHTML, CSS, plain text, JavaScript code, XML document

or

binaryimage (JPEG, PNG), PDF document, multimedia resource

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: resource representations

Types of the resource content

media types

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Content-Type

permits the transfer of any kind of data

Content-Type: type/subtype

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Content-Type

specified by Media Types – MIME(Multipurpose Internet Mail Extensions)

denotes a set of primary content types+ additional sub-types

initially, used in the e-mail context

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

text indicates textual formats

text/plain – unformatted texttext/html – HTML document

text/css – CSS (Cascading Style Sheets) resource

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

image specifies graphical formats

image/gif – GIF (Graphics Interchange Format) imagesimage/jpeg – JPEG (Joint Picture Experts Group) photosimage/png – PNG (Portable Network Graphics) pictures

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

audio denotes audio content

audio/mpeg – resource encoded in MP3 formatspecification for audio data according to

the MPEG (Motion Picture Experts Group) standard

audio/ac3 – compressed audio resourceconforming to AC-3 standard

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

video defines video content: animations, films

video/h264 – resource in H.264 format

video/ogg – content encoded in OGG open format

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

application signifies formats that can be processed by applications on the client-side

application/javascript – JavaScript programapplication/json – JSON (JavaScript Object Notation) data

application/octet-stream – stream of arbitrary bytes

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: header fields (attributes)

Primary types

multipart used to transfer composed data

multipart/mixed – mixed contentmultipart/alternative – alternative contents

e.g., different qualities of multimedia streams

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/N. Freed et al., Media Types (February 2017)

http://www.iana.org/assignments/media-types/media-types.xhtml

calendar+json application/calendar+json Calendar in JSON format

csv text/csv CSV data

opus audio/opus Opus audio resource

msword application/msword Word (MS Office) document

tiff image/tiff Image in TIFF format

vnd.rar application/vnd.rar RAR archive

zip application/zip ZIP archive

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Location

Location ":" "http(s)://" authority [ ":" port ] [ abs_path ]

redirects the client to the other resource representation(HTTP redirect)

Location: http://somewhere.info:8080/moved.html

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Referer

denotes the URI of a Web resource that refers to the current resource

used to know the URI source of the requests to a given document (i.e. back-links)

for analytics, logging, caching,…

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Host

specifies the target address – IP or symbolic domain – of the machine supposed to provide

a requested resource

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/HTTP: header fields (attributes)

Other existing fields concern the following:

accepted content (content negotiation) – e.g., Accept

authentication & authorization – WWW-Authenticate Authorization

conditional access to resources – If-Match, If-Modified-Since,…caching policies – Cache-Control, Expires, ETag, etc.proxy – Proxy-Authenticate, Proxy-Authorization, Via

…and others

www.iana.org/assignments/message-headers/message-headers.xhtml

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: status

Informational (1xx)

100 Continue, 101 Switching Protocols

switching protocols: from HTTP to WebSocket (RFC 6455)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: status

Success (2xx)

200 Ok, 201 Created, 202 Accepted,204 No Content, 206 Partial Content

OPTIONS – method to determine server capabilities or requirements for a resource

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: status

Redirection (3xx)

300 Multiple Choices, 301 Moved Permanently, 302 Found,303 See Other, 304 Not Modified, 305 Use Proxy

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: status

Client Error (4xx)

400 Bad Request, 401 Unauthorized, 403 Forbidden,

405 Method Not Allowed, 408 Request Timeout,

414 Request-URI Too Long

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: status

Server Error (5xx)

500 Internal Server Error, 502 Bad Gateway,

503 Service Unavailable, 504 Gateway Timeout

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: logging

Requests sent to a Web server are logged

Common Log Format

standardized text file format

for Apache HTTP Server: mod_log_config module

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

c12.uaic.ro - msi2013 [13/Feb/2014:14:53:14 +0200] "GET /~vidrascu/MasterSI2/note/Restanta.pdf HTTP/1.1" 206 25227 "http://profs.info.uaic.ro/~vidrascu/MasterSI2/index.html" "...Firefox/27.0"

82-137-8-231.rdsnet.ro - - [13/Feb/2014:15:38:23 +0200] "POST /~computernetworks/login.php HTTP/1.1" 302 1115 "http://profs.info.uaic.ro/~computernetworks/login.php" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0"

ec2-23-21-0-202.compute-1.amazonaws.com - - [13/Feb/2014:15:48:29 +0200] "GET /~busaco/teach/courses/web/presentations/web01ArhitecturaWeb.pdf HTTP/1.1" 200 2081804 "-" "HTTP_Request2/2.2.0 (http://pear.php.net/package/http_request2)..."

199.16.156.126 - - [13/Feb/2014:15:58:58 +0200] "GET /robots.txt HTTP/1.1" 404 182 "-" "Twitterbot/1.0"

psihologie-c-113.psih.uaic.ro - - [13/Feb/2014:16:03:04 +0200] "GET /~busaco/ HTTP/1.1" 200 1942 "-" "Mozilla/5.0 (X11; Linux x86_64; ...) Firefox/27.0"

psihologie-c-113.psih.uaic.ro - - [13/Feb/2014:16:03:04 +0200] "GET /~busaco/csb.css HTTP/1.1" 200 852 "http://profs.info.uaic.ro/~busaco/" "Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0"

proxy-220-255-2-224.singnet.com.sg - - [13/Feb/2014:16:23:23 +0200] "GET /favicon.ico HTTP/1.1" 200 1406 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; ...)"

c2.uaic.ro - - [13/Feb/2014:16:33:43 +0200] "GET /~busaco/teach/courses/web/ HTTP/1.1" 304 - "-" "... Chrome/32.0.1700.107..."

220.181.51.219 - - [13/Feb/2014:19:20:20 +0200] "HEAD /%7Ebusaco/music/09.Sabin%20Buraga%20-...mp3 HTTP/1.0" 200 - "-" "NSPlayer/10.0.0.4072 WMFSDK/10.0"

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

GET /~busaco/teach/courses/web/web-film.html HTTP/1.1

Host: profs.info.uaic.ro

User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 10_1_1

like Mac OS X) AppleWebKit/602.2.14 (KHTML, like Gecko)

Version/10.0 Mobile/14B100 Safari/602.1

Accept: text/html,application/xhtml+xml;q=0.9,*/*;q=0.8

Accept-Language: en-us, en;q=0.5

Accept-Encoding: gzip, deflate

Connection: keep-alive

Referer: http://profs.info.uaic.ro/~busaco/teach/courses/web/

HTTP: request – example

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP/1.1 200 OK

Date: Mon, 27 Feb 2017 15:18:01 GMT

Server: Apache

Last-Modified: Mon, 27 Feb 2017 07:46:02 GMT

Content-Encoding: gzip

Content-Length: 11064

Keep-Alive: timeout=15, max=100

Connection: Keep-Alive

Content-Type: text/html

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml"

lang="ro" xml:lang="ro">

</html>

con

ten

t

header fields(meta-data)

HTTP: response – example

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

advanced

online inspection of HTTP messageswith www.hurl.it

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

avansat

X- fields are not standardized

expires in the past(not stored in cache)

actual content(Atom feed)

processed by client

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: APIs (libraries)

cURL + libcurl

(C, Java, Haskell, .NET, PHP, Ruby,…) – http://curl.haxx.se/

Apache HttpComponents (Java) – http://hc.apache.org/

httplib (Python 2) + http.client (Python 3)

neon (C library): http://www.webdav.org/neon/

WinHTTP

(Windows specific: C/C++) – http://tinyurl.com/6eemqqc

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: client-side tools

Google Chrome Developer Toolshttps://developers.google.com/web/tools/chrome-devtools/

Firefox Developer Toolshttps://developer.mozilla.org/docs/Tools

Fiddler – a free Web debugging proxywww.telerik.com/fiddler

avansat

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

(instead of) break

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

How about the Web server architecture?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Fulfills multiple requests from the clients respecting the HTTP protocol

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Fulfills multiple requests from the clients respecting the HTTP protocol

each request is considered independent from others, although it was issued by the same Web clientconnection state is not kept – stateless

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Traditionally, the Web server implementation

is either pre-forked or pre-threaded

on initialization, a number of child processes or threads are created, each process/thread interacting to

a distinct client

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: server Web

advanced

http://strongloop.com/strongblog/node-js-is-faster-than-java/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Server behavior can be controlled by various configuration parameters (directives)

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Case study: Apache HTTP Server configuration (from April 1996, the most popular Web server)

http://httpd.apache.org/

global configuration: httpd.conf file6 httpd instances are created by default

a user specific configuration (per directory/URI) is defined via .htaccess – see also https://github.com/phanan/htaccess

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Case study: Apache HTTP Server configuration

possibility to define virtual hosts – virtual hosting:same server can host (run) multiple Web sites,

with different symbolic domain names

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP request

post-read-request

IRI translation

headerparsing

accesscontrol

authen-tication

authori-zation

media typechecker

response

log

cleanup

datato theclient

advanced

Apache server: request processing

loop

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Usually, the Web server architecture is modular

kernel (core) +

modules implementing specific functionalities

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Usually, the Web server architecture is modular

kernel (core) +

modules implementing specific functionalities

provides a C language-based API (Application Programming Interface) to create modules

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Usually, the Web server architecture is modular

kernel (core) +

modules implementing specific functionalities

examples (Apache): mod_auth_basic, mod_cache, mod_deflate, mod_include, mod_proxy, mod_session, mod_ssl

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP: Web server

Other approach: asynchronous (non-blocked) single threaded strategies

reference examples: nginx

Node.js

avansat

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

How can we develop the back-end of Web applications?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

necessity

Dynamic generation – on the server –of representations of resources

requested by clients

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

necessity

Dynamic generation – on the server –of representations of resources

requested by clients

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

solutions

CGI – Common Gateway Interface

Web application servers

Web frameworks

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

solution: cgi

Language-independent programming interfacefacilitating the interaction between clients and

programs invoked on the Web server

de facto standard

RFC 3875http://www.w3.org/CGI/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi

A CGI program (script) is invoked on server

directly

i.e., retrieving data from a Web form after the submit button is pressed

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi

A CGI program (script) is invoked on server

indirectly

example: at each visit a new ad (e.g., banner) is generated

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi

CGI scripts can be written in any language available on the server

interpreted languagesbash, Perl – e.g., Perl::CGI module –, Python, Ruby,...

compiled languagesC, C++ etc.

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: programming

Each CGI program will write data – the representation of a Web resource –

at standard output (stdout)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: programming

To denote the type of generated representation, HTTP headers are used – MIME (Media Types)

example: Content-type: text/html

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: programming

Interaction between the client and Web server

Web Server

Web Client

request

response

script

invo-cation

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: variables

A CGI script has access to environment variables

associated to the request sent to the CGI program:

REQUEST_METHOD – HTTP method (GET, POST,…)QUERY_STRING – data transmitted to the clientREMOTE_HOST, REMOTE_ADDR – client address

CONTENT_TYPE – content type as MIME (Media Type)CONTENT_LENGTH – content length in bytes

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: variables

Additional variablesusually, generated by the Web server:

HTTP_ACCEPT – MIME types accepted by client (browser)HTTP_COOKIE – data about cookiesHTTP_HOST – information regarding the host (client)HTTP_USER_AGENT – information about the client

…and others

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

a result received by Web client after the invocation via GET on Web server

of variabile.cgi script(having read & execution rights)

#!/bin/bash# Setting the content typeecho "Content-type: text/plain"; echo

# Executing 'set' command in Linux# to show environment variablesset

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

/* hello.c

(compile with gcc hello.c –o hello.cgi) */

#include <stdio.h>

int main() {

int msgs; /* number of messages */

printf ("Content-type: text/html\n\n");

for (msgs = 0; msgs < 10; msgs++) {

printf ("<p>Hello, world!</p>");

}

return 0;

}

#!/usr/bin/python

# hello.py.cgi

print "Content-type: text/html\n"

for messages in range (0, 10):

print "<p>Hello, world!</p>"

#!/bin/bash

# hello.sh.cgi

echo "Content-type: text/html"

echo

MESSAGES=0

while [ $MESSAGES -lt 10 ]

do

echo "<p>Hello, world!</p>"

let MESSAGES=MESSAGES+1

done

CGI programs written in C, bash, Python generating the same HTML content

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocare

experimenting other MIME types, the browser displays the following:

Content-type: text/plain Content-type: text/xml

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

<form action="http://profs.info.uaic.ro/~.../get-max.cgi"method="GET">

<p>Enter two numbers :<input type="text" name="no1" /> <input type="text" name="no2" /> </p><input type="submit" value="Compute maximum" />

</form>

invocation from an interactive Web formin this case, using the GET method

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/cgi: invocation

special URL in GET case

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

For each form field, a field_name=value pair – delimited by & – is generated and added to the URL

of the CGI script to be invoked on server

http://profs.info.uaic.ro/~busaco/cgi/get-max.cgi?no1=7&no2=4

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

Real-life examples:

http://usabilitygeek.com/?s=web+design

https://www.youtube.com/watch?v=hEzmy93zr0Y#t=540

https://twitter.com/search?q=web%20development&src=typd

https://developer.mozilla.org/search?q=ajax&topic=apps

this URL is encoded – URL encodingsee first lecture

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

The server will invoke a CGI script passing the dataat standard input (stdin)

orvia environment variables

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

Data processing when GET method is used

data available in QUERY_STRING variable

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

Data processing when POST method is used

data read from stdin, the length in bytes being specified by CONTENT_LENGTH variable

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: invocation

Data processing – GET and/or POST

in case of application servers or frameworks, data is encapsulated into specific structures/types

ASP.NET (C#) – HttpRequest classPHP – associative arrays: $_GET[] $_POST[] $_REQUEST[]

Play (Java, Scala) – play.api.mvc.Request

Node.js (JavaScript) – http.ClientRequest

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

GET vs. POST

GET method is used to generate the representations of the requested resources

e.g., HTML documents, JPEG images, Atom/RSS news feeds, ZIP archives, etc.

the server state should not be modified

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

GET vs. POST

GET method is used to generate the representations of the requested resources

obtaining data with GET, the user can set a bookmark for further accesses to the Web resource

(by using the URL of the generated representation)

e.g., https://duckduckgo.com/?q=web+programming&ia=videos

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

GET vs. POST

POST method is used when the data transmitted to the server is large (e.g., upload of file content)

or sensitive – typically, passwords

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

GET vs. POST

POST method is used when the data transmitted to the server is large (e.g., upload of file content)

or sensitive – typically, passwords

plus, when the script invocation can produce a state change on the server:

adding a record, altering a file,...

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: support

Web server should support CGI script invocation

example: Apache HTTP Server provides the mod_cgi module

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: ssi

CGI scripts could be directly invoked from a HTML document via SSI (Server Side Includes)

http://www.ssi-developer.net/ssi/

Apache: http://httpd.apache.org/docs/trunk/howto/ssi.html

Nginx: http://nginx.org/en/docs/http/ngx_http_ssi_module.html

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cgi: fastcgi

FastCGIan alternative to CGI focused on performance

implementations:Apache – https://httpd.apache.org/mod_fcgid/

Nginx – nginx.org/en/docs/http/ngx_http_fastcgi_module.html

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

How about a manner to – temporarily – store on front-end (browser) the data transmitted

by the back-end of Web application?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A script running on a Web server can put data on the client-computer via the user’s Web browser

subsequently, the navigator will return that data to the same script available on the same server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A (quasi-)persistent way to store data on the machine of a Web client in order to be

further accessed by a program running on a server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Storing user preferences

typical examples: options regarding interaction – visual theme

(e.g., chromatics), lingual preferences, etc.geographical location, interests on shopping

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Automatic form completion

using already entered values for certain fields

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Monitoring the access to a Web resource

aspect of interest:Web analytics

collecting information about clients(hardware platform, browser, screen resolution, etc.)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Monitoring the access to a Web resource

aspect of interest:user tracking

monitoring the user behaviorDo Not Track initiative – http://donottrack.us/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Storing the authentication info

e.g., keeping data about the user account in the e-commerce context

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Transaction status

e.g., current state of the virtual shopping cart provided by an e-shop application

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: usages

Web session management

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: types

Persistent cookies

not destroyed when Web browser closes

kept into a file – client-side

time-to-live set by the cookie creator

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: types

Non-persistent (volatile) cookies

disappear when the browser is closed

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A cookie can be considered as a variable

its value is transferred via HTTP between the Web server (back-end application)

and the client (browser)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A cookie can be considered as a variable

name=value

the value is an URL encoded string

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Data about a cookie is received by the browser

a list of cookies for each server (domain)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A cookie is sent to a client by using the Set-Cookie

header field of a HTTP response message

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Set-Cookie: name=value; expires=date; path=path;

domain=Internet-domain; secure

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Set-Cookie: name=value; expires=date; path=path;

domain=Internet-domain; secure

expires – indicates date and time when cookie will expire (Web client should destroy expired cookies)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Set-Cookie: name=value; expires=date; path=path;

domain=Internet-domain; secure

domain – signifies the symbolic name of the Web server that generated the cookie

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Set-Cookie: name=value; expires=date; path=path;

domain=Internet-domain; secure

path – specifies a subset of URLs from the cookie’s domain

distinguishes multiple applications existing on the same server

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Set-Cookie: name=value; expires=date; path=path;

domain=Internet-domain; secure

secure – indicates that cookie will be sent back to the server only if the communication channel is “secure”

(via HTTPS)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookie-uri

also, consult Cookiepediahttps://cookiepedia.co.uk/

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A cookie is transmitted back from the client to the Web server only if it satisfies

all validity conditions

domain, path, expire date & time, and communication channel security are matching

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Server will receive, in the headerof a HTTP request message, the following:

Cookie: name1=value1; name2=value2...

the list of cookies which satisfy the validity conditions

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

A script invocation consists of returning a representation + placing various cookies

Web Server

Web Client

HTTP requestscript invocation

HTTP responseSet-Cookie: color=green

Script

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Cookies – persistent or not –are processed and stored by the browser

Web Server

Web Client

Script

color=

green

persistent cookies are stored in files or databases (SQLite)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Next access to the script is made by transmitting the cookies to the server

according to the validity conditions

Web Server

Web Client

Script

color=

green HTTP requestCookie: color=green

HTTP response

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: creating

An example for PHP – function setcookie ()

<?php

setcookie ("other_color", "blue"); // non-persistent – why?

echo "A cookie of color " . $_COOKIE["other_color"];

?>

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: expiring

Nullifying the value and expiration date;optionally, the other cookie attributes

example – PHP:

<?php

setcookie ($cookie_name, "", 0, "/", "", 0);

?>

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: consulting

Cookies reside in the header field of a HTTP message

HTTP_COOKIE

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies: consulting

PHP – a cookie is specified (accessed) like a variable

$_COOKIE ['cookie_name']

associative array

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

cookies

Other information of interest is available inRFC 6265

HTTP State Management Mechanism

http://tools.ietf.org/html/rfc6265

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

How can we identify successive requests expressed by the same client instance?

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

HTTP is stateless protocol

can not tell if specific successive requests are received from the same client

(from the same instance of a Web browser)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

necessity

Preserving certain data for a sequence of relatedHTTP messages (requests/responses)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

necessity

Preserving certain data for a sequence of relatedHTTP messages (requests/responses)

examples: shopping cart status

multi-step Web formscontent pagination

user authentication stateetc.

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions

Each visitor of a Website will have associated a unique identifier – session ID (SID)

stored by a cookie(e.g., ASP.NET_SessionId, PHPSESSID, session-id, _wp_session)

orpropagated via a URL

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions

Each visitor of a Website will have associated a unique identifier – session ID (SID)

in this way, consecutive visits (requests) made by the same user could be identified

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions

Various variables could be attached to a session

their values will be kept (stored) between consecutive – e.g., related – requests from the same instance

of a Web client (browser)

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions

A session could be implicitly (automatically) or explicitly (manually, by programmer) registered,

depending on the Web application server or the default configuration

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions

A session could be implicitly (automatically) or explicitly (manually, by programmer) registered,

depending on the Web application server or the default configuration

Web session info is persistently stored on the server by using non-relational database systems – e.g., DynamoDB,

Memcached, Redis,… – or, in most cases, files

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

POST / HTTP/1.1

Accept: text/html,application/xhtml+xml,

application/xml;q=0.9,*/*;q=0.8

Accept-Encoding: gzip, deflate

Accept-Language: en,en-GB;q=0.5

Connection: keep-alive

Cookie: language=en_US

Host: mail.info.uaic.ro

Referer: http://mail.info.uaic.ro/?_task=login

Upgrade-Insecure-Requests: 1

User-Agent: Mozilla/5.0 … Gecko/20100101 Firefox/51.0

user authentication by using POST method(already existing cookies are transmitted)

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sesiuni: exemplificare

HTTP/1.1 302 Found

Cache-Control: private, no-cache, no-store, must-revalidate…

Connection: Keep-Alive

Content-Length: 0

Content-Type: text/html; charset=UTF-8

Date: Thu, 23 Feb 2017 10:25:44 GMT

Keep-Alive: timeout=5, max=100

Last-Modified: Thu, 23 Feb 2017 10:25:44 GMT

Location: ./?_task=mail&_token=cb1924…c9c97819

Server: Apache/2.4.6 (CentOS) mod_fcgid/2.3.9 PHP/5.4.16

Set-Cookie: roundcube_sessid=vnqrt4…2uv2; path=/; HttpOnly

roundcube_sessauth=S92ee64…2c71; path=/; HttpOnly

<!DOCTYPE html>

HTTP response a Web session-related cookie is set

advanced

redirection after

authentication

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions: programming

In the case of CGI, session management must be entirely implemented by the programmer

there is no standard way for Web session processing

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions: programming

PHP – functions: session_start(), session_register(),session_id(), session_unset(), session_destroy()

<?php

session_start (); // creating a session

if (!isset ($_SESSION['accesses'])) {

$_SESSION['accesses'] = 0; } else {

$_SESSION['accesses']++; }

?>

accesses variable attached to the session

details at http://php.net/manual/en/book.session.php

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

sessions: programming

By using an application server or framework, the cookie and session management is simpler

various examples:HttpSession class (ASP.NET), HttpSession interface (Java servlets),

HTTP::Session (Perl), session (Flask – Python framework), web.session(web.py), HttpFoundation (component of Symfony – PHP framework),

SessionComponent class (CakePHP), session array (Ruby on Rails),play.mvc.Http.Cookie (Play! for Java/Scala), sessions (Gorilla – Go)cookie-parser and express-session (Node.js modules for Express)

advanced

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

alternatives

HTML5 provides Web Storage

W3C recommendation (2015)

browser-level storage for lists of key—value pairs via sessionStorage and localStorage attributes

for details, studyprofs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week11

avansat

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/“conclusion”

⥁from HTTP to cookies and Web sessions

many thanks to Ciprian Amariei, MSc.

Dr.

Sab

in B

ura

ga

profs.in

fo.uaic.ro/~busa

co/

next episode: Web programmingWeb application servers, Web application architecture

brow-ser

presen-tation

pro-cessing

data access

<Web/> pages

HTML, CSS,…

fat serverdumb client

frontend backend