Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI...

26
Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages

Transcript of Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI...

Page 1: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

Topic 9: The World Wide WebTopic 9: The World Wide Web

CSE2395/CSE3395Perl Programming

CSE2395/CSE3395Perl Programming

Camel3 page 878

LWP, lwpcook, CGI manpages

Page 2: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

2Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

In this topicIn this topic

The World Wide Web Writing a Perl web client

► LWP module

Dynamic web pages► Common Gateway Interface (CGI)

The World Wide Web Writing a Perl web client

► LWP module

Dynamic web pages► Common Gateway Interface (CGI)

Page 3: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

3Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

The World Wide WebThe World Wide Web

Developed in 1991 as a mechanism for linking hypertext across the Internet► documents contain links to other documents

Documents were considered static and stateless► requesting the same document twice always returned

identical copies

Documents were primarily text► focus was on content, not presentation► HTML contained some rudimentary markup for

formatting

Much of this has now changed

Developed in 1991 as a mechanism for linking hypertext across the Internet► documents contain links to other documents

Documents were considered static and stateless► requesting the same document twice always returned

identical copies

Documents were primarily text► focus was on content, not presentation► HTML contained some rudimentary markup for

formatting

Much of this has now changed

Page 4: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

4Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TerminologyTerminology

Documents are identified with a Universal Resource Locator/Identifier (URL/URI)

► unique string identifying a document’s location► http://www.google.com/

Documents are requested and sent using Hypertext Transfer Protocol (HTTP)

► simple text-based file-transfer protocol understood by both ends of a transfer

– web browser (user agent) (client)– web site (server)

► form of responses strongly resembles email messages Documents are often written in Hypertext Markup

Language (HTML)► text-based, like Rich Text Format (RTF), since expanded into

Extensible Markup Language (XML)

Documents are identified with a Universal Resource Locator/Identifier (URL/URI)

► unique string identifying a document’s location► http://www.google.com/

Documents are requested and sent using Hypertext Transfer Protocol (HTTP)

► simple text-based file-transfer protocol understood by both ends of a transfer

– web browser (user agent) (client)– web site (server)

► form of responses strongly resembles email messages Documents are often written in Hypertext Markup

Language (HTML)► text-based, like Rich Text Format (RTF), since expanded into

Extensible Markup Language (XML)

Page 5: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

5Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Fetching a document by HTTPFetching a document by HTTP

user agent (browser) running on client

web server program running on server... Internet ...

GET /path/to/document.html

Content-Type: text/html

blank line

contents of document.html

time

request

response

Page 6: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

6Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

User agentUser agent

Web browser is a kind of user agent► initiates HTTP connection to server► requests document using GET request► receives response (header and document) from

server► disconnects from server► decodes headers► renders document on screen

Any program can be a user agent► Library for the Web with Perl (LWP) provides helper

functions► use LWP::UserAgent;► use LWP::Simple;

Web browser is a kind of user agent► initiates HTTP connection to server► requests document using GET request► receives response (header and document) from

server► disconnects from server► decodes headers► renders document on screen

Any program can be a user agent► Library for the Web with Perl (LWP) provides helper

functions► use LWP::UserAgent;► use LWP::Simple;

Page 7: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

7Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

# Fetch a web page with LWP::Simple;use LWP::Simple;

$doc = get("http://www.google.com/");die "Couldn't access document" unless defined $doc;

# Process the document.if ($doc =~ /<title>(.*?)<\/title>/i){ print "Title is $1\n";}else{ print "Document has no <title> tag\n";}

# Fetch a web page with LWP::Simple;use LWP::Simple;

$doc = get("http://www.google.com/");die "Couldn't access document" unless defined $doc;

# Process the document.if ($doc =~ /<title>(.*?)<\/title>/i){ print "Title is $1\n";}else{ print "Document has no <title> tag\n";}

Page 8: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

8Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Common Gateway Interface (CGI)Common Gateway Interface (CGI)

Document served by server is usually a file on disk Server may instead run a program (“CGI program”) that

produces the document► part of the URL designates the program’s name

Program produces the entire response► including HTTP header and blank line► response is sent as-is by server to user agent

Server needs to distinguish between serving a static file or running a program

► two common approaches– run anything in .cgi– run anything in the /cgi-bin directory

Document served by server is usually a file on disk Server may instead run a program (“CGI program”) that

produces the document► part of the URL designates the program’s name

Program produces the entire response► including HTTP header and blank line► response is sent as-is by server to user agent

Server needs to distinguish between serving a static file or running a program

► two common approaches– run anything in .cgi– run anything in the /cgi-bin directory

Page 9: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

9Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Fetching a document by HTTPFetching a document by HTTP

user agent (browser) running on client

web server program running on server

POST /cgi-bin/programform data

Content-Type: text/html

blank line

result of processing form

program (instance of application)

time

server invokes program and passes

form data to it

server verifies format of response

and passes it to client

Page 10: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

10Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Writing a CGI programWriting a CGI program

Read form data► contents of all form elements on originating web page, if any► form data found either at end of URL or on standard input

– depending on whether GET or POST method used► Perl CGI module facilitates this

Process data Produce response

► send to standard output► produce HTTP header

– Content-Type header mandatory► produce blank line► produce body of response

Read form data► contents of all form elements on originating web page, if any► form data found either at end of URL or on standard input

– depending on whether GET or POST method used► Perl CGI module facilitates this

Process data Produce response

► send to standard output► produce HTTP header

– Content-Type header mandatory► produce blank line► produce body of response

Page 11: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

11Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Installing a CGI program at MonashInstalling a CGI program at Monash

Install program in► $HOME/WWW/cgi-bin/myprogram

Permissions must be set correctly► cgi-bin and parent directories must be searchable

by all– home.page.setup– chmod a+x ~ ~/WWW ~/WWW/cgi-bin

► program must be readable and executable by you– chmod u+rx myprogram

Program is accessible at URLhttp://users.monash.edu.au/~you/cgi-bin/myprogram

Install program in► $HOME/WWW/cgi-bin/myprogram

Permissions must be set correctly► cgi-bin and parent directories must be searchable

by all– home.page.setup– chmod a+x ~ ~/WWW ~/WWW/cgi-bin

► program must be readable and executable by you– chmod u+rx myprogram

Program is accessible at URLhttp://users.monash.edu.au/~you/cgi-bin/myprogram

Page 12: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

12Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

#!/usr/bin/perl -w

# Generate a static CGI page.

# << notation is a fancy kind of string quoting# reminiscent of shell here-documents. All text# between the FLAGS is in the string.print <<"FLAG";Content-Type: text/html

<html><head><title>Hello</title></head><body><p>Hello, world!</p></body></html>FLAG

#!/usr/bin/perl -w

# Generate a static CGI page.

# << notation is a fancy kind of string quoting# reminiscent of shell here-documents. All text# between the FLAGS is in the string.print <<"FLAG";Content-Type: text/html

<html><head><title>Hello</title></head><body><p>Hello, world!</p></body></html>FLAG

Page 13: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

13Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

#!/usr/bin/perl -w

# Generate a CGI page with varying text.

print <<"EOT";Content-Type: text/html

<html><head><title>Date</title></head><body>EOT

# Get date.chomp($date = `/bin/date`);print "<p>The date is <b>$date</b></p>\n";

print "</body></html>\n";

#!/usr/bin/perl -w

# Generate a CGI page with varying text.

print <<"EOT";Content-Type: text/html

<html><head><title>Date</title></head><body>EOT

# Get date.chomp($date = `/bin/date`);print "<p>The date is <b>$date</b></p>\n";

print "</body></html>\n";

Page 14: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

14Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

FormsForms

What is your species?

What is your preferred language? Thai

Go

Page 15: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

15Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Form dataForm data

Form data is text entered into web page in HTML <INPUT>, <SELECT> and <TEXTAREA> tags<FORM><INPUT type="text" name="species"><INPUT type="text" name="language" value="Thai")><INPUT type="submit" name="x" value="Go"></FORM>

Form data is submitted by browser in HTTP request► each parameter and its value► species=human&language=English&x=Go

Perl CGI module includes param function which extracts parameters’ values► use CGI ("param");► param("species") # "human"► param("language") # "English"

Form data is text entered into web page in HTML <INPUT>, <SELECT> and <TEXTAREA> tags<FORM><INPUT type="text" name="species"><INPUT type="text" name="language" value="Thai")><INPUT type="submit" name="x" value="Go"></FORM>

Form data is submitted by browser in HTTP request► each parameter and its value► species=human&language=English&x=Go

Perl CGI module includes param function which extracts parameters’ values► use CGI ("param");► param("species") # "human"► param("language") # "English"

Page 16: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

16Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

# Process form data and produce a response.

use CGI qw(param);

# Get parameters.$kind = param("species");$tongue = param("language");

print <<"EOT";Content-Type: text/html;

<html><head><title>Greetings</title></head><body><h1>Greetings, $kind!</h1><p>Do you speak $tongue?</p></body></html>EOT

# Process form data and produce a response.

use CGI qw(param);

# Get parameters.$kind = param("species");$tongue = param("language");

print <<"EOT";Content-Type: text/html;

<html><head><title>Greetings</title></head><body><h1>Greetings, $kind!</h1><p>Do you speak $tongue?</p></body></html>EOT

Page 17: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

17Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

HTML shortcutsHTML shortcuts

Printing raw HTML can make source code difficult to read

CGI module provides helper functions for generating HTML tags► markup and form generation► without shortcut: print "<h1>Heading</h1>";► with shortcut: print h1("Heading");

Need to import helper functions► use CGI qw(h1 h2 p b em table ...);► use CGI qw(:standard);

Printing raw HTML can make source code difficult to read

CGI module provides helper functions for generating HTML tags► markup and form generation► without shortcut: print "<h1>Heading</h1>";► with shortcut: print h1("Heading");

Need to import helper functions► use CGI qw(h1 h2 p b em table ...);► use CGI qw(:standard);

Page 18: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

18Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

# Using HTML shortcuts.

use CGI qw(:standard);

# Get parameters.$kind = param("species");$tongue = param("language");

print header(), start_html("Greetings"), h1("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();

# Using HTML shortcuts.

use CGI qw(:standard);

# Get parameters.$kind = param("species");$tongue = param("language");

print header(), start_html("Greetings"), h1("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();

Page 19: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

19Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Keeping stateKeeping state

HTTP is a stateless protocol► each connection is independent

Often want to present several pages to user in sequence► e.g., shopping cart

Several solutions► use a hidden parameter

– <INPUT type="hidden">► use cookies

– CGI module’s cookie function► put state information in URL

– requires support from web server

HTTP is a stateless protocol► each connection is independent

Often want to present several pages to user in sequence► e.g., shopping cart

Several solutions► use a hidden parameter

– <INPUT type="hidden">► use cookies

– CGI module’s cookie function► put state information in URL

– requires support from web server

Page 20: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

20Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

TimeoutTimeout

use CGI qw(:standard);$page = param("state");print header();

if (!defined $page){ print start_html("Question"), start_form(), p("What is your species?", textfield("species")), p("Use what language?", textfield("language", "Thai")), p(submit("x", "Go")), hidden("state", "result"), end_form(), end_html();}elsif ($page eq "result"){ print start_html("Greetings"), h1 ("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();}

use CGI qw(:standard);$page = param("state");print header();

if (!defined $page){ print start_html("Question"), start_form(), p("What is your species?", textfield("species")), p("Use what language?", textfield("language", "Thai")), p(submit("x", "Go")), hidden("state", "result"), end_form(), end_html();}elsif ($page eq "result"){ print start_html("Greetings"), h1 ("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();}

Page 21: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

21Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

CGI securityCGI security

CGI security is very important► CGI programs are run on local host

– as your user ID– in your directories

► connections initiated from user agents worldwide– strangers can’t be trusted!

► HTTP requests can be hand-crafted to exploit security holes

Always check form data for correctness► correct values► correct combination of parameters

Never let error conditions provide hints about implementation

► error messages that are helpful during debugging are also helpful to crackers

CGI security is very important► CGI programs are run on local host

– as your user ID– in your directories

► connections initiated from user agents worldwide– strangers can’t be trusted!

► HTTP requests can be hand-crafted to exploit security holes

Always check form data for correctness► correct values► correct combination of parameters

Never let error conditions provide hints about implementation

► error messages that are helpful during debugging are also helpful to crackers

Page 22: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

22Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Further readingFurther reading

LWP, lwpcook manpages CGI manpage Learning Perl 2nd edition, chapter 19

► not in 3rd edition

CGI Programming with Perl► Scott Guelich, Shishir Gundavaram, Gunther

Birznieks, O’Reilly 2000

Perl Cookbook► Tom Christiansen & Nathan Torkington, O’Reilly 1st

edition 1998, 2nd edition 2003

LWP, lwpcook manpages CGI manpage Learning Perl 2nd edition, chapter 19

► not in 3rd edition

CGI Programming with Perl► Scott Guelich, Shishir Gundavaram, Gunther

Birznieks, O’Reilly 2000

Perl Cookbook► Tom Christiansen & Nathan Torkington, O’Reilly 1st

edition 1998, 2nd edition 2003

Page 23: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

23Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Covered in this topicCovered in this topic

Writing a Perl web client► LWP::Simple module

Dynamic web pages► Common Gateway Interface (CGI)► forms► keeping state

Writing a Perl web client► LWP::Simple module

Dynamic web pages► Common Gateway Interface (CGI)► forms► keeping state

Page 24: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

24Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Going furtherGoing further

LWP::UserAgent► full object-oriented interface to Perl web user agent

HTML::Parser and XML::Parser► tools for processing HTML and XML

GD► module to create images on the fly

Tainting► dealing with insecure data► Camel3 pages 558-568

LWP::UserAgent► full object-oriented interface to Perl web user agent

HTML::Parser and XML::Parser► tools for processing HTML and XML

GD► module to create images on the fly

Tainting► dealing with insecure data► Camel3 pages 558-568

Page 25: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

25Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

Next topicNext topic

References► Perl’s answer to pointers

Nested data structures► multi-dimensional arrays► emulating C structs

References► Perl’s answer to pointers

Nested data structures► multi-dimensional arrays► emulating C structs

perlref, perlreftut, perllol, perldsc manpages

Page 26: Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI manpages.

26Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University

CopyrightCopyright

Perl Programming lecture notes Copyright © 2000-2004 Deborah Pickett. Reproduction of this presentation for nonprofit study use is permitted. All other reproduction must be authorized in writing by the author.