Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI...
-
Upload
julia-robertson -
Category
Documents
-
view
216 -
download
0
Transcript of Topic 9: The World Wide Web CSE2395/CSE3395 Perl Programming Camel3 page 878 LWP, lwpcook, CGI...
Topic 9: The World Wide WebTopic 9: The World Wide Web
CSE2395/CSE3395Perl Programming
CSE2395/CSE3395Perl Programming
Camel3 page 878
LWP, lwpcook, CGI manpages
2Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
In this topicIn this topic
The World Wide Web Writing a Perl web client
► LWP module
Dynamic web pages► Common Gateway Interface (CGI)
The World Wide Web Writing a Perl web client
► LWP module
Dynamic web pages► Common Gateway Interface (CGI)
3Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
The World Wide WebThe World Wide Web
Developed in 1991 as a mechanism for linking hypertext across the Internet► documents contain links to other documents
Documents were considered static and stateless► requesting the same document twice always returned
identical copies
Documents were primarily text► focus was on content, not presentation► HTML contained some rudimentary markup for
formatting
Much of this has now changed
Developed in 1991 as a mechanism for linking hypertext across the Internet► documents contain links to other documents
Documents were considered static and stateless► requesting the same document twice always returned
identical copies
Documents were primarily text► focus was on content, not presentation► HTML contained some rudimentary markup for
formatting
Much of this has now changed
4Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TerminologyTerminology
Documents are identified with a Universal Resource Locator/Identifier (URL/URI)
► unique string identifying a document’s location► http://www.google.com/
Documents are requested and sent using Hypertext Transfer Protocol (HTTP)
► simple text-based file-transfer protocol understood by both ends of a transfer
– web browser (user agent) (client)– web site (server)
► form of responses strongly resembles email messages Documents are often written in Hypertext Markup
Language (HTML)► text-based, like Rich Text Format (RTF), since expanded into
Extensible Markup Language (XML)
Documents are identified with a Universal Resource Locator/Identifier (URL/URI)
► unique string identifying a document’s location► http://www.google.com/
Documents are requested and sent using Hypertext Transfer Protocol (HTTP)
► simple text-based file-transfer protocol understood by both ends of a transfer
– web browser (user agent) (client)– web site (server)
► form of responses strongly resembles email messages Documents are often written in Hypertext Markup
Language (HTML)► text-based, like Rich Text Format (RTF), since expanded into
Extensible Markup Language (XML)
5Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Fetching a document by HTTPFetching a document by HTTP
user agent (browser) running on client
web server program running on server... Internet ...
GET /path/to/document.html
Content-Type: text/html
blank line
contents of document.html
time
request
response
6Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
User agentUser agent
Web browser is a kind of user agent► initiates HTTP connection to server► requests document using GET request► receives response (header and document) from
server► disconnects from server► decodes headers► renders document on screen
Any program can be a user agent► Library for the Web with Perl (LWP) provides helper
functions► use LWP::UserAgent;► use LWP::Simple;
Web browser is a kind of user agent► initiates HTTP connection to server► requests document using GET request► receives response (header and document) from
server► disconnects from server► decodes headers► renders document on screen
Any program can be a user agent► Library for the Web with Perl (LWP) provides helper
functions► use LWP::UserAgent;► use LWP::Simple;
7Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
# Fetch a web page with LWP::Simple;use LWP::Simple;
$doc = get("http://www.google.com/");die "Couldn't access document" unless defined $doc;
# Process the document.if ($doc =~ /<title>(.*?)<\/title>/i){ print "Title is $1\n";}else{ print "Document has no <title> tag\n";}
# Fetch a web page with LWP::Simple;use LWP::Simple;
$doc = get("http://www.google.com/");die "Couldn't access document" unless defined $doc;
# Process the document.if ($doc =~ /<title>(.*?)<\/title>/i){ print "Title is $1\n";}else{ print "Document has no <title> tag\n";}
8Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Common Gateway Interface (CGI)Common Gateway Interface (CGI)
Document served by server is usually a file on disk Server may instead run a program (“CGI program”) that
produces the document► part of the URL designates the program’s name
Program produces the entire response► including HTTP header and blank line► response is sent as-is by server to user agent
Server needs to distinguish between serving a static file or running a program
► two common approaches– run anything in .cgi– run anything in the /cgi-bin directory
Document served by server is usually a file on disk Server may instead run a program (“CGI program”) that
produces the document► part of the URL designates the program’s name
Program produces the entire response► including HTTP header and blank line► response is sent as-is by server to user agent
Server needs to distinguish between serving a static file or running a program
► two common approaches– run anything in .cgi– run anything in the /cgi-bin directory
9Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Fetching a document by HTTPFetching a document by HTTP
user agent (browser) running on client
web server program running on server
POST /cgi-bin/programform data
Content-Type: text/html
blank line
result of processing form
program (instance of application)
time
server invokes program and passes
form data to it
server verifies format of response
and passes it to client
10Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Writing a CGI programWriting a CGI program
Read form data► contents of all form elements on originating web page, if any► form data found either at end of URL or on standard input
– depending on whether GET or POST method used► Perl CGI module facilitates this
Process data Produce response
► send to standard output► produce HTTP header
– Content-Type header mandatory► produce blank line► produce body of response
Read form data► contents of all form elements on originating web page, if any► form data found either at end of URL or on standard input
– depending on whether GET or POST method used► Perl CGI module facilitates this
Process data Produce response
► send to standard output► produce HTTP header
– Content-Type header mandatory► produce blank line► produce body of response
11Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Installing a CGI program at MonashInstalling a CGI program at Monash
Install program in► $HOME/WWW/cgi-bin/myprogram
Permissions must be set correctly► cgi-bin and parent directories must be searchable
by all– home.page.setup– chmod a+x ~ ~/WWW ~/WWW/cgi-bin
► program must be readable and executable by you– chmod u+rx myprogram
Program is accessible at URLhttp://users.monash.edu.au/~you/cgi-bin/myprogram
Install program in► $HOME/WWW/cgi-bin/myprogram
Permissions must be set correctly► cgi-bin and parent directories must be searchable
by all– home.page.setup– chmod a+x ~ ~/WWW ~/WWW/cgi-bin
► program must be readable and executable by you– chmod u+rx myprogram
Program is accessible at URLhttp://users.monash.edu.au/~you/cgi-bin/myprogram
12Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
#!/usr/bin/perl -w
# Generate a static CGI page.
# << notation is a fancy kind of string quoting# reminiscent of shell here-documents. All text# between the FLAGS is in the string.print <<"FLAG";Content-Type: text/html
<html><head><title>Hello</title></head><body><p>Hello, world!</p></body></html>FLAG
#!/usr/bin/perl -w
# Generate a static CGI page.
# << notation is a fancy kind of string quoting# reminiscent of shell here-documents. All text# between the FLAGS is in the string.print <<"FLAG";Content-Type: text/html
<html><head><title>Hello</title></head><body><p>Hello, world!</p></body></html>FLAG
13Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
#!/usr/bin/perl -w
# Generate a CGI page with varying text.
print <<"EOT";Content-Type: text/html
<html><head><title>Date</title></head><body>EOT
# Get date.chomp($date = `/bin/date`);print "<p>The date is <b>$date</b></p>\n";
print "</body></html>\n";
#!/usr/bin/perl -w
# Generate a CGI page with varying text.
print <<"EOT";Content-Type: text/html
<html><head><title>Date</title></head><body>EOT
# Get date.chomp($date = `/bin/date`);print "<p>The date is <b>$date</b></p>\n";
print "</body></html>\n";
14Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
FormsForms
What is your species?
What is your preferred language? Thai
Go
15Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Form dataForm data
Form data is text entered into web page in HTML <INPUT>, <SELECT> and <TEXTAREA> tags<FORM><INPUT type="text" name="species"><INPUT type="text" name="language" value="Thai")><INPUT type="submit" name="x" value="Go"></FORM>
Form data is submitted by browser in HTTP request► each parameter and its value► species=human&language=English&x=Go
Perl CGI module includes param function which extracts parameters’ values► use CGI ("param");► param("species") # "human"► param("language") # "English"
Form data is text entered into web page in HTML <INPUT>, <SELECT> and <TEXTAREA> tags<FORM><INPUT type="text" name="species"><INPUT type="text" name="language" value="Thai")><INPUT type="submit" name="x" value="Go"></FORM>
Form data is submitted by browser in HTTP request► each parameter and its value► species=human&language=English&x=Go
Perl CGI module includes param function which extracts parameters’ values► use CGI ("param");► param("species") # "human"► param("language") # "English"
16Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
# Process form data and produce a response.
use CGI qw(param);
# Get parameters.$kind = param("species");$tongue = param("language");
print <<"EOT";Content-Type: text/html;
<html><head><title>Greetings</title></head><body><h1>Greetings, $kind!</h1><p>Do you speak $tongue?</p></body></html>EOT
# Process form data and produce a response.
use CGI qw(param);
# Get parameters.$kind = param("species");$tongue = param("language");
print <<"EOT";Content-Type: text/html;
<html><head><title>Greetings</title></head><body><h1>Greetings, $kind!</h1><p>Do you speak $tongue?</p></body></html>EOT
17Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
HTML shortcutsHTML shortcuts
Printing raw HTML can make source code difficult to read
CGI module provides helper functions for generating HTML tags► markup and form generation► without shortcut: print "<h1>Heading</h1>";► with shortcut: print h1("Heading");
Need to import helper functions► use CGI qw(h1 h2 p b em table ...);► use CGI qw(:standard);
Printing raw HTML can make source code difficult to read
CGI module provides helper functions for generating HTML tags► markup and form generation► without shortcut: print "<h1>Heading</h1>";► with shortcut: print h1("Heading");
Need to import helper functions► use CGI qw(h1 h2 p b em table ...);► use CGI qw(:standard);
18Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
# Using HTML shortcuts.
use CGI qw(:standard);
# Get parameters.$kind = param("species");$tongue = param("language");
print header(), start_html("Greetings"), h1("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();
# Using HTML shortcuts.
use CGI qw(:standard);
# Get parameters.$kind = param("species");$tongue = param("language");
print header(), start_html("Greetings"), h1("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();
19Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Keeping stateKeeping state
HTTP is a stateless protocol► each connection is independent
Often want to present several pages to user in sequence► e.g., shopping cart
Several solutions► use a hidden parameter
– <INPUT type="hidden">► use cookies
– CGI module’s cookie function► put state information in URL
– requires support from web server
HTTP is a stateless protocol► each connection is independent
Often want to present several pages to user in sequence► e.g., shopping cart
Several solutions► use a hidden parameter
– <INPUT type="hidden">► use cookies
– CGI module’s cookie function► put state information in URL
– requires support from web server
20Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
TimeoutTimeout
use CGI qw(:standard);$page = param("state");print header();
if (!defined $page){ print start_html("Question"), start_form(), p("What is your species?", textfield("species")), p("Use what language?", textfield("language", "Thai")), p(submit("x", "Go")), hidden("state", "result"), end_form(), end_html();}elsif ($page eq "result"){ print start_html("Greetings"), h1 ("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();}
use CGI qw(:standard);$page = param("state");print header();
if (!defined $page){ print start_html("Question"), start_form(), p("What is your species?", textfield("species")), p("Use what language?", textfield("language", "Thai")), p(submit("x", "Go")), hidden("state", "result"), end_form(), end_html();}elsif ($page eq "result"){ print start_html("Greetings"), h1 ("Greetings, $kind!"), p("Do your speak $tongue?"), end_html();}
21Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
CGI securityCGI security
CGI security is very important► CGI programs are run on local host
– as your user ID– in your directories
► connections initiated from user agents worldwide– strangers can’t be trusted!
► HTTP requests can be hand-crafted to exploit security holes
Always check form data for correctness► correct values► correct combination of parameters
Never let error conditions provide hints about implementation
► error messages that are helpful during debugging are also helpful to crackers
CGI security is very important► CGI programs are run on local host
– as your user ID– in your directories
► connections initiated from user agents worldwide– strangers can’t be trusted!
► HTTP requests can be hand-crafted to exploit security holes
Always check form data for correctness► correct values► correct combination of parameters
Never let error conditions provide hints about implementation
► error messages that are helpful during debugging are also helpful to crackers
22Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Further readingFurther reading
LWP, lwpcook manpages CGI manpage Learning Perl 2nd edition, chapter 19
► not in 3rd edition
CGI Programming with Perl► Scott Guelich, Shishir Gundavaram, Gunther
Birznieks, O’Reilly 2000
Perl Cookbook► Tom Christiansen & Nathan Torkington, O’Reilly 1st
edition 1998, 2nd edition 2003
LWP, lwpcook manpages CGI manpage Learning Perl 2nd edition, chapter 19
► not in 3rd edition
CGI Programming with Perl► Scott Guelich, Shishir Gundavaram, Gunther
Birznieks, O’Reilly 2000
Perl Cookbook► Tom Christiansen & Nathan Torkington, O’Reilly 1st
edition 1998, 2nd edition 2003
23Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Covered in this topicCovered in this topic
Writing a Perl web client► LWP::Simple module
Dynamic web pages► Common Gateway Interface (CGI)► forms► keeping state
Writing a Perl web client► LWP::Simple module
Dynamic web pages► Common Gateway Interface (CGI)► forms► keeping state
24Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Going furtherGoing further
LWP::UserAgent► full object-oriented interface to Perl web user agent
HTML::Parser and XML::Parser► tools for processing HTML and XML
GD► module to create images on the fly
Tainting► dealing with insecure data► Camel3 pages 558-568
LWP::UserAgent► full object-oriented interface to Perl web user agent
HTML::Parser and XML::Parser► tools for processing HTML and XML
GD► module to create images on the fly
Tainting► dealing with insecure data► Camel3 pages 558-568
25Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
Next topicNext topic
References► Perl’s answer to pointers
Nested data structures► multi-dimensional arrays► emulating C structs
References► Perl’s answer to pointers
Nested data structures► multi-dimensional arrays► emulating C structs
perlref, perlreftut, perllol, perldsc manpages
26Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University
CopyrightCopyright
Perl Programming lecture notes Copyright © 2000-2004 Deborah Pickett. Reproduction of this presentation for nonprofit study use is permitted. All other reproduction must be authorized in writing by the author.