Perl: Lecture 2 Advanced RE & CGI. Regular Expressions 2.
-
Upload
suzan-shelton -
Category
Documents
-
view
220 -
download
1
Transcript of Perl: Lecture 2 Advanced RE & CGI. Regular Expressions 2.
Perl: Lecture 2
Advanced RE & CGI
Regular Expressions 2
Advanced Pattern Matching
Anchor Metacharacters– ^ Beginning of String– $ End of String
"housekeeper" =~ /keeper/; # matches "housekeeper" =~ /^keeper/; # no match "housekeeper" =~ /keeper$/; # matches "housekeeper\n" =~ /keeper$/; # matches "housekeeper" =~ /^housekeeper$/; # matches
Character classes
„[ ]“ notation
„-“ range of characters
„^“ negated
/[bc]at/; # matches 'bat', 'cat' "abc" =~ /[cab]/; # matches 'a' /[\]c]def/; # matches ']def' or 'cdef'
/item[0-9]/; # matches 'item0' or ... or 'item9' /[0-9a-fA-F]/; # matches a hexadecimal digit
/[^0-9]/; # matches a non-numeric character
Common Character Classes
\d digit [0-9] \s whitespace [\ \t\r\n\f] \w word character [0-9a-zA-Z_] \D negated \d [^0-9] \S negated \s [^\s] \W negated \w [^\w] '.' any character but ''\n'‚ \b word boundary \w\W or \W\w
Alternation & Grouping
„|“ alternation operator
„( )“ grouping
m//i modifier– Case-insensitive match
"cats and dogs" =~ /dog|cat|bird/; # matches "cat"
/house(cat|keeper)/
"20" =~ /(19|20|)\d\d/;
Extracting Matches
Grouped patterns can be extracted
$time =~ /(\d\d):(\d\d):(\d\d)/; $hours = $1; $minutes = $2; $seconds = $3;
($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);
Matching repetitions
Quantifier Metacharacters– ? 0 or 1 times– * 0 or more times– + 1 or more times– {n,m} at least n, at most m times– {n,} at least n times– {n} exactly n times
/\w+/ # matches a word$year =~ /\d{4}|\d{2}/; # 2 or 4 digit years
Search and Replace
s/regex/replacement/ Syntax
replaces only one occurence– s///g modifier replaces all occurences
$x = "Time to feed the cat!";$x =~ s/cat/dog/;
$y =~ s/^'(.*)'$/$1/;
RE functions
quotemeta EXPR– escapes all RE metacharacters
split /PATTERN/, EXPR– splits a String at a delimiter specified by
PATTERN
pos SCALAR– Returns offset of last m//g occurence
Time functions
time– Returns number of seconds since jan 1, 1970
localtime– Convert time to human-readable format
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
gmtime– Do the same on the basis of GMT
CGI
What is CGI?
Common Gateway Interface Not a programming language Method for the web server for the invocation
of a program on a user‘s request Method of exchanging data between a user,
a web server and the program
CGI Functionality
CGI Program invoked when its URL is requested
Parameters / Data Included in the request
Setting up the web server for Perl CGI
Example Microsoft Windows, Apache HTTPD– HTTPD is preconfigured for scripts– http://host/cgi-bin– #! required (e.g. #!C:\perl\bin\perl)– Possibly delete windows file type association
CGI Data Exchange overview
Web Server
User
Perl Script
RequestGET / POST
Response
Simple CGI program
#!/usr/bin/perl
print „Content-type: text/html\n\n“;
print „<html><head></head><body>“;
print „<h1>This is CGI</h1>“;
print „</body></html>“;
CGI Environment Variables
HTTP_ACCEPT="*/*"HTTP_ACCEPT_ENCODING="gzip, deflate"HTTP_ACCEPT_LANGUAGE="en-us"HTTP_CONNECTION="Keep-Alive"HTTP_HOST="localhost"HTTP_USER_AGENT="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"QUERY_STRING=""REMOTE_ADDR="127.0.0.1"REMOTE_PORT="3039"REQUEST_METHOD="GET"REQUEST_URI="/cgi-bin/printenv.pl"SCRIPT_NAME="/cgi-bin/printenv.pl"SERVER_ADDR="127.0.0.1"SERVER_ADMIN="[email protected]"SERVER_NAME="ba35002"SERVER_PORT="80"SERVER_PROTOCOL="HTTP/1.1"SERVER_SIGNATURE="<ADDRESS>Apache/1.3.27 Server at ba35002 Port
80</ADDRESS>\n"SERVER_SOFTWARE="Apache/1.3.27 (Win32) PHP/4.2.3"
Generating the response
Own result
Content-type: text/html
print "Content-type: $mime_type\n\n";
RedirectionLocation: http://www.domain.com/newpage
print "Location: $url\n\n";
Getting client data to the server
Usually from an HTML form Two methods defined in HTTP
– GET Data in request URL Accessable by CGI script using QUERY_STRING env var
– POST Data after request Accessable STDIN and CONTENT_LENGTH env var
Both methods use URL-encoding– Fields separated by &, values after = sign– Space -> +– Special characters -> %xx
Where is the data? HTML Forms (1)
<form action=„URL“ method=„get/post“>
</form>
<input type=„submit“ value=„label“ />
<input type=„reset“ value=„label“ />
<input name=„name“ [type=„password“] />
Where is the data? HTML Forms (2)
<textarea name=„name“ />
<select name=„top5“ [multiple]>
<option value=„value“>Label</option>
</select>
<input type=„radio“ name=„name“ value=„value“ /> Label
<input type=„checkbox“ name=„name“ value=„value“ /> Label
<input type=„hidden“ name=„name“ value=„value“ />
GET / POST examples
GET example
POST examplePOST /cgi-bin/script.pl HTTP/1.0 Content-type: application/x-www-form-urlencoded Content-length: 26
in=hello+there&button=Send
GET /cgi-bin/script.pl?in=hello+there&button=Send HTTP/1.0
Tasks to work with data
Find out if GET or POST is usedSplit into single fieldsURL-decodeSend HTTP header w/MIME-type & data
Used for nearly every CGI script, so already build into Perl!
CGI.pm example
use CGI qw(:standard);
print header();
@names=param();
foreach $name (@names) {my @value=param($name);print “$name -> @value<br>\n";
}
Error 500?
Internal Server Error Script behaved incorrectly See Web Server Log for more details
– apache\logs\error.log