Post on 22-Oct-2020
Web Application Development CS 228Web Development CS 303
Fall 2015numangift.wordpress.com/web-development-spring-2015
May 14th, 2015May 16th, 2015
Lecture 11 & 12PHP - III
Regular Expressions
What is form validation?• validation: ensuring that form's values are correct
• some types of validation: • preventing blank values (email address)
• ensuring the type of values
• integer, real number, currency, phone number, Social Security number, postal address, email address, date, credit card number, ...
• ensuring the format and range of values (ZIP code must be a 5-digit integer)
• ensuring that values fit together (user types email twice, and the two must match)
A real form that uses validation
Client vs. server-side validationValidation can be performed:
• client-side (before the form is submitted) • can lead to a better user experience, but not secure (why not?)
• server-side (in PHP code, after the form is submitted) • needed for truly secure validation, but slower
• both • best mix of convenience and security, but requires most effort to program
An example form to be validated
City:
State:
ZIP:
HTML
• Let's validate this form's data on the server...
output
Recall: Basic server-side validation$city = $_POST["city"];
$state = $_POST["state"];
$zip = $_POST["zip"];
if (!$city || strlen($state) != 2 || strlen($zip) != 5) {
print "Error, invalid city/state/zip submitted.";
} PHP
•basic idea: examine parameter values, and if they are bad, show an error message and abort. But:
• How do you test for integers vs. real numbers vs. strings?
• How do you test for a valid credit card number?
• How do you test that a person's name has a middle initial?
• (How do you test whether a given string matches a particular complex format?)
Regular expressions/^[a-zA-Z_\-]+@(([a-zA-Z_\-])+\.)+[a-zA-Z]{2,4}$/
• regular expression ("regex"): a description of a pattern of text
• can test whether a string matches the expression's pattern
• can use a regex to search/replace characters in a string
• regular expressions are extremely powerful but tough to read
(the above regular expression matches email addresses)
• regular expressions occur in many places:
• Java: Scanner, String's split method (CSE 143 sentence generator)
• supported by PHP, JavaScript, and other languages
• many text editors (TextPad) allow regexes in search/replace
• The site Rubular is useful for testing a regex.
http://rubular.com/
Regular expressionsThis picture best describes regex.
Basic regular expressions/abc/
• in PHP, regexes are strings that begin and end with /
• the simplest regexes simply match a particular substring
• the above regular expression matches any string containing "abc":
• YES: "abc", "abcdef", "defabc", ".=.abc.=.", ...
• NO: "fedcba", "ab c", "PHP", ...
Wildcards: .
•A dot . matches any character except a \n line break
•/.oo.y/ matches "Doocy", "goofy", "LooNy", ...
•A trailing i at the end of a regex (after the closing /) signifies a case-insensitive match
•/all/i matches “Allison Obourn", “small", “JANE GOODALL", ...
Special characters: |, (), \• | means OR
• /abc|def|g/ matches "abc", "def", or "g"
• There's no AND symbol. Why not?
• () are for grouping
• /(Homer|Marge) Simpson/ matches "Homer Simpson" or "Marge Simpson"
• \ starts an escape sequence
• many characters must be escaped to match them literally: / \ $ . [ ] ( ) ^ * + ?
• /
/ matches lines containing
tags
Quantifiers: *, +, ?• * means 0 or more occurrences
• /abc*/ matches "ab", "abc", "abcc", "abccc", ...
• /a(bc)*/ matches "a", "abc", "abcbc", "abcbcbc", ...
• /a.*a/ matches "aa", "aba", "a8qa", "a!?xyz__9a", ...
• + means 1 or more occurrences
• /Hi!+ there/ matches "Hi! there", "Hi!!! there", ...
• /a(bc)+/ matches "abc", "abcbc", "abcbcbc", ...
• ? means 0 or 1 occurrences
• /a(bc)?/ matches "a" or "abc"
More quantifiers: {min,max}
• {min,max} means between min and max occurrences (inclusive)
• /a(bc){2,4}/ matches "abcbc", "abcbcbc", or "abcbcbcbc"
• min or max may be omitted to specify any number
• {2,} means 2 or more
• {,6} means up to 6
• {3} means exactly 3
Practice exercise
• When you search Google, it shows the number of pages of results as"o"s in the word "Google". What regex matches strings like "Google", "Gooogle", "Goooogle", ...? (try it) (data)
• Answer: /Goo+gle/ (or /Go{2,}gle/)
http://rubular.com/r/4cf1WVnL4Ahttp://courses.cs.washington.edu/courses/cse154/14sp/lectures/slides/notes/regex-google.txt
Anchors: ^ and $• ^ represents the beginning of the string or line;
$ represents the end
• /Jess/ matches all strings that contain Jess; /^Jess/ matches all strings that start with Jess; /Jess$/ matches all strings that end with Jess; /^Jess$/ matches the exact string "Jess" only
• /^Alli.*Obourn$/ matches “AlliObourn", “Allie Obourn", “Allison E Obourn", ... but NOT “Allison Obourn stinks" or "I H8 Allison Obourn"
• (on the other slides, when we say, /PATTERN/ matches "text", we really mean that it matches any string that contains that text)
Character sets: []
• [] group characters into a character set; will match any single character from the set
• /[bcd]art/ matches strings containing "bart", "cart", and "dart"
• equivalent to /(b|c|d)art/ but shorter
• inside [], many of the modifier keys act as normal characters
• /what[!*?]*/ matches "what", "what!", "what?**!", "what??!", ...
• What regular expression matches DNA (strings of A, C, G, or T)?
• /[ACGT]+/
Character ranges: [start-end]
• inside a character set, specify a range of characters with -
• /[a-z]/ matches any lowercase letter
• /[a-zA-Z0-9]/ matches any lower- or uppercase letter or digit
• an initial ^ inside a character set negates it
• /[^abcd]/ matches any character other than a, b, c, or d
• inside a character set, - must be escaped to be matched
• /[+\-]?[0-9]+/ matches an optional + or -, followed by at least one digit
Practice ExercisesWhat regular expression matches letter grades such as A, B+, or D- ? (try it) (data)
What regular expression would match UW Student ID numbers? (try it) (data)
What regular expression would match a sequence of only consonants, assuming that the string consists only of lowercase letters? (try it) (data)
http://rubular.com/r/pI1rvXzxz9http://courses.cs.washington.edu/courses/cse154/14sp/lectures/slides/notes/regex-lettergrades.txthttp://rubular.com/r/ORmAOalILnhttp://courses.cs.washington.edu/courses/cse154/14sp/lectures/slides/notes/regex-uwstudentid.txthttp://rubular.com/r/cvL9smgzmHhttp://courses.cs.washington.edu/courses/cse154/14sp/lectures/slides/notes/regex-consonants.txt
Escape sequences
• special escape sequence character sets:
• \d matches any digit (same as [0-9]); \D any non-digit ([^0-9])
• \w matches any word character (same as [a-zA-Z_0-9]); \W any non-word char
• \s matches any whitespace character ( , \t, \n, etc.); \S any non-whitespace
• What regular expression matches names in a "Last, First M." format
with any number of spaces?
• /\w+,\s+\w+\s+\w\./
Regular expressions in PHP (PDF)• regex syntax: strings that begin and end with /, such as "/[AEIOU]+/"
function description
preg_match(regex, string) returns TRUE if string matches regex
preg_replace(regex, replacement, string) returns a new string with all substrings that match regex replaced by replacement
preg_split(regex, string) returns an array of strings from given stringbroken apart using given regex as delimiter (like explode but more powerful)
http://www.php.net/pcrehttp://www.phpguru.org/downloads/PCRE Cheat Sheet/PHP PCRE Cheat Sheet.pdfhttp://www.php.net/manual/en/reference.pcre.pattern.syntax.phphttp://www.php.net/preg-matchhttp://www.php.net/preg-replacehttp://www.php.net/preg-split
PHP form validation w/ regexes$state = $_POST["state"];
if (!preg_match("/^[A-Z]{2}$/", $state)) {
print "Error, invalid state submitted.";
} PHP
• preg_match and regexes help you to validate parameters
• sites often don't want to give a descriptive error message here (why?)
Regular expression PHP example# replace vowels with stars
$str = "the quick brown fox";
$str = preg_replace("/[aeiou]/", "*", $str);
# "th* q**ck br*wn f*x"
# break apart into words
$words = preg_split("/[ ]+/", $str);
# ("th*", "q**ck", "br*wn", "f*x")
# capitalize words that had 2+ consecutive vowels
for ($i = 0; $i < count($words); $i++) {
if (preg_match("/\\*{2,}/", $words[$i])) {
$words[$i] = strtoupper($words[$i]);
}
} # ("th*", "Q**CK", "br*wn", "f*x") PHP
Practice exerciseUse regular expressions to add validation to the turnin form shown in previous lectures.
• The student name must not be blank and must contain a first and last name (two words).
• The student ID must be a seven-digit integer.
• The assignment must be a string such as "hw1" or "hw6".
• The section must be a two-letter uppercase string representing a valid section such as AF or BK.
• The email address must follow a valid general format such as user@example.com.
• The course must be one of "142", "143", or "154" exactly.
Handling invalid datafunction check_valid($regex, $param) {
if (preg_match($regex, $_POST[$param])) {
return $_POST[$param];
} else {
# code to run if the parameter is invalid
die("Bad $param");
}
}
...
$sid = check_valid("/^[0-9]{7}$/", "studentid");
$section = check_valid("/^[AB][A-C]$/i", "section"); PHP
• Having a common helper function to check parameters is useful. • If your page needs to show a particular HTML output on errors, the die
function may not be appropriate.
Regular expressions in HTML formsHow old are you?
HTML
output
• HTML5 adds a new pattern attribute to input elements
• the browser will refuse to submit the form unless the value matches the
regex
http://www.w3schools.com/html/html5_form_attributes.asp
Cookies
Stateful client/server interactionSites like amazon.com seem to "know who I am." How do they do this? How does a client uniquely identify itself to a server, and how does the server provide specific content to each client?
• HTTP is a stateless protocol; it simply allows a browser to request a single document from a web server
• today we'll learn about pieces of data called cookies used to work around this problem, which are used as the basis of higher-level sessions between clients and servers
What is a cookie?• cookie: a small amount of information sent by a server to a browser, and then sent back by the browser on future page requests
• cookies have many uses:• authentication
• user tracking
• maintaining user preferences, shopping carts, etc.
• a cookie's data consists of a single name/value pair, sent in the header of the client's HTTP GET or POST request
http://en.wikipedia.org/wiki/HTTP_cookie
How cookies are sent• when the browser requests a page, the server may send back a cookie(s) with it
• if your server has previously sent any cookies to the browser, the browser will send them back on subsequent requests
• alternate model: client-side JavaScript code can set/get cookies
Myths about cookies• Myths:• Cookies are like worms/viruses and can erase data from the user's hard disk.
• Cookies are a form of spyware and can steal your personal information.
• Cookies generate popups and spam.
• Cookies are only used for advertising.
• Facts:• Cookies are only data, not program code.
• Cookies cannot erase or read information from the user's computer.
• Cookies are usually anonymous (do not contain personal information).
• Cookies CAN be used to track your viewing habits on a particular site.
A "tracking cookie"
• an advertising company can put a cookie on your machine when you visit one site, and see it when you visit another site that also uses that advertising company
• therefore they can tell that the same person (you) visited both sites
• can be thwarted by telling your browser not to accept "third-party cookies"
Where are the cookies on my computer?• IE: HomeDirectory\Cookies
•e.g. C:\Documents and Settings\jsmith\Cookies•each is stored as a .txt file similar to the site's domain name
• Chrome: C:\Users\username\AppData\Local\Google\Chrome\User Data\Default
• Firefox: HomeDirectory\.mozilla\firefox\???.default\cookies.txt•view cookies in Firefox preferences: Privacy, Show Cookies...
How long does a cookie exist?• session cookie : the default type; a temporary cookie that is stored only in the browser's memory• when the browser is closed, temporary cookies will be erased
• can not be used for tracking long-term information
• safer, because no programs other than the browser can access them
• persistent cookie : one that is stored in a file on the browser's computer• can track long-term information
• potentially less secure, because users (or programs they run) can open cookie files, see/change the cookie values, etc.
Setting a cookie in PHPsetcookie("name", "value"); PHP
setcookie("username", “allllison");
setcookie("age", 19); PHP
• setcookie causes your script to send a cookie to the user's browser
• setcookie must be called before any output statements (HTML blocks, print, or echo)
• you can set multiple cookies (20-50) per user, each up to 3-4K bytes
• by default, the cookie expires when browser is closed (a "session cookie")
http://php.net/setcookie
Retrieving information from a cookie$variable = $_COOKIE["name"]; # retrieve value of the cookie
if (isset($_COOKIE["username"])) {
$username = $_COOKIE["username"];
print("Welcome back, $username.\n");
} else {
print("Never heard of you.\n");
}
print("All cookies received:\n");
print_r($_COOKIE); PHP
• any cookies sent by client are stored in $_COOKIES associative array
• use isset function to see whether a given cookie name exists
http://php.net/isset
What cookies have been set?• Chrome: F12 → Resources → Cookies; Firefox: F12 → Cookies
Expiration / persistent cookiessetcookie("name", "value", expiration); PHP
$expireTime = time() + 60*60*24*7; # 1 week from now
setcookie("CouponNumber", "389752", $expireTime);
setcookie("CouponValue", "100.00", $expireTime); PHP
• to set a persistent cookie, pass a third parameter for when it should expire
• indicated as an integer representing a number of seconds, often relative to current timestamp
• if no expiration passed, cookie is a session cookie; expires when browser is closed
• time function returns the current time in seconds
• date function can convert a time in seconds to a readable date
http://courses.cs.washington.edu/courses/cse154/14sp/lectures/slides/>http:/php.net/timehttp://php.net/manual/en/function.date.php
Deleting a cookiesetcookie("name", FALSE); PHP
setcookie("CouponNumber", FALSE); PHP
• setting the cookie to FALSE erases it
• you can also set the cookie but with an expiration that is before the present time:
setcookie("count", 42, time() - 1); PHP
• remember that the cookie will also be deleted automatically when it expires, or can be deleted manually by the user by clearing their browser cookies
Clearing cookies in your browser• Chrome: Wrench → History → Clear all browsing data...• Firefox: Firefox menu → Options → Privacy → Show Cookies... → Remove
(All) Cookies
Cookie scope and attributessetcookie("name", "value", expire, "path", "domain", secure, httponly);
• a given cookie is associated only with one particular domain (e.g. www.example.com)
• you can also specify a path URL to indicate that the cookie should only be sent on certain subsets of pages within that site (e.g. /users/accounts/ will bind towww.example.com/users/accounts)
• a cookie can be specified as Secure to indicate that it should only be sent when using HTTPS secure requests
• a cookie can be specified as HTTP Only to indicate that it should be sent by HTTP/HTTPS requests only (not JavaScript, Ajax, etc.; seen later); this is to help avoid JavaScript security attacks
Common cookie bugsWhen you call setcookie, the cookie will be available in $_COOKIE on the next page load, but not the current one. If you need the value during the current page request, also store it in a variable:
setcookie("name", "joe");
print $_COOKIE["name"]; # undefined PHP
$name = "joe";
setcookie("name", $name);
print $name; # joe PHP
• setcookie must be called before your code prints any output or HTML content:
Session
How long does a cookie exist?• session cookie : the default type; a temporary cookie that is stored only in the browser's memory• when the browser is closed, temporary cookies will be erased
• can not be used for tracking long-term information
• safer, because no programs other than the browser can access them
• persistent cookie : one that is stored in a file on the browser's computer• can track long-term information
• potentially less secure, because users (or programs they run) can open cookie files, see/change the cookie values, etc.
What is a session?• session: an abstract concept to represent a series of HTTP requests and responses between a specific Web browser and server• HTTP doesn't support the notion of a session, but PHP does
• sessions vs. cookies:• a cookie is data stored on the client• a session's data is stored on the server (only 1 session per client)
•sessions are often built on top of cookies:• the only data the client stores is a cookie holding a unique session ID• on each page request, the client sends its session ID cookie, and the
server uses this to find and retrieve the client's session data
How sessions are established• client's browser makes an initial request to the server
• server notes client's IP address/browser, stores some local session data, and sends a session ID back to client (as a cookie)
• client sends that same session ID (cookie) back to server on future requests
• server uses session ID cookie to retrieve its data for the client's session later (like a ticket given at a coat-check room)
Cookies vs. sessions• duration: sessions live on until the user logs out or closes the browser; cookies can live that long, or until a given fixed timeout (persistent)
• data storage location: sessions store data on the server (other than a session ID cookie); cookies store data on the user's browser
• security: sessions are hard for malicious users to tamper with or remove; cookies are easy
• privacy: sessions protect private information from being seen by other users of your computer; cookies do not
Sessions in PHP: session_startsession_start(); PHP
• session_start signifies your script wants a session with the user• must be called at the top of your script, before any HTML output is
produced• when you call session_start:
• if the server hasn't seen this user before, a new session is created• otherwise, existing session data is loaded into $_SESSION associative
array• you can store data in $_SESSION and retrieve it on future pages
• complete list of PHP session functions
http://us.php.net/manual/en/ref.session.php
Accessing session data$_SESSION["name"] = value; # store session data
$variable = $_SESSION["name"]; # read session data
if (isset($_SESSION["name"])) { # check for session data PHP
if (isset($_SESSION["points"])) {
$points = $_SESSION["points"];
print("You've earned $points points.\n");
} else {
$_SESSION["points"] = 0; # default
} PHP
• the $_SESSION associative array reads/stores all session data
• use isset function to see whether a given value is in the session
http://php.net/isset
Where is session data stored?• on the client, the session ID is stored as a cookie with the name PHPSESSID• on the server, session data are stored as temporary files such as /tmp/sess_fcc17f071...• you can find out (or change) the folder where session data is saved using the session_save_path function• for very large applications, session data can be stored into a SQL database (or other destination) instead using thesession_set_save_handler function
http://us.php.net/manual/en/function.session-save-path.phphttp://www.php.net/manual/en/function.session-set-save-handler.php
Session timeout• because HTTP is stateless, it is hard for the server to know when a user has finished a session
• ideally, user explicitly logs out, but many users don't
• client deletes session cookies when browser closes
• server automatically cleans up old sessions after a period of time
• old session data consumes resources and may present a security risk
• adjustable in PHP server settings or with session_cache_expire function
• you can explicitly delete a session by calling session_destroy
http://php.net/manual/en/function.session-cache-expire.phphttp://us.php.net/manual/en/function.session-destroy.php
Ending a sessionsession_destroy(); PHP
• session_destroy ends your current session• potential problem: if you call session_start again later, it sometimes
reuses the same session ID/data you used before• if you may want to start a completely new empty session later, it is best to
flush out the old one:
session_destroy();
session_regenerate_id(TRUE); # flushes out session
ID number
session_start(); PHP
Common session bugs• session_start doesn't just begin a session; it also reloads any existing session for this user. So it must be called in every page that uses your session data:
# the user has a session from a previous page
print $_SESSION["name"]; # undefined
session_start();
print $_SESSION["name"]; # joe PHP
• previous sessions will linger unless you destroy them and regenerate the user's session ID:
session_destroy();
session_regenerate_id(TRUE);
session_start(); PHP
Implementing user logins• many sites have the ability to create accounts and log in users
• most apps have a database of user accounts
• when you try to log in, your name/pw are compared to those in the database
"Remember Me" feature• How might an app implement a "Remember Me" feature, where the user's login info is remembered and reused when the user comes back later?
• Is this stored as session data? Why or why not?
• What concerns come up when trying to remember data about the user who has logged in?
Practice problem: Power Animal
• Write a page poweranimal.php that chooses a random "power animal" for the user.
• The page should remember what animal was chosen for the user and show it again each time they visit the page.
• It should also count the number of times that user has visited the page.
• If the user selects to "start over," the animal and number of page visits should be forgotten.
https://courses.cs.washington.edu/courses/cse154/
Credits