Chapter 16 Web Pages And CGI Scripts

Post on 02-Jan-2016

18 views 1 download

description

Chapter 16 Web Pages And CGI Scripts. Department of Biomedical Informatics University of Pittsburgh School of Medicine http://www.dbmi.pitt.edu. The Content of this Lecture. Python w eb p rogramming basics : 1 ) urllib2 module 2 ) cgi module - PowerPoint PPT Presentation

Transcript of Chapter 16 Web Pages And CGI Scripts

Chapter 16 Web Pages And CGI Scripts

Department of Biomedical InformaticsUniversity of Pittsburgh School of Medicine

http://www.dbmi.pitt.edu

The Content of this Lecture

1. Python web programming basics:

1) urllib2 module

2) cgi module

3) information for building your own web page at Pitt

2. Book tasks:

1) Grabing web pages

2) A simple CGI script for searching the neoplasm classification

urllib2 module

The urllib2 module defines functions and classes which help in opening URLs.

urllib2.Request()

url should be a string containing a valid URL.

urllib2.urlopen()

The input of this function can be either a url string or a request object.

Examples

>>> import urllib2

>>> req=urllib2.Request(url='http://www.python.org/')

>>> f=urllib2.urlopen(req)

>>> print f.read(100)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm

>>>

Examples

>>> f=urllib2.urlopen('http://faculty.dbmi.pitt.edu/day/Bioinf2012')

>>> print f.read(50)

<html>

<head>

<title>Index of /day/Bioinf2012/<

>>>

urllib2 module exceptions

urllib2.URLError.The handlers raise this exception when they run into a problem. It is a subclass of IOError.

urllib2.HTTPError

It is a subclass of URLError.

Algorithm for Grabbing Web Pages

1. Import the module that makes the HTTP requests (urllib2).

2. Make the HTTP request.

3. If the request returns the Web page, print the page. Otherwise, print an error message.

The cgi Module

A CGI (Common Gateway Interface) script is invoked by an HTTP server, usually to process user input submitted through an HTML <FORM>.

The CGI module is used to implement CGI scripts.

The contents of an HTML form are passed to a CGI program via a string which are accessed using the FieldStorage class of the CGI module.

The FieldStorage Class

getvalue() is of member function of the the FieldStorage Class.

This function returns the value of a given field with the fieldname as the input.

For example:

form = cgi.FieldStorage()

message = form.getvalue("tx", "(no message)")

The cgitb mudule

This module provides an exception handler that displays a detailed report. It is a good practice to include the follow statements when you develop a new cgi script:

import cgitb

cgitb.enable()

Information for Building Your Own Web Page at Pitt

You can build your own Web site at Pitt. The instructions for doing this are contained in the following two files:

afs_web.pdf

Html_inst.pdf

These two files were uploaded to our website.

A simple CGI script for searching the neoplasm classification

1. Create a very simple Web page consisting of a simple form (see the figure below).

This form contains a text box and a “ submit” button for taking user input.

The source code of this form should contain the URL (Universal Resource Locator, or Web address) linked to the folder (cgi-bin) where your CGI script can be found.

<form name="sender" method="GET"action="http://gweel.dbmi.pitt.edu/cgi-bin/neopull.py"><br><center><input type="text" name="tx" size=38 maxlength=48 value="">

<input type="submit" name="bx" value="SUBMIT"></center></form>

A simple CGI script for searching the neoplasm classification

2. Upload the your Web page to a folder on a webserver, designated for this class. This folder will be publically accessible through the Web.

3. Create a script and upload it to another folder called cgi-bin on the webserver. The cgi-bin folder is also designated for the class. The web address of this folder will be the same as the address on the HTML source code of your Web page. In my example, it should be:

"http://gweel.dbmi.pitt.edu/cgi-bin/neopull.py"

A simple CGI script for searching the neoplasm classification

4. The algorithm of the script you will upload in step 3 starts at step 4. Capture the string sent by the user through your web page. Place the text into a string object (message).

form = cgi.FieldStorage()

message = form.getvalue("tx", "(no message)")

term_check = re.search(r'[A-Za-z ]+$', message)

if not term_check:

print "<br>Only alphabetic letters and spaces are permitted in the query box"

print "</body></html>"

sys.exit()

A simple CGI script for searching the neoplasm classification

5. Print out the HTML header of the web page that will be returned to the user.

print "<br>Your query term is " + message + "<br>"

A simple CGI script for searching the neoplasm classification

6. Open up a file called “neoself” which contains the neoplasm classification information.

7. Go through each line of this file to look for the terms that match the “message” entered by the users. When a matched term is found, print the line that contained the term.

8. Print a line to acknowledge the user the work is done.

in_text = open("../data/neoself", "r")

for line in in_text:

query_match = re.search(message, line)

if query_match:

line = re.sub(r'\|',"<br>", line)

print "<br>" + line + "<br>"

exit