Chapter 1 CGI

download Chapter 1 CGI

of 43

Transcript of Chapter 1 CGI

  • 7/30/2019 Chapter 1 CGI

    1/43

    CHAPTER 1

    SERVER SIDE PROGRAMMING

  • 7/30/2019 Chapter 1 CGI

    2/43

    Static vs Dynamic Pages & Client Side vs Server SideScripting

    The most basic type of Web page is a completely static, text-based one, written entirely in HTML.

    The contents of the HTML file on the server are exactly thesame as the source code of the page on the client.

    An Average Website

    An Average Website

    This is an average website.

    The above HTML code is static.

  • 7/30/2019 Chapter 1 CGI

    3/43

    Static vs Dynamic Pages

    If the user reloads a static website, they would see the exactsame content every time.

    Its content was written directly by an author, and when theuser goes to the site, that code is downloaded into a browser

    and interpreted.

    Client-side technologies cannot do anything that requiresconnecting to a back end server.

    JavaScript cannot assemble a customized drop-down list onthe fly from user preferences stored in a database

    If a change is needed in the list, the Web developer must goand edit the page by hand.

    This gap is filled by server-side programming.

  • 7/30/2019 Chapter 1 CGI

    4/43

    Static vs Dynamic Pages

    Client-Side Technology Main Use Example Effects

    Cascading Style Sheets,

    Dynamic HTML

    Formatting pages: controlling

    size, color, placement, layout,

    timing of elements

    Overlapping, different

    colored/sized fonts

    Layers, exact positioning

    Client-side scripting

    (JavaScript, VBScript)

    Event handling: controlling

    consequences of defined

    events

    Link that changes color on

    mouseover

    Mortgage calculator

    Java applets Delivering small standalone

    applications

    Moving logo

    Crossword puzzle

    Flash animations film Animation Short cartoon

  • 7/30/2019 Chapter 1 CGI

    5/43

    Static vs Dynamic Pages

    What does dynamic web page mean? A basic distinction exists between static and dynamic Web

    pages But dynamic can mean almost anything beyond plain HTML. Web developers use the term to describe both client- and

    server-side functions.

    On the client, it can mean multimedia presentations, scrolling headlines, pages that update themselves automatically, or elements that appear and disappear.

    On the server, the term generally denotes content assembledon the fly, at the time the page is requested.

    If you display the current date and time on a page, forexample, the content will change and thus will be dynamic.

  • 7/30/2019 Chapter 1 CGI

    6/43

    Static vs Dynamic Pages

    In contrast to a static website, a dynamic website is onewhose content is regenerated every time a user visits orreloads the site.

    Server-side web scripting is mostly about connectingWeb sites to back end servers, such as databases.

    This enables the following types of two-waycommunication: Server to client: Web pages can be assembled from back

    end-server output. Client to server: Customer-entered information can be

    acted upon. Common examples of client-to-server interaction are

    online forms with some drop-down lists that the scriptassembles dynamically on the server.

  • 7/30/2019 Chapter 1 CGI

    7/43

  • 7/30/2019 Chapter 1 CGI

    8/43

  • 7/30/2019 Chapter 1 CGI

    9/43

  • 7/30/2019 Chapter 1 CGI

    10/43

  • 7/30/2019 Chapter 1 CGI

    11/43

  • 7/30/2019 Chapter 1 CGI

    12/43

    What is CGI?

  • 7/30/2019 Chapter 1 CGI

    13/43

  • 7/30/2019 Chapter 1 CGI

    14/43

    The Hello world Test

    #include int main(){

    cout

  • 7/30/2019 Chapter 1 CGI

    15/43

    The Hello world Test

    Header DescriptionContent-type: type A MIME string defining the format of the file being returned.

    Example: Content-type:text/html

    Expires: Date The date the information becomes invalid. This should be used by

    the browser to decide when a page needs to be refreshed. A valid

    date string should be in the format 01 Jan 1998 12:00:00 GMT.

    Location: URL The URL that should be returned instead of the URL requested.

    You can use this field to redirect a request to any file.

    Last-modified: Date The date of last modification of the resource.

    Content-length: N The length, in bytes, of the data being returned. The browser usesthis value to report the estimated download time for a file.

    Set-Cookie: String Set the cookie passed through the string

    Other HTTP header lines are:

  • 7/30/2019 Chapter 1 CGI

    16/43

    Getting Data and OtherInformation

    Much of the most crucial information needed by CGIapplications is made available via environmentvariables.

    Programs can access this information as they would anyenvironment variable (e.g. getenv(name) in C++).

  • 7/30/2019 Chapter 1 CGI

    17/43

  • 7/30/2019 Chapter 1 CGI

    18/43

  • 7/30/2019 Chapter 1 CGI

    19/43

  • 7/30/2019 Chapter 1 CGI

    20/43

    Processing a Simple Form

    The outputfrom the script or program to primary

    output stream such as cout in the C++ is handled in a

    special way.

    Effectively, it is directed so that it gets sent back to thebrowser.

    Thus, by writing a C++ program that writes an HTML

    document onto its standard output, you will make thatdocument appear on users screen as a response to

    the form submission.

  • 7/30/2019 Chapter 1 CGI

    21/43

  • 7/30/2019 Chapter 1 CGI

    22/43

  • 7/30/2019 Chapter 1 CGI

    23/43

  • 7/30/2019 Chapter 1 CGI

    24/43

    GET vs POST

    Your CGI program should inspect the REQUEST_METHODenvironment variable to determine if the form was a GET orPOST method

    Then it can take the appropriate action to retrieve the form. The CGI Program can get the request method, Post or Get,

    using getenv() and environment variableREQUEST_METHOD.

    Here is how this can be done in C/C++:char *method;

    method = getenv(REQUEST_METHOD);if (method==NULL) /* error! */ {}else if(strcmp(method,GET)==0) {}else if (strcmp(method,POST)==0) {}

  • 7/30/2019 Chapter 1 CGI

    25/43

    GET vs POST

    Example GET handler:int main(){

    char *method, *query;

    method = getenv(REQUEST_METHOD);if (method==NULL) /* error! */{}else if(strcmp(method, GET)==0)

    query = getenv(QUERY_STRING);Cout

  • 7/30/2019 Chapter 1 CGI

    26/43

    GET vs POST

    A POST will provide the user's input to the CGI program, as if it were typeat the keyboard, using the standard input device, or stdin.

    If POST is used, then an environment variable called CONTENT_LENGTHindicates how much data is being sent.

    You can read this data into a buffer, by doing something like:

    char *method, *query;method = getenv(REQUEST_METHOD);if(strcmp(method, "POST") == 0){

    len = atoi(getenv("CONTENT_LENGTH"));

    query = new char[len + 1];fread(query, 1, len, stdin);

    }Cout

  • 7/30/2019 Chapter 1 CGI

    27/43

    Data Parsing

    Now we have the data passed from a form stored in astring variable and we want to use it.

    However, the data is still in unusable form as it is URLencoded.

    If you have a form with two input fields, lets call themname and email, declared as follows< INPUT TYPE=text MAXLENGTH=30NAME="name">

    < INPUT TYPE=text MAXLENGTH=20NAME="email">

    Suppose the user types John Davidinto name and [email protected] email.

    What will then be read in by your program is

  • 7/30/2019 Chapter 1 CGI

    28/43

  • 7/30/2019 Chapter 1 CGI

    29/43

    Character URL-encoding

  • 7/30/2019 Chapter 1 CGI

    30/43

    Character URL-encoding %80

    %A3

    %A9

    %AE

    %C0

    %C1

    %C2

    %C3

    %C4 %C5

    ! %21

    " %22

    # %23

    $ %24

    % %25

    & %26

    ' %27

    ( %28

    ) %29

    U d di h D di

  • 7/30/2019 Chapter 1 CGI

    31/43

    Understanding the DecodingProcess In order to access the information contained within the form, a

    decoding must be applied to the data.

    The algorithm for decoding form data follows:

    Determine request protocol (either GET or POST) by checking the

    REQUEST_METHOD environment variable. If the protocol is GET, read the query string from QUERY_STRING

    and/or the extra path information from PATH_INFO.

    If the protocol is POST, determine the size of the request usingCONTENT_LENGTH and read that amount of data from the standardinput.

    Split the query string on the "&" character, which separates key-valuepairs (the format is key=value&key=value...).

    Decode the hexadecimal and "+" characters in each key-value pair.

    Create a key-value table with the key as the index.

  • 7/30/2019 Chapter 1 CGI

    32/43

  • 7/30/2019 Chapter 1 CGI

    33/43

  • 7/30/2019 Chapter 1 CGI

    34/43

  • 7/30/2019 Chapter 1 CGI

    35/43

  • 7/30/2019 Chapter 1 CGI

    36/43

  • 7/30/2019 Chapter 1 CGI

    37/43

  • 7/30/2019 Chapter 1 CGI

    38/43

  • 7/30/2019 Chapter 1 CGI

    39/43

  • 7/30/2019 Chapter 1 CGI

    40/43

    Security

    The other is, when dealing with forms, it is extremelycritical to check the data.

    A malicious user can embed shell metacharacterscharacters that have special meaning to the shell in

    the form data. This could cause big problem to your system.

    For example, here is a form that asks for user name:

  • 7/30/2019 Chapter 1 CGI

    41/43

    Security

    #includeint main(){

    system(mkdir

  • 7/30/2019 Chapter 1 CGI

    42/43

    Security

    The false security of HTML form Hidden input, limited options, and thePOST method

    One way to input constant data from a form, or to allow several sequentialinputs from the same user, is to use the tag.

    You should be aware that anyone can see this information using "ViewSource". So, don't hide you secrets there.

    Related to this is the issue of limiting user choices to the options in aSELECT box.

    This will stop random data from being entered, but unfortunately it is quiteeasy to construct a URL that contains a query string with whatever the badguy wants.

    For example, say you have a select box that limits the user to "male" or"female" parameter. http://biolinx.bios.niu.edu/cgi-

    bin/z012345/your_program.cgi?sex=male A modestly clever user could change this to: http://biolinx.bios.niu.edu/cgi-

    bin/z012345/your_program.cgi?sex=monday Your carefull chosen o tions would be subverted.

  • 7/30/2019 Chapter 1 CGI

    43/43

    Security

    Scripts that read or write files A script that writes a file can be a problem. In the simplest case, the contents of that file are completely trashed by

    malicious user. A more dangerous case is that a bad guy might write a file that contains

    executable code that would cause you problems if you inadvertently

    executed it. As a good example, if "rm -rf *" gets executed by the shell, all programs in

    that directory and below will be deleted. Be sure that permissions for files to be written are set at 666!

    Read-only files (files whose last permission number is 4: e.g. 744) mightgive away information to the bad guys.

    Don't keep important information here. One particular source of problems here can be "encrypted" passwords. Encryption is a great thing, but we cant prevent an expert from cracking the

    encryption.