Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two...
Transcript of Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two...
![Page 1: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/1.jpg)
Programming Part 3
Introduction to Perl
![Page 2: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/2.jpg)
Perl
• Like so many things in computer science, Perl is an acronym: Practical Extraction and Reporting Language (you may now forget this)
• Perl has several advantages for us:– can handle large amounts of data
– includes rich set of functions for analysis of string data, and in particular pattern detection
– Syntax (language rules) relatively flexible; more forgiving of variation than many other programming languages
![Page 3: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/3.jpg)
A simple Perl program#!/usr/bin/perl –w# Chapter 1 - Exercise 1print "Enter single DNA strand: ";my $dnaseq = <STDIN>;chomp $dnaseq;print "\nOpposite strand: ";for (my $i=0;$i<length($dnaseq);$i++) {
my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}
}
![Page 4: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/4.jpg)
Running the program
![Page 5: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/5.jpg)
What’s going on here?
• Lines that begin with ‘#’ character are comments:
– they provide opportunity to explain something
– they are not code – computer ignores them
– examples:
#!/usr/bin/perl -w
# Chapter 1 - Exercise 1
![Page 6: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/6.jpg)
Output statements
• The print command sends output to the screen
• The data to be printed appears in quotes after the name of the command; examples:
print "Enter single DNA strand: ";
print "\nOpposite strand: ";
print "T";
![Page 7: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/7.jpg)
Input statements and assignment
• Input statements read data from external sources; in our example, we’re reading input from the keyboard; example:
my $dnaseq = <STDIN>;
• Assignment statements assign values to variables; the statement above includes an assignment operation, as do the statements below:
my $i=0;
my $nucleo = substr($dnaseq, $i, 1);
![Page 8: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/8.jpg)
Declaring variables
• There are three kinds of variables in Perl; they include:
– Scalar variables (declared with $)
– Arrays (declared with @)
– Hashes (declared with #)
• Variables may be declared and assigned values in the same statement; this is the case with all of the examples so far
![Page 9: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/9.jpg)
Declaring variables
• The line of code below declares a local scalar variable named dnaseq, then assigns it the value typed in at the keyboard (represented by the constant <STDIN>):my $dnaseq = <STDIN>;– “my” makes the variable local– “$” makes the variable a scalar – that is, a variable
that holds a single value (like a Scratch variable)– “=” is the assignment operator – we read the symbol
as “gets”– This instruction ends with the “;” character
![Page 10: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/10.jpg)
Data types in Perl
• A scalar variable in Perl can store two kinds of data:
– strings
– numbers (integers and real numbers)
• We can assign either kind of data to any scalar variable, although it is useful to store only one type or the other in an individual variable, which is given a name that reflects the kind of data it will hold - this makes the code less confusing
![Page 11: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/11.jpg)
The chomp command
• When a Perl program reads data from the keyboard, every character entered by the user is read, including the newline character created by pressing the Enter key
• The “chomp” command removes the extraneous newline character from the end of the data
![Page 12: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/12.jpg)
Perl control structures
• Perl supports both loops and selection structures• Our example program contains both; in this case,
a multiway selection structure contained within a loop:for (my $i=0;$i<length($dnaseq);$i++) {
my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}
}
![Page 13: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/13.jpg)
Operations on strings
• Perl is ideally suited for bioinformatics programming because of its rich set of built-in operations on string data; two of these operations are used in the loop:length($dnaseq) andsubstr($dnaseq, $i, 1)
• The length operation tells the program the number of characters in the string; we use this to tell when the loop should end
• The substr operation tells the program the content of a segment of the original string
![Page 14: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/14.jpg)
Substrings
• A substring is a section of a string
• Substrings can be any length, from 1 character to the entire length of the original string
• The substr operation (or function) takes in 3 data items (the original string, the starting position of the substring, and the length of the substring) and gives back one: the actual substring found at the given position, of the given length
![Page 15: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/15.jpg)
Examples
• Suppose we have the following variable:
my $name = “Cathleen Mary Ruth Sheller”;
# it really is!
then these expressions: represent these substrings:
substr($name, 1, 3) “ath”
substr($name, 9, 4) “Mary”
substr($name, 12, 5) “y Rut”
![Page 16: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/16.jpg)
The for loop
• A for loop is an example of a count-controlled loop; that is, one that repeats a certain number of times
• The structure of the loop is as follows:for (my $i=0;$i<length($dnaseq);$i++) {
# body of loop here}– we start by declaring and initializing the counter, or control
variable: my $i=0;– we then check for the loop ending condition; in this case, we
want to know if the counter has reached a value equal to the number of characters in the dnaseq string: $i<length($dnaseq);
– if the test succeeds, we perform the code in the body of the loop – that is, the statements between the two brackets: { … }
– finally, we increment the counter: $i++
![Page 17: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/17.jpg)
The selection structure
• The code below:if ($nucleo eq "A") {print "T";}
elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}
represents a selection structure; the first part of the expression: if ($nucleo eq "A") tests to see if the value in variable nucleo is equal to the string value “A”
if the expression tests true, a T is output to the screen; otherwise, the next expression: elsif ($nucleo eq "C") is tested and, if true, a G is printed
and so on – until the last: else {print "A";} prints out an A if none of the previous expressions tested true
![Page 18: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/18.jpg)
The logic
• As the loop runs, different statements within the selection structure execute
• The next slide shows the loop and selection structure as it executes on the following input string: ATTAGCAG
![Page 19: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/19.jpg)
The logic
dnaseq: ATTAGCAG
for (my $i=0;$i<length($dnaseq);$i++) {my $nucleo = substr($dnaseq, $i, 1);if ($nucleo eq "A") {print "T";}elsif ($nucleo eq "C") {print "G";}elsif ($nucleo eq "G") {print "C";}else {print "A";}
}
Variable values: Output:i nucleo0 A T1 T A2 T A3 A T4 G C5 C G6 A T7 G C
![Page 20: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/20.jpg)
Making improvements
• The program correctly produces a DNA string’s complement, provided it is given good data
• What happens if the user types a letter that isn’t A, C, G or T?
![Page 21: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/21.jpg)
Another example#!/usr/bin/perl -w# Source: Gibbs, Cynthia and Per Jambeck, Developing Bioinformatics Computer Skills,# O'Reilly, 2001, page 334
my $target = "ACCCTG";my $search_string =
'CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACACACA'.'CATCTAACACTACCCTAACACAGCCCTAATCTAACCCTGGCCACCTGTCTCTCAACTT'.'ACCCTCCATTACCCTGCCTCCACTCGTTACCCTGTCCCATTCAACCATACCATCCGAAC';
my @matches;
foreach my $i (0..length $search_string) {if ($target eq substr($search_string, $i, length $target)) {
push @matches, $i;}
}print "My matches occurred at the following offsets: @matches.\n";print "done\n";exit;
![Page 22: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/22.jpg)
Output from example
![Page 23: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/23.jpg)
Extending strings
• This example introduces some new aspects of Perl programming; consider this line:
my $search_string =
'CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACACACA'.
'CATCTAACACTACCCTAACACAGCCCTAATCTAACCCTGGCCACCTGTCTCTCAACTT'.
'ACCCTCCATTACCCTGCCTCCACTCGTTACCCTGTCCCATTCAACCATACCATCCGAAC';
• A string that is too long to fit on a single line is created by concatenation – three lines (in this case) are glued together (with the ‘.’ character) to make a single string
![Page 24: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/24.jpg)
Arrays
• You may remember the term “array” from our brief discussion of their use in Scratch– an array is a variable that holds a collection of data
– each individual data element can be accessed using a subscript, or index number (although that isn’t done here)
– index values start at 0, so an array with n elements has indexes 0 .. n-1
• The array variable in this program is declared in the following line of code: my @matches;
![Page 25: Programming Part 3 · 2019-09-19 · Data types in Perl •A scalar variable in Perl can store two kinds of data: –strings –numbers (integers and real numbers) •We can assign](https://reader033.fdocuments.us/reader033/viewer/2022060302/5f0888887e708231d4227b57/html5/thumbnails/25.jpg)
Arrays
• Two array operations are illustrated in this program:
– The push operation adds a value to the array:
push @matches, $i;
– The print operation, when given an entire array as data, prints the array contents as a list of values separated by commas:
print "My matches occurred at the following offsets: @matches.\n";