Pearl 3

24
PERL PERL Variables and data structures Andrew Emerson, High Performance Systems, CINECA

Transcript of Pearl 3

Page 1: Pearl 3

PERLPERL

Variables and data structures

Andrew Emerson, High Performance Systems, CINECA

Page 2: Pearl 3

The “Hello World” programThe “Hello World” program

Consider the following:

## Hello World#$message=“Ciao, Mondo”;print “$message \n”;exit;

Page 3: Pearl 3

Perl VariablesPerl Variables

$message is called a variable, something with a name used to hold one or more pieces of information.

All computer languages have the ability to create variables to store and manipulate data.

Perl differs from other languages because you do not specify the “type” (i.e. integer, real, character, etc.) only the “complexity” of the data.

Page 4: Pearl 3

Perl VariablesPerl Variables

Perl has 3 ways of storing data:

1. Scalar For single data items, like numbers or strings.

2. Arrays For ordered lists of scalars. Scalars indexed by

numbers.

3. Associative arrays or “hashes” Like arrays, but uses “keys” to identify the scalars.

Page 5: Pearl 3

Scalar VariablesScalar Variables

Examples

#

$no_of_chrs=24; # integer

$per_cent_identity=0; # also integer

$per_cent_identity=99.50; # redefined as real

$pi = 3.1415926535; # floating point (real)

$e_value=1e-40; # using scientific notation

$dna=“GCCTACCGTTCCACCAAAAAAAA”; # string -double quotes

$dna=‘GCCTACCGTTCCACCAAAAAAAA’; # string -single quotes

Page 6: Pearl 3

Scalar VariablesScalar Variables

CASE is important, $DNA ≠ $dna; (true for all variables)

Scalars must be prefixed with a $ whenever they are used (is there a $? Yes → it is a scalar). The next character should be a letter and not a number (true for all variables).

Scalars can be happily redefined at any time (e.g. integer → real → string):

# unlikely example

$dna = 0; # integer

$dna = “GGCCTCGAACGTCCAGAAA”; # now it’s a # string

Page 7: Pearl 3

Doing things with scalars..Doing things with scalars..#$a =1.5;$b =2.0; $c=3;$sum = $a+$b*$c; # multiply by $b by $c, add to $a #while ($j<100) { $j++; # means $j=$j+1, i.e. add 1 to j print “$j\n”;}#$dna1=“GCCTAAACGTC”;$polyA=“AAAAAAAAAAAAAAAA”;$dna1 .= $polyA; # add one string to another # (equiv. $dna1 = $dna1.$polyA)$no_of_bases = length($dna2); # length of a scalar

Page 8: Pearl 3

More about strings..More about strings..

There is a difference between strings with ‘ and “

#

$nchr = 24;

$message=“chromosones in human cell =$nchr”;

print $message;

$message = ‘chromosones in human cell =$nchr’;

print $message;

exit;single quotes

double quotes

OUTPUT

chromosones in human cell =24

chromosones in human cell =$nchr

Page 9: Pearl 3

More about stringsMore about strings

Double quotes “ interpret variables, single quotes ‘ do not:

$dna=‘GTTTCGGA’;

print “sequence=$dna”;

print ‘sequence=$dna’;

OUTPUT

sequence=GTTTCGGA

sequence=$dna

Normally you would want double quotes when using print.

Page 10: Pearl 3

@days_in_month=(31,28,31,30,31,30,31,31,30,31,30,31);@days_of_the_week=(‘mon’, ‘tue’, ‘wed’ ,’thu’,’fri’,’sat’,’sun’);@bases = (‘adenine’, ‘guanine’, ‘thymine’, ‘cytosine’, ‘uracil’);@GenBank_fields=( ‘LOCUS’,

‘DEFINITION’,‘ACCESSION’,...);

ArraysArraysCollections of numbers, strings etc can be stored in arrays. In Perl arrays are defined as ordered lists of scalars and are represented with the @ character.

Initializing arrays with lists

Page 11: Pearl 3

Arrays - elementsArrays - elements

To access the individual array elements you use [ and ] :

@poly_peptide=(‘gly’,’ser’,’gly’,’pro’,’pro’,’lys’,’ser’,’phe’);# now mutate the peptide$poly_peptide[0]=‘val’;$i=0; # print out what we havewhile ($i<8) { print “$poly_peptide[$i] “; $i++;}

Lookarray index

The numbers used to identify the elements are called indices.

Page 12: Pearl 3

Arrays - elements Arrays - elements

When accessing array elements you use $ - why ? Because array elements are scalar and scalars must have $;

@poly_peptide=(..);

$poly_peptide[0] = ‘val’;

This means that you can have a separate variable called $poly_peptide because $poly_peptide[0] is part of @poly_peptide, NOT $poly_peptide.

This may seem a bit weird, but that's okay, because it is weird.Unix Perl Manual

Page 13: Pearl 3

Array indices start from 0 not 1 ;

Array elementsArray elements

$poly_peptide[0]=‘var’;

$poly_peptide[1]=‘ser’;

$poly_peptide[7]=‘phe’;

The last index of the array can be found from $#name_of_array, e.g. $#poly_peptide. You can also use negative indices: it means you count back from the end of the array. Therefore

$poly_peptide[-1]= $poly_peptide[$#poly_peptide] = $poly_peptide[7]

Page 14: Pearl 3

Array propertiesArray propertiesLength of an array:

$len = $#poly_peptide+1;

The size of the array does not need to be defined – it can grow dynamically:

# begin program$i=0;while ($i<100) { $polyA[$i]=‘A’; $i++;}

Page 15: Pearl 3

Useful Array functionsUseful Array functions

PUSH and POP

Functions commonly used for manipulating a stack:

PUSHPOP

F.I.L.O = First In Last Out

Very common in computer programs

Page 16: Pearl 3

Array functions – PUSH and POPArray functions – PUSH and POP# part of a program that reads a database into an array

# open database etc first..@dblines=(); # resets @dblines while ($line=<DB>) {

push @dblines,$line; # push $line onto array

}

...

while (@dblines) {

$record = pop @dblines; # pop line off and use it

.... do something

}

Page 17: Pearl 3

Scalar ContextsScalar Contexts

If you provide an expression (e.g. an array) when Perl expects a scalar, Perl attempts to evaluate the expression in a scalar context. For an array this is the length of an array:

$length=@poly_peptide;

$length=$#poly_peptide+1;

This is equivalent to

Hence:

while (@dblines) {

..

array in scalar context = length of array

Page 18: Pearl 3

Special variablesSpecial variables

$_Set in many situations such as reading from a file or in a foreach loop.

$0Name of the file currently being executed.

$]Version of Perl being used.

@_Contains the parameters passed to a subroutine.

@ARGVContains the command line arguments passed to the program.

Perl defines some variables for special purposes, including:

Some are read-only and cannot be changed: see man perlvar for more details.

Page 19: Pearl 3

Associative Arrays (Hashes)Associative Arrays (Hashes)

Similar to normal arrays but the elements are identified by keys and not indices. The keys can be more complicated, such as strings of characters.

Hashes are indicated by % and can be initialized with lists like arrays:

%hash = (key1,val1,key2,val2,key3,val3..);

Page 20: Pearl 3

Associative Arrays (Hashes)Associative Arrays (Hashes)Examples

%months=(‘jan’,31,’feb’,28,’mar’,31,’apr’,30);

Alternatively,

%months=(‘jan’=> 31,

’feb’=> 28,

’mar’=> 31,

’apr’=> 30);

=> is a synonym for ,

keyvalue

Page 21: Pearl 3

Associative Arrays (Hashes)Associative Arrays (Hashes)Further examples#

%classification = (‘dog’ => ‘mammal’, ‘robin’ => ‘bird’, ‘snake’ => ‘reptile’);

%genetic_code = (

‘TCA’ => ‘ser’,

‘TTC’ => ‘phe’,

‘TTA’ => ‘leu’,

‘TTA’ => ‘STOP’

‘CCC’ => ‘pro’,

...

);

Page 22: Pearl 3

The elements of a hash are accessed using curly brackets, { and } :

Associative Arrays (Hashes) - elementsAssociative Arrays (Hashes) - elements

$genetic_code{TCA} = ‘ser’;

$genetic_code{CCC} = ‘pro’;

$genetic_code{TGA} = ‘STOP’;

Note the $ sign: the elements are scalars and so must be preceded by $, even though they belong to a % (just as for arrays).

Page 23: Pearl 3

Associative Arrays (Hashes) – useful Associative Arrays (Hashes) – useful functionsfunctions

existsindicates whether a key exists in the hash

if (exists $genetic_code{$codon}) {

...

}else {

print “Bad codon $codon\n”;

exit;

}

Page 24: Pearl 3

Associative Arrays (Hashes) – useful Associative Arrays (Hashes) – useful functionsfunctions

keys and valuesmakes arrays from the keys and values of a hash.

@codons = keys %genetic_code;@amino_acids = values %genetic_code;

Often you will see code like the following:

foreach $codon (keys %genetic_code) { if ($genetic_code{$codon} eq ‘STOP’) { last; # i.e. stop translating } else { $protein .= $genetic_code{$codon};}