Shell&scripts& &Week&2,&Lecture&4
Transcript of Shell&scripts& &Week&2,&Lecture&4
9/6/12&
1&
2012&(&BMMB&597D:&Analyzing&Next&Genera=on&Sequencing&Data&
&&Week&2,&Lecture&4 &
István'Albert''
Biochemistry&and&Molecular&Biology&&and&Bioinforma=cs&Consul=ng&Center&
&Penn&State&
Shell&scripts&
Collect&mul=ple&commands&into&a&single&program&
• Run&the&same&commands&again&or&on&other&data&
• Document&the&steps&and&describe&the&thought&process&
&
Crea=ng&and&refining&a&shell&script& Add&another&step &
9/6/12&
2&
Using&Shell&Variables& Single&and&double"ed&strings&
Essen=al&detail&
• Remember&that&you&may&be&using&dis=nct&languages&–&shell,&awk,&perl,&python&variables&may&look&the&same&but&are&interpreted&differently!&&
• Avoid&mixing&the&context&(& i.e.&awk&program&modified&via&&bash&variables.&&Instead&write&the&awk&program&separately&and&pass&the&variables&into&it.&
Error&management&
9/6/12&
3&
Strict&error&checking& Looping&over&mul=ple&files&
Bash&has&lots&of&features&
We&will&slowly&introduce&some&features&along&the&way&
Entrez&Programming&U=li=es:&EU=ls&
• Query&and&download&Entrez&(Genbank&and&other&databases)&via&URLS&
• Combined&with&UNIX&tools&allow&you&to&automa=cally&download&data&&&
• Named&as&efetch,&esearch,& &etc. &
9/6/12&
4&
Download&via&Eu=ls& Refine&our&script&
Valid&databases& Valid&return&types&and&modes&
9/6/12&
5&
Adding&more&interac=vity& Always&inves=gate&files&
XML&markup&
XML&is&a&file&markup&format&but&it&is¬&a&data&format.&
Advanced&topic:&XSLT&transforma=ons&
• An&XML&document&can&be&transformed&by&rules&described&in&another&XML&document&&
• XSL&Transforma=ons&
• You&don’t&need&to&know&these,&but&are&very&handy&if&someone&can&make&one&for&you&
• Programmers&love&to&make&these&post&it&on&&h1p:/www.stackoverflow.com'
9/6/12&
6&
Example&XSLT& Homework&4&
• Create&a&shell&script&that&can&download&three&proteins&of&interest&to&you&from&the&protein&database.&
• Write&a&shell&script&that&lists&the&journal&papers&that&GeneBank&references&for&your&data&