BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to...

45
BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London

Transcript of BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to...

Page 1: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

BIOC0023 – A (shortened) introduction to computing

Andrew C.R. Martin

University College London

Page 2: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Aims and Objectives

To introduce some fundamentals of:– Computing concepts– Operating systems– Databases– Algorithms– Programming– Using the command line

Page 3: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

Page 4: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

VisualOutput

Data

Program

Database

Page 5: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

VisualOutput

Data

Program

Database

Page 6: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

VisualOutput

Data

Program

Database

Page 7: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

VisualOutput

Data

Program

Database

Page 8: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Computing concepts

VisualOutputData Program

RASMOL

pdb2pir

Page 9: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Operating systems

Page 10: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

The software interface between users’ software and the computer’s hardware

Provides low-level networking Provides a set of tools for:

file handling user handling and security (e.g. passwords)

May provide a graphical user interface (GUI) May include other non-essential bundled tools

Operating Systems

Page 11: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

VMS (Dec) VME (ICL) CP/M (PCs) PRIMOS (Prime) MS-DOS (PCs) OS/2 (PCs) AmigaDOS (CBM) Unix (Various) MacOS (Apple)

Windows (PCs) Unix/Linux (Various) MacOS (Apple)

Operating Systems

Page 12: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

LinuxMacOS Windows

ComputerScientists

ZoologyDNA

Phylogeny

Mac PC

CrystallographyProtein

sequence &structure

Mini/Mainframe Operating Systems

Page 13: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Databases

Page 14: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Databases and databanks

DatabankA collection of data (normally in simple text files)

without an associated query toolQuery tools may be written as separate

applications

DatabaseA structured collection of data with some tool

enabling it to be ‘queried’

Page 15: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Relational databases

Microsoft AccessSQL ServerOracleSybaseMySQLPostgreSQL

Page 16: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Data are separated into tables or relations

Good database design requirescareful thought and planningnormalization

Maintains data integrity

Relational databases

Page 17: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

SQL - Structured Query Language

‘Standard’ database query languageUnfortunately most databases extend or deviate

from the standard

Provides 4 types of command:Schema creationData insertionData extractionDatabase management

Page 18: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Algorithms

Page 19: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Algorithms

“A process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer.”

A complete and precise set of steps that will solve a problem and achieve an identical result whenever given the same set of data to a defined level of accuracy.

• Ordered steps• Repeatable• Known/defned accuracy

Page 20: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

AlgorithmsSuppose we wish to count the amino acids in a PDB file...

Count the C-alpha (CA) atoms

Image from:ww2.chemistry.gatech.edu/~lw26/structure/protein/peptide_bond/

Page 21: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

AlgorithmsCount the CA atoms

Page 22: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

PDB File

EOF?

Read Line

DisplayResult

Start

Set ResCount = 0

Atom =CA?

IncrementResCount

YesYes

No

No

Page 23: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Programming

Page 24: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Programming languages

Rarely write directly in instructions understood by a computer

Use a high-level languageMany such languages:

CPerlC++JavaSmalltalkPython

ForthBASICFORTRANModula-IISimulaProlog

BCPLJavaScriptPascalAWKLISPCobol

Page 25: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Language Categories

SourceFile

SourceFile

Processor

Executable

Compiler Interpreter

Data

Data

Page 26: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

General Purpose Languages - Python

General Purpose / JIT/Part-Compiled / OO• Design emphasizes code readability; concise

syntax.• Comprehensive standard library, plus maths and

scientific and BioPython libraries.

Page 27: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Variables

Scalar variables

a = 5a = a + 1print (a)b = 'Hello world'

Page 28: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Variables

Lists / Arrays (vectors)

position = [5.4, 2.7, 9.5]print (position[1])position[2] = 3.6

Page 29: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Variables

Dictionaries / Hashes

position = {}position['x'] = 5.4position['y'] = 2.7position['z'] = 9.5

Page 30: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Control statements

if:x = -6

if x > 0: print “Positive” elif x == 0: print “zero” else: print “negative”

Comparisons: < <= == >= > !=

Page 31: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Control statements

while:

x = 5 while x > 0: print x x = x - 1

Page 32: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Control statements

for:x = [100, 200, 300]

for i in x: print i ----------------------- for i in range(0, 100): print i ----------------------- x = [100, 200, 300] for i in range(3): print x[i]

Up to but not including this

number

Page 33: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Folders, trees and directories

Page 34: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

OS GUIs

Page 35: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Trees and directories

amartin Reading TEACHING

bioc2001

bioc1007

bioc2004

bioc3010

bioc3101 LECTURES

genbank.png

sprot.png

pdb.png

word2.png

bioc3101.odp

Code.png

db.png

dbgui.png

exe.png

...

...

... ...

......

Page 36: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Trees and directories

...

Windows

Linux/Mac

Each disk (or network share) is the root of the tree:

C:\system D:\data

Everything lives under the same root:

//home/home/amartin/data/data/pdb

Page 37: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Using the BASH shell command line

Linux / Mac :- the standard command lineWindows :- git-bash

Linux / Mac :- the standard command lineWindows :- git-bash

Page 38: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Navigating the tree

...

Where am I?

What's in here?

pwd /c/Documents and Settings/amartin

ls Application Data/ Cookies/Desktop/ Favorites/…

-l Long format-t Sort by time-r Reverse the sort-h Human format for file sizes

ls -ltrh

Page 39: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Navigating the tree

...

Moving down the tree

Moving up the tree

Moving across the tree

Going home

cd Desktoppwd /c/Documents and Settings/amartin/Desktop

cd ..

cd ../Start\ Menupwd /c/Documents and Settings/amartin/Start Menu

cd --or-- cd ~ --or-- cd $HOMEpwd /c/Documents and Settings/amartin/

Page 40: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Handling files

...

View a whole file

View a page at a time

Copying a file

cat /etc/bash.bashrc

less /etc/bash.bashrc Press spacebar for next page'b' for previous page'>' for last page'<' for first page'/string' to search for 'string''q' to quit

cp /etc/bash.bashrc ~/mybashrc.txt

Page 41: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Default input and output

...

Default input (stdin) is the keyboard

Default output (stdout) is the screen

cat (no command prompt displayed)HelloWorldCTRL-d (i.e. press and hold CTRL while pressing d)

cat ~/mybashrc.txt

Page 42: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Redirection and pipes

...

Send the output of a command to a file

Receive input from a file

Sending the output of one program to the input of another

cat > test.txt (no command prompt displayed)Hello WorldCTRL-d

cat test.txt Hello World

cat < test.txt Hello World

cat /etc/bash.bashrc | less

Page 43: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Other

...

Searching: Find lines in a file that contain a string

Creating directories

Removing an empty directory

Removing a file

Removing a directory and all its content

grep return /etc/bash.bashrc (finds all lines containing 'return')

mkdir newdir

rmdir newdir

rm myfile

rm -rf newdir

Page 44: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Other

...

Sorting

Making scripts executable

sort /etc/bash.bashrc (alphabetical sort of lines in the file)

chmod +x scriptfile

Page 45: BIOC0023 – A (shortened) introduction to computing...BIOC0023 – A (shortened) introduction to computing Andrew C.R. Martin University College London Aims and Objectives To introduce

Programming in BASH

...

Renaming a set of files with extension .text to .txtfor file in *.textdo mv $file `basename $file .text`.txtdone