Chapman bosc2010 biopython

Post on 11-May-2015

673 views 3 download

Tags:

Transcript of Chapman bosc2010 biopython

Community Integration Democratization

Biopython: challenges

Brad Chapman

Peter Cock

Biopython contributors

http://biopython.org

10 July 2010

Community Integration Democratization

3 challenges for successful open source

projects

Community

Integration

Democratization

Community Integration Democratization

Distributed code access

Community Integration Democratization

Recruiting and training

Google Summer of Code

2009 Eric Talevich

phyloXML; Bio.Phylo

Nick Matzke

Biogeographical Phylogenetics

2010 Joao Rodrigues

Structural biology; Bio.PDB

Community Integration Democratization

Answering questions better

Community Integration Democratization

Recognizing contributions

Community Integration Democratization

Diversity of Python bioinformatics

Community Integration Democratization

Interoperability

Avoid re-implementation

Convert core objects

Document workflows with multiple

libraries

Communicate better

Community Integration Democratization

Wrapping external tools

import subprocess

from Bio.Blast.Applications import (

NcbiblastxCommandline)

cl = NcbiblastxCommandline(query="opuntia.fasta",

db="nr", evalue=0.001, outfmt=5,

out="opuntia.xml")

subprocess.call(str(cl))

Community Integration Democratization

Documenting standards

Community Integration Democratization

Making code easier to use

>>> from Bio import SeqIO

>>> memory_dict = SeqIO.index("in.gb", "genbank")

>>> memory_dict.keys()

[’Z78484.1’, ... ’Z78471.1’]

>>> seq_record = memory_dict["Z78475.1"]

>>> print seq_record.description

P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA

>>> seq_record.seq

Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’,

IUPACAmbiguousDNA())

Community Integration Democratization

Challenges of big data

Community Integration Democratization

Cloud: easier to distribute

On-demand computational resources like

Amazon EC2

Provide ready-to-go images

Biopython and many associated

bioinformatics libraries

Biological data

http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/

Community Integration Democratization

Following up

Home http://biopython.org

Code http://github.com/biopython

BOSC Talk to Eric, Tiago or myself