AllBio and EU CodeFest 2014

Phd Student @Bioinformaticsand Population GenomicsSupervisor:Yannick Wurm | Before:

AllBio EU CodeFestbmpvieira.com/allbio14

Bruno Vieira @bmpvieira

@yannick__

Some problems I facedduring my research:

Difficulty getting relevant descriptionsand datasets from NCBI API using bio* libsFor web projects, needed to implementthe same functionality on browser andserverDifficulty writing scalable, reproducibleand complex bioinformatic pipelines

- Modular and universal bioinformatics

Pipeable UNIX command line tools and

JavaScript / Node.js APIs for bioinformatic

analysis workflows on the server and browser.

Collaborates with - Represent biological data on the web

- Build data pipelinesProvides a streaming interface between every file

format and data storage backend. "git for data" | |

Bionode.io

dat-data.com @maxogden @mafintosh

ExamplesBASH

JavaScript

bionode.io (online shell)

bionode-ncbi urls assembly Solenopsis invicta | grep genomic.fna

http://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG/ GCA_000188075.1_Si_gnG_genomic.fna.gz

bionode-ncbi download sra arthropoda | bionode-sra

bionode-ncbi download gff bacteria

var ncbi = require('bionode-ncbi') ncbi.urls('assembly', 'Solenopsis invicta'), gotData) function gotData(urls) { var genome = urls[0].genomic.fna download(genome) })

Difficulty getting relevant description anddatasets from NCBI API using bio* libsPython example

Solution:

import xml.etree.ElementTree as ET from Bio import Entrez Entrez.email = "mail@bmpvieira.com" esearch_handle = Entrez.esearch(db="assembly", term="Achromyrmex") esearch_record = Entrez.read(esearch_handle) for id in esearch_record['IdList']: esummary_handle = Entrez.esummary(db="assembly", id=id) esummary_record = Entrez.read(esummary_handle) documentSummarySet = esummary_record['DocumentSummarySet'] document = documentSummarySet['DocumentSummary'][0] metadata_XML = document['Meta'].encode('utf-8') metadata = ET.fromstring('<root>' + metadata_XML + '</root>') for entry in Metadata[1]: print entry.text

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA_000188075.1_Si_gnG

bionode-ncbi

Need to reimplement the same code onbrowser and server.Solution: JavaScript everywhere

is converting parsers toBionode

AfraSequenceServerGeneValidatorBioJS

Biodalliance

Difficulty writing scalable, reproducible andcomplex bioinformatic pipelines.Solution: Node.js everywhereStreams var ncbi = require('bionode-ncbi') var tool = require('tool-stream') var through = require('through2') var fork1 = through.obj() var fork2 = through.obj()

ncbi .search('sra', 'Solenopsis invicta') .pipe(fork1) .pipe(dat.reads)

fork1 .pipe(tool.extractProperty('expxml.Biosample.id')) .pipe(ncbi.search('biosample')) .pipe(dat.samples)

fork1 .pipe(tool.extractProperty('uid')) .pipe(ncbi.link('sra', 'pubmed'))

Benefit from other JSprojects

Dat BioJS NoFlo

Reusable, small and testedmodules

Some users and Contributors:

(UC San Diego)Michael LovciOlga Botvinnik

DatBiodallianceBioJSYeo Lab

AfraGeneValidator

DNADigest

Thanks!Acknowledgements:

@yannick__@maxogden@mafintosh@alanmrice@dasmoth@biodevops

Why Node.js / JavaScript applies well to Bioinformatics

Easy to write for Streams

Same language everywhere (JavaScript)Package Manager that works ( )Huge number modules ( )Use other JS projects ( , , )Possible to write

StreamsCLI wrappers

Reusable, small and tested modules

NPM93327, 199/day

Dat BioJS NoFloDesktop GUI apps in JS

Module counts

Package Manager that works

Not only for JavaScript, C/C++ too:

npm install bionodenpm install bionode -gnpm testnpm startnpm run test-browsernpm run build-docsnpm initnpm publish

Node.js style C/C++ modulesNative C/C++ running in Google V8

AllBio and EU CodeFest 2014

Science

Transcript of AllBio and EU CodeFest 2014

CodeFest 2014 - Pentesting client/server API

CodeFest 2014. Christopher Bennage — CQRS Journey: scalable, available, and maintainable systems

CodeFest 2014. Ruotger Deecke — iOS Theming Engine

CodeFest 2010. Билык В. — CouchDB: от теории к практике

2013 Steel City Codefest Final Presentations

CodeFest 2012. Петунин Д. — Идеальные инструменты для разработки на HTML5

CodeFest 2014. Brett Martin — Postmortem of a venture-backed Startup

CodeFest 2011. Merzli - Kanban

Kudo Codefest: Faster data retrival with SQL query optimization

CodeFest, июль 2012. Бугаев Л. — Запуск продукта в мобильных технологиях

CodeFest 2012. Ткачук М. — Hard Rock Design

Sponsors - Open Bioinformatics Foundation€¦ · OpenBio Codefest The annual pre-BOSC Codefest is an opportunity for anyone interested in open science, biology and programming to

CodeFest dirty facts about AngularJS

Kudo codefest : Delivering High Quality Software Through Better Release Process

CodeFest 2014. Макаров Н. — Selenium Grid. OK Version

CodeFest 2012. Анкудинов Д. — О специфике мультиплатформенного тестирования игр

CodeFest 2013. Родионов А. — От Selenium к Watir — путь к просветлению

CodeFest 2014. Eduardo Bravo — Google+, A Deep Dive into Mobile Testing

CodeFest 2010. Погребняк А. — Проблемы оценки труда программистов

CodeFest 2014. Макишвили В., Дулин М. — BEViS & bt