The Evolution of e-Research: Machines, Methods and Music

73
The Evolution of e- Research Machines, Methods and Music David De Roure

description

David De Roure's Inaugural Lecture on 28th October at Oxford e-Research Centre, University of Oxford, UK 10 years ago we saw a few early adopters of e-Science technology; now we see acceleration of research through broader adoption and sharing of tools, techniques and artefacts, both for 'big science' and the 'long tail scientist'. Will this incremental trend continue or are we seeing glimpses of a phase change ahead, where researchers harness these emerging digital capabilities to address research questions in ways that simply were not possible before? This talk will describe three generations of e-Research, using the myExperiment social website as a lens to glimpse future research practice, and focusing on a web-scale computational musicology project as an illustration of 3rd generation thinking. Also available from http://wiki.myexperiment.org/index.php/Presentations

Transcript of The Evolution of e-Research: Machines, Methods and Music

Page 1: The Evolution of e-Research: Machines, Methods and Music

The Evolutionof e-ResearchMachines, Methods and Music

David De Roure

Page 2: The Evolution of e-Research: Machines, Methods and Music

MathsPhysics

Medical electronics PhD in distributed declarative

programming language design

Hypermedia Large scaleDistributedSystems

Semantic Sensor Networks

WebScience

Devices

AmorphousComputing

Digital Social

Research

Equator

e-Science

MusicElectronics Programming

Transputers

Temporal Media

Computational Musicology

AdvancedKnowledgeTechnologies

Semantic Web

ProcessNetworks

myExperiment

Web 2Statistics

Grid

LinkedData

1981

2010

Environmentalsensing

Networks

VREs

MITAJGH PH WH

PEOPLEEOPLE Agents

Semantic Grid

e-Laboratories

Workflows

QBH

Page 3: The Evolution of e-Research: Machines, Methods and Music

Overview

Generation 1: Early adopters

Generation 2: Embedding

Generation 3: Radical sharing

SALAMI

A case study in 3rd generation e-Research

Page 4: The Evolution of e-Research: Machines, Methods and Music

e-Science

• e-Science was defined by John Taylor (Director General of the UK Research Councils) asglobal collaboration in key areas of science and the next generation of infrastructure that will enable it

• e-Science was the name of the destination• It became the name of the journey• When we arrive, the destination is just called

science

Page 5: The Evolution of e-Research: Machines, Methods and Music

“e-research extendse-Science andcyberinfrastructureto other disciplines, including the humanities andsocial sciences.”

e-Research

http://mitpress.mit.edu/catalog/item/default.asp?tid=12185&ttype=2

Page 6: The Evolution of e-Research: Machines, Methods and Music

2000 – 2005

Generation 1

Page 7: The Evolution of e-Research: Machines, Methods and Music

...the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites

Tony Hey and Anne Trefethen

Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203

Page 8: The Evolution of e-Research: Machines, Methods and Music

26/2/2007 | myExperiment | Slide 8

Jeremy Frey

Page 9: The Evolution of e-Research: Machines, Methods and Music

• Workflows are the new rock and roll

• Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources

• The era of Service Oriented Applications

• Repetitive and mundane boring stuff made easier

Carole Goble

E. Science laboris

Page 10: The Evolution of e-Research: Machines, Methods and Music

Kepler

Triana

BPEL

Taverna

Trident

Meandre

Galaxy

Page 11: The Evolution of e-Research: Machines, Methods and Music

co-shapingco-design

co-creation

co-constitution

co-evolution

co-construction

co-

co-realisation

Page 12: The Evolution of e-Research: Machines, Methods and Music

http://webscience.org

Page 13: The Evolution of e-Research: Machines, Methods and Music

Box of Chemists

My Chemistry Experiment

CombeChem

Page 14: The Evolution of e-Research: Machines, Methods and Music

CombeChem

Page 15: The Evolution of e-Research: Machines, Methods and Music

empower to equip or supply with an ability;

enable

servicethe performance of duties or the

duties performed as or by a waiter or servant

Page 16: The Evolution of e-Research: Machines, Methods and Music

Early adoptors of tools.

Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline.

Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data.

Science is accelerated and practice beginning to shift to emphasise in silico work.

1st Generation Summary Thanks to Iain Buchanand the chipmunks

Page 17: The Evolution of e-Research: Machines, Methods and Music

2005 – 2010

Generation 2

Page 18: The Evolution of e-Research: Machines, Methods and Music

• Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle

• Paul meets Jo. Jo is investigating Whipworm in mouse.

• Jo reuses one of Paul’s workflow without change.• Jo identifies the biological pathways involved in

sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite.

• Previously a manual two year study by Jo had failed to do this.

Reuse, Recycling, Repurposing

Carole Goble

Page 19: The Evolution of e-Research: Machines, Methods and Music

Carole Goble “e-Science is me-Science: What do Scientists want?”, EGEE 2006

“There are these great collaboration tools that 12-year-olds are using. It’s all back to front.”

Robert Stevens

Page 20: The Evolution of e-Research: Machines, Methods and Music

“A biologist would rather share their toothbrush than their gene name”

Mike Ashburner and othersProfessor in Dept of Genetics,

University of Cambridge, UK

Page 21: The Evolution of e-Research: Machines, Methods and Music

Data mining: my data’s mine and your data’s mine

Page 22: The Evolution of e-Research: Machines, Methods and Music

workflows

photosmovies

slides

Page 23: The Evolution of e-Research: Machines, Methods and Music

mySpace for scientists!FacebookNot

too open!

too passé!

Page 24: The Evolution of e-Research: Machines, Methods and Music

Open Repositories

Researchers

Social Networkers

Developers

Social Scientists

Page 25: The Evolution of e-Research: Machines, Methods and Music

“Facebook for Scientists” ...but different to Facebook!

A repository of research methods

A community social network of people and things

A Social Virtual Research Environment

A probe into researcher behaviour

Open source (BSD) Ruby on Rails app

REST and SPARQL interfaces, supports Linked Data

Inspiration for: BioCatalogue, MethodBox and SysMO-SEEK

myExperiment currently has 4400 members, 236 groups, 1336 workflows, 351 files and 141 packs

Page 26: The Evolution of e-Research: Machines, Methods and Music

http://www.myexperiment.org/

Page 27: The Evolution of e-Research: Machines, Methods and Music

Visits to www.myexperiment.org (Oct 2010)

Global collaboration in key areas of science and the next generation of infrastructure that will enable it

http://wiki.myexperiment.org

Page 28: The Evolution of e-Research: Machines, Methods and Music

data

method

Page 29: The Evolution of e-Research: Machines, Methods and Music

Methods should be first class citizens

Celebrate the flux! Let the data flow through the pipelines. Nail down the methods not the data!

Towards “Linked Open Methods”

Though this be madness, yet there is method in it

* Polonius in Hamlet ** Sean Bechhofer in Manchester *** Not the e-Science Envoy

*

***

**

Data bonanza => Methods bonanza!

Page 30: The Evolution of e-Research: Machines, Methods and Music

It’s not just the data

And what other people do with it

...that you never thought of

It’s what you do with it that counts

Page 31: The Evolution of e-Research: Machines, Methods and Music

Results

Logs

Results

Metadata PaperSlides

Feeds into

produces

Included in

produces Published in

produces

Included in

Included in Included in

Published in

Workflow 16

Workflow 13

Common pathways

QTLPaul’s PackPaul’s Research

Object

Page 32: The Evolution of e-Research: Machines, Methods and Music

Research Objects enable data-intensive research to be:

1. Replayable – go back and see what happened2. Repeatable – run the experiment again3. Reproducible – independent expt to reproduce4. Reusable – use as part of new experiments5. Repurposeable – reuse the pieces in new expt6. Reliable – robust under automation7. Referenceable – citable and traceable

The Six Rs of Research Object Behaviours

http://blog.openwetware.org/deroure/?p=56

Page 33: The Evolution of e-Research: Machines, Methods and Music

Semantically enhanced publication versus

Shared digital Research Objects

Challenging the mindset of paper-sized chunks

Page 34: The Evolution of e-Research: Machines, Methods and Music

Documentsunder glass

Page 35: The Evolution of e-Research: Machines, Methods and Music
Page 36: The Evolution of e-Research: Machines, Methods and Music

Projects delivering now.Some institutional embedding.Key characteristic is re-use – of the increasing pool of tools, data and methods across areas/disciplines. Contain some freestanding, recombinant, reproducible research objects. New scientific practices are established and opportunities arise for completely new scientific investigations.Some expert curation.

2nd Generation Summary

Page 37: The Evolution of e-Research: Machines, Methods and Music

2010 – 2015

Generation 3

Page 38: The Evolution of e-Research: Machines, Methods and Music

4th Paradigm

The Fourth Paradigm: Data-Intensive Scientific Discovery

Presenting the firstbroad look at the rapidly emerging field of data-intensive science

http://research.microsoft.com/en-us/collaboration/fourthparadigm/

Page 39: The Evolution of e-Research: Machines, Methods and Music

http://blogs.nature.com/fourthparadigm/

Page 40: The Evolution of e-Research: Machines, Methods and Music
Page 41: The Evolution of e-Research: Machines, Methods and Music

BioEssays, 26(1):99–105, January 2004

Doug Kell

Page 42: The Evolution of e-Research: Machines, Methods and Music

Francois Belleau

Page 43: The Evolution of e-Research: Machines, Methods and Music

“…to discover proteins that interact with transmembrane proteins, particularly those that can be related to neuro-degenerative diseases in which amyloids play a significant role”1) Taverna provenance exposed as RDF2) myExperiment RDF document for a protein discovery workflow3) Mocked-up BioCatalogue document using myExperiment RDF

data as example4) Provisional RDF documents obtained from the ConceptWiki

(conceptwiki.org) development server5) An RDF document for an example protein, obtained from the RDF

interface of the UniProt web site

A Bioinformatics Experiment Scott Marshall Marco Roos

Page 44: The Evolution of e-Research: Machines, Methods and Music

LifeGuide http://www.lifeguideonline.org/

Lucy Yardley

Page 45: The Evolution of e-Research: Machines, Methods and Music

MethodBox http://www.methodbox.org/

Enable cross disciplinary research into Major Public Health problems

Ease handling data and sharing results and insights

Page 46: The Evolution of e-Research: Machines, Methods and Music

http://www.galaxyzoo.org/

Page 47: The Evolution of e-Research: Machines, Methods and Music

Arfon Smith

http://www.zooniverse.org/

Page 48: The Evolution of e-Research: Machines, Methods and Music

The solutions we'll be delivering in 5 yearsCharacterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. Routine use.Key characteristic is radical sharing.Research is significantly data driven – plundering the backlog of data, results and methods. Publishing by the social networkIncreasing automation and decision-support for the researcher – the VRE becomes assistive. Curation is autonomic and social.

3rd Generation Summary

Page 49: The Evolution of e-Research: Machines, Methods and Music
Page 50: The Evolution of e-Research: Machines, Methods and Music

Easy and low risk to startProgress to advanced skillsFor researchersNo obligationGo as far as you want

Find a service & relax

Intellectual ramps

Malcolm Atkinson

Page 51: The Evolution of e-Research: Machines, Methods and Music

NRAO/AUI/NSF

telescopes for the naked mindDatascopes

Malcolm Atkinson

From Signal to Understanding

Page 52: The Evolution of e-Research: Machines, Methods and Music

Jeannette M. Wing COMMUNICATIONS OF THE ACM March 2006/Vol. 49, No. 3 Pages 33-35

Page 53: The Evolution of e-Research: Machines, Methods and Music

2010 – 2011and beyond

Music and Linked Data

Page 54: The Evolution of e-Research: Machines, Methods and Music
Page 55: The Evolution of e-Research: Machines, Methods and Music

http://www.openarchives.org/ore/terms/aggregates

http://eprints.ecs.soton.ac.uk/id/eprint/20817

Page 56: The Evolution of e-Research: Machines, Methods and Music

It’s about enabling the join

Ben Fields, 6th October 2010

Page 57: The Evolution of e-Research: Machines, Methods and Music

SALAMI: Structural Analysis of Large Amounts of Music

Information

David De RoureJ. Stephen Downie

Ichiro Fujinaga

Page 58: The Evolution of e-Research: Machines, Methods and Music

www.diggingintodata.org

Page 59: The Evolution of e-Research: Machines, Methods and Music

The SALAMI collaboration• DDeR (e-Research South), J. Stephen Downie (Illinois) and

Ichiro Fujinaga (McGill)• NCSA donating 250,000 supercomputer hours• 350,000 pieces of music (23,000 hours)

– Internet Archive, DRAM, IMIRSEL, McGill• Feature analysis and structural analysis• Music Ontology by Yves Raimond (BBC)• Musicologists from McGill and Southampton• Sharing of analyses

http://salami.music.mcgill.ca

Page 60: The Evolution of e-Research: Machines, Methods and Music

Digital Music Collections

Crowdsourced ground truth

Community Software

Linked Data Repositories

Supercomputer

23,000 hours ofrecorded music

250,000 hours NCSASupercomputer time

Music InformationRetrieval Community

Page 61: The Evolution of e-Research: Machines, Methods and Music

Ashley Burgoyne http://www.sonicvisualiser.org/

Page 62: The Evolution of e-Research: Machines, Methods and Music

MIREX Overview• Began in 2005• Tasks defined by community debate• Data sets collected and/or donated• Participants submit code to IMIRSEL• Code rarely works first try • Huge labour consumption getting

programs to work• Meet at ISMIR to discuss results Stephen Downie

http://www.music-ir.org/mirex

Page 63: The Evolution of e-Research: Machines, Methods and Music

MIREX TASKSAudio Artist Identification Audio Onset Detection

Audio Beat Tracking Audio Tag Classification

Audio Chord Detection Audio Tempo Extraction

Audio Classical Composer ID Multiple F0 Estimation

Audio Cover Song Identification Multiple F0 Note Detection

Audio Drum Detection Query-by-Singing/Humming

Audio Genre Classification Query-by-Tapping

Audio Key Finding Score Following

Audio Melody Extraction Symbolic Genre Classification

Audio Mood Classification Symbolic Key Finding

Audio Music Similarity Symbolic Melodic Similarity

Page 64: The Evolution of e-Research: Machines, Methods and Music

seasr.org/meandreMeandre

Page 65: The Evolution of e-Research: Machines, Methods and Music

“Signal”Digital Audio

“Ground Truth”

Community

It’s web-like!

Q. If and when should community-generated content be assimilated into managed repositories?

StructuralAnalysis

Page 66: The Evolution of e-Research: Machines, Methods and Music

How country is my country?

Kevin Page and Ben Fieldshttp://www.nema.ecs.soton.ac.uk/countrycountry/

Page 67: The Evolution of e-Research: Machines, Methods and Music

Stephen Downie

Music and computational thinking

Page 68: The Evolution of e-Research: Machines, Methods and Music

“Again, it [the Analytical Engine] might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should bealso susceptible of adaptations to the action of the operating notation and mechanism of the engine...”

Page 69: The Evolution of e-Research: Machines, Methods and Music

“Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.”

Ada, The Enchantress of Numbers: Poetical Scienceby Betty Alexandra Toole

http://www.well.com/user/adatoole/

Betty Alexandra Toole

Page 70: The Evolution of e-Research: Machines, Methods and Music

I can write a workflow that creates workflows based on those of others, and automatically modify it – think genetic mutation and crossovers. Who owns it?

I can register a query over an increasing number and diversity of “linked data” sources to ask new research questions.

http://eresearch-ethics.org/

The computer can learn from the activities of 1,000,000 scientists – and be indistinguishable from them?

What about the ethics of Citizen Social Science? Of citizens designing experiments?

Page 71: The Evolution of e-Research: Machines, Methods and Music

Co-*

MethodsAccess ramps

Research ObjectsComputational thinking

Ethics of e-Research at scaleEnjoy th

e Open D

ay!

Page 72: The Evolution of e-Research: Machines, Methods and Music

[email protected] Thanks to: Jeremy Frey & CombeChem; Carole Goble, myGrid and myExperiment; Iain Buchan & Obesity e-Lab; Sean Bechhofer; Doug Kell; Marco Roos; Lucy Yardley; Arfon Smith; Malcolm Atkinson; Stephen Downie, Kevin Page, Ben Fields, Ashley Burgoyne and NEMA/SALAMI; Betty Toole.

http://www.myexperiment.org/packs/153

Page 73: The Evolution of e-Research: Machines, Methods and Music