Music Research. The evolution of popular music: USA 1960–2010
The Evolution of e-Research: Machines, Methods and Music
-
Upload
david-de-roure -
Category
Technology
-
view
3.198 -
download
0
description
Transcript of The Evolution of e-Research: Machines, Methods and Music
The Evolutionof e-ResearchMachines, Methods and Music
David De Roure
MathsPhysics
Medical electronics PhD in distributed declarative
programming language design
Hypermedia Large scaleDistributedSystems
Semantic Sensor Networks
WebScience
Devices
AmorphousComputing
Digital Social
Research
Equator
e-Science
MusicElectronics Programming
Transputers
Temporal Media
Computational Musicology
AdvancedKnowledgeTechnologies
Semantic Web
ProcessNetworks
myExperiment
Web 2Statistics
Grid
LinkedData
1981
2010
Environmentalsensing
Networks
VREs
MITAJGH PH WH
PEOPLEEOPLE Agents
Semantic Grid
e-Laboratories
Workflows
QBH
Overview
Generation 1: Early adopters
Generation 2: Embedding
Generation 3: Radical sharing
SALAMI
A case study in 3rd generation e-Research
e-Science
• e-Science was defined by John Taylor (Director General of the UK Research Councils) asglobal collaboration in key areas of science and the next generation of infrastructure that will enable it
• e-Science was the name of the destination• It became the name of the journey• When we arrive, the destination is just called
science
“e-research extendse-Science andcyberinfrastructureto other disciplines, including the humanities andsocial sciences.”
e-Research
http://mitpress.mit.edu/catalog/item/default.asp?tid=12185&ttype=2
2000 – 2005
Generation 1
...the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites
Tony Hey and Anne Trefethen
Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
26/2/2007 | myExperiment | Slide 8
Jeremy Frey
• Workflows are the new rock and roll
• Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources
• The era of Service Oriented Applications
• Repetitive and mundane boring stuff made easier
Carole Goble
E. Science laboris
Kepler
Triana
BPEL
Taverna
Trident
Meandre
Galaxy
co-shapingco-design
co-creation
co-constitution
co-evolution
co-construction
co-
co-realisation
http://webscience.org
Box of Chemists
My Chemistry Experiment
CombeChem
CombeChem
empower to equip or supply with an ability;
enable
servicethe performance of duties or the
duties performed as or by a waiter or servant
Early adoptors of tools.
Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline.
Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data.
Science is accelerated and practice beginning to shift to emphasise in silico work.
1st Generation Summary Thanks to Iain Buchanand the chipmunks
2005 – 2010
Generation 2
• Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle
• Paul meets Jo. Jo is investigating Whipworm in mouse.
• Jo reuses one of Paul’s workflow without change.• Jo identifies the biological pathways involved in
sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite.
• Previously a manual two year study by Jo had failed to do this.
Reuse, Recycling, Repurposing
Carole Goble
Carole Goble “e-Science is me-Science: What do Scientists want?”, EGEE 2006
“There are these great collaboration tools that 12-year-olds are using. It’s all back to front.”
Robert Stevens
“A biologist would rather share their toothbrush than their gene name”
Mike Ashburner and othersProfessor in Dept of Genetics,
University of Cambridge, UK
Data mining: my data’s mine and your data’s mine
workflows
photosmovies
slides
mySpace for scientists!FacebookNot
too open!
too passé!
Open Repositories
Researchers
Social Networkers
Developers
Social Scientists
“Facebook for Scientists” ...but different to Facebook!
A repository of research methods
A community social network of people and things
A Social Virtual Research Environment
A probe into researcher behaviour
Open source (BSD) Ruby on Rails app
REST and SPARQL interfaces, supports Linked Data
Inspiration for: BioCatalogue, MethodBox and SysMO-SEEK
myExperiment currently has 4400 members, 236 groups, 1336 workflows, 351 files and 141 packs
http://www.myexperiment.org/
Visits to www.myexperiment.org (Oct 2010)
Global collaboration in key areas of science and the next generation of infrastructure that will enable it
http://wiki.myexperiment.org
data
method
Methods should be first class citizens
Celebrate the flux! Let the data flow through the pipelines. Nail down the methods not the data!
Towards “Linked Open Methods”
Though this be madness, yet there is method in it
* Polonius in Hamlet ** Sean Bechhofer in Manchester *** Not the e-Science Envoy
*
***
**
Data bonanza => Methods bonanza!
It’s not just the data
And what other people do with it
...that you never thought of
It’s what you do with it that counts
Results
Logs
Results
Metadata PaperSlides
Feeds into
produces
Included in
produces Published in
produces
Included in
Included in Included in
Published in
Workflow 16
Workflow 13
Common pathways
QTLPaul’s PackPaul’s Research
Object
Research Objects enable data-intensive research to be:
1. Replayable – go back and see what happened2. Repeatable – run the experiment again3. Reproducible – independent expt to reproduce4. Reusable – use as part of new experiments5. Repurposeable – reuse the pieces in new expt6. Reliable – robust under automation7. Referenceable – citable and traceable
The Six Rs of Research Object Behaviours
http://blog.openwetware.org/deroure/?p=56
Semantically enhanced publication versus
Shared digital Research Objects
Challenging the mindset of paper-sized chunks
Documentsunder glass
Projects delivering now.Some institutional embedding.Key characteristic is re-use – of the increasing pool of tools, data and methods across areas/disciplines. Contain some freestanding, recombinant, reproducible research objects. New scientific practices are established and opportunities arise for completely new scientific investigations.Some expert curation.
2nd Generation Summary
2010 – 2015
Generation 3
4th Paradigm
The Fourth Paradigm: Data-Intensive Scientific Discovery
Presenting the firstbroad look at the rapidly emerging field of data-intensive science
http://research.microsoft.com/en-us/collaboration/fourthparadigm/
http://blogs.nature.com/fourthparadigm/
BioEssays, 26(1):99–105, January 2004
Doug Kell
Francois Belleau
“…to discover proteins that interact with transmembrane proteins, particularly those that can be related to neuro-degenerative diseases in which amyloids play a significant role”1) Taverna provenance exposed as RDF2) myExperiment RDF document for a protein discovery workflow3) Mocked-up BioCatalogue document using myExperiment RDF
data as example4) Provisional RDF documents obtained from the ConceptWiki
(conceptwiki.org) development server5) An RDF document for an example protein, obtained from the RDF
interface of the UniProt web site
A Bioinformatics Experiment Scott Marshall Marco Roos
LifeGuide http://www.lifeguideonline.org/
Lucy Yardley
MethodBox http://www.methodbox.org/
Enable cross disciplinary research into Major Public Health problems
Ease handling data and sharing results and insights
http://www.galaxyzoo.org/
Arfon Smith
http://www.zooniverse.org/
The solutions we'll be delivering in 5 yearsCharacterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. Routine use.Key characteristic is radical sharing.Research is significantly data driven – plundering the backlog of data, results and methods. Publishing by the social networkIncreasing automation and decision-support for the researcher – the VRE becomes assistive. Curation is autonomic and social.
3rd Generation Summary
Easy and low risk to startProgress to advanced skillsFor researchersNo obligationGo as far as you want
Find a service & relax
Intellectual ramps
Malcolm Atkinson
NRAO/AUI/NSF
telescopes for the naked mindDatascopes
Malcolm Atkinson
From Signal to Understanding
Jeannette M. Wing COMMUNICATIONS OF THE ACM March 2006/Vol. 49, No. 3 Pages 33-35
2010 – 2011and beyond
Music and Linked Data
http://www.openarchives.org/ore/terms/aggregates
http://eprints.ecs.soton.ac.uk/id/eprint/20817
It’s about enabling the join
Ben Fields, 6th October 2010
SALAMI: Structural Analysis of Large Amounts of Music
Information
David De RoureJ. Stephen Downie
Ichiro Fujinaga
www.diggingintodata.org
The SALAMI collaboration• DDeR (e-Research South), J. Stephen Downie (Illinois) and
Ichiro Fujinaga (McGill)• NCSA donating 250,000 supercomputer hours• 350,000 pieces of music (23,000 hours)
– Internet Archive, DRAM, IMIRSEL, McGill• Feature analysis and structural analysis• Music Ontology by Yves Raimond (BBC)• Musicologists from McGill and Southampton• Sharing of analyses
http://salami.music.mcgill.ca
Digital Music Collections
Crowdsourced ground truth
Community Software
Linked Data Repositories
Supercomputer
23,000 hours ofrecorded music
250,000 hours NCSASupercomputer time
Music InformationRetrieval Community
Ashley Burgoyne http://www.sonicvisualiser.org/
MIREX Overview• Began in 2005• Tasks defined by community debate• Data sets collected and/or donated• Participants submit code to IMIRSEL• Code rarely works first try • Huge labour consumption getting
programs to work• Meet at ISMIR to discuss results Stephen Downie
http://www.music-ir.org/mirex
MIREX TASKSAudio Artist Identification Audio Onset Detection
Audio Beat Tracking Audio Tag Classification
Audio Chord Detection Audio Tempo Extraction
Audio Classical Composer ID Multiple F0 Estimation
Audio Cover Song Identification Multiple F0 Note Detection
Audio Drum Detection Query-by-Singing/Humming
Audio Genre Classification Query-by-Tapping
Audio Key Finding Score Following
Audio Melody Extraction Symbolic Genre Classification
Audio Mood Classification Symbolic Key Finding
Audio Music Similarity Symbolic Melodic Similarity
seasr.org/meandreMeandre
“Signal”Digital Audio
“Ground Truth”
Community
It’s web-like!
Q. If and when should community-generated content be assimilated into managed repositories?
StructuralAnalysis
How country is my country?
Kevin Page and Ben Fieldshttp://www.nema.ecs.soton.ac.uk/countrycountry/
Stephen Downie
Music and computational thinking
“Again, it [the Analytical Engine] might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should bealso susceptible of adaptations to the action of the operating notation and mechanism of the engine...”
“Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.”
Ada, The Enchantress of Numbers: Poetical Scienceby Betty Alexandra Toole
http://www.well.com/user/adatoole/
Betty Alexandra Toole
I can write a workflow that creates workflows based on those of others, and automatically modify it – think genetic mutation and crossovers. Who owns it?
I can register a query over an increasing number and diversity of “linked data” sources to ask new research questions.
http://eresearch-ethics.org/
The computer can learn from the activities of 1,000,000 scientists – and be indistinguishable from them?
What about the ethics of Citizen Social Science? Of citizens designing experiments?
Co-*
MethodsAccess ramps
Research ObjectsComputational thinking
Ethics of e-Research at scaleEnjoy th
e Open D
ay!
[email protected] Thanks to: Jeremy Frey & CombeChem; Carole Goble, myGrid and myExperiment; Iain Buchan & Obesity e-Lab; Sean Bechhofer; Doug Kell; Marco Roos; Lucy Yardley; Arfon Smith; Malcolm Atkinson; Stephen Downie, Kevin Page, Ben Fields, Ashley Burgoyne and NEMA/SALAMI; Betty Toole.
http://www.myexperiment.org/packs/153