talk
-
Upload
brucelee55 -
Category
Technology
-
view
277 -
download
0
Transcript of talk
![Page 1: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/1.jpg)
Matthew CockerillTechnical Director, BioMed Central
Text mining and Open Access
publishing
![Page 2: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/2.jpg)
March 30th 2004 BioCreative 2004
SummarySummary
What is Open Access publishing?What is Open Access publishing? Open Access publishing and text Open Access publishing and text
miningmining About BMC BioinformaticsAbout BMC Bioinformatics The BioCreative supplementThe BioCreative supplement
![Page 3: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/3.jpg)
March 30th 2004 BioCreative 2004
SummarySummary
What is Open Access publishing?What is Open Access publishing? Open Access publishing and text Open Access publishing and text
miningmining About BMC BioinformaticsAbout BMC Bioinformatics The BioCreative supplementThe BioCreative supplement
![Page 4: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/4.jpg)
March 30th 2004 BioCreative 2004
The current model of publishing The current model of publishing scientific researchscientific research Scientists carry out researchScientists carry out research They write up their resultsThey write up their results They submit them to a journalThey submit them to a journal Other scientists act as peer Other scientists act as peer
reviewers and editorial advisersreviewers and editorial advisers Finally, the publisher Finally, the publisher sellssells access access
to that research back to the to that research back to the scientific communityscientific community
![Page 5: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/5.jpg)
March 30th 2004 BioCreative 2004
What’s wrong with this What’s wrong with this status quo?status quo?
Restricted access to scientific research Restricted access to scientific research is contrary to the interests ofis contrary to the interests of– the scientists who do the researchthe scientists who do the research– the funders who pay for itthe funders who pay for it– society as a wholesociety as a whole
It is an historical artefact of the It is an historical artefact of the economics of print publishingeconomics of print publishing
It is a serious obstacle to mining of full It is a serious obstacle to mining of full text informationtext information
![Page 6: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/6.jpg)
March 30th 2004 BioCreative 2004
BioMed Central BioMed Central The Open Access publisherThe Open Access publisher
Commercial organizationCommercial organization Published first article in mid-2000Published first article in mid-2000 Strict policy of immediate Open Strict policy of immediate Open
Access to Access to allall research articles research articles
![Page 7: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/7.jpg)
March 30th 2004 BioCreative 2004
Growth of BioMed CentralGrowth of BioMed Central
Open Access research article publications
0
500
1000
1500
2000
2000 2001 2002 2003
Fulltext accesses to Open Access articles
0m1m2m
3m4m5m
2000 2001 2002 2003
![Page 8: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/8.jpg)
March 30th 2004 BioCreative 2004
Momentum for Open Access Momentum for Open Access
PubMed CentralPubMed Central Public Library of SciencePublic Library of Science Open Access declarations:Open Access declarations:
Budapest/Bethesda/Berlin Budapest/Bethesda/Berlin Software open-source movementSoftware open-source movement Mass cancellation of titles from Mass cancellation of titles from
traditional publisherstraditional publishers
![Page 9: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/9.jpg)
March 30th 2004 BioCreative 2004
BioMed Central’s business model BioMed Central’s business model for open access publishingfor open access publishing Keep costs down viaKeep costs down via
– Online submission and peer reviewOnline submission and peer review– Automated tools to streamline article processing, conversion Automated tools to streamline article processing, conversion
and layout and layout Processing charge (currently $525) for accepted articlesProcessing charge (currently $525) for accepted articles No processing charge for authors at member institutionsNo processing charge for authors at member institutions
![Page 10: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/10.jpg)
March 30th 2004 BioCreative 2004
Institutional membershipInstitutional membership
CalTechCalTech Cancer Research UKCancer Research UK Columbia UniversityColumbia University Cornell UniversityCornell University University of CaliforniaUniversity of California Dana-Farber Cancer InstituteDana-Farber Cancer Institute Harvard UniversityHarvard University INSERMINSERM Imperial College Imperial College Institut PasteurInstitut Pasteur John Innes CentreJohn Innes Centre Johns Hopkins UniversityJohns Hopkins University Kyoto UniversityKyoto University Max Planck InstitutesMax Planck Institutes Memorial Sloan-Kettering Cancer Memorial Sloan-Kettering Cancer
CenterCenter
More than 400 institutions are members of BioMed Central, including, More than 400 institutions are members of BioMed Central, including, to name just a few:to name just a few:
MRC Laboratory of Molecular MRC Laboratory of Molecular BiologyBiology
National Institutes of HealthNational Institutes of Health National Institute for Medical National Institute for Medical
ResearchResearch NHS EnglandNHS England Princeton UniversityPrinceton University Rockefeller UniversityRockefeller University TIGRTIGR TSRITSRI Tufts UniversityTufts University Wellcome Trust Sanger InstituteWellcome Trust Sanger Institute University of WisconsinUniversity of Wisconsin World Health OrganizationWorld Health Organization Yale UniversityYale University
![Page 11: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/11.jpg)
March 30th 2004 BioCreative 2004
SummarySummary
What is Open Access publishing?What is Open Access publishing? Open Access publishing and text Open Access publishing and text
miningmining About BMC BioinformaticsAbout BMC Bioinformatics The BioCreative supplementThe BioCreative supplement
![Page 12: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/12.jpg)
March 30th 2004 BioCreative 2004
Mining the full textMining the full text
Analysing results of high-throughput Analysing results of high-throughput experiments means biologists experiments means biologists increasingly increasingly need need text-mining toolstext-mining tools
PubMed is currently the primary PubMed is currently the primary resource for text mining (“it’s what’s resource for text mining (“it’s what’s available”) but:available”) but:– Abstracts omit critical informationAbstracts omit critical information– Techniques developed for abstracts may not Techniques developed for abstracts may not
effectively use extra information in full texteffectively use extra information in full text Fully Open Access corpora, in standard Fully Open Access corpora, in standard
XML formats, will helpXML formats, will help
![Page 13: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/13.jpg)
March 30th 2004 BioCreative 2004
Data mining - BioMed CentralData mining - BioMed Central
Entire corpus of full text XML downloadable by Entire corpus of full text XML downloadable by ftp as a single zip fileftp as a single zip file
Various groups working with the data Various groups working with the data – E.g Pre-BIND (automatic extraction of possible E.g Pre-BIND (automatic extraction of possible
protein-protein interaction information from full text)protein-protein interaction information from full text) No restrictions on redistributionNo restrictions on redistribution This means other groups can use same corpus This means other groups can use same corpus
to repeat and build on resultsto repeat and build on results
http://www.biomedcentral.com/info/about/datamining
![Page 14: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/14.jpg)
March 30th 2004 BioCreative 2004
Data mining - BioMed Central Data mining - BioMed Central (screen shot)(screen shot)
![Page 15: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/15.jpg)
March 30th 2004 BioCreative 2004
Data mining - PubMed CentralData mining - PubMed Central
Standard NLM archiving/interchange XML DTD: Standard NLM archiving/interchange XML DTD: common format across multiple publisherscommon format across multiple publishers
Only a subset of PubMed Central participating Only a subset of PubMed Central participating publishers allow download of full text XMLpublishers allow download of full text XML– BioMed Central BioMed Central – Public Library of SciencePublic Library of Science
Hopefully, more will follow….Hopefully, more will follow…. XML made available via OAI interfaceXML made available via OAI interface
http://www.pubmedcentral.com/about/oai.html
![Page 16: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/16.jpg)
March 30th 2004 BioCreative 2004
Data mining - PubMed Central Data mining - PubMed Central
![Page 17: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/17.jpg)
March 30th 2004 BioCreative 2004
Adding structure to full text dataAdding structure to full text data
Some examples of useful structure:Some examples of useful structure:
1.1. Structure of article itself (figure Structure of article itself (figure legends, materials and methods, legends, materials and methods, references etc)references etc)
2.2. MathML, CML etcMathML, CML etc
3.3. Disambiguated references to Disambiguated references to genes/proteins…genes/proteins…
![Page 18: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/18.jpg)
March 30th 2004 BioCreative 2004
Authoring tools are keyAuthoring tools are key
Manuscript structureManuscript structureEndNote, TeX/BibTeX pretty good alreadyEndNote, TeX/BibTeX pretty good already
MathMLMathMLPublicon, TeX etc.Publicon, TeX etc.
CMLCMLChemsketch etc.Chemsketch etc.
Gene/protein reference markupGene/protein reference markup??Semi-automatic markup during authoringSemi-automatic markup during authoringAuthor reviews and confirms markupAuthor reviews and confirms markupSystem prompts author to clarify ambiguity System prompts author to clarify ambiguity c.f.c.f. grammar checker, code intelligence grammar checker, code intelligence
![Page 19: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/19.jpg)
March 30th 2004 BioCreative 2004
SummarySummary
What is Open Access publishing?What is Open Access publishing? Open Access publishing and text Open Access publishing and text
miningmining BMC BioinformaticsBMC Bioinformatics The BioCreative supplementThe BioCreative supplement
![Page 20: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/20.jpg)
March 30th 2004 BioCreative 2004
BMC series of online journalsBMC series of online journals BMC BiochemistryBMC Biochemistry BMC BioinformaticsBMC Bioinformatics BMC BiotechnologyBMC Biotechnology BMC Cell BiologyBMC Cell Biology BMC Chemical BiologyBMC Chemical Biology BMC Developmental BiologyBMC Developmental Biology BMC EcologyBMC Ecology BMC Evolutionary BiologyBMC Evolutionary Biology BMC GeneticsBMC Genetics BMC GenomicsBMC Genomics BMC ImmunologyBMC Immunology BMC MicrobiologyBMC Microbiology BMC Molecular BiologyBMC Molecular Biology BMC NeuroscienceBMC Neuroscience BMC PharmacologyBMC Pharmacology BMC PhysiologyBMC Physiology BMC Plant BiologyBMC Plant Biology BMC Structural BiologyBMC Structural Biology
BMC AnesthesiologyBMC Anesthesiology BMC Blood DisordersBMC Blood Disorders BMC CancerBMC Cancer BMC Cardiovascular DisordersBMC Cardiovascular Disorders BMC Clinical PathologyBMC Clinical Pathology BMC Clinical PharmacologyBMC Clinical Pharmacology BMC Complementary and BMC Complementary and
Alternative MedicineAlternative Medicine BMC DermatologyBMC Dermatology BMC Ear, Nose and Throat BMC Ear, Nose and Throat
DisordersDisorders BMC Emergency MedicineBMC Emergency Medicine BMC Endocrine DisordersBMC Endocrine Disorders BMC Family PracticeBMC Family Practice BMC GastroenterologyBMC Gastroenterology BMC GeriatricsBMC Geriatrics BMC Health Services ResearchBMC Health Services Research BMC Infectious DiseasesBMC Infectious Diseases BMC International Health and BMC International Health and
Human RightsHuman Rights BMC Medical EducationBMC Medical Education BMC Medical EthicsBMC Medical Ethics BMC Medical GeneticsBMC Medical Genetics
BMC Medical ImagingBMC Medical Imaging BMC Medical Informatics and BMC Medical Informatics and
Decision MakingDecision Making BMC Medical Research BMC Medical Research
MethodologyMethodology BMC Musculoskeletal BMC Musculoskeletal
DisordersDisorders BMC NephrologyBMC Nephrology BMC NeurologyBMC Neurology BMC Nuclear MedicineBMC Nuclear Medicine BMC NursingBMC Nursing BMC OphthalmologyBMC Ophthalmology BMC Oral HealthBMC Oral Health BMC Palliative CareBMC Palliative Care BMC PediatricsBMC Pediatrics BMC Pregnancy and ChildbirthBMC Pregnancy and Childbirth BMC PsychiatryBMC Psychiatry BMC Public HealthBMC Public Health BMC Pulmonary MedicineBMC Pulmonary Medicine BMC SurgeryBMC Surgery BMC UrologyBMC Urology BMC Women's HealthBMC Women's Health
![Page 21: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/21.jpg)
March 30th 2004 BioCreative 2004
BMC BioinformaticsBMC Bioinformatics
![Page 22: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/22.jpg)
March 30th 2004 BioCreative 2004
RSS feedsRSS feeds
![Page 23: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/23.jpg)
March 30th 2004 BioCreative 2004
Open access leads to high visibilityOpen access leads to high visibilityIndexing/LinkingIndexing/Linking PubMedPubMed MEDLINEMEDLINE ISIISI BIOSISBIOSIS CASCAS CrossRefCrossRef ScirusScirus Open Archive InitiativeOpen Archive Initiative CitebaseCitebase GoogleGoogle
ArchivingArchivingPubMed CentralPubMed CentralINISTINISTLOCKSSLOCKSSMax PlanckMax PlanckOhioLINKOhioLINK
![Page 24: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/24.jpg)
March 30th 2004 BioCreative 2004
BMC Bioinformatics - citation BMC Bioinformatics - citation impactimpact
BMC Bioinformatics
0
100
200
300
400
2001 2002 2003(projected)
Number ofarticlespublished
Times cited(ISI )
![Page 25: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/25.jpg)
March 30th 2004 BioCreative 2004
SummarySummary
What is Open Access publishing?What is Open Access publishing? Open Access publishing and text Open Access publishing and text
miningmining About BMC BioinformaticsAbout BMC Bioinformatics The BioCreative supplementThe BioCreative supplement
![Page 26: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/26.jpg)
March 30th 2004 BioCreative 2004
Process for publishing in Process for publishing in BMC BMC BioinformaticsBioinformatics supplement supplement Follow Follow BMC BioinformaticsBMC Bioinformatics ‘Research ‘Research
Article’ instructions for authorsArticle’ instructions for authors Send articles to BioCreative organizers Send articles to BioCreative organizers
who will coordinate peer reviewwho will coordinate peer review[do not submit articles online][do not submit articles online]
Supplement passed on to BioMed Supplement passed on to BioMed Central for XML markup and publicationCentral for XML markup and publication
$400 processing charge/article$400 processing charge/article
![Page 27: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/27.jpg)
March 30th 2004 BioCreative 2004
Instructions for authorsInstructions for authors
![Page 28: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/28.jpg)
March 30th 2004 BioCreative 2004
Access to supplementAccess to supplement
All articles in supplement covered All articles in supplement covered by BioMed Central’s Open Access by BioMed Central’s Open Access licence agreementlicence agreement– Free accessFree access– Free re-distribution/re-useFree re-distribution/re-use
Supplement indexed in PubMed Supplement indexed in PubMed and permanently archived in and permanently archived in PubMed CentralPubMed Central
![Page 29: talk](https://reader036.fdocuments.us/reader036/viewer/2022070315/554e84a8b4c90526358b4592/html5/thumbnails/29.jpg)
March 30th 2004 BioCreative 2004
That’s itThat’s it