EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI...

Post on 18-Jan-2016

224 views 0 download

Tags:

Transcript of EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI...

EBI is an Outstation of the European Molecular Biology Laboratory.

Literature Resources at the EBIInformation Workshop on European Bioinformatics Resources

Vienna, 3rd-4th September 2009

What is the scientific literature?

Abstracts

Full text research articles

Patents

Grey literature

Books

Accessing the literature: search and browse

PubMed, PubMed Central

Google, Google Scholar

EPO, USPO

Scopus, WoS, Journal websites

Databases, Websites, blogs, Wikis

Hard copy publications

CiteXplore

Role for CiteXplore

• Opportunity to provide content in the public domain from a number of sources

• Unique position in EBI to add value through other public domain databases

• Text mining capabilities to leverage content

• Forge collaborative relationships to find solutions for different user groups

Components of CiteXplore: Abstracts

CiteXplore: Search Features

• Simple search box

• Query expansion

• Ranking by relevance or publication date

• Some advanced search features e.g. source, full text links

• SOAP web service

http://www.ebi.ac.uk/citexplore/

Search the website: ADAMTS7 case study

Added value: Anatomy of a CiteXplore record

Full text links

Database links

Text mining

Citation info

Added value: Citation and full text links

Added value: Citation and full text links

Added value: database links

Added value: Database Links

Database        #links      CiteXplore records   DB records

UniProtKB       5228552  225854           3801970

InterPro           40011      16730            12646

IntAct             10330       3920            10330

Added value: Text mining

Anatomy of a Patent Record

Those are the facts

Improvements over the next ~ 6 months include

Now for some fiction ….

• Full text searching

• Addition of content to CiteXplore

• Enhanced text mining functions

• Leveraging of citation information

• Facelift to highlight new features

Full text search

1.5 million articles

1.3 million patents

Search will be available from EBI

Search will be available from UKPMC

~ 9 % the size of PubMed

Addition of content to CiteXplore

PubMed19 million

~ 0.5 million

PubMed Central1.8 million

PMC will be a true subset at CiteXplore and UKPMC

Enhanced text mining functions

• Article summaries

• Semantic types: • genes/proteins• GO terms• organisms• diseases• drugs• chemicals

• Leveraging semantic indexing

• Web service for results, where possible

Text mining for Patents

Leveraging Citations

Published research articles

Counts

Citation sort order

Social networking

CiteULike

Connotea

Web hits, downloads (cf PLoS One)

CiteXplore USPs

• Search patent abstracts, agricola, other• Search all PubMed and UKPMC abstracts

• Text mining: highlighting for abstracts• Text mining: e.g. article summaries for Open Access, Patents

• Most complete citation information in the public domain

• Full text patent search

Credits

Core EBI StaffPeter StoehrSharmila PiliaAlan HorneMark Rijnbeek

Content ProvidersPubMedAgricolaEPOChinese Biological Abstracts

CollaboratorsDietrich Rebholz-Schuhmann (EBI text mining)NacTeM, MIMAS, British Library (UKPMC)European Patent Office

http://www.ebi.ac.uk/citexplore/