Building Bridges: from Europeana Libraries to Europeana Newspapers

11
Building Bridges: from Europeana Libraries to Europeana Newspapers Susan Reilly, LIBER Twitter: @skreilly IFLA Newspapers/GENLOC, Helsinki, 13th Aug 2012

description

Presentation by Susan Reilly at the joint Newspapers & Genealogy session at WLIC2012.

Transcript of Building Bridges: from Europeana Libraries to Europeana Newspapers

Page 1: Building Bridges: from Europeana Libraries to Europeana Newspapers

Building Bridges: from Europeana

Libraries to Europeana Newspapers

Susan Reilly, LIBER

Twitter: @skreilly

IFLA Newspapers/GENLOC, Helsinki, 13th Aug 2012

Page 2: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 2

Overview

About LIBERIntroduction to Europeana NewspapersThe foundation stone: Europeana Libraries

Page 3: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Association of European Research LibrariesOur projects:

ContentEuropeana LibrariesEuropeana Newspapers

PolicyMEDOANET

InfrastructureAPARSENAAA StudyODE

LIBER & the European Digital Agenda

Page 4: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Europeana Newspapers

• 17 partner institutions

• 3 years (2012-2015)

• Aggregation of more than 18 million newspapers

• Will use refinement methods for OCR, OLR (article segmentation), and named entity (NER) and class recognition

• Suvey existing collections in Europe

• Make content accessible

Page 5: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Why newspapers?

“The museum (and the newspaper) today seeks whatever represents normal life in its own native locality and with infinite pains its collections are arranged in a manner which is natural to them in their own habitat”

Lucy Maynard Salmon (1976) in The Newspaper and the Historian

Page 6: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Europeana Newspapers: where the content comes from…

NLF

SBB ONB

NLP

BnF

NLE

SUB HH

USAL

NLL

KB

LIBER

CCS

NLT

UB

UIBK

LFT

BL

We are looking for more libraries!

Page 7: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

What we do with the content

• Select 10 million items to be OCR’d• Structural information by UKIB e.g. headings, table of contents

• Select 2 million items for OCR and OLR• Article segmentation and page class recognition by CCS

• Libraries carry out manual correction of recognition and segmentation results

• Named entity recognition applied to English, Dutch and German material

Page 8: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Making the content accessible

• OCR enables full text searching

• OLR enables more targeted searching (titles and sections)

• NER enables searching by people, place,and the discover of new relationships between entities

Page 9: Building Bridges: from Europeana Libraries to Europeana Newspapers

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

No access without aggregation

• Europeana Libraries • A single library domain aggregator• Content from European research libraries• Full-text search capabilities• Portal for researchers

Access= Critical mass of content:

• 3,319,045 pages

• 598,130 books and theses

• 368,000 articles

• 848,078 images

• 1,200 film and video clips

• 34,000 mixed content objectsAccess = SustainabilityAccess = Visibility

Page 10: Building Bridges: from Europeana Libraries to Europeana Newspapers

Go to www.theeuropeanlibrary.org

Page 11: Building Bridges: from Europeana Libraries to Europeana Newspapers

Thank you for your attention!

http://www.libereurope.eu

http://www.europeana-newspapers.eu/

http://www.europeana-libraries.eu/

Hall 4/5, stand H104