IFLA 2014 Europeana Newspapers Rossitza Atanassova

26
Improving the discovery of European Historic Newspapers Rossitza Atanassova, British Library @RossiAtanassova IFLA Newspapers, Lyon, 20 August 2014

Transcript of IFLA 2014 Europeana Newspapers Rossitza Atanassova

Page 1: IFLA 2014 Europeana Newspapers Rossitza Atanassova

Improving the discovery of

European Historic Newspapers

Rossitza Atanassova, British Library

@RossiAtanassova

IFLA Newspapers, Lyon, 20 August 2014

Page 2: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Europeana Newspapers is making historic

newspapers pages searchable

2

http://vimeo.com/100313926

Page 3: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Project outcomes

• Content in 22 languages

ranging 17th-20th century

• 10 million pages of full text

• Article-level records and

named entities for 2 million

pages

• Aggregation of up to 18

million pages

• Aggregation of metadata of

up to additional 19 million

pages

• Cross-searchable

newspapers interface at The

European Library

• http://www.theeuropeanlibrary.

org/tel4/newspapers

• Issue-level metadata via

Europeana

http://www.europeana.eu/

3

Page 4: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Statistics

Currently one can search

through

• full-text for over 2 million

pages

• metadata records relating to

to over 1 million issues

(links to source libraries)

4

Page 5: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

5

Page 6: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Search and browse options

6

Page 7: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Display options

• Metadata, full-text and full

zoomable images

• Metadata, full-text and static

images (full size or snippets)

• Metadata and full-text

• Metadata

7

Page 8: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Usability testing

• Remote 60 minutes long test sessions in April 2014

• Conducted by User Vision, Edinburgh

• 12 participants from 5 countries with professional or strong

personal research interest in the content

• 6 task scenarios

• Pre- and post-test questionnaires

• User Vision Report at http://www.europeana-

newspapers.eu/usability-testing-results-for-our-historic-

newspapers-browser/

8

Page 9: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Task success and ease of use ratings

9

Images in Alan Blackwood, The European Library Newspaper Archive –

Usability Testing, 16/04/2014

Page 10: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

User response to the interface

• “Strong positive reaction to the availability of the archive”

• “Aggregated view of content from many sources highly

valued”

• “Basic search functionalities worked well”

• Presentation of images and image navigation controls are

appreciated, as is the display of OCRed text

• Browse content over geographical map is popular

• Identified issues with design and functionality: facets, results,

navigation

• More expectations: print, download, saved searches

10

Page 11: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Before and after

11

Page 12: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Changes to landing page

• Prominent browse and

advanced options

• ‘Discover’ tab for browse

options page

• This day in history allows

users to scroll through all

relevant issues

12

Page 13: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Changes to browsing options

• Search by issue date

modified to include a text

input box for the year with

auto-suggestions

• Select title from an

alphabetical index

• Geographical map of Europe

is bigger and uses better

colour palette to indicate

number of issues

13

Page 14: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

• Sort by relevance,

descending date and

ascending date

• Configure number of items

per page (10-100)

• Further recommendations:

controls to navigate between

results, a ‘back to search

results’ button and a search

input box to allow

modification of search terms

14

Changes to results pages

Page 15: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

15

Faceted search and newspaper source page

Page 16: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

16

Page 17: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Integration of the viewer into the Europeana

portal

17

Page 18: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Next steps with the browser

18

• Second usability test in

September

• Final version by end of 2014

• Add OCR correction

functionality

• Allow access via API

• Further integration of the

newspapers viewer within

Europeana

Page 19: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Research practices and expectations

• Participants in the usability test have well established

research practices and higher expectations of the site’s

functionality

• Preference for search over browsing

• Greater control over search results

• Multiple layers of search through facets

• Would like to search by subject area and historical period

• User account to save search histories

• Download and print options

• New content notifications and feedback submission option

19

Page 20: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Researchers’ interest in the Europeana

Newspapers archive

20

• Interdisciplinary source of

information

• Mass digitised content

• Pan-European cross-

searchable archive

• Transnational comparative

studies

• Text mining for multilingual

content

• Computational analysis and

visualisation of the data

Page 21: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

What researchers value

21

“I see enormous value in an archive that breaks

down national boundaries automatically, where I

can search for content from a range of

countries..” – Bob Nicholson

“The difference lies not just in access but in the

conversion of a massive amount of print into a

searchable resource … This holds the potential to

make connections across newspapers in ways

previously unimaginable.” Matt Rubery

“Now software allows us to work with millions of

pages. By combining words and expressions,

machines uncover patterns that we never even

suspected were there …” Professor Toine Pieters

Page 22: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Digital Humanities approaches to digitised

newspaper archives

22

• Asymmetrical Encounters: E-

Humanity Approaches to

Reference Cultures in

Europe, 1815-1992’

• The project will apply multi-

lingual text mining

techniques to long runs of

digitised newspapers and

other textual materials

Page 23: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

The Victorian Meme Machine project

23

• Partnership between Bob

Nicholson, Edge Hill

University and British Library

Labs

• Extract Victorian jokes from

19th century British

newspapers

• Crowdsource transcriptions

• Algorithms to pair text with

images

• Share and re-use memes

https://www.youtube.com/wat

ch?v=FN1ZSAz2vMg

Page 24: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Europeana Newspapers Information Days

24

Page 25: IFLA 2014 Europeana Newspapers Rossitza Atanassova

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the

Competitiveness and Innovation Framework Programme by the European Community

http://ec.europa.eu/ict_psp

Final workshop “Newspapers in Europe & the

Digital Agenda for Europe”

25

• British Library, 29-30

September 2014

• The value of digitised historic

newspapers

• How to overcome the barriers

to improving access to

digitised historic newspapers

• Policy makers, researchers,

librarians, cultural heritage

professionals and newspaper

publishers