IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

6
IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist

Transcript of IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

Page 1: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

IIPC GA Curator Tools FairMay 2014

WEB CURATOR TOOL

Nicola Bingham

Web Archivist

Page 2: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

www.bl.uk 2

• Introduction

• Jointly developed by BL and NLZ 2006 under the auspices of the IIPC

• WCT manages the selective web harvesting process

• Designed for use in libraries by non-technical users

• Open source

• Uses the Heritrix web crawler

Page 3: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

www.bl.uk 3

What it does and doesn’t do.

•Appraisal and selection: choosing websites for capture.– Subject specialists, curators, external agencies– BL uses a selection permission tool plugged into WCT

•Metadata/Description– Basic Dublin Core Metadata– Titles, description, subject and collection tagging

•Scoping and Data Capture– Scheduling– Crawl parameters, e.g. path depth, size of download

•QA and Analysis– Heritrix log files– Browse tools– Recommendations based on indicators

Page 4: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

www.bl.uk 4

What it does and doesn’t do continued..

•Storage and Organisation– WARC files created in WCT– Passed out of WCT for indexing and long term storage

•Access/Use/Reuse– Wayback is plugged in as the access tool– Harvested sites can be viewed within the tool

•Risk Management– Harvest Authorisation module, rights metadata– Records the outcome of publisher communications– Control the display of Targets

Page 5: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

www.bl.uk 5

Development

•Latest version 1.6.1 available now. • UI new features and improvements (x 17) including…

– Date pickers for date fields– Scheduling heat map– Harvest optimisation

• Bug Fixes (x11)

• Development related e.g., – No longer need to install Apache Tomcat server or database etc

•NLNZ budgeted NZD 50,000 for 2014-15

•Open development process up to all WCT users. – WCT pages http://webcurator.sourceforge.net/

– Wiki http://sourceforge.net/projects/webcurator/

(Code, Support, mailing lists, bug tracker)

Page 6: IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

www.bl.uk 6

Thank-you.

UK Web Archive http://www.webarchive.org.ukhttp://britishlibrary.typepad.co.uk/webarchive/ @UKWebArchive

[email protected]