Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

29
Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace Alan Ng Hong Kong University Libraries PNC 2009 Annual Conference and Joint Meetings Taipei, Taiwan

description

Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace. PNC 2009 Annual Conference and Joint Meetings Taipei, Taiwan. Alan Ng Hong Kong University Libraries. Introduction. HKU Libraries. established in 1912 the oldest academic library in HK - PowerPoint PPT Presentation

Transcript of Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Page 1: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Converting Millennium ILS Bibliographic records into

Dublin-Core XML format for DSpace

Alan NgHong Kong University Libraries

PNC 2009 Annual Conference and Joint MeetingsTaipei, Taiwan

Page 2: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Introduction

Page 3: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

•established in 1912•the oldest academic library in HK•main library and 6 branches

HKU Libraries

Page 4: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Libraries

•2.84M total physical volumes•49K print periodical titles•80K electronic periodical titles•1.90M e-book

Page 5: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Libraries

•Millennium ILS from Innovative Interface Inc.

•hosting the HKALL union catalog for 8 university libraries in HK

Page 6: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Institutional Repository

Page 7: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Scholars Hub

•collects intellectual output of HKU for fulltext open access

•http://hub.hku.hk/

Page 8: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Scholars Hub

•uses DSpace (version 1.5)•OAI-compliant•implements DCMI

Page 9: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Scholars Hub•25300+ records (as of 2009 June)•Articles•Conference paper•Postgraduate thesis and others•1.6M download (as of 2009 June)

Page 10: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

HKU Scholars Hub•some records originate from the

OPAC•HKU postgraduate thesis•Digital editions from HKU Press•Bibliographic MARC fields are

mapped to DC XML data

Page 11: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

MARC to DC mapping001 identifier -- other

008 language020 identifier -- isbn022 identifier -- issn050 subject -- lcc

092|a|b subject -- dcc110|a contributor -- author

245|a|b title260|b publisher260|c date -- issued

300|a|b|c format -- extent490|a relation -- ispartofseries5XX description650 subject -- lcsh

710|a|b contributor -- other856|u identifier970 description -- tableofcontents

Page 12: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

http://library.hku.hk/record=b4200627

A record in OPAC

Page 13: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Same record in Hub

http://hub.hku.hk/handle/123456789/55513

Page 14: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Automated batch processing

Page 15: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Incentives

•needs to convert 100+ records at a time

•tedious, easy to make mistake manually

•time consuming

Page 16: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Automated approach

•efficiency•accuracy•eliminate duplicated effort of data

entry•easier quality control of converted

data

Page 17: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Perl programming•free of charge•easy to program•powerful in handling plain text in

MARC•runs on any computer platform•needs a persistent URL syntax to

locate a particular record on OPAC

Page 18: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Perl programming•reads in a list of bibliographic

record numbers•captures the MARC records on

OPAC real time one by one via HTTP

•regards the returned HTML as plain text

Page 19: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

MARC record as seen by human

http://library.hku.hk/search~S6?/.b4200627/.b4200627/1%2C1%2C1%2CB/marc~b4200627

Page 20: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

MARC record as seen by program

http://library.hku.hk/search~S6?/.b4200627/.b4200627/1%2C1%2C1%2CB/marc~b4200627

Page 21: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Perl programming•extracts the essential MARC fields

using Regular Expression•constructs the DC fields according

to the mapping table•converts 100+ records in a couple

of minutes

Page 22: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Converted record in DC XML format

Page 23: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Running Perl program

•runs natively on Unix, Linux and Mac OS X

•needs Perl interpreter on Windows•download ActivePerl•http://www.activestate.com/

activeperl/

Page 24: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Running the program on Mac OS X

Page 25: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Demo

Page 26: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Recap

Page 27: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Recap•uses existing MARC records for DSpace•uses Perl program for fast batch

converting•retrieves MARC in real time via HTTP•works with any OPAC with persistent

URL•source codes is free for sharing

Page 28: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Q & A

Page 29: Converting Millennium ILS Bibliographic records into Dublin-Core XML format for DSpace

Thank You !!

My contact : [email protected]