ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC

42
ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC Marthie de Kock The Hong Kong Institute of Education 9 December 2002

description

ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC. Marthie de Kock The Hong Kong Institute of Education 9 December 2002. Education Imaging System ( EdIS ). Hong Kong Institute of Education Library. Points for discussion. Scope and functions EdIS Phase I EdIS Phase II Background - PowerPoint PPT Presentation

Transcript of ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC

Page 1: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

ARCHIVE IMAGING SEARCHABLE VIA THE

WEBPAC

Marthie de Kock The Hong Kong Institute of Education

9 December 2002

Page 2: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

Education Imaging System(EdIS)

Hong Kong Institute of Education Library

Page 3: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

3

Points for discussion

• Scope and functions

• EdIS Phase I

• EdIS Phase II

• Background

• Different document classes

• Data retrieval & searching

• INNOPAC and the Z server

Page 4: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

4

ScopeScope

• Provide a sophisticated system to manage the growing electronic media including text, black & white scanned images, colour photos, audio, video and multimedia presentations available to and in HKIEd library.

• Provide an effective web interface to retrieve on-line digitised materials.

Page 5: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

5

System FunctionsSystem Functions

• Capture of content, storage & management

• Scanning & OCR

• Supports both English and Chinese indexing and full text searching

Page 6: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

6

BackgroundBackground

First Digital Library initiatives of HKIed Library

• Joint project between IBM & Library with technical support by ITS

• July 1997 - signed contract with IBM and it’s Digital Library

• June 23 1998 - the system was launched

Page 7: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

7

Search Interface of EdIS > The Main Screen

Page 8: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

8

Contents of EdIS Phase I Contents of EdIS Phase I Four Document TypesFour Document Types

Document types Digitised itemsNewspaper clippings Image scanning & OCR

Examination papers Image scanning & OCR

Curriculum materials Multimedia objects

Student Projects Multimedia objects

Page 9: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

9

Document Types:Document Types:News Clippings & Exam PapersNews Clippings & Exam Papers

• News clippings:• Past newspaper clippings

• scanning, OCR, indexing

• Wiser News indexing & CMC operations

• Exam Papers:• Departments

• scanning, OCR, indexing

Page 10: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

10

Document Types:Document Types:Curriculums & Student ProjectsCurriculums & Student Projects

• Digitising procedures included:• Content Analysis

• Categorise multimedia objects

• Write a summary

• Digitise materials, saving files with logical file names, web page design & preparing scripts for uploading

• Upload documents & testing

Page 11: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

11

Basic Search Screen of Curriculum Materials

Page 12: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

12

Search results screen of [Title = dance]

Page 13: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

13

Selected the target page from the hit-list.

Page 14: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

14

EdIS Phase II

• Include Archive materials

• Improve multimedia searching

• Search Archive materials via INNOPAC

• No response – IBM’s DL and CMC

• June 2001 new Tender specifications

• Vitova

Page 15: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

15

EdIS Phase II Development

• Customise system

• Project development – July 2001

• Z server

• System delivered – April 2002

• Interface – uploading of Wiser news

Page 16: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

16

System ArchitectureSystem Architecture

Three subsystems:

• Client subsystem• The front-end PC workstations with

Netscape or Microsoft web browser are available for record retrieval and viewing.

• Capturing Subsystems • Used for content preparation

(scanning OCR and indexing)• Server Subsystem • The production server - stores

records and manages the systems operations

Page 17: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

17

ConfigurationConfiguration

• Hardware:• SUN Enterprise 250 server

• 36 GB data storage space

• Configured as RAID 0 (disk mirror)

• Operating Software:• ORACLE Database 8i for SUN Sparc Solaris Unix 2.7

Z39.50 server for document searching

Page 18: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

18

Hardware and software

• Application software• VitalDoc Document Imaging system - 40 user

license

• Two VitalScan licenses for desktop Scanning and OCR

• Chinese OCR - TsingHau Wintone ver. 8.0

Page 19: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

19

Page 20: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

20

Page 21: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

21

Other hardware

•Two scanning/OCR workstations

•Minolta PS7000 Scanner

•Ricoh IS330DC DF and Flatbed scanner

Page 22: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

22

Page 23: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

23

Page 24: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

24

Page 25: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

25

Page 26: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

26

Page 27: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

27

Typical Searching ProcedureTypical Searching Procedure

Enter Searching Criteria

Browsing Hit List

View Result/Content

Review HistoryNew Search

Select Class/Database

Page 28: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

28

Page 29: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

29

Page 30: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

30

Page 31: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

31

Page 32: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

32

Page 33: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

33

Page 34: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

34

Page 35: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

35

Page 36: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

36

Page 37: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

37

Page 38: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

38

Page 39: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

39

Page 40: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

40

Page 41: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

41

Page 42: ARCHIVE IMAGING  SEARCHABLE VIA THE WEBPAC

42

Future?

End