Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

12
Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris

Transcript of Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Page 1: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Image Workflow Processes

Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris

Page 2: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.
Page 3: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Large scale digitisation programmes are becoming more common, resulting in:

Large numbers of files – potentially nearly 3,000,000 for Edinburgh herbarium (E)

High quality images

Large file size – c. 150MB each

Images captured with minimal data records

These images need to be managed and made available and the scale is too large for completely manual processes

Page 4: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Image polling & metadata capture

Capture Image

Edit Image

Save Image

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

User

System

Image workflow being developed at RBGE incorporating:

image capture

automated image processing

metadata recording

optical character recognition

quality control

image streaming online

archiving

Page 5: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Save Image

Edit Image

Capture Image

Image polling & metadata capture

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

Capture Image

Edit Image

Save Image

Image captured using digital camera or scanner

Image edited in Leaf Capture software and/or Adobe Photoshop

Images saved into folders

batches consisting of ¼ day’s work are checked for quality prior to being transferred

Page 6: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Edit Image

Capture Image

Image polling & metadata capture

Save Image

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

A series of dropbox folders are used to facilitate the use of parallel processing

an internal folder structure contains the equipment and operator names which form part of the metadata

The image management system polls the dropbox folders

any new image files are registered in a MySQL data base and the metadata (equipment, operator, date, etc) are recorded

Image polling & metadata capture

Dropbox

Page 7: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Image polling & metadata capture

Capture Image

Edit Image

Save Image

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

A copy of the image is processed using ABBYY Optical Character Recognition (OCR) software

the text is recorded in the MySQL database to facilitate searching

a pdf is available to help users carry out additional data entry from the image

OCR

QCWe are developing a quality control checking process

provides an interface for a user to open images and record a quality assessment

enable correction and appending or overwriting as appropriate

Additional modular components

Page 8: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Image polling & metadata capture

Capture Image

Edit Image

Save Image

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

The image management system creates a jpg and a zoomify version of the image files

The tiff and the raw files are saved into a zip folder

Create jpg & zoomify

Serve Online

Archive

Save tiff & raw

The zoomify files are served online, enabling users to zoom in and examine the specimen in detail

The zip folders comprising the tiff and the raw file are then archived onto tape and external hard drives

Page 9: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Image polling & metadata capture

Capture Image

Edit Image

Save Image

Create jpg & zoomify

Serve Online

OCR

Archive

Save tiff & raw

Dropbox

QC

The location of each file is also recorded in the MySQL database

Create jpg & zoomify

Serve Online

Archive

Save tiff & raw

Image polling & metadata capture

Page 10: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

The image workflow system at RBGE has now processed over 130,000 images.

modular system has flexibility, but each new module may require access to the archived tiff files and some level of reprocessing may be necessary

it has proved unfeasible to maintain the tiff and raw files on a server

during the development of the workflow backlogs built up which can have a large impact on image management and on the curation of the collections

Page 11: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

The workflow is enabling us to manage the images effectively:

the system helps with the integration of digitisation and curation in the herbarium

requests for images and data are easily managed and users will shortly be able to download images and data directly

the modular element will allow us to incorporate a georeferencing tool

the workflow is allowing us to manage several large digitisation projects in an integrated system

Page 12: Image Workflow Processes Elspeth Haston, Robert Cubey, Martin Pullan & David J Harris.

Thank you