Funded by: © AHDS What happens when you digitise? An introduction to some key themes Alastair...
-
Upload
arnold-mcbride -
Category
Documents
-
view
218 -
download
1
Transcript of Funded by: © AHDS What happens when you digitise? An introduction to some key themes Alastair...
Funded by:
© AHDS
What happens when you digitise?An introduction to some key themes
Alastair DunningArts and Humanities Data Service
http://ahds.ac.uk/
Think Digital:Best Practices for Digital Resources across the Cultural Heritage Sector
Funded by:
© AHDS
OverviewGeneral processesDigital objectsData models
Funded by:
© AHDS
Source Digitisation Resource
Funded by:
© AHDS
e.g. William Shakespeare: The Complete Works
slides videotape
John Bright bust © Public Monuments and Sculpture Association
sculpture
text
performanceMark Rylance as Hamlet, Block/TIrmani, June 2000 © Donald Cooper
Source Digitisation Resource
Funded by:
© AHDS
digital audio/movie recording
scan/digital camera/3-D scan/OCR
item to digitise
digital object
Source Digitisation Resource
Funded by:
© AHDS
digital resource
Title: JOHN BRIGHT
Sculptor: Bruce-Joy, Albert
Type: Sculpture
Date Completed: 1890
Description: marble bust
Public Monuments and Sculpture Association collection available through AHDS
Visual Arts image catalogue: visualarts.ahds.ac.uk
Source Digitisation Resource
Funded by:
© AHDS
digital resource
As you like it, William Shakespeare.
Available as DOS ASCII Text (Especially converted for MS-DOS based machines.)
Search Results for William Shakespeare
http://ota.ahds.ac.uk/
Source Digitisation Resource
Funded by:
© AHDS
digital resource
September 4, 1995 ITN News At Ten
Actress Tilda Swinton becomes a living, sleeping, performance art exhibit for artist Cornelia Parker in an exhibition at London's Serpentine Gallery.
Newsfilm Online http://www.bufvc.ac.uk
Source Digitisation Resource
Funded by:
© AHDS
Elements of a digital resource
Users
Knowledge
Experience
Culture
Environment
HardwareSoftware(OS)(Network)
Digital ObjectsBinary Data
Data ModelsRelationships
• Users are the most important element
• Advice on hardware and software, objects and data models can be sought
• Fit for Purpose: Digital objects must be created with their intended use/purpose of paramount importance
Funded by:
© AHDS
Digital objects
• Text• Image• Time-based media
Funded by:
© AHDS
Digital objects - Text
• Text is data stored as a stream of characters (numbers, letters, etc.)– Text Transcription– OCR
Funded by:
© AHDS
Digital objects - Text
Text Transcription
Pros:• low overhead to start transcription • person, keyboard, document• hand-written documents can be transcribed• transcriber can follow complex disorganised documents
Cons:• slow and expensive• human error
© David Leach/Crafts Study Centre 2004
Funded by:
© AHDS
Digital objects - TextOptical Character Recognition
Pros:• automatic, suitable for digitising large numbers of documents• highly accurate for clean, clear type written documents• systematic errors can be easy to find and find
Cons:• current technology is very poor on hand-writing• complex document layout can become scrambled
5 S
Funded by:
© AHDS
Digital objects - Image
• Images are data understood as a spatial pattern or shape– bitmapped/raster images – vector spatial data
Funded by:
© AHDS
Digital objects - Bitmapped images• Bitmapped images are made up of
many pixels• Each pixel stores information about it’s
colour (RGB)• Pixels per inch (ppi) dictates amount of
visible information• The standard archival file format is
uncompressed TIFF v6
Funded by:
© AHDS
Fit for Purpose Capture
• National Gallery• Hi-res images• 20,000 * 20,000 pixels
• Old Bailey Online• Lower res images• 1,024 * 1,024 pixels
Funded by:
© AHDS
Digital objects - Bitmapped images
Good practice
• Check the optical resolution of the scanner• Avoid interpolated resolution• Capture master images at appropriate resolution and bit depth• Check scanning time• Record details of scanner/camera settings and any image editing• http://www.tasi.ac.uk
Scanning versus direct digital capture
• Always depends on the resource • Is there an analogue version of the resource?• Time – money – available skills• Image quality - colour fidelity
Funded by:
© AHDS
Digital objects - Vector images
x,y x,y,z
• A point represents an exact location in two or three dimensional space
• Two points define a line
• A series of connected lines define an area
Funded by:
© AHDS
Digital objects – Time-based media
• Time-based media– data understood as a sequence
through time– audio and video (multimedia)
Funded by:
© AHDS
Digital objects – Time-based media
• Audio– Moving from analogue to digital called sampling
– Frequency of sampling rates Hertz (Hz)
– Uncompressed digitisation 36kHz– .WAV and .AIF– MP3 offers good quality compressed
(lossy) files
Funded by:
© AHDS
Digital objects – Time-based media
• Video– Issue of file sizes and bandwidth– MPEG - The Motion Pictures Experts
Group standards are the most popular compression standards
– The three standards, MPEG-1, MPEG-2, MPEG-4
– Compression basically works by selecting key frames and only recording changes between the frames (but it gets a lot more complicated!)
Funded by:
© AHDS
Data models
• List, one item follows another
• Tree, each item can have several children
• Sets, items belong to one or more groups
• Geography/geometry, items are located using a co-ordinate system
Funded by:
© AHDS
Lists
e.g. spreadsheet
Funded by:
© AHDS
Trees and hierarchies
e.g. Text mark-up
<xml> <book title= “The Great Gatsby” author= “F. Scott Fitzgerald”> <chapter /> <page /> <sentence />
<word /> </book></xml>
Funded by:
© AHDS
Sets
e.g. Relational database
Table: art_work
Id no.ArtistTitleDescriptionSubjectMaterial
Table: image
Id no.File nameCreatorDateSize (pixels)ResolutionOrientation
one to many relationship, avoids redundancy, i.e. art workinformation is stored only once
Funded by:
© AHDS
Geography/geometry
• GIS (spatial) database
• The distribution of things upon the surface of the earth
• Maps and plans - Aerial photographs - Co-ordinate lists
Figure created by Peter Halls using data fromthe Cottam Project directed by Julian Richards. Image copyright © Archaeology Data Service.
Funded by:
© AHDS
Selecting your Data Model
• Consider how the original source material is organised
• What your users are familiar with• Fit for purpose• Seek specialist advice
Funded by:
© AHDS
Selecting software 1
• Avoid little-used software with proprietary features
• Select software that can perform the right tasks – e.g. Don’t use a word processor as a
spreadsheet– e.g. Don’t use a webpage editor as a
database
• Watch out for licencing costs
Funded by:
© AHDS
Selecting software 2• Do look for software with export and import
options.• Do look for software that supports important
standards. e.g.
How data is organised Data model
Important standards
trees mark-up XML (SGML)
sets relational databases
SQL
coordinates CAD or GIS
DXF,ESRI shape files
Funded by:
© AHDS
Summary
• A discussion of general processes including:– Source– Digitisation– Resource
• A brief overview of digital objects including:– Text– Image– Time-based media
• A look at data models and selecting software.