Getting to Disk-based Lossless Digital Video Preservation – An Introduction Paul Theerman, Walter...
-
Upload
claire-hampton -
Category
Documents
-
view
215 -
download
0
Transcript of Getting to Disk-based Lossless Digital Video Preservation – An Introduction Paul Theerman, Walter...
Getting to Disk-based Lossless Digital Video Preservation –
An Introduction
Paul Theerman, Walter Cybulski, Glenn Pearson
National Library of MedicineNIH/HHS
Historical Audiovisuals at the National Library of Medicine
Paul Theerman, Ph.D.
Head, Images and Archives
History of Medicine Division, NLM
Historical Audiovisuals at NLM
• Origins as the National Medical Audiovisual Collection
• A clearinghouse for these materials
• Variously held here and at CDC, Atlanta
• Only relatively recently transferred to the History of Medicine Division
Historical Audiovisuals at NLM
• Current definition of the collection
• All audiovisuals before 1970
• Films and videos of historical interest dating after 1970—that is, of interest for historical value, not informational value
Historical Audiovisuals at NLM
• The collection ranges from the first decade of the 20th century through the 1990s
• Content:– Early films on “how to go to the doctor”and
other public service and public information films
– Films on the U.S. Public Health Service
Historical Audiovisuals at NLM
• Content– Dental films due to an ADA donation– Training films for surgical procedures– Military: battlefield surgical films– Large recent donations from NIMH and FDA– “home movies”– Research footage– Films promoting usage of films in medicine
Historical Audiovisuals at NLM
• Size: the largest such collection in the U.S.
• Total number of titles: ~9650
• Number cataloged: 4300
• Number inventoried: 3550
• Number to be inventoried: ~1800
• Number preserved: 2250+
Historical Audiovisuals at NLM
• The ability to collect is dependent on the ability to preserve and to catalog, and, in the short run, to stabilize in order to preserve and to catalog in the future
• Controlled environments– On-site cool vault for new accessions,
masters– Off-site cool and cold vaults for new
accessions, originals
Historical Audiovisuals at NLM
• The decision to preserve and to catalog is not made lightly, because of the investment of resources
• Based on condition and content assessments
Historical Audiovisuals at NLM
• Condition assessment– Age of medium– Obsolescence of format– Possible or actual deterioration of medium
• Nitrate• Acetate
– Generation
Historical Audiovisuals at NLM
• Content assessment– Ownership and restrictions– Uniqueness– Age, especially pre-1950 – Then a sliding scale, based on collection
development guidelines
Historical Audiovisuals at NLM
• When both condition and content indicate, then:
– Preservation copying, to three copies (in some cases two)
– Cataloging, either to full or core records
Historical Audiovisuals at NLM
• Currently we are on the cusp of moving to digital formats, but our originals are chiefly analog, and our duplication and viewing copies are as well– Betacam SP for duplication copies– VHS for use copies
• This also matches patron needs for Interlibrary Loan and production
Historical Audiovisuals at NLM
• The Preservation and Collection Management Section enters the picture: – Determining formats– Technical specifications – Managing vendor copying– Managing on-site and off-site cool and cold
vaults– Managing shelving for use copies
Historical Audiovisuals at NLM
• New Ventures with Center for Information Technology (CIT) at NIH– Videocasting service of “history in the making”– Possible collaboration with NLM– Interlocking systems for preservation and
cataloging – New venture for NLM in a large cache of
digital materials
Historical Audiovisuals at NLM
• New Library Research at NLM
• NLM’s Lister Hill Center is looking at means of digital preservation
• The origin of this conference—excited what it will bring
Analog Motion Picture and Tape Preservation at NLM –
Duplication & Offsite Storage
Walter Cybulski, Preservation Librarian
Preservation & Collection Management Section,
Public Services Division, NLM
Examples of Film and Tape Media in the NLM Collections
8mm
16mm
35mm
2” Quadruplex
1” Type C
¾” U-Matic
½” Beta
Deterioration
Nitrate added spice to the idea of deterioration – unfortunately, nothing but hot pepper
(There are no nitrate film materials at NLM)
“250 TEASPOONFULS OF VINEGAR FOR A 1,000 FOOT CAN OF 35mm FILM”
Main Objectives of Preservation
• Identify content that merits preserving
• Mitigate against known risks
• Extend useful life of content
Mitigate against risk.
Temperature (F) Relative Humidity
Years for Acidity to Double
Room Temperature: 70º F 50% RH 5
NLM Cool Vault: 55º F
Magnetic Tape
30% RH 50
NLM Cold Vault: 35º F
Acetate Movie Film
25% RH 200
SECURE, CLIMATE-
CONTROLLED STORAGE AT
IRON MOUNTAIN
Extend useful life :
copy onto new media + ==
For libraries and archives, obtaining new copies may not be possible, and copying content on deteriorated media to the same media (e.g. 35mm to 35mm film transfer) can
be prohibitively expensive
At this point, the most widely used AV preservation media are BetacamSP and
Digital Betacam
But the clock is ticking even as we copy content onto these formats…
Rapidly changing technology takes its toll
with each technological advance, the storage picture changes …
WE ARE TRANSITIONING FROM FILMS AND TAPES TO DATA, BUT THE QUESTION REMAINS:
HOW TO EXTEND THE USEFUL LIFE OF THE CONTENT
101010101010101010101001010101000100101010100101101001000101010101010101010101010101010101010101010111101001110101011010101010101010100
101010101010101010001010101010000101001010111001011010110101010101011010101010101010101010101010101010100101010101010101001010100110010
101010100110000101010101010101011010101010101010101001101010110
101010101010101010110011011111010101110101010000001011010101110
Getting to Disk-based Lossless Digital Video Preservation –
Which Way Forward?
Communications Engineering Branch,Lister Hill National Center for Biomedical Communications
NLM
Glenn Pearson, Ph.D.Senior Software Developer
Generational Loss Once Digital
• Migration as preservation strategy– To cope with obsolescence of digital formats, gear
• If using lossy image compression algorithms– No degradation when making exact copy
Master Master– Degradation when migrating (or editing)
Master uncompress recompress Master– Examples: M-JPEGs, DVs, MPEG-1, -2, most -4
• Mathematically-lossless algorithms– Avoid this problem– Don’t compress as well (2x – 4x) as “virtually lossless”
(5x – 9x) or obviously lossy (web streaming)
Lossless Video Storage
• Uncompressed video– Can be stored with general binary file compressors (RLL, LZW
[zip] ), typically 1.6:1 - 2:1 compression
• Lossless video codecs– Standardized, open (but may be patents)
• HuffYUV – original, uses Huffman “entropy” encoding• Apple Quicktime “None” codec [documented, not standard]• JPEG 2000 Lossless (within, say, Motion JPEG 2000)• MPEG4/AVC Lossless
– Proprietary• Matrox DigiSuite: Lossless = entropy-only portion of M-JPEG• New - MatrixView’s “Adaptive Binary Optimization”, from patented
“Repetition Coded Compression” (boolean grids + Huffman)
Economics of Digital Storage
Sources: E. Grochowski & R. Halem, IBM Sys J, 42(2), 2003 (Disk, Flash)
R. Harada, Comp Tech Rev, June 2004 (Tape)
$ per GigaByte
0.01
0.1
1
10
100
1000
10000
1998 2002 2006 2010 2014 2020
DRAM/Flash
HDD Storage System
2.5" Hard Disk Drive
3.5" Hard Disk Drive
Tape Media Data is for computer tape, but digital video tape uses the same technology, which drives media price
The Twilight of Tape
0.01
0.1
1
10
100
1000
10000
1998 2002 2006 2010 2014 2020
DRAM/FlashHDD Storage System2.5" Hard Disk Drive3.5" Hard Disk DriveTape Media
Hierarchical storage yesterday:
Hierarchical storage tomorrow:
HardDisks
Tapes
FlashDisks
HardDisks*
*Powered on-demand
Economics of Subsampling and Lossless Compression
• Gold Standard for digital video: 4:4:4 uncompressed• Not so affordable today for archives
Chroma
Luma
4:4:4 4:2:2
Uncompressed 1 2/3
Lossless ~1/3 ~1/4
• 4:2:2 lossless– will be affordable 2 years before 4:4:4 uncompressed– stay ¼ the cost
• When is 4:2:2 good enough for preservation?
In YUV colorspace:
Y is luma (B&W intensity)
U, V are red, blue color differences . respectively
4:4:4 = full sample/pixel
4:2:2 = sample for Y at full pixel resolution, for U, V at half resolution
Film Master Digital Master
• Traditional good advice: Film Film• Can Film Digital be
– as good as/better than Film Film– as affordable?
• Quality of source– 8mm, 16mm, 35mm, 65/70m– B/W vs color– camera original, intermediate print, distribution print
• Versus quality of target• HD video has1920x1080 (“e-Cinema”)
– Variety matching film best: progressive-scan 24 fps (1080p24)– But video has but 8-10 bits linear/component – less than film’s range– Good enough for archiving some 16mm B&W distribution prints?– HD 16:9 aspect matches some sources, not others
Film Master Digital Master- Hollywood Style
• Better than HD but $$
• 12 bit linear/component (36 bits/pixel)
• Or 10-bit log/component
• No subsampling
• 2K @ 24 fps = most practical res. & rate– 2K = 2048 x 1080– That’s outer bounds for various aspect ratios
3 Steps, 3 Types of File Formats
• Sources (Production)
• Digital Intermediate
• Package for Theatrical Release
Sources• Computer Graphics• New cinema digital cameras
– Viper, Dalsa Origin 4K, Arri D-20, Kinetta
• Film Scanners– Kodak Genesis, Northlight, Arriscan, Imagina
• “Datacines” (data telecines)– Thomson Spirit, Cintel DSX, Millennium
• Raw, Unwrapped Frame-per-file Formats– Flexible resolution, aspect ratio– But sound, most metadata in separate files
• Awkward: per-shot info
– Examples• Kodak Cineon scanner .CIN (10-bit log rgb)• SMPTE std DPX (derived from Cineon)• Others: TIFF, SGI, EXR, JP2• “Digital Negative” from 1-CCD camera with Bayer-pattern color filters atop pixels
Magazinehas 1240 GBiPod
Drives
Digital Intermediate Process
• Creates Digital Masters– May include “Digital Source Master” from which multiple
masters come: DVD master, TV master, DCDM
• Typical Steps– Color grading, compositing, editing, finishing– Projects moved along in vendor formats or AAF– End products archived in vendor formats or MXF
• Such unencrypted masters closely held by studios, but archivists could make their own
Theatrical Distribution
• DCI Distribution Master (DCDM)– MXF wrapper + JPEG2000 frames– But lossy due to real-time bandwidth
constraints (250 Mb/s peak)– Something Similar for Archivists?
• a lossless variety of this• or MJ2 instead of MXF
Roadblocks in Getting to a Disk-Based Lossless Archive Master
• Rapid digital-technology change• High current costs
– Top quality needs massive storage, high-speed pipelines– An uncompressed color movie (2K @ 24 fps, 12-bit)
• Would consume ~2 Gigabits per second bandwidth if realtime• Needs 0.8 TB storage per hour of length
– Plus $$$ for color grading/restoration services & software
• Analog tape SD digital is more affordable now• A proliferation of standards
– File Formats• Essence representation/codecs/color spaces• Wrappers
– Metadata & Rights Management• Can we help find a way forward?