Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions:...
Transcript of Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions:...
![Page 1: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/1.jpg)
1
![Page 2: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/2.jpg)
<Insert Picture Here>
Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 2011
![Page 3: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/3.jpg)
3
<Insert Picture Here>
CHALLENGES OF TODAY’S ARCHIVE
![Page 4: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/4.jpg)
4
Challenges of Today’s Archive
Challenge Results Bit Rot • Data Loss
• Data Corruption Obsolescence • Can no longer access the data or read the data Natural Disaster • Data Loss Economic Failure • Data access Loss; data loss Organizational Failure • Data access loss; data loss, inappropriate use Information Attack • Data corruption or loss Human Error • Data loss or data access loss
![Page 5: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/5.jpg)
5
Challenges of Today’s Archive
Challenge Results Lack of context • Data is available but no access or pointers or
metadata Ambiguous IP State • Copyright • Licensing
• Loss of data access
Distribution and Dissipation
• Loss of data access
Migrations and Transitions • People (2-20yrs) • Software (5-10yrs) • Hardware (3-5yrs)
• Data loss and loss of data access
![Page 6: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/6.jpg)
6
<Insert Picture Here>
CHARACTERISTICS OF ARCHIVE SOLUTIONS
![Page 7: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/7.jpg)
7
Availability
• Searchable
• Retrievable – Dynamic access
– What went in is comes out
• Deliverable to new environments, in new contexts
• Over time… a VERY long time
![Page 8: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/8.jpg)
8
Integrity
• Fixity of the original object – No data loss
– No data corruption
– No data “augmentation”
• Wholeness – Contains all of its essential bits
– Transformed content is documented
![Page 9: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/9.jpg)
9
Authenticity
• Assure that an object is what it purports to be…
• Include a description of the object in its original state as well as transformations
• Include provenance – where an object came from and the chain of custody and processes from its point of origin
![Page 10: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/10.jpg)
10
Reusability
• Collaboration
• May require the object in its original form or format
• May require a derived form, suitable for a specific purpose – Case study: what’s more useful, an image of a newspaper
page, or the full text of a newspaper page?
• Requires clear understanding of business purpose
![Page 11: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/11.jpg)
11
Security
• Secure against leakage
• Secure against tampering
• A primary design consideration
• A vital element in trust
![Page 12: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/12.jpg)
12
Sustainability
• Technically feasible & maintainable
• Economically viable and maintainable
• Organizational alignment and commitment
• Able to adapt – Technically: changes in technology, scale, have a migration
plan that is non-disruptive
– Economically: changes in costs, funding (recessions…)
– Organizationally: layoffs, staff changes, mergers, strategy shifts
![Page 13: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/13.jpg)
13
Trustworthiness
• Perception of competence, security, long-term commitment
• Prerequisite for confidence by – Depositors
– Funders
– Content Consumers
![Page 14: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/14.jpg)
14
<Insert Picture Here>
ARCHITECTING AN ARCHIVE SOLUTION
![Page 15: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/15.jpg)
15
Data Archive Layers
Storage Archive
Manager
Flash Tape
Manage content
Data Preservation and
Content Management Applications
Disk
15
![Page 16: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/16.jpg)
16
Preservation Mindset & Strategies
• Resist the temptation to think of preserved objects as “static” – Migrations, versions, audits & disseminations all require
constant attention
– New access to old data, old access to new data
– The content will not change but it’s home will
– Awareness of retention requirements
• Remember that preservation is a journey, not a destination
![Page 17: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/17.jpg)
17
Technological Considerations
• Minimize dependencies – Encapsulate your metadata with your objects – Storage preservation should not depend on specific storage – Applications should not depend on specific storage
• Minimize affect of errors – Embrace redundancy – Embrace diversity
![Page 18: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/18.jpg)
18
Design
• Don’t overspec; don’t overbuild – Design a scalable architecture – Build in ability to grow non-disruptively with customer demand
• Monolithic systems don’t meet requirements – Complex, expensive, inflexible – Migration costs can capsize you
• Components should not depend on each other but should be proven to work together
• Keep it simple; have an exit plan for every component
![Page 19: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/19.jpg)
19
Know Your Designated Community - Who will be using the content?
- Is there data connectivity requirements?
- How will they be using the data?
- Latency
- Delivery formats
- Security - Offer (appropriate) access from the start - Remain flexible as the community changes and
grows
![Page 20: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/20.jpg)
20
Basic Architecture of an Unstructured Data Archive Solution
• Application – Captures Data – Creates Content Metadata;
Optionally stored in DB – Stores Content in a File Store – Provides Search Engine – Provides data preservation
features
• Database Server – Content Metadata – Security – Improved search performance
• File Store
Application Database Server
Metadata
File Store
![Page 21: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/21.jpg)
21
SAM QFS As The File Store
21
• SAM-QFS – Dynamically maintains
data on defined tiers of storage
– Dynamically stages data for access when requested by application
– Standard file access via FC, NFS, CIFS
Application
File Store
SAM-QFS Managed Tiered Storage
Database Server
![Page 22: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/22.jpg)
22
Oracle Storage Appropriate for an Archive
SAM QFS File system and Metadata
• High Speed FC Drives • FC Access • High Availability
FC Array Storage
S6580
S6780
S6180
High Capacity Disk Storage
S6580
S6780
S6180
7720
7420
7320 7120
Disk Archive • SATA Drives • FC or IP access • High Capacity • High Availability
Tape and Libraries
SL8500 SL3000
LTO T10K
Tape Archive • T10KC
• Highest capacity
• DIV • LTO 5
![Page 23: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/23.jpg)
23
Oracle Enterprise Content Management
• Content Management – Geared toward business data and workflow – Customizable for different data types
• Oracle Optimized Solution • Fully tested, integrated solution (HW, SW, Storage SW) • Expanding into industry data – Health Sciences – Media and Entertainment
![Page 24: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/24.jpg)
24
What Solutions Integrated SAM QFS? • Third Party Applications and SAM QFS
− Scalable On-line Archive Repository (S.O.A.R.) from Moca/Arrow and their Channel Partners (see Mark Legott preso) − Sun tested and partner marketed − Uses Open Source Software Drupal and Fedora − Fully supported by Oracle partner for implementation and 1st call
− Ex Libris − New Zealand National Library implementation and validated solution
− Storage Resource Broker (SRB) − Customer implementation at DOD − Tight integration with SAM
− PACS Applications − Been in production in many sites since STK was STK
− Home-Grown-Application − Norwegian National Library “it just works” − 6PB under SAM management (1 on disk archive 2 on tape archive)
24
![Page 25: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/25.jpg)
25
Questions..
25
![Page 26: Long-Term Archive and Digital Preservation at TACC Donna Harland Oracle Optimized Solutions: Solutions Architect June 20, 20117](https://reader034.fdocuments.us/reader034/viewer/2022042403/5f16dba9fc6dd86349036c80/html5/thumbnails/26.jpg)
26
We encourage you to use the newly minted corporate tagline “Hardware and Software, Engineered to Work Together.” at the end of all your presentations. This message should replace any reference to our previous corporate tagline “Hardware. Software. Complete.”