Intelligent Archiving Strategies: Toward ILM Arun Taneja, Founder and Consulting Analyst, Taneja...
-
Upload
mercy-jones -
Category
Documents
-
view
218 -
download
3
Embed Size (px)
Transcript of Intelligent Archiving Strategies: Toward ILM Arun Taneja, Founder and Consulting Analyst, Taneja...

Intelligent Archiving Strategies: Toward ILM
Arun Taneja, Founder and Consulting Analyst, Taneja GroupAlex Gorbansky, Senior Analyst, Taneja Group

Agenda
A Bit of Historical Perspective
Why Archive?
What to Archive?
The ILM Panacea
Developing an Operational Archival Strategy
Key Considerations
Representative Vendors and Solutions
Conclusions

Archival ≠ Backup
BACKUP
Copying production data to an
alternative medium for restorability
in the event of data loss, corruption,
or unavailability.
ARCHIVAL
Retention of historical data for future
access for business reasons such as
audits, customer issues, or litigation.

Some History On Archiving
3000 BCE
Ancient Egypt:• Library of Alexandria• Engravings
Middle Ages 1600s 1789 1884
Shift from Feudalism To Nation State:• Records• Property rights
American colonists• Births • Marriages• Businesses
French Revolution• Property records
American Historical Association• Archival standards • Marriages• Businesses

Archival Business Drivers Today
REGULATORY COMPLIANCE
REQUIREMENTS
EXPLOSIVE DATA GROWTH
APPLICATION PERFORMANCE
DEGRADATION
RISING COSTS

What to Archive?
Structured Data: • ERP/CRM DB tiers
• Business transactions
Unstructured Data: • Documents
• X-Rays
• Check Images
• Voice recording
Semi-structured Data:• Email
• Instant Messaging

ILM…ShmILM
“ILM” is an abstract framework for
describing the processes and technology
used to manage information throughout
its life according to its business value.
“ILM” is NOT the panacea for your
storage management challenges.

Archival is a key component of what vendors are calling “ILM
Applications: ERP, CRM, Email, Call Recording, Image Access
Application Data: Structured, Unstructured, Semi-Structured
Policies and Rules
Business ContextReferential Integrity Regulatory Compliance
Snapshots HSM
Data Movement Technologies
Replication Backup Archiva
l
Primary
Storage Infrastructure Tiers
Secondary Tertiary

Developing an Archival Strategy
PLAN When/How
Data Classification
Requirements
2. DESIGN
3. IMPLEMENT
4. REPORT &
TEST

Why Plan and When to Start
Upfront Planning will Result in Significant Benefits in Future Phases.
Develop an Archival Strategy as part of your application design and development process.
Engage Key Stakeholders: • Application Owners
• Business Decision Makers: Compliance Officers, Legal
Identify Key Archival Business Drivers:• Regulatory Compliance
• Other: Data Growth, Increasing Costs, Poor Performance

The Data Classification Puzzle
Assess the application data in your shop according to the following categories:• Structured: database
• Unstructured: files, videos, images
• Semi-structured: email
Identify specific data sets impacted by regulatory compliance:• Examples: Email, Medical Records, Call Recordings

Requirements DefinitionEngage Application Owners
Compliance not the ONLY archival driver
Separate requirements processes for
applications impacted by compliance.
Compliance-specific:
• Retention period
• Media characteristics
• Data restorability rates
• Access control policies
• Data availability/DR
General archival:
• Data Access Patterns
• Restore time requirements
• Application performance
• Cost structure
• Access control policies
• Data availability/DR

Taming the Compliance Monster
1. Understand the Regulations: Significant
Variance by Industry
2. Assess/Communicate Requirements to Key
Business Stakeholders
3. Judge Products for Yourself – Just because a
vendor says a solution is “Compliant” doesn’t
make it so.
4. Stay abreast of changes in regulatory mandates.

Defining Key Archival Metrics
Archive Distribution Percentages Across:• Online: Disk, Object-based storage
• Near-line: Optical, Tape (local)
• Off-line: Off-site vaults
Number of data copies• Local
• Remote

Designing an Archival Solution
Requires an application specific
assessment – look for commonality in
application requirements
Wholly enterprise-wide strategies will be
difficult to build and sustain
Evaluate alternative solutions based on
application requirements and metrics

Don’t Ignore the Organizational Dynamics
Archival Touches Multiple Organizations:• IT – Applications
• IT – Infrastructure
• Legal
• Users
Consequences of mistakes are enormous:• Fines
• Litigation
Consider organizing a cross-functional team led by an archival champion with a combination of technical and business expertise

Comprehensive Application Assessment
Data Classification Exercise
Data Set Size and Historical and Predicted Data Growth Rates based on business drivers
Is Regulatory Compliance an Issue?
Data Valuation over Time:• Access patterns of data of 90 days old and beyond.
• Cost of data loss
Going it alone can be difficult
Available resources: • Services organizations: GlassHouse, Accenture, EDS, Storage Vendor
• Application Management Tools: File-Level SRM, Precise
Budgetary Requirements

Components of the Archival Stack
Application Specific Module
Discovery and analysis of data assets
Business rules and policies definitions
Identification and movement of specific data to
appropriate storage medium
Management, indexing of data and metadata
Access control mechanism
Application Data
Storage Infrastructure
Physical archive repository
Data Preservation and Protection
Indexing Technologies for Retrieval
Management
& Control
Physical
Repository
Data Flow

Structured Data Archival Challenges to Investigate
ERP deployments are still very nascent
Preventing application downtime during archival
Preserving referential data integrity:• Archival of core data and associated data in other tables
Enforcing single read-only state across related data
Delivering transparent access to archived/combined data via native app UI• Maintaining performance of remote queries and union views.
Update process:• Restate vs. entire reload

Unstructured Data Considerations
Scalability
Sustained performance with data growth• Hierarchical file-systems limited at large scales
Content Access and Visibility
Meta data use to intelligently manage and maintain archive addresses traditional file system limitations
Scalability of Index (Content addresses)

Email Archival Challenges
Stringent regulations: SEC Rule 17A-4• Non-rewriteable, non-reusable media
• Verification of writes
• Serialize units of media
Solution Requirements• Server-based capture
• Support for multiple distributed Email Servers

Meta Data Holds Real Value
• Digital asset tied to specific infrastructure
• No value outside of infrastructure context
• Self-describing attributes for digital asset
• Enables powerful policy-based data movement applications
Traditional File Systems
Object-based systems
Meta Data is data about data
Object Age and creation date
Object Change History
Associated application/users
Access control
Priority/Criticality
Data Access/Frequency

Choosing the Right Storage Medium
Amount of D
ata
Probability of Reuse
D2D Systems
Libraries
Drives
< Seconds Minutes Hours to Days
1 WeekLife Expectancy
1 Month 3 Months 18 – 30 Years1 Year
Disk Systems
Recovery Time
Object Storage

Key Considerations for Storage Media
Cost
Access time
Application access method:• NFS/CIFS
• Application-specific API
Reliability/Availability
Data Preservation Capability
Scalability
Archival solution integration

Storage Media Considerations
Pros Cons
Primary Storage No risk of data loss
Instantaneous access
Exorbitant costs
Performance degradation
Secondary
Storage (SATA)
Cost effective
Solid access time
Integration
Enforcing preservation
Management
Object Storage Fit for large unstructured files
Elimination of data redundancy
WORM-like preservation
Price premium
Performance scalability with
index growth
Tape Most cost effective
Removable
Integrated WORM
Access time
Reliability

Shifting towards an On-line Model
Primary
Object
Storage
Tape
SATA

Representative VendorsStructured Data Email Unstructured
Archival
Solutions
OuterBay, Princeton Softech,
Applimation, Ixos,
Legato, KVS, Assentor Documentum, FileNet,
NICE
SATA Object Tape
Storage
Platforms
CLARiiON, STK, IBM, Nexsan COPAN, Centera,
Archivas, Permabit,
DCT
STK, Quantum, ADIC,
IBM
Start with your application vendor

Trust But Verify
Develop processes to periodically access
historical data to test:• Data integrity
• Access time
Manage capacity growth using vendor-
supplied reporting tools

Summary
Archival is not backup and is not just about compliance
Successful strategy requires application-centric approach
Engage with key corporate stakeholders to define requirements and select solutions
Look for automated and interoperable software and hardware modules.
Be Paranoid!

Thank you!
Arun Taneja
Alex Gorbansky