Getting Started UsingDatabase Archiving
Toronto DAMA Chapter Meeting16 September, 2009
Jack E. Olson
www.svaltech.com
SvalTech
“Database Archiving: How to Keep Lots of Data for a Long Time”Jack E. Olson, Morgan Kaufmann, 2008
Copyright SvalTech, Inc., 2009
2
Why This Presentation
SvalTech
• A common position of many IT shops is– We know we should be doing database archiving
– We know it will be valuable to us
– But we don’t know how to get started
• Database archiving is an enterprise technology: it can be used in many applications
• Not all database applications are suitable for database archiving
• Suitable applications have widely differing return-on-investment potential
Copyright SvalTech, Inc., 2009
3
The Database Archiving SurveySvalTech
organize survey team
application enumeration
first-cut feasibility
data-life-cycle analysis
operational analysis
risk analysis
metric gathering
evaluate implementation options
business case development
prioritization
Copyright SvalTech, Inc., 2009
4
The Survey Organization
SvalTech
MandatePeopleInputs
Mandate
A management directive that creates the database archiving survey task force and gives them the scope and objectives of the study.
Scope: business units to include, organizational units (divisions, companies, campuses)
Objectives: find best candidates for cost reduction, fixing operational problems, risk reduction
Copyright SvalTech, Inc., 2009
5
The Survey Team
SvalTech
MandatePeopleInputs
ChairFulltime members
IT/enterprise architectstorage administration records retention
Subject matter membersdatabase architectdata managementbusiness unit data analystdatabase administration
Incidental memberslegal departmentIT compliancedata governancesecurity administrationdata analyst (BI type)
Copyright SvalTech, Inc., 2009
6
Starting Materials
SvalTech
MandatePeopleInputs
Enterprise data modelData classification resultsSLA’sIT storage strategyRegulations/compliance rulesData governance mandates
Copyright SvalTech, Inc., 2009
7
Application Enumeration
SvalTech
Limit search to those within mandatebusiness unitlocationenterprise
Identify Operational Applicationsclassify as transactional vs. static datainclude those already archiving to any extent
Identify Retired Applications still retaining data
Identify applications about to changeconsolidations planned planned or recent acquisitionsreplacements/ conversions/ reengineeringidentify any strategies for application retirement
Copyright SvalTech, Inc., 2009
8
Application Enumeration
SvalTech
For Applications with potential,
Capture application data modelIdentify business records within the data modelConnect business records to records retention and legal categoriesIdentify database information: system/dbms/file/metadataCreate a Database Topology chartIdentify parallel applications within the corporation
(even if out of scope)Identify operational replicatesIdentify backup/disaster recovery stores and strategiesIdentify recurring data extracts for BI, etc.Get rough idea of db size and transaction rates
Copyright SvalTech, Inc., 2009
9
Database Topology Chart
SvalTech
create data
operationalreplicateoperational
BI storesCRM
archive
backupbackup
disasterrecovery
offlinestorage
Copyright SvalTech, Inc., 2009
10
First-Cut Feasibility
SvalTech
Factors for continuing to consider,important datalots of datalots of individual business records simple data structuresrelatively stable data structures (little change)long retention requirementlong inactive period within retention requirementlow frequency access requirement in inactive periodlow performance requirement in inactive periodsimple access requirements in inactive period
Apply criteria after each subsequent step to further eliminate bad candidates
Copyright SvalTech, Inc., 2009
11
Examples
SvalTech
Good Not GoodBank deposits and withdrawals Customer master dataStock trades Airplane manufacturing recordsCredit card transactions HR recordsTicketmaster transactions Felony recordsMedical claim data Home salesCasualty claim data (auto, home)Retail sales inventory transactionsPackage trackingPassenger flight dataDriver license recordsSales tax recordsProperty tax recordsTelephone call transactionsNuclear reactor monitoring recordsAuto warrantee records
Copyright SvalTech, Inc., 2009
12
Data Life Cycle Analysis
SvalTech
Create a database archiving DLCA for each business record type
Data Retention ChartBusiness Record Process Chart to determine inactive periodBusiness Record SLA chart by age of record
Copyright SvalTech, Inc., 2009
13
Data Retention Chart
SvalTech
The requirement to keep data for a business object for a specified period of time. The object cannot be destroyed untilafter the time for all such requirements applicable to it has past.
Business Requirements
Regulatory Requirements
The Data Retention requirement is the longest of all requirement lines.
Copyright SvalTech, Inc., 2009
14
Business Record Process Chart
SvalTech
for a single instance of a data object
Create POUpdate POCreate InvoiceBackorderCreate Financial RecordUpdate on ShipUpdate on Ack
Weekly Sales ReportQuarterly Sales report
Extract for data warehouseExtract for bus analysisCommon customer queriesCommon bus queries
Ad hoc requestsLaw suit e-Discovery requestsInvestigation data gathering
Retention requirement
operational reference inactive
time
Copyright SvalTech, Inc., 2009
15
Business Record SLA Chart by Age
SvalTech
for a single instance of a data object
Query response time
Transaction volume
create/update
Security (no users)
read
Retention requirement
operational reference inactive
time
Copyright SvalTech, Inc., 2009
16
Operational Analysis
SvalTech
Don’t assume there are no problems.
Talk to DBAs and users.
Look for trends
Look for escalating operational costs.
Get numbers.
Copyright SvalTech, Inc., 2009
17
Operational AnalysisSvalTech
• Performance Issues– Not meeting response time SLA
– Longer time to run extracts
– Longer time to run backups
– Longer time to run database reorganizations
– Running reorganizations more frequently
– More difficult to tune
• Risk Issues– Longer estimated time to run recovery
– Longer estimated time to run disaster recovery
• Cost Issues– Higher annual hardware costs
– Higher annual MIP-based software cost
– Adding expensive DASD to support database and backups
Copyright SvalTech, Inc., 2009
18
Risk AnalysisSvalTech
• Data Loss Risk– Isolation from internet hackers
– Prevent ANY updates or deletes
– Preserve data through multi-site backups and periodic pings
• Data Quality Risk– Changing data structures and column semantics
– Changing reference data
• Unauthorized Access Risk– Reduced (or different) user set
– Audit trail of access
• Legal Risk– Preserve authenticity of data in archive
– Reduce cost and time to produce data for discovery requests
Copyright SvalTech, Inc., 2009
19
Metric Gathering
SvalTech
Data bytes stored per business objectnew transactions created per daybytes for backups, replicatesgrowth in transactions ratesany sudden expected additionspast history plus future projections
Storage Costscost per byte: operationalcost per byte: backupcost per byte: archivearchive compression ratios
System Costsmips required to processsoftware license feesstaff for operational
Copyright SvalTech, Inc., 2009
20
Metric Gathering
SvalTech
For retired applications concentrate ondisplaced system costdisplaced software costdisplaced staff cost
IBM mainframeIMS DBMS
CICSDBA/SYSPROG
LINUX serverArchive software
JDBCArchive admin
NOT shared
Shared
Copyright SvalTech, Inc., 2009
21
Evaluate Implementation Options
SvalTech
• Software– Vendor provided software
– Custom built solution
• Access tools– Original application
– Generic report generation/ query tools
– Custom built
• Storage for archive– Storage subsystem
– Hosted storage
Copyright SvalTech, Inc., 2009
22
Architecture of Database Archiving
Archive Server
Operational System
archive catalog
archive storage
OP DB
Archive AdministratorArchive DesignerArchive Data ManagerArchive Access Manager
SvalTech
Archive Extractor
Application program
Archive extractor
Copyright SvalTech, Inc., 2009
23
Estimate Implementation Time and Cost
SvalTech
• Archiving systems required– Servers
– Storage systems (hosted storage?)
– Licensed software
• Application Design
• Implementation
• Test
• Deployment
• Ongoing operation and administration
Copyright SvalTech, Inc., 2009
24
Business Case Development
SvalTech
– Lower IT costs
– Improved operational efficiency
– Risk reduction
Copyright SvalTech, Inc., 2009
25
Lower IT Costs
SvalTech
• Systems– Reduce size/cost of operational systems
– Put off or eliminate need for system upgrades
• Software– Eliminate or reduce cost of expensive system software
• DBMS
• Transaction system
– Eliminate or reduce cost of application software
• Storage costs– Switch to lower cost storage
– Impact on backups/ disaster recovery stores
– Reduction in byte count stored
• Staff– Eliminate or reduce legacy system staff
Copyright SvalTech, Inc., 2009
26
Chart it
operational operational archive
All data in operational db
most expensive system most expensive storage most expensive software
Inactive data in archive db
least expensive system least expensive storage least expensive software
In a typical op db60-80% of datais inactive
This percentageis growing
SvalTech
Size Today
Copyright SvalTech, Inc., 2009
27
Lower IT Costs
SvalTech
• First year impact
• Time to recover project costs
• Chart cost savings over time– Plot data growth over time for operational
– Plot data growth over time of archive
Copyright SvalTech, Inc., 2009
28
Operational Improvements
SvalTech
• Itemize improvements expected– Performance of operations
– Reduction of utility times
– Reduction of recovery times
– Reduction of disaster recovery times
– Reduction of DBA workload
• Provide cost savings where appropriate
Copyright SvalTech, Inc., 2009
29
Risk Reduction
SvalTech
• Itemize improvements expected– Less risk of failing e-Discovery request
– Enhanced data quality of older data
– Less exposure to loss of data authenticity
– Better access control
– Better compliance
– Better data governance
– Less dependence on legacy systems
• Provide cost savings where appropriate
Copyright SvalTech, Inc., 2009
30
Prioritization
SvalTech
– Determine Prioritization Criteria• Cost is most common primary factor
– First archiving project may have other goals• Lower risk of failure
• Faster implementation
• Faster return on investment
• Usually a retired application project
– Risk may over-ride other factors• Preserve data authenticity
Copyright SvalTech, Inc., 2009
31
Final Thought
SvalTech
• Always do a survey to find the best applications to start with
• Always do a survey to identify those that make sense to proceed with versus those that do not: don’t waste time on apps that are too hard to implement or that will have little value.
• A good database archive application can save millions of dollars per year, increase performance of operational systems and reduce risk all at the same time. The trick is identifying them and proving it.
• Repeat the Database Archiving Survey from time to time in the future.
Copyright SvalTech, Inc., 2009
Top Related