University of Arkansas Data Mining with Teradata TM Warehouse Miner Jim Kashner CTO Data Mining.
Reference Certificate - PRISE Teradata Database Optimizationmarketing capabilities established on...
Transcript of Reference Certificate - PRISE Teradata Database Optimizationmarketing capabilities established on...
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Reference Certificate
Vendor name: PRISE Recipient: Telenor Hungary Reference issuer: Zsolt Kovács, Head of Infrastructure Operation
Subject of service: PRISE Compress Wizard implementation
Vendor introduction PRISE enables its customers to realize the
maximum value of their Teradata investments
by decreasing storage space and hardware
consumption using a proprietary self developed
methodology.
After a free analysis of customers’ hardware
usage patterns, PRISE can rapidly optimize
storage space and hardware consumption.
Achieved benefits significantly reduce cost of
ownership.
Recipient
Telenor Hungary
2nd largest Mobile Network Operator
3.6 million active subscribers
Net 9.4 TeraBytes Teradata Data Warehouse
Content of service PRISE successfully implemented PRISE
Compress Wizard at Telenor Hungary in Q2
2011.
Implementation milestones:
Analysis of current data storage
Report the forecasted storage and
performance impact
Implementation of storage optimization
Final report of the real storage and
performance impact
PRISE’ business model is simple and
effective: analysis and forecast reports are
free. Savings are shared upon measuring the
real impact.
Achievements Analysis performed without any congestion in regular
processes
Reports forecasted a possible 1200 Gbytes of net space gain
Implementation performed without any operational issues and
released 1258 Gbytes of net free space, which is
17.0 % of the affected tables’ net storage space
15.5% of the affected databases’ net storage space
Suitable for 11 months’ data increase
Optimization had no measurable performance impact on load
procedures
Optimization caused a total of 13% decrease of overall IO and
15% of overall CPU usage
The project delivered cost reduction in the value of ~100.000
EUR within 1-2 weeks* (*extended by phasing and resource availability)
Impressions The vendor demonstrated
excellent knowledge in
Teradata and related
technology
PRISE Compress Wizard
implementation delivered
real value to Telenor
Based on experience I
highly recommend PRISE
and it’s products as a
professional information
services partner to your
company.
Zsolt Kovács, Head of Infrastructure Operation
ncsay Dániel, Director of SSOD
Budapest, 2012.03.02.
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Case Study
The Companies
PRISE - Vendor
PRISE enables its customers to realize the maximum value of their Teradata investments by
decreasing storage space and hardware consumption using a proprietary self developed
methodology.
The implemented PRISE Compress Wizard delivers quick win results for customers by immediately
releasing resources like storage space and processing capabilities, and decreasing the specific
operational costs.
PRISE’s experience shows that even the best managed Teradata environments hide significant
optimization opportunities. Unique enablers of the company are the over 10 years of Teradata
Masters’ industry experience and the proven solution for optimizing Teradata database internal
compression.
PRISE Compress Wizard implementation projects demand minimal onsite activities, have a short
implementation time, therefore offering quick wins. Most of the benefits manifest as immediately
usable free database space and increased processing capacity, while others reduce the long-term
costs. For more details please refer to Annex A.
Telenor Hungary - Recipient
Hungary’s 2nd rank mobile network operator has been founded in 1994, and fully owned by the
Telenor Group since 2002. The company’s telecommunication services cover voice, text message
(SMS) and mobile internet over 2G and 3G network. The 3.6 million subscribers’ distribution is:
52% postpaid and 48% prepaid. The company achieved 650 million EUR revenue in 2009.
http://www.telenor.hu/en/telenor-hungary/facts-figures/
http://www.telenor.com/
In 1999 the firm decided to build a Teradata based Enterprise Data Warehouse (EDW) solution.
The solution has since been continuously developed according to best practices to fulfill an ever
expanding set of enterprise requirements. It has been deemed the best data warehouse solution in
the Telenor Group in 2006, due to the complex analytical functionalities it provides, such as
profitability calculation, social network modeling and churn prediction, and the one to one
marketing capabilities established on top of it with the implementation of a campaign management
solution. The data warehouse of Telenor Hungary stores a nearly 360° view of the customers and
the company and stores all detail data for 13 months and aggregated level data from inception.
During its lifetime the EDW has grown to 5 times its original size, and at least 3 times in complexity
and the EDW has also been integrated into most core business processes, making it one of the most
important IT systems within the company.
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Methodology
Project lifecycle
Analysis
Forecast
report
Technical
Business
1- 2 weeks 1- 2 weeks
Result report
and closure
Implementation
Decision
Methodology
and preparation
Figure 1: PRISE Compress Wizard implementation lifecycle
Methodology and preparation
Non-disclosure agreement signed. Technology demonstrated on-site including ways of working,
project milestones and timelines, contractual highlights and guarantees, Q&A. Technical
environment initialized, including access permissions, resource allocation, timing.
Roles:
Vendor: On-site visit, presentation, requirements definition
Recipient: DBA and decision-maker get familiar with the product, resource planning,
technical preparations
Analysis
Target areas identified. PRISE Compress Wizard assembled according to requirements and run.
PRISE evaluates results within 3 working days.
Roles:
Vendor: Consults, prepares and supervises product configuration and scheduled run
Recipient: Participate in identifying focus areas, available for assistance
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Forecast report
Findings and recommendations prepared and presented. Q&A session.
Roles:
Vendor: Calculates forecasted benefits, creates reports and perform presentation
Recipient: DBA and decision-maker get precise information about the achievable
benefits
Decision
Recipient makes decision about implementing PRISE Compress Wizard optimization. No-go
decision concludes the project with zero costs to Recipient and implementation will not take place.
Implementation
Optimization scripts are provided, deployed, scheduled and run.
Roles:
Vendor: Deploys optimization scripts and assists in running
Recipient: DBA fully involved in implementation, managing EDW processes and
running optimization scripts
Result report and closure
Log data evaluated and the results of the overall project are assessed and presented. Project closure
with benefits shared as previously agreed. Upon mutual satisfaction Recipient issues a reference
certificate.
Roles:
Vendor: Assesses and presents results
Recipient: Evaluates results
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Optimization process
PRISE Compress Wizard storage optimization is based on fine tuning Teradata Multi-Value
Compression (MVC) functionality. MVC is a 100% transparent feature, using it does not influence
the applications. MVC is usually manually configured and will not automatically adapt to the
changing business and data environment.
From technical point of view the project has two separate tasks: analysis and implementation.
During the analysis phase the vendor’s expert identifies the set of tables in focus in cooperation
with the Recipient’s delegated DBA. The qualified tables will be analyzed by a proprietary software
- PRISE Compress Wizard, which inspects the distribution of data values, calculates the optimal
MVC settings and calculates the possible space gain for each involved table.
The Vendor then evaluates the result statistics and presents the suggested optimization prospects.
The second technical phase of the project covers the implementation, which enables the Recipient
to apply the benefits in practice. The Vendor’s and Recipient’s DBAs create a deploy time plan to
effectively execute the suggested modifications in the database. According to the plan the Vendor
delivers SQL scripts that relocate data into the optimized table structure. The scripts are specifically
designed to avoid all incompatibilities or system disturbances. For details please refer to Annex B.
Benefits of compression
The cost structure of a data warehouse consists of several different components:
Storage and processing hardware costs, e.g. servers and disk arrays
License and support fees in proportion to the system performance
Operational expenses like backup, space and performance management, optimization
Electric power consumption of the hardware components
Server room reservation and cooling costs
Personnel costs
Implementing PRISE Compress Wizard will decrease all cost components, saving significant
expenses for the company, without noticeable disadvantages due to Teradata’s unique
technological solution utilized by PRISE Compress Wizard. Compression is recommended to be
performed from time to time in order to exploit the maximum advantages of the product.
Benefits achieved can be classified into two categories. For further details please refer to Annex A.
Instant advantages
Lower storage consumption, instantly freed disk space
Less resource consumption -> faster queries to the users
Shorter backup windows, less archive media
Deferred hardware extension
No side effects or application disturbances
Occasional advantages
Shorter migration, reconfiguration times at upgrades
Shorter disaster recovery time
Slower growth of database size, less frequent capacity expansions
More applications and reports can be served by the current infrastructure
Significantly less DBA time is required for manual database optimization tasks
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Achievements at Telenor Hungary
Storage space saving
On the optimized tables PRISE Compress Wizard managed to reach 17% storage space saving,
which resulted in 4%..92% savings on individual tables. To visualize the achievement, the gained
storage area is enough to cover 11 months of data volume increase. Detailed estimations and
statistics are available in Annex C.
Reduction of resource consumption
Both the CPU and IO load of the system has been decreased by 15% and 13% respectively, at
significantly increased user activity. The chart shows the system measurement results:
Figure 2: Overall Workload and CPU&IO usage and trends, including optimization phases
Telenor Hungary’s Teradata RDBMS has the DBQL (Database Query Logging) feature switched on
continuously, and stores ~1-2 months of historical data. The chart of Figure 2 is based on DBQL
data and shows the total CPU and IO load of the system over the time.
“Phases”: shows the optimization implementations in time (phase 4,5,7,8) System CPU time: 7 days moving average of daily SUM() of CPU seconds System IOs: 7 days moving average of daily SUM() of IO Workload: 7 days moving average of number of SQL statements issued daily
Tre
nd
of:
CP
U t
ime,
IO
, Wo
rklo
ad
0
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Project details The optimization project was initiated by PRISE in Q1 2011. It required noticeably low human
resources from Telenor side, only one DBA, who prepared the working environment and then
executed the implementation package. Telenor DBA time spent on the project did not exceed 5
working days. Decision maker time involvement was limited to a few hours in meetings.
The time frame of the project was quite long compared to the required human and machine
resources. The reason was low resource availability, which lead to 8 implementation phases. The
project did not have a drawback due to the lengthier timing, all benefits were delivered.
Figure 3: PRISE Compress Wizard implementation project timing
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Annex
A. Aspects of benefits
Some of the achieved advantages will not be visible immediately, but on the long run they will
undoubtedly apply: the company’s growing business produces more and more data and rising
market competition demands more complex algorithms, both of which require developments of
hardware infrastructure sooner or later or periodically. Optimizing available resources will
decrease investment into additional hardware resources along with their costs.
Teradata license and support fees’ calculation model is based on the idea of TPERF, which is a
weighted mixture of processing capacities. The more CPU and IO capacity a system has, the bigger
the TPERF value of an installation, and raises the fees higher.
More server nodes and disk bundles consume more electric energy. Researches demonstrated that
the power consumption expenses of IT infrastructure fall into the price range of license fees. The
power energy consumption additionally doubles for server room cooling.
Other cost components are reduced immediately. Backup and archive times shorten proportional to
freed disk capacity. Companies often suffer from the lack of free space. In this case difficult
decisions must be made: investing into additional hardware or reducing either the scope of stored
information or the depth of history. These will be painful to some division of the company – unlike
compression, where there are no tradeoffs.
It is important to highlight, that PRISE Compress Wizard is 100% transparent for all kind of
applications, therefore introduction or optimization of the product will not involve ETL, application
or report development, which means there is no hidden cost.
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
B. Technical description of project
Prerequisites
For quality assurance purposes we prefer to use Teradata’s DBQL feature. We use 3-4 weeks log
data as basis of resource consumption calculation for analysis & implementation as well as final
results.
Target selection
We used the Teradata data dictionary to make a pre-selection of the candidate tables. Based on a
consultation with development, reporting and operation staff, we excluded the unnecessary
databases and tables from the scope.
Project decided to take the TOP 1000 tables by size to be examined, and after excluding the
unnecessary and stale ones, 889 of them were candidate for analysis.
Calculation process
The analysis of the <target> is a 100% percent automatic process which has the following
prerequisites:
1 database in the EDW, where PRISE Compress Wizard can write, and have ~1000MB (depending on local situation), referenced <CCWDB>
1 user in the EDW, which has read-only access to all EDW database tables, and data dictionary, and read-write access to <CCWDB>.
The given user and enough resources (priority, TASM limits) to be able to run the statistical processes. In Telenor Hungary PRISE used a standard „Developer Role” to run the analysis process with its default priority, spool space and access rights
Enough spool space for calculations, depending on the analysis sample size. A Windows workstation with
o JVM 1.6 installed o Teradata client installed o Network access to EDW system (through JDBC)
The process was run in multiple parts, during the nights, in order to minimize the disturbance of
EDW users or daytime if the priority settings enable seamless EDW usage. The time the whole
process takes depends on
Available processing resources Number of tables in <target> Number of columns in <target>’s tables Storage size of <target>
The calculation process runs as an ordinary ad-hoc report, does not disturb loads (uses access
locking) and other reporting and analysis tasks.
The calculation process in Telenor Hungary was run with the following details:
Used 1million record sample on each tables Ran in 3 parts, took 10h, 16h, 1h respectively We utilized a total of 94 CPU hours and 916 million IOs for the calculations The calculation results (including detailed value distribution data) occupied ~700Mbytes
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
The result of the calculation process contains the original and the estimated optimized table sizes,
to be able to estimate the amount of workspace that will be necessary for implementing the
optimization and the storage gain which can be achieved – before any modifications are done on
the database.
Analysis & report
After the completion of the calculation process we found that the following gains are available.
The table contains only the statistics for the tables that are currently not optimal.
Gain % range Orig. sum GBytes Gainable Gbytes No. of tables
0..5% 3856,85 124,57 222
05%..10% 1269,25 101,94 134
10%..20% 2388,77 336,13 213
20%..30% 864,62 227,6 137
30%..40% 624,15 215,14 64
40%..50% 369,86 175,57 48
50%..60% 126,37 68,86 30
60%..70% 36,01 22,82 27
70%..80% 21,67 16,2 6
80%..90% 11,35 9,35 7
90%..100% 2,02 1,84 1
Total 9570,92 1300,02 889
Target selection
If we want to cut the time and resource demands of the implementation, it’s reasonable to shorten
the <target> table list. We’ve chosen to eliminate those tables that did not promise more than
20Mbytes of space gain. This way we had 488 tables left for implementation phase.
Implementation
Table compression parameters are creation-time defined; non modifiable in the available versions
of Teradata, therefore the modification of compression is only applicable by recreating the tables,
which will cause object and data absence for a while.
For this reason we had to choose low activity time windows to implement optimizations. The most
appropriate time was the end of business time: 17:00
PRISE Compress Wizard generates all the necessary scripts that need to be run to achieve the
possible storage gain by recompressing the tables, but they must be run by the local Teradata DBA.
Before running we recommend that the DBA checks all the SQLs before they are applied.
The implementation was broken up into 8 phases since it was quite difficult to find continuous time
frames with available resources of both DBA and database.
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
C. Result details
Space saving
The total results of the 8 implementation phases are shown in the table below:
Phase No of Tables Original Gbytes Gain Gbytes Gain% Runtime
1 3 183,4 72,5 39,53 ~2h
2 8 543,08 158,22 29,13 ~2h
3 5 848,67 169,92 20,02 ~3.5h
4 20 388,46 99,81 25,69 ~2h
5 406 2304,38 484,03 21 ~8h
6 1 120,67 34,07 28,23 ~0.5h
7 18 705,37 100,26 14,21 ~3h
8 27 2308,55 140,1 6,07 ~6h
Total 488 7402,58 1258,91 17.0% ~27h
The implementation phase requires the optimized space amount of the target table as an additional
free space during the technical run of the scripts, for one table at a time. This space will be
immediately de-allocated after the table’s optimization finishes.
The implementation consumed
~350 CPU hours ~1450 million IO operations
Which are 80% and 25% respectively, of the regular daily system loads.
Load performance impact
The test measurements proved that the CPU consumption of an INSERT statement into a
compressed table can be 50-200% more, than into its uncompressed equivalent. However, one can
easily check that a suboptimal compression definition will demand the same, or more CPU.
Load procedures typically have two important time factors: longer preparation & transformation
and significantly shorter final “INSERT”, however the more resource intensive first part is not
affected at all.
After the implementation finished, we performed an investigation of the CPU and IO consumption
of the INSERT statements issued against the optimized tables:
PRISE Ltd. Bartók Béla út 152, Budapest, HU Phone: +36 1 789 1888 Fax: +36 1 786 3242 Email: [email protected] Web: prise.hu
Figure 4: How did CPU usage of affected tables’ INSERT statements change
Performance impact is displayed on the horizontal axis, 1x means no change, 0,5x half, 2x double
amount of CPU for the similar INSERT operation. The number of tables where INSERT phase fell
into a given performance impact range can be found on the vertical axis. One can read that the
modal impact was around 0.95, therefore INSERT statements consumed a little less CPU than
before optimization.
Fault tolerance during implementation
Production Data Warehouses are in continuous operation and will not easily go down for
maintenance. For this reason PRISE developed a safe process to foreclose the possibility of data
damage or loss and minimize the possibility of unavailability.
In Telenor Hungary’s PRISE Compress Wizard implementation project we reached these goals
100%:
No particle of data was lost or corrupted
No database downtime was requested
Personal data security
PRISE undertakes to sign a very comprehensive non-disclosure agreement as a first step. PRISE
Compress Wizard will not modify or keep any personal data. Additionally we only work with your
data in your safe environment.
Change management
PRISE provides detailed documentation for ITIL change & configuration management process
considerations. PRISE Compress Wizard can be tested initially in a test environment, then gradually
implemented across the entire scope with maximum safety.
0
2
4
6
8
10
12
14
16
18
20
0,2
6x
0,5
x 0
,62
x 0
,74
x 0
,82
x 0
,9x
0,9
8x
1,0
6x
1,1
4x
1,2
2x
1,3
x 1
,44
x 1
,76
x 1
,9x
2,1
8x
3,9
8x
5,1
2x
7,3
x
Data Load CPU change
Number of loads