Capacity Planning PDD Final
-
Upload
ronnyb13119549 -
Category
Documents
-
view
36 -
download
1
Transcript of Capacity Planning PDD Final
1
2
Capacity Planning - Version 8.6
Murat Yesilsirt, Principal Consultant
3
Agenda
• Key Architecture Points
• Environment Sizing vs. Capacity Planning• Environment Sizing• Capacity Planning
• Tools
• Sizing Exercise
• Customer Examples
4
Capacity Planning
• Ensures that sufficient capacity is available at all times to meet business requirements
• Integration capacity is not simply the sum of capacity needs of each application
• Time dimension - Involves more than performance of the system’s components, individually or collectively
• Also deals with resolving incidents and identifying problems relating to capacity issues
5
Have you ever asked yourself?
• Can I save money with server consolidation?
• Could I move my data faster with an expanded environment?
• Is my Informatica Server ‘big’ enough?
• How much more data could I move with my existing system configuration?
• How much faster could I execute my existing loads?
• If I had to add 1 more project – do I have sufficient capacity?
• How about x more projects?
6
Server Sizing/Capacity Planning Goals
• Meet future requirements
• Meet performance requirements
• Satisfy load window requirements
• Minimize contentions due to lack of resources
• Lower maintenance cost and cost of ownership
• Optimize capital expenditures
7
Key Architecture Points
8
Data Integration Environment: Key Architecture Points
PowerCenterServer
TargetsSources
Source File Server and
RDBMS CPU/RAM
Target File Server and
RDBMS CPU/RAM
Server CPU/RAM
Network Network
9
Informatica Real Time Data Integration
Web Application Server
Integrated Customer Portal
Database
Transactional System –
Relational Source (Oracle, DB2, etc)
Mainframe System
Acquired Mid-Range AS/400 System
Acquired Mainframe System
Exception Management
Database
PowerCenter Orchestration Engine
Administration Portal
Data Steward Portal
Customer Portal
10
PowerCenter Data Integration System Characteristics
• Block processing
• Parallelism – Multiple threads, partitioning
• 64-bit Option and Caching
• Random Reads and Sequential Reads
• Database and File processing
• String Manipulation and Unicode
• Pushdown Optimization
• Shared File System
• Checkpoint Recovery
• Web Services
11
Environment Sizing
12
Environment Sizing vs. Capacity Planning
• Environment Sizing• New software implementation/install• Extremely rudimentary models to predict estimated need• Rarely perfect
• Capacity Planning (Existing Environment)• Accuracy of exercise is based on statistics from existing
environment• New projects on existing environment
• Ensure existing projects are not affected• Load window• Load times• Performance
• PowerCenter upgrade on existing environment• Use of new PowerCenter enhancements for performance gains on
existing hardware after upgrade
13
Environment Sizing
• New Environment• Conversion of custom code and stored procedures• Estimation process because there is no existing environment• Consider various architectural options and how they will affect the
sizing• GRID/HA• Windows vs. UNIX/LINUX• Shared PowerCenter Environment vs. Dedicated Environment• Virtualization
• Hardware sizing considerations• CPU• Memory• I/O Capacity• Disk Space• Network Bandwidth• Repository Database
14
Environment Sizing Inputs
• Data volumes
• Mapping complexity
• Number of mappings
• Concurrent work load
• Peak work load
• Expected growth
15
Environment Sizing Methodology
• Gather performance requirements (Volume, load window etc.)
• Document assumptions e.g. planned architecture, usage period, geographical distribution of data & users
• Evaluate alternatives – Commodity hardware vs. High-End SMP
• 75% CPU Utilization or less
• Minimize memory paging
• Consider future growth
• Use proof of concept benchmark testing to validate
• Based on high level estimation factors –• 2MB per CPU per second avg. • 2-4 GB Memory per Core
• Cross check with other implementations
16
Environment Sizing Example
Method 1 (Volume based) (Enter a "1")Number of Gigabytes per hour 8 Low(25% or less) 0Number of simultaneous jobs on average? 5 Medium (25% to 75%) 1
Method 2 (Existing load process) High (75% or more) 0How many loads? 10How much data is being moved (in GB)? 0.5 Data Volume GrowthWhat is your load window in minutes? 720 20%
Method 3 (Expected load process)How many target tables do you have to load? 250Size of data to load (source data) in GB? 10 UnixWhat is the load window in minutes? 480 64
Continuous and/or Real Time
(Enter a '1') (Enter a "1")
25% or less 1 Low(25% or less) 025-60% 0 Medium (25% to 75%) 160-90% 0 High (75% or more) 0
90%+ 0
Load Window Criticality Application Type(s)(Enter a '1') What sort of application(s) will PowerCenter be used for?
Not at all important 0Somewhat Important 0
Very Important 0Critical 1
Informatica PowerCenter Sizing QuestionsAggregates and Sorting
Operating System
Lookup Sizing
Data Volume RateData volumes are a critical aspect as the CPU cycles must be available to handle the data volume in the appropriate timeframe. Getting a reasonable estimate of the volume of data to be moved on a nightly/daily basis is the cornerstone of a sizing effort.
What is the expected use of aggregates/sort
What is the expected yearly data growth (%)
Unix or NT64 bit or 32 bit
Other ConsiderationsPlease include any other considerations you feel are important to the sizing effort. Any environmental information, restrictions, needs should be listed here.
Lookups (caching data tables to match values) require additional CPU and RAM. It is an important factor in sizing the box. Use an educated guess as to the size of your lookup requirements. If you are loading a warehouse, think in terms of the size of th
The assumption is that continuous and/or real-time workloads will require more CPU and memory. This is because there is less flexibility in workload management. RT sessions must run, and they must run now and they should not be slowed by other processesWhat percent of sessions/loads will be real time? Percent of lookups with > 250k rows?
How critical is it that the load windows is always met?
I
17
Capacity Planning
18
Capacity Planning
• Existing Environment• Measure actual performance in YOUR environment• Use real world performance information to understand
current unused capacity• Use linear scalability to predict future needs• Key review points :
• Current performance • Data growth projections• Future integration needs
• Consider Impacts of any technology shift/change• Web Services• Grid/HA• XML Processing etc.• New server technologies
19
Capacity Planning Methodology
• Gather performance information• Volume (data/records)• CPU Usage• Memory Usage• Network Usage• File System Usage• System Characteristics (CPU speed etc.)
• Document future assumptions e.g. planned architecture, usage period, geographical distribution of data & users
• Review future growth needs
• Review data growth projections
• Plan for 75% CPU utilization or less
• Determine required capacity
• Update/expand environment as needed
• Use benchmark testing with real production like data
20
Roles
• DBA
• Operations
• Informatica Administrator
• System Administrator
• Network Specialist
• Developer
• Business Analyst
21
Tools
22
Tools
• Monitoring tools to help determine how the servers are performing
• Reports to provide metrics about how PowerCenter is being used
• Analysis to find out your current maximum capacity
• Estimation to determine required capacity for future growth
23
Tools
• Repository Reports
• Repository Queries• OPB_SWIDGINST_LOG, OPB_TASK_INST_RUN,
OPB_WFLOW_RUN, OPB_TASK_STATS
• Key Results• Number of records from SQ per node• Number of session runs per node per day• Number of concurrent session runs per node per hour• CPU/Memory used per session
24
Tools
vmstat
• Reports information about processes, memory, paging, block IO, traps, and cpu activity
• vmstat 5 10 – run with 5 sec delay 10 times
• Processes in the run queue (procs r) procs r consistently greater than the number of CPUs is a bottleneck
• Idle time (cpu id) cpu id is consistenly 0 indicates CPU issue
• Scan rate (sr) sr rate continuously over 200 pages per second indicates a memory shortage• Key Results
• Memory usage statistics
iostat
• Report on CPU, input/output statistics for devices and partitions
• iostat 5 10 – run with 5 sec delay 10 times
• Reads/writes per second (r/s , w/s) Consistently high reads/writes indicates disk issues
• Percentage busy (%b) %b > 5 may point to I/O bottleneck
• Service time (svc_t) svc_t > 30 milliseconds requires faster disk/controller• Key Results
• Disk usage results
25
Tools
netstat
• Displays information about networkinterfaces on the system
• Network connections, routing tables, and interface statistics
ntop
• Shows a list of hosts using the network
• Provides information about traffic generated by each host
26
Tools
sar – System Activity Reporter
• Exists on many UNIX platforms
• Examine live statistics• sar [options…] t n
• t is number of seconds per sample• n is number of samples
• Save sar data for later analysis• sar –o filename t n
• Recall CPU usage: sar –u –f filename• Recall Disk usage: sar –d –f filename• You can also specify time windows (-s, -e) and alternate interval with –I
• Key Results• Consolidated CPU/Memory/Disk usage statistics
27
Tools
sar – Disk Utilization• sar –d t n
• Average I/O size in bytes = (blks/s*512 bytes)/(r+w/s)• % busy is a good indicator of disk bottleneck• Shows disk devices -- can be tough to trace back to specific logical
volume
vega7077-root-># sar -d 60 1
HP-UX vega7077 B.11.23 U ia64 10/25/07
10:25:24 device %busy avque r+w/s blks/s avwait avserv10:26:24 c2t6d0 0.65 0.50 1 23 0.00 9.14
c76t4d3 0.02 0.50 0 0 0.01 10.03c140t2d0 3.13 0.50 2 180 0.00 18.21c142t2d0 3.88 0.50 2 180 0.00 22.38c148t2d0 0.28 0.50 2 180 0.00 1.69c150t2d0 0.42 0.50 2 176 0.00 2.37c108t2d0 3.03 0.50 2 179 0.00 17.67
28
Tools
sar – CPU utilization• sar –u t n
• %sys is system/kernel time• %usr is user space time• %wio is Percent of time “waiting on I/O”
• wio is the best indicator if I/O is a bottleneck• Directly reflects how much performance is lost waiting on I/O
operations
vega7077-root-># sar -u 60 1
HP-UX vega7077 B.11.23 U ia64 10/25/07
10:49:31 %usr %sys %wio %idle10:50:31 1 5 6 87
29
Tools
top
• Provides a dynamic real-time view of a running system
• Displays system summary information as well as a list of tasks currently being managed
• Useful for shared environments to identify each application process and their CPU/memory consumption
30
ToolsWindows perfmon
31
Example
32
Capacity Planning Example
• Before upgrade to PowerCenter 8.6, planning for the new environment is initiated
• Current hardware on Unix
• Business activity is expected to increase 20% annually
• Two new Business Units are expected to use Informatica platform
• Explore PowerCenter 8.6 performance enhancements
33
Capacity Planning Example• Peak Load Time – 1am to 1:35 am
• Number of Sessions – 45
• Most concurrent sessions – 15
• Total Data Processed – 10 GB
• Primarily flat file to DBMS and DBMS to DBMS data load
• Server is 4 CPU with 16gb of RAM
• Most sessions include lookups, but with fairly reasonable cache size (ie. no 8gb customer master)
• Total Load Window requirement is 2 hrs (done by 3am)
34
Capacity Planning Steps
• Using repository reports establish a timeline for loads• Daily • Weekly• Monthly
• Determine the complexity of mappings• High: Multiple sources or
targets, 5 or more lookups, complex logic
• Medium: Multiple sources or targets, 2-5 lookups or an aggregator, full update strategy
• Low: Straight Thru Mapping less than 3 lookups
Extr
act F
iles
Extr
act +
Aud
it C
ompl
eted
Valid
atio
ns
Dim
ensi
on L
oads
Valid
atio
ns
Dai
ly F
act L
oads
Valid
atio
ns
1:00
AM
1:10
AM
1:20
AM
1:30
AM
1:40
AM
1:50
AM
2:00
AM
2:10
AM
2:20
AM
2:30
AM
2:40
AM
2:50
AM
3:00
AM
35
Capacity Planning Steps
Time CPU 1 CPU 2 CPU 3 CPU 4 Avg RAM I/O 1:01 95% 90% 85% 25% 74% 90% Ok
1:11 90% 90% 65% 3% 62% 35% Good1:21 90% 50% 10% 3% 38% 50% Good1:31 75% 25% 3% 3% 25% 25% GoodAvg 87% 64% 41% 9% 50% 50% Good
Data Seconds Data/Sec Data/CPU/Sec Max Expected
10GB 2100 4.8mb 1.2mb 2.4/mb/CPU
• Link the results of system metrics to the load timeline
• Identify the peaks in CPU/memory/disk utilizations
36
Capacity Planning Steps
• Review bottlenecks to reveal areas of improvement with addition CPU/memory/disks
• This may also result in code fixes, but performance tuning is only a short term fix
• Value of new Informatica features e.g. using OS profiles for more granular information and process ownership
• Consider architectural changes in the new environment such as Enterprise Grid Option
• Start making projections based on the input available• My current peak CPU/memory utilization is at 50% and I am expecting
20% growth per group and 2 new groups will join
37
Questions for the Example : • Do you need more CPU?
• Do you need more RAM?
• How much more expected capacity do you have without extending the current load window?
• How much more capacity do you have until you no longer meet load window?
• What could you do to ‘free up’ more capacity?
38
Pitfalls and Common Mistakes• Apples to Apples
• “I talked to <customer> at the user group and they are moving 1,000 rows a second – why aren’t we experiencing the same?”
• “I read an Informatica benchmark and they moved a terabyte in 38 min, which showed 4mb a second per processor – mine should be the same performance right?”
• Growth Projections• “Every day we process 100,000 records that equal 5mb of data thus our warehouse is
increasing by 5mb a day. “• “Every year our warehouse grows by 25% so our daily capacity must be growing by
25%. “
• Adding Horsepower• ‘If I add more CPU and RAM my loads will be faster.”• “My hardware vendor promised their new CPU’s are 2x faster so my load should finish
in ½ the time.”
• Root Cause• “My performance is poor, it must be the Informatica Platform.”• “I’m seeing very low rows per second processed, I must have a slow server”
39
Capacity Planning Results
• Better to start low, observe the adoption rates and usage and then adjust upward as necessary• Vertical – Expandable servers• Horizontal – Grid Architecture
• Start with adding CPU and memory to existing server
• Then increase number of servers with Grid Architecture
• Allocate abundant storage for infa_shared directory
• Use flexible storage architecture e.g. start with 4 stripes over 4 LUN’s, then grow to 4 stripes over 8 LUN’s to expand from 100 GB to 200 GB
40
Customer Examples
41
Customer Example 1Scenario
• Planning for release to production for PowerExchange CDC
• First a benchmark test was conducted with a subset of the data
• Projected data volumes was used for the estimation
• Assumptions were documented: Projected data volumes and benchmark results
• The disk space used for the file system during the benchmark test was recorded
Recommendations
• For actual data volumes, session logs are expected to use about 13 GB daily
• There will be process to purge log files older than two days
• Based on this 26 GB will be allocated for Session Log directory
• Based on the number of lookups, the sizes of the lookup tables, and concurrent sessions, 20 GB for Cache directory should be allocated
42
Customer Example 2Scenario
• Provide capacity planning assistance for upgrade and server purchase
Some key questions
What is the total volume of data to move?
• Current task – Data volume is less than 2 GB per Month.
What is the largest table (bytes and rows)? Is there any key on this table that could
be used to partition load sessions, if needed?
• Existing task – Largest table 6 M rows with average record length 1000 bytes
What is the batch window available for the load?
• Existing task – Batch window is around 6 hours. Future task – 3 hours Week days, 10 hours week end
What is the expected growth?
• The percentage of data volume growth has been projected to be 25% each year.
• Currently there are 50 interfaces loading an approximate average of 200MB of data each
Recommendation
• Compute a “base size” using the key driving factors for CPU and RAM. Then, adjust this base size according to some key attributes of the job load
• The key driving factors for calculating the base CPU size are “cpu mb per sec” (data rate) and “cpu per session” (job load)
• Data Rate: CPU mb per sec = cpu mb per sec factor * number of GB/hour
43
Customer Example 3
Component Details Ram Factor Initial CPU Adjusted CPUData Volume Rate Method 1 Using CPU/MB sec factor 4.9 2.9 4.9Data Volume Rate Method 2 Using CPU/MB sec factor 4.9 2.9 4.9Data Volume Rate Method 3 Using CPU/MB sec factor 4.9 2.9 4.9Base Size 4.9 2.9 4.9Continuous And/Or Real Time Ranges (0,40%,60%,100%) 0% 0% 0%Load Window Criticality Ranges (-30%,0,50%,75%) 0% 0% 0%Aggregates and Sorting 0% 0% 0%Data Volume Growth 0% 0% 0%Operating System Unix vs NT and 32 vs 64 bit 100% -25% -25%Lookup Sizing Ranges (Ram = -20%,0,50%) 0% 0% 0%Application Types + Other Subjective Factor (in %)Total Adjustment Factor 100% -25% -25%
Final Sizing Raw 9.8 2.175 3.675
Final Sizing Adjusted 10 2 4Sizing Upper Range 15 4 6
Informatica would recommend a PowerCenter server(s) with 4 to 6 CPUs and 10 to 15 GB of RAM.
Informatica PowerCenter Sizing Results
Scenario• Upgrade, Server Consolidation, and ICC Organization
Recommendation
44
Customer Example 4
4 Nodes with 2 Dual Core CPU and 32 GB Memory each4 Nodes with 2 Dual Core CPU and 32 GB Memory each
Scenario• Upgrade, High Business Growth, and End of Life for Servers
Recommendation
45
Summary
• Capacity planning is a complicated process that requires input from various sources
• Testing the PowerCenter loads in your environment is the most effective way to estimate system behavior
• Choose a flexible architecture to allow incremental growth
• Validate your conclusions with Informatica Professional Services
• Informatica HACOE at your service for reference architecture and testing
46
Questions?
47
Thank you