sparc migration doag
Transcript of sparc migration doag
about us
• Software production company founded 2001
• mostly J2EE
• logistics
• telecommunication
• media and publishing
• customers demand full lifecycle support
• hardware resale
• datacenter operations
• 3rd party software
• motivation
• Oracle VM for SPARC
• estimating performance impact
• SLOB
• data migration TTS
• issues
• ASH/AWR comparisons
agenda
the situation
• 2TB OLTP RAC database
• x4150 servers, 4 cores each
• max memory 36GB
• workload IO-bound
hw refresh options
x4170 M2 T4-1
CPU 4c 2.4GHz 8c 2.85GHz
RAM max 72GB max 512GB
virtualization zones LDom, zones
lifecycle eol 08/12 available
why SPARC?
• license: hard partitioning
• high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
hard partitioning• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
How much of your CPU is idle?
hard partitioning
• how much CPU is needed?
• pay only for what you need
• pay-as-you-grow
• add cores to VM one-by-one
68,000
1,221,760
SW license EE,RAC,PART,DIAG, TUNE + 3years supportHW 2*T4-1, 128GB RAM, 3 years support
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
hard partitioning
• client already licensed: 2 nodes, 4 cores each
• cpu utilization <75% (ok)
• modern CPUs have >4 cores
• and become more powerful per core
• why pay for cores that are not needed?
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
hard partitioning
http://www.oracle.com/us/corporate/pricing/partitioning-070609.pdf
• soft partitioning (license whole box)
• vmWare
• several others
• hard partitioning (license a subset)
• Solaris capped Zones
• Oracle VM (with special config)
• Oracle VM for Sparc (LDom)
• some special (mainframe) hw
• dsd, lpar, vpar
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
hard partitions on x64
• Oracle VM with pinned CPUs
• overhead (especially for IO)
• no more O-Motion or OVM failover
• Solaris Capped Zone
• already need Solaris (on Oracle HW?)
• small extra step to go sparc
• no “hard” isolation
• RAC is supported, but would you?
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
OVM for SPARC
• hypervisor built in hardware
• zero overhead
• strict isolation of CPU, mem, IO
• PCIe dio for HBAs
• supported with RAC
• supported for hard partitioning
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
OVM for SPARC• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
OVM for SPARC• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
VM challenges
• ovoid overhead
• CPU
• mem
• clocks, RT scheduling
• don’t waste IO latency in vm layers
• make VM as robust as possible
• other VMs must not influence prod VM
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
LDom PCIe DIO
root@primary:~# ldm list-io -l!NAME TYPE BUS DOMAIN STATUS !---- ---- --- ------ ------ !niu_0 NIU niu_0 primary ![niu@480]!niu_1 NIU niu_1 primary ![niu@580]!pci_0 BUS pci_0 primary IOV ![pci@400]!pci_1 BUS pci_1 primary IOV ![pci@500]!/SYS/MB/PCIE0 PCIE pci_0 db1prod OCC ![pci@400/pci@2/pci@0/pci@8]! SUNW,assigned-device@0! SUNW,assigned-device@0,1!/SYS/MB/PCIE2 PCIE pci_0 primary EMP ![pci@400/pci@2/pci@0/pci@4]!/SYS/MB/SASHBA PCIE pci_0 primary OCC ![pci@400/pci@2/pci@0/pci@e]! scsi@0/iport@1! scsi@0/iport@2! scsi@0/iport@80/cdrom@p7,0! scsi@0/iport@v0/disk@w365ae6ad45951589,0!/SYS/MB/NET0 PCIE pci_0 primary OCC ![pci@400/pci@1/pci@0/pci@4]! network@0! network@0,1!/SYS/MB/PCIE1 PCIE pci_1 db1test OCC ![pci@500/pci@2/pci@0/pci@a]! SUNW,assigned-device@0! SUNW,assigned-device@0,1!
http://portrix-systems.de/blog/brost/using-direct-io-with-ldoms/
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
Sol 11 vnet
• sol11 crossbow network virtualization
• LACP
• DLMP (new, great if your switch does not support bonding)
• build vnics on top
• everything is possible
• license: hard partitioning • high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
memory per CPU
• memory means caching
• caching means less IO
• you don’t license DB per GB Ram
• so the more the merrier
• NUMA
• performance penalty for SMP
• license: hard partitioning
• high memory/CPU ratio • product lifecycle
• reliability
• Oracle-on-Oracle
memory per CPU• license: hard partitioning
• high memory/CPU ratio • product lifecycle
• reliability
• Oracle-on-Oracle
x4170 M2 (nehalem)
X3-2 (sandy bridge)
X4-2 (sandy bridge)
3-channel Intel E5
T4
T5
0 128 256 384 512
512GB
512GB
384GB
256GB
256GB
72GB
memory per CPU
product lifecycle
!
• the x4170 M2 is EOL since 08/2012
• the t4 is still being sold
• SUN even had roadmaps that guaranteed availability
• license: hard partitioning
• high memory/CPU ratio
• product lifecycle • reliability
• Oracle-on-Oracle
RAS
• T4-4 has hot-swap PCI
• needed for RAC?
• average MTBF?
• perceived reliability?
• license: hard partitioning
• high memory/CPU ratio
• product lifecycle
• reliability • Oracle-on-Oracle
Oracle on Oracle
• engineered together
• optimized for Oracle DB workloads
• support under one roof
• license: hard partitioning
• high memory/CPU ratio
• product lifecycle
• reliability
• Oracle-on-Oracle
planning
• CMT history not all bright
• horrible single-thread performance
• results in inacceptable user experience
• insanely expensive RAM
perf estimates
• benchmark!
• find an estimate of impact on latency
• show better throughput at high utilization
perf estimates
problem
!
we did not have our own test machine (yet)
and no useful public benchmarks available
!
but Oracle was kind enough to run some tests for us
perf benchmarks
• loop around sql that works from buffer cache (eliminate IO)
• count executions per second
• we wrote this ourselves first
results - latency
• at very low utilization
• 90s vs 119s (30%)
• this is intel turbo-boost
• at 1 thread/core same response time
results - throughput
• T4 handles concurrency much better
• hw threads work really well
• ~30% less response time at 4threads/core
results - summary
• single thread “in the same ballpark”
• multi-threaded advantage for T4
• more work/core = more work/$license$
more benchmarks
• SLOB
• LIO test mode
• there are “modes” for PIO, writes, ...
• one suite of tools for multiple platforms
• very simple (and open) logic
• easy to set up, easy to compare
http://kevinclosson.wordpress.com/2012/02/06/introducing-slob-the-silly-little-oracle-benchmark/
SLOB intro1. download
2. unzip
3. compile (minor tweaking for solaris)
4. ./setup.sh for users (~80MB table per user)
5. set SGA large enough to hold all test data
6. modify readers.sh to loop 500k times
7. runit.sh with different # of threads
8. look at SLOBops/s
9. will generate AWR report for LIO and waits
SLOB setup
DECLARE !x NUMBER :=1;!fluff varchar2(128) := 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX';!!BEGIN!FOR i IN 1..10000 LOOP! insert into seed values (x,fluff, NULL, NULL, NULL, NULL, ! NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, fluff);! x := x + 1;!!END LOOP;
SLOB setup
insert into cf1 select * from user1.seed where rownum = 1 ;!commit;!!alter table cf1 minimize records_per_block;!truncate table cf1 ;!commit;!!insert /*+ APPEND */ into cf1 select * from user1.seed order by dbms_random.value();!commit;!!create unique index i_cf1 on cf1(custid) NOPARALLEL PCTFREE 0 tablespace $TABLESPACE;!!alter index i_cf1 SHRINK SPACE COMPACT;!!exec DBMS_STATS.GATHER_TABLE_STATS('$user', 'cf1', estimate_percent=>100, block_sample=>TRUE, degree=>2);!!
SLOB reader.sql
DECLARE!x NUMBER := 0;!v_r PLS_INTEGER;!!BEGIN!dbms_random.initialize(UID * 7777);!!FOR i IN 1..500000 LOOP! v_r := dbms_random.value(257, 10000) ;! SELECT COUNT(c2) into x FROM cf1 where custid > v_r - 256 AND custid < v_r;!END LOOP;!!END;!/!
SLOBops/s
#> ./runit.sh 0 64!Tm 1179
64*500000 SLOBops / 1179s = 27142 SLOBops/s
27142 SLOBops/s / 8 cores = 3393 SLOBops/s/c
SLOB results
peak SLOBops/s per core
0
1250
2500
3750
5000
x5570 E5-2640 E5-2690 SPARC64 T4 T5
thanks to Philippe Fierens for additional data twitter: @pfierens blog: http://pfierens.blogspot.de/
SLOB results
thanks to Philippe Fierens for additional data twitter: @pfierens blog: http://pfierens.blogspot.de/
time to complete 500k SLOB loops
time
0s
400s
800s
1200s
1600s
threads per core
0,25 0,5 1 2 4 8
E5-2640 T4 T5
data migration
• rman backup/restore
• not across endianness
• rman convert database
• only supported for migration sparc to exadata
• datapump export/import?
• too long
TTS steps
1. check prerequisites
2. make TS read-only
3. dp export metadata
4. rman convert tablespace TBS to platform
5. move TBS copies to destination
6. import metadata
7. make TBS rw
TTS data copy
• move data on SAN
• avoid 1GbE network copy
• ASM
• nope, partitions not compatible
• ZFS!
• works with block device
• automatic endian conversion
issues
• clusterware does not know netn (bug 13604285)
• falls back to “generic” probing for interconnect health
• trouble with network virtualization (eviction issues until sol patch)
• we forgot to transfer SPM baselines (stupid us)
post mortem ASH
• ASH/AWR is fantastic
• archive of SQL performance data
• compare avg sql runtime
• before and after change
ASH
• automatically on by default
• sampled, detailed activity data
• samples taken (in mem) every second
• v$active_session...
• write AWR snapshots to disk
• DBA_HIST_ACTIVE_SESSION...
• increase default keep time
• can keep some baselines forever
• awrextr.sql/awrload.sql after migration
post mortem AWR
SELECT!sql_id,!ROUND(sum(elapsed_time_delta)/sum(executions_delta)) exectime,!sum(executions_delta) executions,!sum(elapsed_time_delta) total_time!FROM dba_hist_sqlstat a!JOIN dba_hist_snapshot b!ON a.snap_id = b.snap_id!WHERE b.begin_interval_time > to_timestamp('201301220845','YYYYMMDDHH24MI') !and b.end_interval_time < to_timestamp('201301221345','YYYYMMDDHH24MI')!AND executions_delta>0!group by sql_id
post mortem AWRwith new_system as (SELECT!sql_id,!ROUND(sum(elapsed_time_delta)/sum(executions_delta)) exectime,!sum(executions_delta) executions,!sum(elapsed_time_delta) total_time!FROM dba_hist_sqlstat a!JOIN dba_hist_snapshot b!ON a.snap_id = b.snap_id!WHERE b.begin_interval_time > to_timestamp('201301220845','YYYYMMDDHH24MI') !and b.end_interval_time < to_timestamp('201301221345','YYYYMMDDHH24MI')!AND executions_delta>0!group by sql_id),!old_system as (SELECT!sql_id,!ROUND(sum(elapsed_time_delta)/sum(executions_delta)) exectime,!sum(executions_delta) executions,!sum(elapsed_time_delta) total_time!FROM dba_hist_sqlstat a!JOIN dba_hist_snapshot b!ON a.snap_id = b.snap_id!WHERE b.begin_interval_time > to_timestamp('201301150845','YYYYMMDDHH24MI') !and b.end_interval_time < to_timestamp('201301151345','YYYYMMDDHH24MI')!AND executions_delta>0!group by sql_id)!select new_system.sql_id, !new_system.exectime newtime, !old_system.exectime oldtime, !round(greatest(old_system.exectime,newdb.exectime)/least(old_system.exectime,newdb.exectime)*sign(old_system.exectime-new_system.exectime),0) speedup,!new_system.executions, !new_system.total_time newtime, !old_system.total_time oldtime from new_system, old_system where new_system.sql_id = old_system.sql_id!order by new_system.total_time desc;
post mortem AWRwith new_system as (SELECT!sql_id,!ROUND(sum(elapsed_time_delta)/sum(executions_delta)) exectime,!sum(executions_delta) executions,!sum(elapsed_time_delta) total_time!FROM dba_hist_sqlstat a!JOIN dba_hist_snapshot b!ON a.snap_id = b.snap_id!WHERE b.begin_interval_time > to_timestamp('201301220845','YYYYMMDDHH24MI') !and b.end_interval_time < to_timestamp('201301221345','YYYYMMDDHH24MI')!AND executions_delta>0!group by sql_id),!old_system as (SELECT!sql_id,!ROUND(sum(elapsed_time_delta)/sum(executions_delta)) exectime,!sum(executions_delta) executions,!sum(elapsed_time_delta) total_time!FROM dba_hist_sqlstat a!JOIN dba_hist_snapshot b!ON a.snap_id = b.snap_id!WHERE b.begin_interval_time > to_timestamp('201301150845','YYYYMMDDHH24MI') !and b.end_interval_time < to_timestamp('201301151345','YYYYMMDDHH24MI')!AND executions_delta>0!group by sql_id)!select new_system.sql_id, !new_system.exectime newtime, !old_system.exectime oldtime, !round(greatest(old_system.exectime,newdb.exectime)/least(old_system.exectime,newdb.exectime)*sign(old_system.exectime-new_system.exectime),0) speedup,!new_system.executions, !new_system.total_time newtime, !old_system.total_time oldtime from new_system, old_system where new_system.sql_id = old_system.sql_id!order by new_system.total_time desc;
post mortem ASH!SQL_ID NEWTIME OLDTIME SPEEDUP EXECUTIONS NEWTIME OLDTIME!------------- ---------- ---------- ---------- ---------- ---------- ----------!94qfy61wzr979 756417 9169090 12 5142 3889494264 1.0323E+11!9fqj3jnfyupvf 894891 15599003 17 2920 2613080481 1.1609E+11!82gkmapmt6apc 726 300 -2 3462071 2515070807 1211844016!5bzjuzcrh52gy 577309159 1170336372 2 4 2309236634 4681345488!gfrtvm879j1ry 517999 176268 -3 3469 1796936890 284848872!annj0ct5d49mp 563926 8964813 16 2727 1537827089 2.0153E+10!a5h8sbx7b4gzf 18930618 491125857 26 54 1022253386 1.3752E+10!ggcsmyw22ygpr 825227 233433 -4 1221 1007601900 465464972!bsjug2fdvkn3a 85490 652245 8 11356 970824082 2.4279E+10!1cphs8cnbhpsm 275 17161 62 1755282 482780358 5.0277E+10!d739f3ystssbf 125212 226918 2 2714 339825406 168372874!crjus93xycs2w 48755724 249751087 5 6 292534344 1998008696!6mrjk196ayz7r 1423 26445 19 192648 274203984 7557523980!6j5f8rc00kb0x 838 19019 23 193281 161903300 5446650848!abr4rgpaagv17 1045 18193 17 147923 154573924 1068976356!09caxbcryq43r 945 18916 20 162908 153892043 4546423200!79wuh7f3rfqfy 376 5280 14 287087 107972760 2275145736!dvmdvrsmhz76b 104031863 160488695 2 1 104031863 320977390!7wra7pmvddday 96859189 35919437 -3 1 96859189 71838874!7bqnjnuptpatm 10585 116958 11 8927 94490590 1.0220E+10!3aj76umkhstnf 2392 18974 8 33688 80588976 167996156!0bc7w9wr441zq 488 7678 16 164108 80126122 1852890834!akrvwc6urkax3 1047 43265 41 70221 73524606 1337942240!0k4nn1g1fwqr8 348 19872 57 191774 66719694 5630835446!agxyq3xtmbtxt 196 2172 11 251529 49349839 895362046!12qw5z09ufub8 267 15673 59 140150 37465359 1934266074!g434abph3dq42 172 1866 11 189748 32709417 509570152!9vp8s2hu7sg41 185 1494 8 166841 30867239 203240886!gag61u6hjrsvk 8658 124536 14 3091 26760809 4033231394!56f1b4nwg7vc9 163 3828 23 67637 11008187 1955368074!gxk98nqkg9rkz 196 5322 27 47439 9285292 1394379424!69subccxd9b03 668 47280 71 4017 2683163 1985395704!c618bg1pzysfb 22832 1678854 74 7 159822 30219378!3t5zba5fc90aj 987 50171 51 155 152919 1705820
Summary
• reconsider SPARC platform
• save money with hard partitioning
• know and utilize your benchmark tools
• don’t fear endian conversion
• use your ASH/AWR data
thank you
!
!
twitter: @brost
http://portrix-systems.de/blog/