OpenDBCamp Virtualization
-
Upload
liz-van-dijk-ameel -
Category
Technology
-
view
646 -
download
0
description
Transcript of OpenDBCamp Virtualization
Liz van Dijk - @lizztheblizz - [email protected]
VIRTUAL DATABASES?Optimizing for Virtualization
Sunday 8 May 2011
THE GOAL
“Virtualized Databases suck!” - Is this really true? Does it have to be?
Sunday 8 May 2011
Databases are supposed to be “hard to virtualize” and have decreased performance in a virtual environment. This is actually correct, dumping a native database into a virtual environment without applying any changes could potentially cause some issues.
HOW DO WE GET THERE?
1. Understanding just why the virtual environment impacts performance, and taking the correct steps to adapt our database to its new habitat.
2. Optimize, optimize, optimize...
Sunday 8 May 2011
Action: We have to understand why performance of databases is influenced, and how we can arm ourselves against this impact.
On the other hand, while there used be less of a need for optimization in an environment where hardware was abundant, a virtual environment causes struggles for resources more quickly. It’s important to create our application as slim as possible without losing performance. In many cases, performance can be multiplied by having a closer look at the database.
Message: Why is this interesting for you? This knowledge could convince you to make the switch to a virtual environment, trusting it won’t hit your software’s performance, and will help you take a look at your existing infrastructure to take the necessary steps to run your application as optimal as possible.
THE INFLUENCE OF VIRTUALIZATION
• All “kernel” activity is more costly:• Interrupts• System Calls (I/O)•Memory page management
Sunday 8 May 2011
So, let’s start with the understanding step: what could potentially slow down because of virtualization?
The 3 most important aspects are:Interrupts - An actual piece of hardware is looking for attention from the CPU. Making use of Jumbo Frames is a very good idea in a virtual environment, because sending the same data causes less interrupts (1500 --> 9000 bytes per packet)System Calls - A process is looking for attention from the kernel to do a privileged task like accessing certain hardware (network/disk IO)Page Management - This is the most important one for databases: think caching. The database keeps an enormous amount of data in its own caches, so memory is manipulated a lot of the time. Every time something changes in this memory, the virtual host has to perform a double translation: From Virtual Memory to VM pagetable to physical address.
Usually, this causes the biggest performance hit when switching from native to virtual. We really have to do everything we can to minimize this problem.
GENERAL OPTIMIZATION STRATEGY
Making the right hardware choices
Tuning the hypervisor to your database’s needs
Tuning the OS to your database’s needs
Squeezing every last bit of performance out of your database
Sunday 8 May 2011
Performance issues should be dealt with systematically, and we can split that process up in these 4 steps.
HARDWARE CHOICES
• Choosing the right CPU’s
• Intel 5500/7500 and later types (Nehalem) / All AMD quadcore Opterons (HW-assisted/MMU virtualization)
• Choosing the right NIC’s (VMDQ)
• Choosing the right storage system (iSCSI vs FC SAN)
Sunday 8 May 2011
CPU’s --> HW Virtualization (dom -1) & HAP
best price/quality at the momentOpteron 6000 series very good at datamining/decision supportXeon 5600 series still very good at OLTP
VMDQ = sorting/queueing offloaded to the NIC
CPU EVOLUTION
Sunday 8 May 2011
CPU EVOLUTION
Sunday 8 May 2011
CPU EVOLUTION
Sunday 8 May 2011
OVERVIEW NIC -‐ VMDQ / NETQUEUENetqueue Devices Part nr Speed Interface
Intel Ethernet Server Adapter X520-‐SR2 2 ports E10G42BFSR 10Gbps SR-‐LC
Intel Ethernet Server Adapter X520-‐DA2 2 ports E10G42BTDA 10Gbps SFP+
Intel Gigabit ET Dual Port Server Adapter 2 ports E1G42ET 1Gbps RJ-‐45 -‐ Copper
Intel Gigabit EF Dual Port Server Adapter 2 ports E1G42EF 1Gbps RJ-‐45 -‐ Fibre
Intel Gigabit ET Quad Port Server Adapter 4 ports E1G44ET 1Gbps RJ-‐45 -‐ Copper
Intel Gigabit CT Desktop Adapter EXPI9301CT 1Gbps RJ-‐45 -‐ Copper
Supermicro Add-‐on Card AOC-‐SG-‐I2 2 ports AOC-‐SG-‐I2 1Gbps RJ-‐45 copper
Onboard 82576 (8 Virtual Queues)
Onboard 82574 Geen IOV
Broadcom's NetXtreme II Ethernet chipse 1-‐10 GBps1-‐10 GBps
Alle Neterions 1-‐10 GBps1-‐10 GBps
Sunday 8 May 2011
SAN CHOICES
• iSCSI (using 10Gbit if possible)• ESX with Hardware Initiator (iSCSI HBA)• ESX with Software Initiator• Initiator inside the Guest OS• vSphere: iSCSI HBA pass-through to Guest OS
• Fibre Channel• ESX with FC-HBA• vSphere: FC-HBA pass-through to Guest OS
Sunday 8 May 2011
10Gbit = high CPU overhead!! We’re talking 24GHz to fill up 9GbitsThis problem can be reduced by the following technologiesVT-d ---> Moving DMA and address translation to the NICVMDQ/Netqueue ---> Netqueue is pretty much VMware’s implementationSR-IOV ---> Allowing one physical device (NIC) to show itself as multiple virtual devices.
SAN CHOICES
• iSCSI (using 10Gbit if possible)• ESX with Hardware Initiator (iSCSI HBA)• ESX with Software Initiator• Initiator inside the Guest OS• vSphere: iSCSI HBA pass-through to Guest OS
• Fibre Channel• ESX with FC-HBA• vSphere: FC-HBA pass-through to Guest OS Server with (hardware) iSCSI = iSCSI Target
(Virtualization-) server with (hardware) iSCSI= iSCSI Initiator
Sunday 8 May 2011
10Gbit = high CPU overhead!! We’re talking 24GHz to fill up 9GbitsThis problem can be reduced by the following technologiesVT-d ---> Moving DMA and address translation to the NICVMDQ/Netqueue ---> Netqueue is pretty much VMware’s implementationSR-IOV ---> Allowing one physical device (NIC) to show itself as multiple virtual devices.
GENERAL OPTIMIZATION STRATEGY
Making the right “hardware” choices
Tuning the hypervisor to your database’s needs
Tuning the OS to your database’s needs
Squeezing every last bit of performance out of your database
Sunday 8 May 2011
VIRTUAL MEMORY
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
Managed by software
Actual Hardware
Sunday 8 May 2011
CPU’s: AMD: all 4core opteronsIntel: Xeon 5500, 7500, 5600
Physical memory is divided into segments of 4KB, which is translated in software to so-called pages. Small chunks with each its own address, which the CPU uses to find the data in the physical memory.
A piece of software always gets a continuous block of “virtual” memory assigned to it within an OS, even though the physical memory is fragmented, to prevent a coding nightmare. (keeping track of every single page address is madness).
The page table was made for the CPU to run through and to make the necessary translation to the physical memory. The CPU has a hardware cache that keeps track of these entries, the Translation Lookaside Buffer. This is an extremely fast buffer that saves the most recent addresses, so the CPU can prevent running through the Page Table as much as possible.
VIRTUAL MEMORY
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
Managed by software
Actual Hardware
OS
1
2
3
4
5
Virtual Memory
6
7
8
9
10
11
12
Sunday 8 May 2011
CPU’s: AMD: all 4core opteronsIntel: Xeon 5500, 7500, 5600
Physical memory is divided into segments of 4KB, which is translated in software to so-called pages. Small chunks with each its own address, which the CPU uses to find the data in the physical memory.
A piece of software always gets a continuous block of “virtual” memory assigned to it within an OS, even though the physical memory is fragmented, to prevent a coding nightmare. (keeping track of every single page address is madness).
The page table was made for the CPU to run through and to make the necessary translation to the physical memory. The CPU has a hardware cache that keeps track of these entries, the Translation Lookaside Buffer. This is an extremely fast buffer that saves the most recent addresses, so the CPU can prevent running through the Page Table as much as possible.
VIRTUAL MEMORY
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
Managed by software
Actual Hardware
OS
1
2
3
4
5
Virtual Memory
6
7
8
9
10
11
12
1 | 0xD
2 | 0xC
3 | 0xF
Page Table
6 | 0xG
5 | 0xH
4 | 0xA
8 | 0xE
7 | 0xB
etc.
Sunday 8 May 2011
CPU’s: AMD: all 4core opteronsIntel: Xeon 5500, 7500, 5600
Physical memory is divided into segments of 4KB, which is translated in software to so-called pages. Small chunks with each its own address, which the CPU uses to find the data in the physical memory.
A piece of software always gets a continuous block of “virtual” memory assigned to it within an OS, even though the physical memory is fragmented, to prevent a coding nightmare. (keeping track of every single page address is madness).
The page table was made for the CPU to run through and to make the necessary translation to the physical memory. The CPU has a hardware cache that keeps track of these entries, the Translation Lookaside Buffer. This is an extremely fast buffer that saves the most recent addresses, so the CPU can prevent running through the Page Table as much as possible.
VIRTUAL MEMORY
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
TLB
1 | 0xD
5 | 0xH
2 | 0xC
etc.
Managed by software
Actual Hardware
OS
1
2
3
4
5
Virtual Memory
6
7
8
9
10
11
12
1 | 0xD
2 | 0xC
3 | 0xF
Page Table
6 | 0xG
5 | 0xH
4 | 0xA
8 | 0xE
7 | 0xB
etc.
Sunday 8 May 2011
CPU’s: AMD: all 4core opteronsIntel: Xeon 5500, 7500, 5600
Physical memory is divided into segments of 4KB, which is translated in software to so-called pages. Small chunks with each its own address, which the CPU uses to find the data in the physical memory.
A piece of software always gets a continuous block of “virtual” memory assigned to it within an OS, even though the physical memory is fragmented, to prevent a coding nightmare. (keeping track of every single page address is madness).
The page table was made for the CPU to run through and to make the necessary translation to the physical memory. The CPU has a hardware cache that keeps track of these entries, the Translation Lookaside Buffer. This is an extremely fast buffer that saves the most recent addresses, so the CPU can prevent running through the Page Table as much as possible.
SPT VS HAP
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
VM A
VM B
1 | 0xD
5 | 0xH
2 | 0xC
N
“Read-only”Page Table
12 | 0xB
10 | 0xE
9 | 0xA
etc.
1
2
3
4
5
1
2
3
4
12
Managed by VM OS
Managed by hypervisor
Actual Hardware
Sunday 8 May 2011
In a virtual environment, where the guest OS is not allowed direct access to the memory, this was solved in a different way. Each VM gets access to its own page table, but this one is actually locked/read-only, and as soon as a change is made, a “trap” is generated, so the hypervisor is forced to take over and handle the page management. This causes a lot of overhead, because every single memory management action forces the hypervisor to intervene.
As an alternative, new CPU’s came to the market with a modified TLB-cache, which was able to keep track of the complete translation path (VM virtual address --> VM physical address --> host physical address)
Downside: Because of this, filling up the TLB got a lot more complex. A page that is not yet in there is very hard to find. Once the TLB is properly warmed up, though, most applications rarely have to wait for other pages.
SPT VS HAP
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
VM A
VM B
1 | 0xD
5 | 0xH
2 | 0xC
N
“Read-only”Page Table
12 | 0xB
10 | 0xE
9 | 0xA
etc.
1
2
3
4
5
1
2
3
4
12
Managed by VM OS
Managed by hypervisor
Actual Hardware
B
A
1 | 0xG
5 | 0xD
2 | 0xF
12 | 0xE
10 | 0xB
9 | 0xC
“Shadow” Page Table
Sunday 8 May 2011
In a virtual environment, where the guest OS is not allowed direct access to the memory, this was solved in a different way. Each VM gets access to its own page table, but this one is actually locked/read-only, and as soon as a change is made, a “trap” is generated, so the hypervisor is forced to take over and handle the page management. This causes a lot of overhead, because every single memory management action forces the hypervisor to intervene.
As an alternative, new CPU’s came to the market with a modified TLB-cache, which was able to keep track of the complete translation path (VM virtual address --> VM physical address --> host physical address)
Downside: Because of this, filling up the TLB got a lot more complex. A page that is not yet in there is very hard to find. Once the TLB is properly warmed up, though, most applications rarely have to wait for other pages.
SPT VS HAP
Mem
0xA
0xB
0xC
0xD
0xE
0xF
0xG
0xH CPU
TLB
A1 | 0xD
A5 | 0xH
A2 | 0xC
etc.
B12 | 0xB
B10 | 0xE
B9 | 0xA
VM A
VM B
1 | 0xD
5 | 0xH
2 | 0xC
N
“Read-only”Page Table
12 | 0xB
10 | 0xE
9 | 0xA
etc.
1
2
3
4
5
1
2
3
4
12
Managed by VM OS
Managed by hypervisor
Actual Hardware
Sunday 8 May 2011
In a virtual environment, where the guest OS is not allowed direct access to the memory, this was solved in a different way. Each VM gets access to its own page table, but this one is actually locked/read-only, and as soon as a change is made, a “trap” is generated, so the hypervisor is forced to take over and handle the page management. This causes a lot of overhead, because every single memory management action forces the hypervisor to intervene.
As an alternative, new CPU’s came to the market with a modified TLB-cache, which was able to keep track of the complete translation path (VM virtual address --> VM physical address --> host physical address)
Downside: Because of this, filling up the TLB got a lot more complex. A page that is not yet in there is very hard to find. Once the TLB is properly warmed up, though, most applications rarely have to wait for other pages.
HAP
Sunday 8 May 2011
As you can see, in general this does help improve performance, though not by a really huge amount. It opens the door to a great combination with another technique, though!
HAP + LARGE PAGES
Setting Large Pages:• Linux - increase SHMMAX in rc.local• Windows - grant “Lock Pages in memory”
• MySQL (only InnoDB) - large-pages• Oracle - ORA_LPENABLE=1 in registry• SQL Server - Enterprise only, need >8GB RAM. For buffer
pool start up with trace flag -834
Sunday 8 May 2011
While using HAP, you should definitely make use of Large Pages, because filling up the TLB is a lot more expensive. By using Large Pages (2mb in 4kb), a LOT more memory can be accessed by a single entry. This in combination with a bigger TLB in the newest CPU’s attempts to prevent entries from disappearing from the TLB too fast.
Oracle: HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_HOME_NAME
HAP + LARGE PAGES
Setting Large Pages:• Linux - increase SHMMAX in rc.local• Windows - grant “Lock Pages in memory”
• MySQL (only InnoDB) - large-pages• Oracle - ORA_LPENABLE=1 in registry• SQL Server - Enterprise only, need >8GB RAM. For buffer
pool start up with trace flag -834
Sunday 8 May 2011
While using HAP, you should definitely make use of Large Pages, because filling up the TLB is a lot more expensive. By using Large Pages (2mb in 4kb), a LOT more memory can be accessed by a single entry. This in combination with a bigger TLB in the newest CPU’s attempts to prevent entries from disappearing from the TLB too fast.
Oracle: HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_HOME_NAME
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL HBA’S
• Choices (ESX)
• Before vSphere:• BusLogic Parallel (Legacy)• LSI Logic Parallel (Optimized)
• Since vSphere• LSI Logic SAS (default as of Win2008)• VMware Paravirtual (PVSCSI)
• Thin vs Thick Provisioning (vSphere)
• Snapshots & performance do not go together
Sunday 8 May 2011
BusLogic ---> Generic adapterLSILogic ---> Optimized adapter that requires toolsLSILogic SAS ---> Presents itself as a SAS controller (necessary for Windows clustering)PVSCSI ---> Fully paravirtualized high performance adapter, created to use iSCSI from the guest, supports command queueing.
VIRTUAL NIC’S• Choices (ESX)
• Before vSphere:• Flexible (emulation)• E1000 (Intel E1000 emulation, default x64)• (enhanced) VMXNET (paravirtual)
• Since vSphere:• VMXNET 3 (third generation paravirtual NIC)
• Jumbo frames, NIC Teaming, VLANs
• Colocation (minimize NIC traffic by sharing a host)
Sunday 8 May 2011
Flexible ---> required for 32-bit systemsAutomatically turns into a VMXNET after installing VMware Tools
VMXNet adds ‘Jumbo Frames’
VMXNET3 adds:* MSI/MSI-X support (if supported by guest OS Kernel) • Receive Side Scaling (Windows 2008 only) • IPv6 checksum & TCP Segmentation Offloading (segmentation of packages --> NIC, not CPU) • VLAN Offloading • Bigger TX/RX ring sizes• Optimizations for iSCSI & VMotion • Necessary for VMDq!!
VIRTUAL NIC’S• Choices (ESX)
• Before vSphere:• Flexible (emulation)• E1000 (Intel E1000 emulation, default x64)• (enhanced) VMXNET (paravirtual)
• Since vSphere:• VMXNET 3 (third generation paravirtual NIC)
• Jumbo frames, NIC Teaming, VLANs
• Colocation (minimize NIC traffic by sharing a host)
Sunday 8 May 2011
Flexible ---> required for 32-bit systemsAutomatically turns into a VMXNET after installing VMware Tools
VMXNet adds ‘Jumbo Frames’
VMXNET3 adds:* MSI/MSI-X support (if supported by guest OS Kernel) • Receive Side Scaling (Windows 2008 only) • IPv6 checksum & TCP Segmentation Offloading (segmentation of packages --> NIC, not CPU) • VLAN Offloading • Bigger TX/RX ring sizes• Optimizations for iSCSI & VMotion • Necessary for VMDq!!
VIRTUAL NIC’S• Choices (ESX)
• Before vSphere:• Flexible (emulation)• E1000 (Intel E1000 emulation, default x64)• (enhanced) VMXNET (paravirtual)
• Since vSphere:• VMXNET 3 (third generation paravirtual NIC)
• Jumbo frames, NIC Teaming, VLANs
• Colocation (minimize NIC traffic by sharing a host)
Sunday 8 May 2011
Flexible ---> required for 32-bit systemsAutomatically turns into a VMXNET after installing VMware Tools
VMXNet adds ‘Jumbo Frames’
VMXNET3 adds:* MSI/MSI-X support (if supported by guest OS Kernel) • Receive Side Scaling (Windows 2008 only) • IPv6 checksum & TCP Segmentation Offloading (segmentation of packages --> NIC, not CPU) • VLAN Offloading • Bigger TX/RX ring sizes• Optimizations for iSCSI & VMotion • Necessary for VMDq!!
VIRTUAL NIC’S• Choices (ESX)
• Before vSphere:• Flexible (emulation)• E1000 (Intel E1000 emulation, default x64)• (enhanced) VMXNET (paravirtual)
• Since vSphere:• VMXNET 3 (third generation paravirtual NIC)
• Jumbo frames, NIC Teaming, VLANs
• Colocation (minimize NIC traffic by sharing a host)
Sunday 8 May 2011
Flexible ---> required for 32-bit systemsAutomatically turns into a VMXNET after installing VMware Tools
VMXNet adds ‘Jumbo Frames’
VMXNET3 adds:* MSI/MSI-X support (if supported by guest OS Kernel) • Receive Side Scaling (Windows 2008 only) • IPv6 checksum & TCP Segmentation Offloading (segmentation of packages --> NIC, not CPU) • VLAN Offloading • Bigger TX/RX ring sizes• Optimizations for iSCSI & VMotion • Necessary for VMDq!!
VIRTUAL NIC’S• Choices (ESX)
• Before vSphere:• Flexible (emulation)• E1000 (Intel E1000 emulation, default x64)• (enhanced) VMXNET (paravirtual)
• Since vSphere:• VMXNET 3 (third generation paravirtual NIC)
• Jumbo frames, NIC Teaming, VLANs
• Colocation (minimize NIC traffic by sharing a host)
Sunday 8 May 2011
Flexible ---> required for 32-bit systemsAutomatically turns into a VMXNET after installing VMware Tools
VMXNet adds ‘Jumbo Frames’
VMXNET3 adds:* MSI/MSI-X support (if supported by guest OS Kernel) • Receive Side Scaling (Windows 2008 only) • IPv6 checksum & TCP Segmentation Offloading (segmentation of packages --> NIC, not CPU) • VLAN Offloading • Bigger TX/RX ring sizes• Optimizations for iSCSI & VMotion • Necessary for VMDq!!
GENERAL OPTIMIZATION STRATEGY
Making the right hardware choices
Tuning the hypervisor to your database’s needs
Tuning the OS to your database’s needs
Squeezing every last bit of performance out of your database
Sunday 8 May 2011
BEST OS CHOICES
• 64-bit Linux for MySQL
•MySQL 5.1.32 or later
• ... ? (discuss mode on! :) )
Sunday 8 May 2011
Modified mutexes for InnoDB = improvement of locking for multithreaded environments. This allows for much better scaling.
DON’T FORGET
• VMware Tools
• Paravirtualized Vmxnet, PVSCSI
• Ballooning
• Time Sync
• ... and more recent drivers
• Integration Services
• Paravirtualized Drivers
• Hypercall adapter
• Time Sync
• ... and more recent drivers
Sunday 8 May 2011
Definitely install the tools of the hypervisor in question to enable use of its newest functionalities. This is very important if you want to use for example overcommitting memory in ESX, or using paravirtualization in Linux on Hyper-V.
CACHING LEVELS
• CPU
• Application
• Filesystem / OS
• RAID Controller (switch off or use a BBU!)
•Disk
Sunday 8 May 2011
CPU: Just buy the right CPUApp/FS: use the correct settings (Direct IO)RAID Controller: Make use of a battery backupped unit (for transactional databases: lots of random writes in the cache, so to be sure, the RAID controller keeps track of those). This is mostly used as a write buffer.Disk: If cache is available on-disk, it’s best we disable this, especially when the power drops (so nothing can get stuck in the caches). HP disables these by default.
GENERAL OPTIMIZATION STRATEGY
Making the right hardware choices
Tuning the hypervisor to your database’s needs
Tuning the OS to your database’s needs
Squeezing every last bit of performance out of your database
Sunday 8 May 2011
DIRECT IO
• Less Page management• Smallest cache possible vs Less I/O
SQL Server : AutomaticallyMySQL: only for use with InnoDB! - innodb_flush_method=O_DIRECTOracle: filesystemio_options=DIRECTIO
Sunday 8 May 2011
Though in Windows this is on by default, in Linux it should definitely be enabled. Otherwise everything that is already cached by the InnoDB buffer pool may also be cached by the filesystem cache, so two separate but identical caches need to be maintained in the memory: far too much memory management.
MySQL’s MyISAM actually depends on this filesystem cache. It expects the OS to do the brunt of the caching work itself.
GENERAL MY.CNF OPTIMIZATIONS
• max_connections (151) (File descriptors!)
• Per connection
• read_buffer_size (128K) (Full Scan)
• read_rnd_buffer_size (256K) (Order By)
• sort_buffer_size (2M) (Sorts)
• join_buffer_size (128K) (Full Scan Join)
Sunday 8 May 2011
GENERAL MY.CNF OPTIMIZATIONS
• thread_cache (check out max_used_connections)
• table_cache (64) - table_open_cache (5.1.3x)
• Engine dependent
• open_tables variable
• opened_tables ∆ ≈ 0 • innodb_buffer_pool_size
• innodb_thread_concurrency
Sunday 8 May 2011Try to fit max_used_connections into the thread_cache IF POSSIBLE
INDEXING
• Heaps
• Unclustered Indexes
• Clustered Indexes (InnoDB)
Sunday 8 May 2011
INDEX FRAGMENTATION
• Happens with clustered indexes
• Large-scale fragmentation of the indexes could cause serious performance problems
• Fixes:
• SQL Server : REBUILD/REORGANIZE•MySQL: ALTER TABLE tbl_name ENGINE=INNODB•Oracle: ALTER INDEX index_name REBUILD
Clustered Index Leaf Level
Sunday 8 May 2011
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
InsertDelete
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Checkpoint process
InsertDelete
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
STORAGE ENGINE INTERNALS
DB Front
Buffer PoolCache
Datafile
Transaction Log
Update
Checkpoint process
InsertDelete
Sunday 8 May 2011
SQL Server --> Set memory options in server properties > Memory > Server memory Options
DATA AND LOG PLACEMENT
Sunday 8 May 2011
This is most important for transactional databases.
As you can see, the difference of using a decent SAS or SSD disk for the database log is negligible. There is no use sinking the cache into an SSD for logs, just get a decent, fast SAS.
SQL STATEMENT ‘DUHS’
• Every table MUST have a primary key
• If possible, use a clustered index
•Only keep regularly used indexes around (f. ex. FK)
•WHERE > JOIN > ORDER BY > SELECT
•Don’t use SELECT *
• Try not to use COUNT() (in InnoDB always a full table scan)
Sunday 8 May 2011
GENERAL OPTIMIZATION STRATEGY
Making the right hardware choices
Tuning the hypervisor to your database’s needs
Tuning the OS to your database’s needs
Squeezing every last bit of performance out of your database
Sunday 8 May 2011
QUESTIONS?
I don’t have the attention span to keep up a blog :(
Results of benchmarks: http://www.anandtech.com/tag/IT
Sunday 8 May 2011