Insight Case Studies Tuning the Beloved DB-Engines · Insight Case Studies Tuning the Beloved...
Transcript of Insight Case Studies Tuning the Beloved DB-Engines · Insight Case Studies Tuning the Beloved...
Insight Case Studies
Tuning the Beloved DB-Engines
Presented By Nithya Koka and Michael Arnold
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Who is Nithya Koka ?
● Senior Hadoop Administrator ○ Project Lead ○ Client Engagement○ On-Call Engineer○ Cluster NinjaOn numerous Insight projects
● 5+ years in IT - 4 years with Hadoop
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Who is Michael Arnold ?
● Principal Systems Engineer
● Automation geek
● 20+ years in IT - 9 years with Hadoop
● I help people deal with:
○ Servers (physical and virtual)
○ Networks
○ Server operating systems
○ Hadoop distributions
○ Making it all run smoothly
Impala Tuning
Case Study
HBase Tuning
Case Study
Agenda
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Tuning Case Study
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
1. Impala threads peak, crash the daemon, and all queries hang causing complete outage to their end users. This is happening over: ○ 2 years, on and off○ Multiple support tickets ○ Several tuning attempts
No trends on host or timeframe where these incidents tend to occur
2. Impala queries on HUE error out with expired results messages
Case Study: ClientA Impala Woes
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Initial Insight Evaluation
Gotchas Captured: ● Role Layout: over burdened “Master hosts”● Using the buggy RHEL kernel (Linux
2.6.32-504.3.3.el6.x86_64)● Multiple Java versions● Default swappiness● Transparent hugepages was enabled
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads
Typical Incident Pattern
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads
Typical Incident Pattern
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
1. Potential disk errors in dmesg output for incident prone hosts.
2. The JVM crashes reported by Impala.
3. HDFS file count snowballing.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
1.15Million
750K
File
s
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
1. Disk Errors● Without Spill directories configured, Scratch was defaulting
to /tmp/impala-scratch, which was unsuitable for the scale and concurrency.
Resolution: ● Spread the disk spill across the data drives.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
1. Disk Errors● Identified bad RAID controller :
Three problem disks on a master host, RAID10 virtual disk for namenode, RAID1 virtual disk for Journalnode and another RAID1 virtual disk for Zookeeper.
Resolution: ● The host with bad disks was decommissioned to replace the
disks and brought back in a good state.● Regular scans have been set with the raid controller CLI to
alert about any future incidents.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
2. Impala reported JVM Crashes ● The running OS kernel version is known to cause
CDH applications to pause and result in JVM hangs as seen on Impala reports.
Resolution: ● Upgrading kernel version to 2.6.32-504.16.2.el6
or later is recommended
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
3. The small files problem: ● Parquet files in order of KB which led to slow IO throughput.● Coordinator and Executor connections fail due to high scan times
from NN.● The failed executor connections kick off more threads which add up
very quickly and crash the daemon.
Resolution: ● By rewriting Parquet Compaction to dynamic partitions the client
was able to produce 1 file in place of 29 files, significantly reducing the file count overall.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
Tuning for Scale
● Since Impala 2.9, we can assign Impala Daemons as query coordinators or query executors.
● These two components can now be tuned as per their responsibilities giving us more flexibility.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
Tuning for Scale
Coordinators: ● Perform the network communication to keep metadata up-to-date
and route query results to the appropriate clients.
● Experience significant network and CPU overhead with queries containing a large number of query fragments.
● Need large JVM heap for caching metadata for all table partitions and data files.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
Tuning for Scale
Executors: ● Need default JVM Heap, leaving more memory
available to process CPU intensive joins, aggregations, and other operations.
● Executors perform I/O intensive scans.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Impala Threads : Deep Dive
Tuning for Scale
Coordinators: How Many? [Our cluster: 3]● Small is good (a minimum of 1 dedicated)● Considerations: # of Impala Daemons, DDL queries, average query
resource usage at various stages.
Where do they go? [Our cluster: Utility hosts]● Coordinators can go non-workers.● Avoid losing out on resources, memory, or disk.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Choosing the right Load-Balancing Algorithm for High Availability through a proxy.
LeastConn:
High Availability
What? Connects sessions to the coordinator with the fewest connections, to balance the load evenly.
When? Many independent, short-running queries.
Where? Recommended for Impala with F5.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Choosing the right Load-Balancing Algorithm for High Availability through a proxy.
RoundRobin:
High Availability
What? Distributes connections to all coordinator nodes, we can add list of servers with a weight parameter to define the distribution.
When? Predictable and stable balancing, requires to perform benchmarks and load testing.
Where? Not recommended by Cloudera for Impala.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Impala Tuning
Choosing the right Load-Balancing Algorithm for High Availability through a proxy.
Source Persistence:
High Availability
What? The source IP address is hashed and divided by the total weight of the running servers to determine which server will receive the request.
When? Impala workloads containing a mix of queries and DDL statements, such as CREATE TABLE and ALTER TABLE.
Where? It is required for setting up high availability with Hue.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
HBase Tuning Case Study
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
● Client wanted to upgrade from manually installed HBase environment to the Cloudera distribution's HBase.
● New hardware with much larger RAM footprint.● SSDs, because, why not? (And not important to
this tuning.)
Case Study: ClientB OpenTSDB Platform Upgrade
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Initial Insight Evaluation
Gotchas Captured:
● None, really. It is not installed yet, but we will need to tune HBase to utilize a lot more memory.
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Use the Java Development Kit (JDK) version 8.
Java
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable garbage collection (GC) logging.Java
-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintReferenceGC -XX:+PrintFlagsFinal -Xloggc:/var/log/hbase/regionserver-gc.log
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable garbage collection (GC) log rotation.Java
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=200M
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable G1GC Garbage Collector for RegionServer.Java
-XX:+UseG1GC -XX:MaxGCPauseMillis=100
https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Tune G1GC.Java
-XX:+ParallelRefProcEnabled -XX:-ResizePLAB -XX:ParallelGCThreads=8+(logical Processors-8)(5/8) -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=3
https://www.oracle.com/technetwork/articles/java/g1gc-1984535.html
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Where do the HBase GC settings go?Configuration
Cloudera Manager: HBase -> Configuration -> SCOPE:RegionServer / CATEGORY:Advanced / Java Configuration Options for HBase RegionServer
Ambari: Service/HBase/Configs -> CONFIGS / ADVANCED / Advanced hbase-env / hbase-env template
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Increase the Java Heap of the HBase RegionServer.
Java
CM: Java Heap Size of HBase RegionServer in Bytes: 31 GiBAmbari: HBase RegionServer Maximum Memory: 31 GiB
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Increase the Java Heap of the HBase RegionServer.
Java
CM: Java Heap Size of HBase RegionServer in Bytes: 31 GiBAmbari: HBase RegionServer Maximum Memory: 31 GiB
Never set the heap size to values between 32-48 GiB.
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable the HBase BucketCache.HBase
RegionServer Advanced Configuration Snippet (Safety Valve) for hbase-site.xml:
hbase.bucketcache.ioengine: offheap
hbase.bucketcache.size: 32 GiB (or 96 GiB)
hfile.block.cache.size: 0.2
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable the HBase BucketCache.HBase
HBase Client Environment Advanced Configuration Snippet for hbase-env.sh:
HBASE_OFFHEAPSIZE=36G (or 100G)
HBASE_OPTS=-XX:MaxDirectMemorySize=36G (100G)
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable HBase MultiWAL Support.HBase
hbase.wal.provider: Multiple HDFS WAL
hbase.wal.regiongrouping.numgroups: (numDrives/3)
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
HBase Tuning
Enable HDFS Hedged Reads.HDFS
dfs.client.hedged.read.threadpool.size: 20
dfs.client.hedged.read.threshold.millis: 500 milliseconds
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
References
● https://impala.apache.org/docs/build/html/topics/impala_scalability.html
● https://impala.apache.org/docs/build/html/topics/impala_partitioning.html
● https://impala.apache.org/docs/build/html/topics/impala_proxy.html
● https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-ga
rbage-collection-for-hbase
● http://gceasy.io/
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Thank You
• Thank you
• Questions
• Get in touch with us:
www.clairvoyantsoft.com
C L A I R V O Y A N T S O F T . C O M
C L A I R V O Y A N T S O F T . C O M
Contact Us
CHANDLER, AZ
SEATTLE, WA
DALLAS, TX
BOSTON, MA
PUNE, INDIA
+1 (623) 282 2385
Nithya Koka
@nithya_koka
https://www.linkedin.com/in/nithyakoka
6185 W Detroit St. Chandler, AZ
Michael Arnold
@hadoopgeek
https://www.linkedin.com/in/michaelarnold