Insight Case Studies Tuning the Beloved DB-Engines · Insight Case Studies Tuning the Beloved...

Insight Case Studies

Tuning the Beloved DB-Engines

Presented By Nithya Koka and Michael Arnold

C L A I R V O Y A N T S O F T . C O M


Who is Nithya Koka ?

● Senior Hadoop Administrator ○ Project Lead ○ Client Engagement○ On-Call Engineer○ Cluster NinjaOn numerous Insight projects

● 5+ years in IT - 4 years with Hadoop



Who is Michael Arnold ?

● Principal Systems Engineer

● Automation geek

● 20+ years in IT - 9 years with Hadoop

● I help people deal with:

○ Servers (physical and virtual)

○ Networks

○ Server operating systems

○ Hadoop distributions

○ Making it all run smoothly

Impala Tuning

Case Study

HBase Tuning

Case Study

Agenda




Impala Tuning

Impala Tuning Case Study



Impala Tuning

1. Impala threads peak, crash the daemon, and all queries hang causing complete outage to their end users. This is happening over: ○ 2 years, on and off○ Multiple support tickets ○ Several tuning attempts

No trends on host or timeframe where these incidents tend to occur

2. Impala queries on HUE error out with expired results messages

Case Study: ClientA Impala Woes



Impala Tuning

Initial Insight Evaluation

Gotchas Captured: ● Role Layout: over burdened “Master hosts”● Using the buggy RHEL kernel (Linux

2.6.32-504.3.3.el6.x86_64)● Multiple Java versions● Default swappiness● Transparent hugepages was enabled



Impala Tuning

Impala Threads

Typical Incident Pattern



Impala Tuning

Impala Threads : Deep Dive

1. Potential disk errors in dmesg output for incident prone hosts.

2. The JVM crashes reported by Impala.

3. HDFS file count snowballing.



Impala Tuning

1.15Million

750K

File

s



Impala Tuning


1. Disk Errors● Without Spill directories configured, Scratch was defaulting

to /tmp/impala-scratch, which was unsuitable for the scale and concurrency.

Resolution: ● Spread the disk spill across the data drives.



Impala Tuning


1. Disk Errors● Identified bad RAID controller :

Three problem disks on a master host, RAID10 virtual disk for namenode, RAID1 virtual disk for Journalnode and another RAID1 virtual disk for Zookeeper.

Resolution: ● The host with bad disks was decommissioned to replace the

disks and brought back in a good state.● Regular scans have been set with the raid controller CLI to

alert about any future incidents.



Impala Tuning


2. Impala reported JVM Crashes ● The running OS kernel version is known to cause

CDH applications to pause and result in JVM hangs as seen on Impala reports.

Resolution: ● Upgrading kernel version to 2.6.32-504.16.2.el6

or later is recommended



Impala Tuning


3. The small files problem: ● Parquet files in order of KB which led to slow IO throughput.● Coordinator and Executor connections fail due to high scan times

from NN.● The failed executor connections kick off more threads which add up

very quickly and crash the daemon.

Resolution: ● By rewriting Parquet Compaction to dynamic partitions the client

was able to produce 1 file in place of 29 files, significantly reducing the file count overall.



Impala Tuning


Tuning for Scale

● Since Impala 2.9, we can assign Impala Daemons as query coordinators or query executors.

● These two components can now be tuned as per their responsibilities giving us more flexibility.



Impala Tuning


Tuning for Scale

Coordinators: ● Perform the network communication to keep metadata up-to-date

and route query results to the appropriate clients.

● Experience significant network and CPU overhead with queries containing a large number of query fragments.

● Need large JVM heap for caching metadata for all table partitions and data files.



Impala Tuning


Tuning for Scale

Executors: ● Need default JVM Heap, leaving more memory

available to process CPU intensive joins, aggregations, and other operations.

● Executors perform I/O intensive scans.



Impala Tuning


Tuning for Scale

Coordinators: How Many? [Our cluster: 3]● Small is good (a minimum of 1 dedicated)● Considerations: # of Impala Daemons, DDL queries, average query

resource usage at various stages.

Where do they go? [Our cluster: Utility hosts]● Coordinators can go non-workers.● Avoid losing out on resources, memory, or disk.



Impala Tuning

Choosing the right Load-Balancing Algorithm for High Availability through a proxy.

LeastConn:

High Availability

What? Connects sessions to the coordinator with the fewest connections, to balance the load evenly.

When? Many independent, short-running queries.

Where? Recommended for Impala with F5.



Impala Tuning


RoundRobin:

High Availability

What? Distributes connections to all coordinator nodes, we can add list of servers with a weight parameter to define the distribution.

When? Predictable and stable balancing, requires to perform benchmarks and load testing.

Where? Not recommended by Cloudera for Impala.



Impala Tuning


Source Persistence:

High Availability

What? The source IP address is hashed and divided by the total weight of the running servers to determine which server will receive the request.

When? Impala workloads containing a mix of queries and DDL statements, such as CREATE TABLE and ALTER TABLE.

Where? It is required for setting up high availability with Hue.



HBase Tuning

HBase Tuning Case Study



HBase Tuning

● Client wanted to upgrade from manually installed HBase environment to the Cloudera distribution's HBase.

● New hardware with much larger RAM footprint.● SSDs, because, why not? (And not important to

this tuning.)

Case Study: ClientB OpenTSDB Platform Upgrade



HBase Tuning

Initial Insight Evaluation

Gotchas Captured:

● None, really. It is not installed yet, but we will need to tune HBase to utilize a lot more memory.



HBase Tuning

Use the Java Development Kit (JDK) version 8.

Java



HBase Tuning

Enable garbage collection (GC) logging.Java

-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintReferenceGC -XX:+PrintFlagsFinal -Xloggc:/var/log/hbase/regionserver-gc.log



HBase Tuning

Enable garbage collection (GC) log rotation.Java

-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=200M



HBase Tuning

Enable G1GC Garbage Collector for RegionServer.Java

-XX:+UseG1GC -XX:MaxGCPauseMillis=100

https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html



HBase Tuning

Tune G1GC.Java

-XX:+ParallelRefProcEnabled -XX:-ResizePLAB -XX:ParallelGCThreads=8+(logical Processors-8)(5/8) -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=3

https://www.oracle.com/technetwork/articles/java/g1gc-1984535.html



HBase Tuning

Where do the HBase GC settings go?Configuration

Cloudera Manager: HBase -> Configuration -> SCOPE:RegionServer / CATEGORY:Advanced / Java Configuration Options for HBase RegionServer

Ambari: Service/HBase/Configs -> CONFIGS / ADVANCED / Advanced hbase-env / hbase-env template



HBase Tuning

Increase the Java Heap of the HBase RegionServer.

Java

CM: Java Heap Size of HBase RegionServer in Bytes: 31 GiBAmbari: HBase RegionServer Maximum Memory: 31 GiB



HBase Tuning

Increase the Java Heap of the HBase RegionServer.

Java

CM: Java Heap Size of HBase RegionServer in Bytes: 31 GiBAmbari: HBase RegionServer Maximum Memory: 31 GiB

Never set the heap size to values between 32-48 GiB.

https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/



HBase Tuning

Enable the HBase BucketCache.HBase

RegionServer Advanced Configuration Snippet (Safety Valve) for hbase-site.xml:

hbase.bucketcache.ioengine: offheap

hbase.bucketcache.size: 32 GiB (or 96 GiB)

hfile.block.cache.size: 0.2



HBase Tuning

Enable the HBase BucketCache.HBase

HBase Client Environment Advanced Configuration Snippet for hbase-env.sh:

HBASE_OFFHEAPSIZE=36G (or 100G)

HBASE_OPTS=-XX:MaxDirectMemorySize=36G (100G)



HBase Tuning

Enable HBase MultiWAL Support.HBase

hbase.wal.provider: Multiple HDFS WAL

hbase.wal.regiongrouping.numgroups: (numDrives/3)



HBase Tuning

Enable HDFS Hedged Reads.HDFS

dfs.client.hedged.read.threadpool.size: 20

dfs.client.hedged.read.threshold.millis: 500 milliseconds



References

● https://impala.apache.org/docs/build/html/topics/impala_scalability.html

● https://impala.apache.org/docs/build/html/topics/impala_partitioning.html

● https://impala.apache.org/docs/build/html/topics/impala_proxy.html

● https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-ga

rbage-collection-for-hbase

● http://gceasy.io/



Thank You

• Thank you

• Questions

• Get in touch with us:

www.clairvoyantsoft.com



Contact Us

CHANDLER, AZ

SEATTLE, WA

DALLAS, TX

BOSTON, MA

PUNE, INDIA

+1 (623) 282 2385

Nithya Koka

@nithya_koka

https://www.linkedin.com/in/nithyakoka

6185 W Detroit St. Chandler, AZ

Michael Arnold

@hadoopgeek

https://www.linkedin.com/in/michaelarnold

Insight Case Studies Tuning the Beloved DB-Engines · Insight Case Studies Tuning the Beloved...

Documents

Transcript of Insight Case Studies Tuning the Beloved DB-Engines · Insight Case Studies Tuning the Beloved...