Accelerate Database Performance and Reduce Response Times in MongoDB...

6
White Paper The Rise of MongoDB MongoDB, derived from “humongous”, is an open-source document oriented database and the leading NoSQL database system. A powerful, flexible and scalable data store that provides high performance, high availability and easy scalability, MongoDB has features that rival relational databases including: a shard-capable scale-out model, primary, secondary and geospatial indexing and built-in support for MapReduce. MongoDB is used by thousands of companies worldwide to provide services for applications such as content management, Platform as a Service (PaaS), real-time monitoring, CRM, custom analytics stores, and many more. The MongoDB data model allows it to automatically split up data across multiple servers providing easier scalability and high availability. By adding additional nodes to the cluster, administrators can easily scale capacity and performance. High availability (HA) is also achieved through its automatic failover feature, in which a backup slave can automatically be promoted to a master in the event of a failure. With MongoDB, a document is the basic unit of data (equivalent to a row in a RDBMS), a collection consists of one or more documents (similar to a table in an RDBMS) and a database consists of one or more collections. A single instance of MongoDB can host multiple independent databases. Documents of the same kind are typically grouped together in the same collection to provide data locality and create excellent opportunities for data tiering and caching. As the data set size grows, it can be automatically sharded (the process of storing data records across multiple machines; MongoDB’s approach to meeting data growth demands) across the cluster for scalability. In addition, documents can be batch inserted (and removed) with minimal overhead, therefore, it is important to build an optimal storage subsystem to assure the fast loading of data. As is typical in most databases, there are regions of frequently accessed data that would benefit from low latency flash caching. MongoDB by itself does not include caching or tiering features. Accelerate Performance in MongoDB Environments with the Nytro MegaRAID Flash Accelerator Testing conducted by LSI compared a six-disk RAID 10 configuration with and without flash- based cache located on the disk controllers of each node in a 3-node replica set. An additional node was used as the workload driver so that the overhead associated with workload generator activity did not interact and introduce measurement noise. Each node ran CentOS 6.4 configured with 16GB of RAM, and the Nytro MegaRAID cache size was 44GB configured as a RAID 1 to ensure the highest data availability. Using a version of the Yahoo Cloud Serving Benchmark (YCSB) forked specifically for MongoDB (version 0.14) to measure database performance, the client was configured with the default write concern which acknowledges writes to the primary server. Summary One of today’s growing database types is MongoDB in which applications such as real-time monitoring or content management may be performed. The flash caching algorithm in Nytro MegaRAID technology allows hot data in large collections of databases to be quickly recognized as the data set changes helping to provide consistent improvements in performance. In tests conducted by LSI, the implementation of MongoDB installations using a Nytro MegaRAID flash acceleration card doubled operations per second and reduced average latency by up to 60% for a uniform distribution of data. Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments with the LSI ® Nytro MegaRAID ® Flash Accelerator Card

Transcript of Accelerate Database Performance and Reduce Response Times in MongoDB...

Page 1: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

White Paper

The Rise of MongoDB

MongoDB, derived from “humongous”, is an open-source document oriented database and the

leading NoSQL database system. A powerful, flexible and scalable data store that provides high

performance, high availability and easy scalability, MongoDB has features that rival relational

databases including: a shard-capable scale-out model, primary, secondary and geospatial

indexing and built-in support for MapReduce. MongoDB is used by thousands of companies

worldwide to provide services for applications such as content management, Platform as

a Service (PaaS), real-time monitoring, CRM, custom analytics stores, and many more. The

MongoDB data model allows it to automatically split up data across multiple servers providing

easier scalability and high availability. By adding additional nodes to the cluster, administrators

can easily scale capacity and performance. High availability (HA) is also achieved through its

automatic failover feature, in which a backup slave can automatically be promoted to a master

in the event of a failure.

With MongoDB, a document is the basic unit of data (equivalent to a row in a RDBMS), a

collection consists of one or more documents (similar to a table in an RDBMS) and a database

consists of one or more collections. A single instance of MongoDB can host multiple

independent databases. Documents of the same kind are typically grouped together in the

same collection to provide data locality and create excellent opportunities for data tiering and

caching. As the data set size grows, it can be automatically sharded (the process of storing data

records across multiple machines; MongoDB’s approach to meeting data growth demands)

across the cluster for scalability. In addition, documents can be batch inserted (and removed)

with minimal overhead, therefore, it is important to build an optimal storage subsystem to

assure the fast loading of data. As is typical in most databases, there are regions of frequently

accessed data that would benefit from low latency flash caching. MongoDB by itself does not

include caching or tiering features.

Accelerate Performance in MongoDB Environments with the Nytro MegaRAID

Flash Accelerator

Testing conducted by LSI compared a six-disk RAID 10 configuration with and without flash-

based cache located on the disk controllers of each node in a 3-node replica set. An additional

node was used as the workload driver so that the overhead associated with workload

generator activity did not interact and introduce measurement noise. Each node ran CentOS

6.4 configured with 16GB of RAM, and the Nytro MegaRAID cache size was 44GB configured

as a RAID 1 to ensure the highest data availability. Using a version of the Yahoo Cloud Serving

Benchmark (YCSB) forked specifically for MongoDB (version 0.14) to measure database

performance, the client was configured with the default write concern which acknowledges

writes to the primary server.

Summary

One of today’s growing database

types is MongoDB in which

applications such as real-time

monitoring or content management

may be performed.

The flash caching algorithm in Nytro

MegaRAID technology allows hot data

in large collections of databases to

be quickly recognized as the data set

changes helping to provide consistent

improvements in performance.

In tests conducted by LSI, the

implementation of MongoDB

installations using a Nytro MegaRAID

flash acceleration card doubled

operations per second and reduced

average latency by up to 60% for a

uniform distribution of data.

Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments with the LSI® Nytro™ MegaRAID® Flash Accelerator Card

Page 2: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

White Paper

Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments | 2

All reads and writes from YCSB occur on the primary node. The database contains a single

collection with 75 million documents and is 100GB in size. Additional workloads (workloads

B and F) were also used. Workload B is primarily read-based (95% reads and 5% writes) and

simulates applications such as photo tagging or status updating. Workload F reads a record,

modifies it and then writes it back, simulating an I/O pattern found in transactional type

databases.

In order to measure steady-state performance, testing was performed in six consecutive

10-minute phases to obtain the IOPS and latency stats for average, 95th percentile and 99th

percentile over each interval. This approach measures the impact of “warming up” the Nytro

MegaRAID flash cache.

Test Results

The latency and operations per second statistics are shown in the charts below. Using

the Nytro MegaRAID card’s automatic flash caching technology, the number of achieved

operations per second improved by 117%, while the response times were reduced by

approximately 50%. The warm up time (the amount of time it took to identify and begin

caching hot data), was less than 10 minutes. This improvement applied to both workload

profiles.

The LSI Nytro MegaRAID card

offers MegaRAID data protection

and flexible onboard flash caching

technology that can be used in a

variety of ways to help improve

performance and server storage

density.

Existing server deployments

currently using a standard RAID

controller or HBA have the potential

for higher performance simply by

replacing the existing card with a

Nytro MegaRAID card.

1000

900

800

700

600

500

400

300

200

100

0HDD Only HDD with Nytro MegaRAID

Ope

ratio

ns p

er S

econ

d

MongoDB YCSB B Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

866

399

Page 3: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

White Paper

Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments | 3

The following chart compares the average read latency for uniform distribution for flash cache-

accelerated and hard disk drive (HDD) cases:

20

18

16

14

12

10

8

6

4

2

0HDD Only HDD with Nytro MegaRAID

Aver

age

Read

Res

pons

e Ti

me

MongoDB YCSB B Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

7.984

17.734

The following chart compares the average update latency for uniform distribution for flash

cache accelerated and HDD cases:

60

50

40

30

20

10

0HDD Only HDD with Nytro MegaRAID

Aver

age

Upd

ate

Resp

onse

Tim

e

MongoDB YCSB B Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

27.93

50.94

Page 4: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

White Paper

Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments | 4

The following chart compares the operations per second for uniform distribution for flash

cache-accelerated and HDD cases:

600

500

400

300

200

100

0HDD Only HDD with Nytro MegaRAID

Ope

ratio

ns p

er S

econ

d

MongoDB YCSB F Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

525

249

The following chart compares the average read latency for uniform distribution for flash cache-

accelerated and HDD cases:

25

20

15

10

5

0HDD Only HDD with Nytro MegaRAID

Aver

age

Read

Res

pons

e Ti

me

MongoDB YCSB F Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

11.967

20.9

Page 5: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

White Paper

Accelerate Database Performance and Reduce Response Times in MongoDB “Humongous” Environments | 5

The following chart compares the average update latency for uniform distribution for flash

cache-accelerated and HDD cases:

50

45

40

35

30

25

20

15

10

5

0HDD Only HDD with Nytro MegaRAID

Aver

age

Upd

ate

Resp

onse

Tim

e

MongoDB YCSB F Workload Uniform Distribution100G, 8 threads, replicate set, 3 nodes

21.698

42.878

Conclusion

Today’s megatrends – big data and cloud computing - are driving the adoption of NoSQL

technologies as a viable alternative to relational databases when operating at scale. The

ability to store schema-less data is an attractive solution for the variety, velocity and volumes

of data captured and processed today. These YCSB test results clearly show that flash caching

is a valuable and cost-effective way to accelerate MongoDB installations. The table below

summarizes the test results as factors of improvement in performance for operations per

second and latency.

Workload B (Web Updates)

Workload F (Transactions)

Operations Per Second (% Improvement)

117% 111%

Average Read Response Time (% Reduction)

45% 50%

Average Update Response Time (% Reduction)

55% 43%

The LSI Nytro MegaRAID card’s effective caching algorithm allows the hot data to be

recognized quickly as the data set changes helping to provide consistently improved

performance. Data management structures such as indices can be retained in flash cache to

allow even faster lookups and efficient access of document records.

Page 6: Accelerate Database Performance and Reduce Response Times in MongoDB …ww1.prweb.com/prfiles/2014/04/02/11730828/Nytro MegaRAID... · 2014-04-02 · White Paer The Rise of MongoDB

For more information and sales office locations, please visit the LSI website at: www.lsi.com

LSI, the LSI & Design logo, and the Storage. Networking. Accelerated. tagline are trademarks or registered trademarks of LSI Corporation. All other brand or product names may be trademarks or registered trademarks of their respective companies.

LSI Corporation reserves the right to make changes to any products and services herein at any time without notice. LSI does not assume any responsibility or liability arising out of the application or use of any product or service described herein, except as expressly agreed to in writing by LSI; nor does the purchase, lease, or use of a product or service from LSI convey a license under any patent rights, copyrights, trademark rights, or any other of the intellectual property rights of LSI or of third parties.

Copyright ©2014 by LSI Corporation. All rights reserved. > 0214 WP0016.01

North American HeadquartersSan Jose, CAT: +1.866.574.5741 (within U.S.)T: +1.408.954.3108 (outside U.S.)

LSI Europe Ltd.European HeadquartersUnited KingdomT: [+44] 1344.413200

LSI KK HeadquartersTokyo, JapanT: [+81] 3.5463.7165

White Paper

Test Disclosure

• MongoDB v2.4.8

• JDK 1.6.45 (Oracle)

• OS: CentOS release 6.5

• 4 machines, 1 workload driver, 3 replica nodes

- 16 core, 64bit, 16GB RAM

- Intel® Xeon® CPU E5-2650 0 @ 2.00GHz stepping 07, SandyBridge

- 10gb dedicated Ethernet

• LSI Nytro MegaRAID flash accelerator card

- 88GB flash based write back cache

- 6x 1TB 7200 RPM SATA disks

• YCSB (Yahoo Cloud System Benchmark) 0.1.4

- 75,000,000 records

- 8 threads

References

https://github.com/brianfrankcooper/YCSB/tree/master/mongodb

http://www.mongodb.org/