Big Data Testing: Ensuring MongoDB Data Quality

Post on 01-Jul-2015

5.797 views 2 download

description

You've made the move to MongoDB for its flexible schema and querying capabilities in order to enhance agility and reduce costs for your business. Shouldn't your data quality process be just as organized and efficient? Using QuerySurge for testing your MongoDB data as part of your quality effort will increase your testing speed, boost your testing coverage (up to 100%), and improve the level of quality within your data warehouse. QuerySurge will help you keep your team organized and on track too!

Transcript of Big Data Testing: Ensuring MongoDB Data Quality

Webinar:automating the data testing

for

Bill HaydukCEO, President

RTTS

Jeff Bocarsly, PhDVP & Chief Architect

RTTS

Presenters

Ensuring MongoDB Data Quality

built by

QuerySurge ™

About FACTS

Software:developer of QuerySurge

Founded: 1996

Locations:

New York (HQ), Atlanta, Philadelphia, Phoenix

Primary expertise:Software quality & testing services and solutions

Customer profile:Fortune 1000 & mid-size> 600 customers

Strategic Partners:IBM, Microsoft, HP, Oracle, Teradata, HortonWorks

RTTS is the leading provider of software quality for critical business systems

about

built by

QuerySurge ™

What is MongoDB?

Name: MongoDB (from "humongous")

1NoSQL means now “not only SQL” 210gen changed its name to MongoDB, Inc.

Source: Wikipedia

built by

QuerySurge ™

• classified as a NoSQL1 database • does not implement the table-based relational db structure• cross-platform document-oriented database• makes the integration of data in certain types of apps easier & faster• free and open source• originally built by 10gen2 and released in 2009

“MongoDB is in 5th place as the most popular type of database management system, and 1st place for NoSQL database management systems.”

April 2014

built by

QuerySurge ™

• Online real-time processing• Data set is smaller• Measured in milliseconds

• Offline big data processing• Offline analytics• Measured in minutes & hours

MongoDB versus Hadoop

Source: classpattern.com

When use MongoDB? / When use Hadoop?

MongoDB Use Cases

built by

QuerySurge ™

Source: MongoDB, Inc.

Data Warehouse Batch Aggregation

ETL from MongoDB

ETL to MongoDB

Use Cases: Data Warehouse

Relational DB & Data Warehousing

Source Data

@

BI, Analytics & Reporting

built by

QuerySurge ™

Ingestion

Data Quality Issues

built by

QuerySurge ™

Data Quality Best Practices boost revenue by 66%.

The average organization loses $8.2 million annually through poor Data Quality.

46% of companies cite Data Quality as a barrier for adopting Business Intelligence products.

80% of organizations… will underestimate the costs related to the data acquisition tasks by an average of 50 percent.

News Headlines

built by

QuerySurge ™

Validating Data: 3 Big Issues

- need to verify more data and to do it faster

- need to automate the testing effort

- need to be able to test across different platforms

Need a testing tool!

built by

QuerySurge ™

What is QuerySurge ™?

a collaborative data testing tool that

finds bad data & provides a holistic view of your

data’s health

Data Testing

built by

QuerySurge ™

Improve the Health of your Data

• Reduce your costs & risks

• Improve your data quality

• Accelerate your testing cycles

• Share information with your team

with QuerySurge ™ you can:

built by

QuerySurge ™

Teamwork

Testers - functional testing - regression testing- result analysis

Developers - unit testing- result analysis

Data Analysts- review, analyze data - verify mappings & failures

Operations teams - monitoring- result analysis

Managers- oversight- result analysis

Share information on the health of your data

built by

QuerySurge ™

QuerySurge™ Architecture

Web-based Installs on...

Linux

Connects to…

built by

QuerySurge ™

…any JDBC compliant data source

Finding Bad Data

SQL

SQL

SQL

SQL

SQL

SQL

QS pulls data from data source(s) QS pulls data from data target(s) QS compares data in seconds QS generates reports, audit trails

How?

reports

built by

QuerySurge ™

Use Case: DWH & Automated Data Testing

Relational DB & Data Warehousing

Source Data

@

BI, Analytics & Reporting

Ingestion

built by

QuerySurge ™

Value-Add

QuerySurge provides value by either:

in testing data coverage from < 1% to upwards of 100%

in testing time by as much as 1,000 x 

combination of in test coverage while in testing time

17built by

QuerySurge ™

Return on InvestmentQuerySurge provides an increase in better data due to shorter / more thorough testing cycle - saving $$$. 

18built by

QuerySurge ™

Pharmaceutical Organization Saves $288,000 in Clinical Trials Data Migration Testing Project

1Since 2010, the pharmaceutical industry has been assessed over $13 billion in fines.

Source: wikipedia http://en.wikipedia.org/wiki/List_of_largest_pharmaceutical_settlements

This savings does not include savings from avoiding fines from regulatory bodies or lawsuits.1

Total Savings

built by

QuerySurge

built by

QuerySurge ™

Jeff Bocarsly, PhDVP & Chief Architect

RTTS

• Release date: Sept 22nd

• We will email all registrants• If you have an immediate need, contact us.

We can set you up manually for a trial now.

release

Contact us if your team would like:

(1) a Trial in the Cloud, including self-learning tutorial that works with sample data (3 days) or

(2) a downloaded Trial of QuerySurge, including self-learning tutorial with sample data or your data (15 days) or

(3) a Proof of Concept of QuerySurge, including a kickoff & setup meeting and weekly meetings with our team of experts (30 days)

http://www.querysurge.com/compare-trial-options for more information, Go here

QuerySurge

built by

QuerySurge ™

TRIAL IN THE CLOUD