Querying multiple distributed storage systems with Apache Hive robustly

Post on 12-Aug-2015

52 views 0 download

Tags:

Transcript of Querying multiple distributed storage systems with Apache Hive robustly

1© Cloudera, Inc. All rights reserved.

Querying multiple distributed storage systems with Apache Hive robustlyAshish Singh | Software Engineer, Cloudera

2© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.

3© Cloudera, Inc. All rights reserved.

Programming Model SQL

4© Cloudera, Inc. All rights reserved.

5© Cloudera, Inc. All rights reserved.

6© Cloudera, Inc. All rights reserved.

Storage Handler

7© Cloudera, Inc. All rights reserved.

8© Cloudera, Inc. All rights reserved.

9© Cloudera, Inc. All rights reserved.

10© Cloudera, Inc. All rights reserved.

+ = HiveKa

11© Cloudera, Inc. All rights reserved.

+ = HiveKa

Project available on github (https://github.com/HiveKa)

12© Cloudera, Inc. All rights reserved.

13© Cloudera, Inc. All rights reserved.

14© Cloudera, Inc. All rights reserved.

15© Cloudera, Inc. All rights reserved.

16© Cloudera, Inc. All rights reserved.

Demo Time

16© Cloudera, Inc. All rights reserved.

17© Cloudera, Inc. All rights reserved.

18© Cloudera, Inc. All rights reserved.

19© Cloudera, Inc. All rights reserved.

20© Cloudera, Inc. All rights reserved.

21© Cloudera, Inc. All rights reserved.

22© Cloudera, Inc. All rights reserved.

23© Cloudera, Inc. All rights reserved.

• Strict code review policies

24© Cloudera, Inc. All rights reserved.

• Strict code review policies• ~7600 upstream tests

25© Cloudera, Inc. All rights reserved.

• Strict code review policies• ~7600 upstream tests• End-to-end tests: qTests

26© Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved.

27© Cloudera, Inc. All rights reserved.

@Cloudera

28© Cloudera, Inc. All rights reserved.

@Cloudera

• Believe in Open source Community• Invest heavily in improving upstream test infra• Ptests to reduce turn around time

29© Cloudera, Inc. All rights reserved.

@Cloudera

• Believe in Open source Community• Invest heavily in improving upstream test infra• Ptests to reduce turn around time

But, is that enough?

30© Cloudera, Inc. All rights reserved. 30© Cloudera, Inc. All rights reserved.

Integration Testing

31© Cloudera, Inc. All rights reserved. 31© Cloudera, Inc. All rights reserved.

Compatibility Testing

32© Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved.

Scale Testing

33© Cloudera, Inc. All rights reserved. 33© Cloudera, Inc. All rights reserved.

Upgrade Testing

34© Cloudera, Inc. All rights reserved. 34© Cloudera, Inc. All rights reserved.

Random Query Generator

35© Cloudera, Inc. All rights reserved.

Thank youAshish Singhasingh@cloudera.com@singhasdev