Integrating Structure and Analytics with Unstructured Data

41
© 2014 IBM Corporation 1 Making sense of data Integrating Structure and Analytics with Unstructured Data John Park – Lead Product Manager for dashDB

Transcript of Integrating Structure and Analytics with Unstructured Data

© 2014 IBM Corporation 1

Making sense of data Integrating Structure and Analytics with Unstructured Data John Park – Lead Product Manager for dashDB

Disclaimer

© Copyright IBM Corporation 2014. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM'S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.

IBM's statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM's sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.

IBM, the IBM logo, ibm.com, Information Management, DB2, DB2 Connect, DB2 OLAP Server, pureScale, System Z, Cognos, solidDB, Informix, Optim, InfoSphere, and z/OS are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others.

© 2014 IBM Corporation 2 IBM Internal Use Only

© 2014 IBM Corporation 3

Gordon Moore stated, “The future of integrated electronics is the future of electronics itself. The advantages of integration will bring about a proliferation of electronics, pushing this science into many new areas. Integrated circuits will lead to such wonders as home computers–or at least terminals connected to a central computer–automatic controls for automobiles, and personal portable communications equipment.”

John Park postulates (who incidentally is NOT as smart as Gordon Moore, nor as successful)

“Just as efficiencies in electronics have made electronics more powerful, accessible and consumable. Technology in itself benefits from this “Moore’s law” where efficiencies are made in software, and innovation leading to more powerful, accessible and consumable Technologies”.

Lets Level Set on DBs

Larry Ellison is still rich (Forbes March 2015)

Big Data is still a big opportunity and reality. (insert some ridiculous statistic here)

Innovation & technology makes data easier to obtain, analyze and consume

Innovation driving down TCO changing the DBMS landscape

Commoditization of technology is driving a new DBMS landscape

Emerging DBMS technologies are replacing today's infrastructure

What database technologies are driving the most rapid change ?

o  Database as a Service

o  In Memory

o  Document Store

o  Hadoop

o  Graph

o  Table / Time Series

Database as a Service What is it ? Cloud hosted and provider provisioned Scalable Multi tenant and dedicated Included services (administration, monitoring, security) Abstraction layer of operational functions – focus on user end points Who’s playing in this space ? 1010data.com; Microsoft Azure SQL Database, Dynamo DB, Amazon Redshift

Factoids .. Great for application development, sandbox and POC NoSQL has gained traction for production purposes specifically in the mobile / Web 2.0 arena SQL / ACID used for analytics processing, consolidation and data mart use cases.

In Memory

Moves the database into memory either partially or in totality Supports both transactional and analytical processing In memory is lending itself to the creation of Hybrid Transactional Analytical Processing (HTAP) SAP, Oracle, IBM, VoltDB, Kognitio, ParStream If HTAP becomes a reality – real time analysis could be supported Workload optimized In-Memory DBMS are replacing traditional RDBMS systems

Document Store

Supports JSON or XML NoSQL Basic availability, Soft State, and Eventual Consistency (BASE) NOT ACID JSON datatype support in traditional RDBMS (IBM, Teradata) Open Source options as well. MongoDB and CouchDB Perfect for Web 2.0 and Gaming applications Technology startups adopting the technology due to ease of scalability and speed to deployment

What are these technologies causing ? 2 “..zation”s … Modernization Consolidation

As organizations ask how to leverage data from their current infrastructure

How to control the sprawl of desktop DBMS

How to collate and aggregate large data sets

How to control license and capital costs

Technology forces Modernization

Replacing old technology with new. XML to JSON. Row based databases to columnar for analytics.

Moving to Open Source Databases and Cloud based infrastructure

Technology forces Consolidation

Appliances to move data to "one source of truth"

HTAP systems (hybrid transactional analytical systems)

Generation D driving Modernization & Consolidation There are four distinct approaches to data and analytics… and one group of enterprises uses insights, cloud, and data very differently

Generation D enterprises are:

3x more likely to excel at developing insights regarding their customers and marketplace

Dat

a an

d an

alyt

ics

2x more likely to automate processes and decisions based on insights from analytics

2x more likely to believe cloud is transforming their business model C

loud

2x more likely to engage customers via digital channels (mobile and social)

Enga

gem

ent

Ana

lytic

mat

urity

Data breadth and sophistication

Traditional

Analytically ambitious

Data-rich, analytically enabled

Generation D: Data-rich, analytically driven

19%

31%

21%

29%

Generation D All others Generation D

versus all others

34%

37%

43%

42%

33%

Developing new revenue streams

Penetrating new markets

Improving interactions with customers

Operating efficiently

Managing risk

Responding to security threats

Faster time to market

3.7x

2.5x

2.9x

2.4x

3.0x

9%

13%

18%

14%

14%

46% 18%

2.6x 36% 14%

2.5x

Generation D enterprises are extremely effective at addressing business challenges

Generation D enterprises use data and analytics throughout the business

Generation D All others Generation D

versus all others

Educate employees on the use of data and analytics

Provide data and analytics in real time

Extensively share data and analytics internally

2.7x

1.6x

1.5x

31%

40%

43%

85%

60%

70%

Generation D enterprises view cloud as key to enabling transformation

Generation D All others Generation D

versus all others

60%

56%

54%

53%

Believe cloud is transforming their business model

Use cloud for analytics

Use cloud for data management

Use API-based services

1.9x

1.8x

2.5x

1.8x

31%

23%

30%

29%

Competing faster on the cloud

Using APIs to speed performance

•  A real estate company desired to collect and distribute property information between employees faster.

•  Through a cloud database, the company gives employees the ability to sort through existing data and upload their own, such as photographs of properties and location information, while in the field.

•  It now has a competitive advantage during time-sensitive biddings and can predict imminent vacancies.

•  A bank wanted to give their institutional clients access to some of its efficient, in-house capabilities.

•  The bank offers API services over the cloud which provide access to proprietary platforms and stream real-time data.

•  Institutional clients can now view real-time information (e.g., exchange rates) and perform actions such as foreign exchange transactions more easily.

How would you answer Generation D’s call ?

Improving the customer experience with real-time data

•  A trucking company wanted to revamp their communication with drivers on the road.

•  It equips all of the company’s trucks with a telematics system that logs GPS, engine use, and speed/braking data.

•  The real-time information allows the company to provide updated quotes and delivery status while optimizing fuel costs and reducing its mobile units by 60%.

16

17 Need webcast troubleshooting help? Click attachm

ents

18 Need webcast troubleshooting help? Click attachm

ents

19 Need webcast troubleshooting help? Click attachm

ents

20 Need webcast troubleshooting help? Click attachm

ents

21 Need webcast troubleshooting help? Click attachm

ents

22 Need webcast troubleshooting help? Click attachm

ents

23 Need webcast troubleshooting help? Click attachm

ents

24 Need webcast troubleshooting help? Click attachm

ents

25 Need webcast troubleshooting help? Click attachm

ents

26 Need webcast troubleshooting help? Click attachm

ents

27 Need webcast troubleshooting help? Click attachm

ents

28 Need webcast troubleshooting help? Click attachm

ents

29 Need webcast troubleshooting help? Click attachm

ents

30 Need webcast troubleshooting help? Click attachm

ents

31 Need webcast troubleshooting help? Click attachm

ents

32 Need webcast troubleshooting help? Click attachm

ents

33 Need webcast troubleshooting help? Click attachm

ents

34 Need webcast troubleshooting help? Click attachm

ents

35 Need webcast troubleshooting help? Click attachm

ents

36 Need webcast troubleshooting help? Click attachm

ents

37 Need webcast troubleshooting help? Click attachm

ents

38 Need webcast troubleshooting help? Click attachm

ents

39 Need webcast troubleshooting help? Click attachm

ents

Traditional

Analytically ambitious

Data-rich, analytically enabled

Generation D: Data-rich, analytically driven

Infusing the majority of processes and decisions with analytics Tackling complex data sources and applying more predictive and prescriptive analytics Managing more of their data and analytics on the cloud Moving toward mobile and social as their primary methods of engaging customers Changing their culture, not just their technology

Generation D is:

One more week to Enterprise Data World 2015 ! Drop by the Booth #306 – lets talk.

John Park – [email protected] Merci J