Integrating Structure and Analytics with Unstructured Data
-
Upload
dataversity -
Category
Technology
-
view
574 -
download
2
Transcript of Integrating Structure and Analytics with Unstructured Data
© 2014 IBM Corporation 1
Making sense of data Integrating Structure and Analytics with Unstructured Data John Park – Lead Product Manager for dashDB
Disclaimer
© Copyright IBM Corporation 2014. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM'S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.
IBM's statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM's sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
IBM, the IBM logo, ibm.com, Information Management, DB2, DB2 Connect, DB2 OLAP Server, pureScale, System Z, Cognos, solidDB, Informix, Optim, InfoSphere, and z/OS are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others.
© 2014 IBM Corporation 2 IBM Internal Use Only
© 2014 IBM Corporation 3
Gordon Moore stated, “The future of integrated electronics is the future of electronics itself. The advantages of integration will bring about a proliferation of electronics, pushing this science into many new areas. Integrated circuits will lead to such wonders as home computers–or at least terminals connected to a central computer–automatic controls for automobiles, and personal portable communications equipment.”
John Park postulates (who incidentally is NOT as smart as Gordon Moore, nor as successful)
“Just as efficiencies in electronics have made electronics more powerful, accessible and consumable. Technology in itself benefits from this “Moore’s law” where efficiencies are made in software, and innovation leading to more powerful, accessible and consumable Technologies”.
Lets Level Set on DBs
Larry Ellison is still rich (Forbes March 2015)
Big Data is still a big opportunity and reality. (insert some ridiculous statistic here)
Innovation & technology makes data easier to obtain, analyze and consume
Innovation driving down TCO changing the DBMS landscape
Commoditization of technology is driving a new DBMS landscape
Emerging DBMS technologies are replacing today's infrastructure
What database technologies are driving the most rapid change ?
o Database as a Service
o In Memory
o Document Store
o Hadoop
o Graph
o Table / Time Series
Database as a Service What is it ? Cloud hosted and provider provisioned Scalable Multi tenant and dedicated Included services (administration, monitoring, security) Abstraction layer of operational functions – focus on user end points Who’s playing in this space ? 1010data.com; Microsoft Azure SQL Database, Dynamo DB, Amazon Redshift
Factoids .. Great for application development, sandbox and POC NoSQL has gained traction for production purposes specifically in the mobile / Web 2.0 arena SQL / ACID used for analytics processing, consolidation and data mart use cases.
In Memory
Moves the database into memory either partially or in totality Supports both transactional and analytical processing In memory is lending itself to the creation of Hybrid Transactional Analytical Processing (HTAP) SAP, Oracle, IBM, VoltDB, Kognitio, ParStream If HTAP becomes a reality – real time analysis could be supported Workload optimized In-Memory DBMS are replacing traditional RDBMS systems
Document Store
Supports JSON or XML NoSQL Basic availability, Soft State, and Eventual Consistency (BASE) NOT ACID JSON datatype support in traditional RDBMS (IBM, Teradata) Open Source options as well. MongoDB and CouchDB Perfect for Web 2.0 and Gaming applications Technology startups adopting the technology due to ease of scalability and speed to deployment
What are these technologies causing ? 2 “..zation”s … Modernization Consolidation
As organizations ask how to leverage data from their current infrastructure
How to control the sprawl of desktop DBMS
How to collate and aggregate large data sets
How to control license and capital costs
Technology forces Modernization
Replacing old technology with new. XML to JSON. Row based databases to columnar for analytics.
Moving to Open Source Databases and Cloud based infrastructure
Technology forces Consolidation
Appliances to move data to "one source of truth"
HTAP systems (hybrid transactional analytical systems)
Generation D driving Modernization & Consolidation There are four distinct approaches to data and analytics… and one group of enterprises uses insights, cloud, and data very differently
Generation D enterprises are:
3x more likely to excel at developing insights regarding their customers and marketplace
Dat
a an
d an
alyt
ics
2x more likely to automate processes and decisions based on insights from analytics
2x more likely to believe cloud is transforming their business model C
loud
2x more likely to engage customers via digital channels (mobile and social)
Enga
gem
ent
Ana
lytic
mat
urity
Data breadth and sophistication
Traditional
Analytically ambitious
Data-rich, analytically enabled
Generation D: Data-rich, analytically driven
19%
31%
21%
29%
Generation D All others Generation D
versus all others
34%
37%
43%
42%
33%
Developing new revenue streams
Penetrating new markets
Improving interactions with customers
Operating efficiently
Managing risk
Responding to security threats
Faster time to market
3.7x
2.5x
2.9x
2.4x
3.0x
9%
13%
18%
14%
14%
46% 18%
2.6x 36% 14%
2.5x
Generation D enterprises are extremely effective at addressing business challenges
Generation D enterprises use data and analytics throughout the business
Generation D All others Generation D
versus all others
Educate employees on the use of data and analytics
Provide data and analytics in real time
Extensively share data and analytics internally
2.7x
1.6x
1.5x
31%
40%
43%
85%
60%
70%
Generation D enterprises view cloud as key to enabling transformation
Generation D All others Generation D
versus all others
60%
56%
54%
53%
Believe cloud is transforming their business model
Use cloud for analytics
Use cloud for data management
Use API-based services
1.9x
1.8x
2.5x
1.8x
31%
23%
30%
29%
Competing faster on the cloud
Using APIs to speed performance
• A real estate company desired to collect and distribute property information between employees faster.
• Through a cloud database, the company gives employees the ability to sort through existing data and upload their own, such as photographs of properties and location information, while in the field.
• It now has a competitive advantage during time-sensitive biddings and can predict imminent vacancies.
• A bank wanted to give their institutional clients access to some of its efficient, in-house capabilities.
• The bank offers API services over the cloud which provide access to proprietary platforms and stream real-time data.
• Institutional clients can now view real-time information (e.g., exchange rates) and perform actions such as foreign exchange transactions more easily.
How would you answer Generation D’s call ?
Improving the customer experience with real-time data
• A trucking company wanted to revamp their communication with drivers on the road.
• It equips all of the company’s trucks with a telematics system that logs GPS, engine use, and speed/braking data.
• The real-time information allows the company to provide updated quotes and delivery status while optimizing fuel costs and reducing its mobile units by 60%.
Traditional
Analytically ambitious
Data-rich, analytically enabled
Generation D: Data-rich, analytically driven
Infusing the majority of processes and decisions with analytics Tackling complex data sources and applying more predictive and prescriptive analytics Managing more of their data and analytics on the cloud Moving toward mobile and social as their primary methods of engaging customers Changing their culture, not just their technology
Generation D is:
One more week to Enterprise Data World 2015 ! Drop by the Booth #306 – lets talk.
John Park – [email protected] Merci J