Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
-
Upload
cynthia-saracco -
Category
Technology
-
view
1.644 -
download
7
description
Transcript of Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Introducing BigSheets
Spreadsheet-Style Tool
for IBM InfoSphere BigInsights
Cynthia M. Saracco
Senior Solution Architect
IBM Silicon Valley Lab
Browser-based analytics tool for business users.
Why BigSheets? � Business users need a non-technical approach
for analyzing Big Data.
� Translating untapped data into actionable
What is BigSheets?
How can BigSheets help?� Spreadsheet-like interface enables business
users to gather and analyze data easily.
� Built-in “readers” can work with data in
2 © 2013 IBM Corporation
� Translating untapped data into actionable
business insights is a common requirement.
� Visualizing and drilling down into enterprise
and Web data promotes new business
intelligence.
� Built-in “readers” can work with data in
several common formats (JSON, CSV, TSV, …)
� Users can combine and explore various types
of data to identify “hidden” insights.
What you can do with BigSheets
� Model “big data”
collected from various
sources in spreadsheet-
like structures
� Filter and enrich content
with built-in functions
3 © 2013 IBM Corporation
� Combine data in different
workbooks
� Visualize results through
spreadsheets, charts
� Export data into common
formats (if desired)
No programming knowledge needed!
Sample Scenario
Data gathering •WebCrawler app
• DBMS import app
• BoardReader app
• Accelerators
• Flume
• Hadoop commands
• - . . .
Data storage • Distributed file system
•Web-based file browser
and administration
Data exploration,
manipulation, and
analysis • BigSheets
4 © 2013 IBM Corporation
InfoSphere BigInsights
Blue italics = IBM technology
Technology
5 © 2013 IBM Corporation
Working with BigSheets
� Create workbook (spreadsheet-style structure) to model target data
� Customize workbook through graphical editor and built-in functions – Filter data
– Manipulate data (e.g., concatenate fields)
– Combine data from multiple workbooks
� “Run” workbook: apply work to full data set
6 © 2013 IBM Corporation
� “Run” workbook: apply work to full data set
� Explore results in spreadsheet format and/or create charts
� Optionally, export your data
What are Workbooks?
� Spreadsheet-like structures defined by user
� Based on data accessible in BigInsights
7 © 2013 IBM Corporation
Creating a Workbook (one approach)
� From BigSheets tab of
Web console, click New
Workbook button
� Supply input– Workbook name
– Source file (select from file
system directory tree)
8 © 2013 IBM Corporation
system directory tree)
– Appropriate “reader” (data
format translator)• Built-in readers for Web
data, JSON, CSV, TSV,
Hive, etc.
• User-written plug-ins
supported
� Save the workbook
Customizing a workbook
� Work with built-in editor
� Add / delete columns
� Filter data
� Specify formulas to compute
9 © 2013 IBM Corporation
� Specify formulas to compute
new values using
spreadsheet-style syntax
� Apply built-in or custom macro
functions– Supplied text analytic functions
for popular business entities:
person, location, phone number,
etc.
� . . .
Visualizing results
� Built-in charting facility aids analysis
� Pie charts, bar charts, tag clouds, maps, etc.
� Hover over sections to reveal details
10 © 2013 IBM Corporation
Exporting data
� Useful for sharing with downstream applications
� Several common formats supported
� Save to distributed file system or display in browser (Save As -> local file)
11 © 2013 IBM Corporation
On-demand videos
� Available on YouTube’s IBM Big
Data Channel at
http://www.youtube.com/user/ibm
bigdata
� “Analyzing Social Media for IBM
Watson”
12 © 2013 IBM Corporation
Watson”
� “Big Data Patent Analysis with
BigSheets”
� “Big Data for Business Users”
� “BigSheets in Action”
� See the full list of videos at
http://tinyurl.com/biginsights
Supplemental
13 © 2013 IBM Corporation
Inspecting runtime statistics
14 © 2013 IBM Corporation
Displaying the workflow diagram
15 © 2013 IBM Corporation
Built-in text analysis functions
� Included with BigInsights
Version 2.1
� BigSheets functions for
extracting common business
entities from text-based
columns
16 © 2013 IBM Corporation
columns
– Address, EmailAddress, Country,
Person, etc.
– Based on pre-built text extractor
library provided with BigInsights
� Add Sheet -> Function ->
Categories -> entities