Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights

16
Introducing BigSheets Spreadsheet-Style Tool for IBM InfoSphere BigInsights Cynthia M. Saracco Senior Solution Architect IBM Silicon Valley Lab

description

Introduces BigSheets, a spreadsheet-style tool for business users working with Big Data. BigSheets is part of IBM's InfoSphere BigInsights platform, which is based on open source technologies (e.g., Apache Hadoop) and IBM-specific technologies.

Transcript of Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights

Page 1: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Introducing BigSheets

Spreadsheet-Style Tool

for IBM InfoSphere BigInsights

Cynthia M. Saracco

Senior Solution Architect

IBM Silicon Valley Lab

Page 2: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Browser-based analytics tool for business users.

Why BigSheets? � Business users need a non-technical approach

for analyzing Big Data.

� Translating untapped data into actionable

What is BigSheets?

How can BigSheets help?� Spreadsheet-like interface enables business

users to gather and analyze data easily.

� Built-in “readers” can work with data in

2 © 2013 IBM Corporation

� Translating untapped data into actionable

business insights is a common requirement.

� Visualizing and drilling down into enterprise

and Web data promotes new business

intelligence.

� Built-in “readers” can work with data in

several common formats (JSON, CSV, TSV, …)

� Users can combine and explore various types

of data to identify “hidden” insights.

Page 3: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

What you can do with BigSheets

� Model “big data”

collected from various

sources in spreadsheet-

like structures

� Filter and enrich content

with built-in functions

3 © 2013 IBM Corporation

� Combine data in different

workbooks

� Visualize results through

spreadsheets, charts

� Export data into common

formats (if desired)

No programming knowledge needed!

Page 4: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Sample Scenario

Data gathering •WebCrawler app

• DBMS import app

• BoardReader app

• Accelerators

• Flume

• Hadoop commands

• - . . .

Data storage • Distributed file system

•Web-based file browser

and administration

Data exploration,

manipulation, and

analysis • BigSheets

4 © 2013 IBM Corporation

InfoSphere BigInsights

Blue italics = IBM technology

Page 5: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Technology

5 © 2013 IBM Corporation

Page 6: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Working with BigSheets

� Create workbook (spreadsheet-style structure) to model target data

� Customize workbook through graphical editor and built-in functions – Filter data

– Manipulate data (e.g., concatenate fields)

– Combine data from multiple workbooks

� “Run” workbook: apply work to full data set

6 © 2013 IBM Corporation

� “Run” workbook: apply work to full data set

� Explore results in spreadsheet format and/or create charts

� Optionally, export your data

Page 7: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

What are Workbooks?

� Spreadsheet-like structures defined by user

� Based on data accessible in BigInsights

7 © 2013 IBM Corporation

Page 8: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Creating a Workbook (one approach)

� From BigSheets tab of

Web console, click New

Workbook button

� Supply input– Workbook name

– Source file (select from file

system directory tree)

8 © 2013 IBM Corporation

system directory tree)

– Appropriate “reader” (data

format translator)• Built-in readers for Web

data, JSON, CSV, TSV,

Hive, etc.

• User-written plug-ins

supported

� Save the workbook

Page 9: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Customizing a workbook

� Work with built-in editor

� Add / delete columns

� Filter data

� Specify formulas to compute

9 © 2013 IBM Corporation

� Specify formulas to compute

new values using

spreadsheet-style syntax

� Apply built-in or custom macro

functions– Supplied text analytic functions

for popular business entities:

person, location, phone number,

etc.

� . . .

Page 10: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Visualizing results

� Built-in charting facility aids analysis

� Pie charts, bar charts, tag clouds, maps, etc.

� Hover over sections to reveal details

10 © 2013 IBM Corporation

Page 11: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Exporting data

� Useful for sharing with downstream applications

� Several common formats supported

� Save to distributed file system or display in browser (Save As -> local file)

11 © 2013 IBM Corporation

Page 12: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

On-demand videos

� Available on YouTube’s IBM Big

Data Channel at

http://www.youtube.com/user/ibm

bigdata

� “Analyzing Social Media for IBM

Watson”

12 © 2013 IBM Corporation

Watson”

� “Big Data Patent Analysis with

BigSheets”

� “Big Data for Business Users”

� “BigSheets in Action”

� See the full list of videos at

http://tinyurl.com/biginsights

Page 13: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Supplemental

13 © 2013 IBM Corporation

Page 14: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Inspecting runtime statistics

14 © 2013 IBM Corporation

Page 15: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Displaying the workflow diagram

15 © 2013 IBM Corporation

Page 16: Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsights

Built-in text analysis functions

� Included with BigInsights

Version 2.1

� BigSheets functions for

extracting common business

entities from text-based

columns

16 © 2013 IBM Corporation

columns

– Address, EmailAddress, Country,

Person, etc.

– Based on pre-built text extractor

library provided with BigInsights

� Add Sheet -> Function ->

Categories -> entities