PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

22
10 80 ~ 2 240

Transcript of PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Page 1: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

1080 ~ 2240

Page 2: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 3: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

3Sources: The Economist, Feb ‘10; IDC

By 2016 the New Large Synoptic Survey Telescope in Chile will acquire 140 terabytes in 5

days - more than Sloan acquired in 10 years

In 2000 the Sloan Digital Sky Survey collected more data in its 1st week than was collected in

the entire history of Astronomy

The Large Hadron Collider at CERN generates 40 terabytes of data every second

Page 4: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 5: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 6: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Power Map for Excel is a three-

dimensional (3D) data

visualization tool for Excel 2013.

http://www.microsoft.com/en-us/powerbi

Page 7: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Big Datain Research

Microsoft Research ATL Europe, Munich

Marcel TillyProgram Manager

Page 8: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Big Data.

Page 9: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Sources: The Economist, Feb ‘10; DBMS2; Microsoft Corp

Cisco predicts that by 2013 annual internet traffic flowing will reach 667 exabytes

The Twitter community generates over 1 terabyte of tweets every day

Bing ingests > 7 petabyte a month

Page 10: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 11: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Talks• From Text to Entities and from Entites to Insight: a Perspective on

Unstructured Big Data

• Querying and Exploring Big Brain Data

• Big Data with Stratosphere

• SCOPE: Parallel Databases Meet MapReduce

• Online Data Processing with S4 and Omid

• Predictable Data Centers

• From Terabytes to Megabytes: Finding the Needle by Shrinking the

Haystack

• Incremental, Iterative, and Interactive Computation using

Differential Dataflow

• Big Data on Small Machines

• Graphs and Linear Measurements

• Partitioning & Clustering Big Graphs

• Online Team Formation in Social Networks

• Big Data and Enterprise Analytics

• Streaming Verification of Outsourced Computation

• Big Data Analytics: A Happy Marriage of Systems and Theory?

• Fast Algorithms for Perfect Matchings in Regular Bipartite Graphs

• Cuts, Trees, and Electrical Flows

• Neighborhood Sampling for Estimating Local Properties on a

Graph Stream

• What Can't We Compute on Data Streams?

• Querying Big, Dynamic, Distributed Datahttp://research.microsoft.com/en-

US/events/bda2013/default.aspx

Scope

We witness a rapid development of the

research and technology for efficient

processing of big data. There is a surge of

commercial and open source platforms for big

data analytics, including platforms for querying

of massive datasets, batch processing, real-time

analytics, streaming computations, iterative

computations, graph data processing, and

distributed machine learning.

Page 12: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 13: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

Database queries

How can we efficiently resolve database queries on massive

amounts of input data?

Here the input data may be presented in the form of a distributed

data stream.

Machine learning

How can we efficiently solve large-scale machine learning problems?

Here the input data may be massive, stored in a distributed cluster of

machines.

Distributed computing

How can we efficiently solve large-scale optimization problems in

distributed computing environments? For example, how can we

efficiently solve large-scale combinatorial problems, e.g. processing of

large scale graphs?

Page 14: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

0

2

Page 15: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 16: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

RedFIR® is unrivaled worldwide as a tool

for analyzing performance in team sports,

making it possible to objectively analyze

games and assess players against a

consistent set of criteria.

http://www.orgs.ttu.edu/debs2013/index.php?goto=cfchal

lengedetails

Page 17: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 18: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 19: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...
Page 21: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...

“How to Fit when No One Size Fits”, Lim and al, CIDR 13

Page 22: PowerPoint Presentation · TechEd 2012 Keywords: TechEd 2012 Created Date: 3/4/2014 9:37:35 AM ...