Five Things You Didn't Know DataSift Can Do

Post on 31-Jul-2015

390 views 1 download

Tags:

Transcript of Five Things You Didn't Know DataSift Can Do

Brad HubbardProduct Manager, Developer Relations DataSift

Five Things You Didn’t Know DataSift Can Do

#DSwebinar

HUMAN DATA INTELLIGENCE

FILTER TAG • ENRICH

STORE

Stream products will be covered todayTo see PYLON (our aggregated, anonymized Facebook topic data), join our next live demo:

http://lp.datasift.com/20150701-Live-SE-Demo-Registration

DataSift is of Two Minds: Indexed Data & Streaming

#DSwebinar

VEDO

2011 1K 4

Launched

• San Francisco• New York• London• Reading, UK

Customers across 40 countries

2B

Items processed

per day

(These don’t count toward the 5 things)

Global offices:

#DSwebinar

Brave New Data World

of all digital data created by consumers

emails a day

of US adults’ location is known

increase in global data by 2020

ThoughtsEm

otio

ns

LIKES

Dis

likes

Intentions IdeasCurrent Events

GEOOccupationAge

Top

icsGenderIdeas

Gender

Occupation

Intentions

Age

Th

ou

gh

tsG

EO

Dislikes

Age

Ideas

ThoughtsAge

Intentions

Current Events

Current Events

Emotions

GEO

IdeasGEO

#DSwebinar

Sources of Human-Generated Data

BLOGS & NEWS INSIDE YOUR BUSINESS

SOCIAL NETWORKS

#DSwebinar

The Complexity of Human Data

VOLUMEVARIET

YVELOCITY

Billions of users

Noisy

Generated in real time

per second

Post vs blog vs like

Terabytes per day

Ambiguous

Big spikesUnstructured

#DSwebinar

Turn Human Data into Meaning

#DSwebinar

Unify Human Data

#DSwebinar

9

We apply structure to the chaotic world of human data

#DSwebinar

Facebook

Tencent Weibo

Sina Weibo

Google+

YouTubeInstagram

LexisNexis

Wikipedia

Wordpress

TumblrIntense Debate

DisqusNewsCred

Reddit

TopixJiveTwitter

EDGAR NewsVideoIMDBYammer

Unifying data from across the web

#DSwebinar

Filtering Human Data with CSDL

#DSwebinar

Filter: CSDL Data Processing Language

WRITE ONCE • USE MANYFilters against generic objects or get source-specific

#DSwebinar

Rules can contain millions of tag and filter criteria, no need to limit yourself

INFINITE COMPLEXITY

#DSwebinar

Enrich Human Data

#DSwebinar

Identifies links in social posts and fetches header

dataAllowing you to filter against link content

LINKS AUGMENTATION

#DSwebinar

LANGUAGE DETECTIONWrite filters on a per-language basis, or limit

yourself to only certain languages

#DSwebinar

Location either disclosed by user or listed in profile

GENDER DETECTION USING PROFILES AND NAME + LANGUAGE

#DSwebinar

SENTIMENT AND TOPICS Likely positive • Neutral • Likely Negative

Topic detection (looking for nouns and disambiguating them)

#DSwebinar

Categorization, Scoring and Tagging

#DSwebinar

VEDO enables automatic

classification of Human Data

based on it’s meaning

Apply Data Science

#DSwebinar

OFF THE SHELF CLASSIFIERSEnable automatic scoring and classification

#DSwebinar

CUSTOM TAXONOMIESHierarchal rules to mach your business

#DSwebinar

CUSTOM SCORING SYTEMTo expose meaning hidden deep within

unstructured, text-rich data

#DSwebinar

DeliveryUse Everywhere

#DSwebinar

CONSUME A JSON STREAM DIRECTLY

#DSwebinar

Send your data to any of these pre-built connectors

#DSwebinar

We handle the infrastructure and send you the data you need

#DSwebinar

THANK YOU

#DSwebinar