Data Ninja Services: empowering data science workflows with text analytics
-
Upload
data-ninja-api -
Category
Data & Analytics
-
view
25 -
download
2
Transcript of Data Ninja Services: empowering data science workflows with text analytics
DATA NINJA Services
Pero Subasic Open Service Innovation Group
DOCOMO Innovations, Inc.
July 13, 2016
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
@DataNinjaAPI
dataninja.net
NTT DOCOMO, Inc.
• Japan’s largest mobile phone operator
• 67M subscribers in Japan
• 46% are smart phones
• DOCOMO Innovations is a subsidiary
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
4G subscribers 3G subscribers 2G subscribers
NTT DOCOMO mobile market share in Japan
Data Ninja Team
• Part of Open Service Innovation Group at DII
• Formed in 2012 with 10+ researchers and engineers
– 7 members with Ph.D. degree
– 80+ years of combined experience with more than 50 patents and 120
peer-reviewed papers
– Diverse and extensive international large company and startup
experience
– Experts in data science and text analysis
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Data Ninja Technologies,
Applications and Customers
• Technologies
– Natural Language Processing (NLP) and Machine Learning
– Large-scale Data Analytics and Cloud Computing
• Applications
– Personal voice assistants, car navigation assistants
– Personalization and recommendation systems
– Data management platforms and online advertising
– Automated text categorization system
• Customers
– Large enterprises including NTT DOCOMO, Toyota, Pioneer, Nissan
– Companies utilizing analytics to enhance service offerings and effectively
provide relevant, appropriate and targeted content
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Our mission is to enable companies of all sizes to build smart services
with content intelligence without having in-house advanced data
science and machine learning teams.
Data Ninja Mission
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Content Smart Sentiment
Smart Data Smart Learning
Content Intelligence Platform
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Data Ninja API
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart
Content
Smart
Data
Smart
Learning
Smart
Sentiment
REST API
public cloud or technology licensing API Endpoints
Smart Applications and Services
Unstructured
Data (text)
Structured Data
(Concepts, Categories and
Entities with Sentiments)
http://dataninja.net
Smart Data Service
Smart Data service provides access to our knowledge
graphs to complement the Smart Content service allowing
development of sophisticated data-science applications.
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Concept-category hierarchy Concept relationship graph
Semantic Interpretation
DOCOMO Innovations、Inc. All Rights
Reserved. 9
Pets
Animals with feathers Farm animals
Small domestic animals Farm animals - Concept nodes act as network
sensors
- Category inference
- Concept inference
- Learning
- Interaction with content
- Communication
- Communication participants’
knowledge bases are different:
there is no common grounding ->
customization is necessary
- Knowledge bases are continually
updated
Knowledge Base Updates
DOCOMO Innovations、Inc. All Rights
Reserved. 10
Real World
Timely
Updates
Knowledge
Base
Interpretation
Engine
Environment and customization
• General background knowledge
• Vertical customization
Automated, timely KB updates
• New concepts, categories and relationships
• Custom entities, concepts, taxonomies
New, improved interpretations
• Finer resolution, increased accuracy
• Enriched interpretation
Smart Data Demo
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Data Use Cases
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
• Concept & Category Contextual Search: Keyword suggestion &
expansion
• Related Concept Contextual Search
– Personalization by building user profiles based on usage history
– Recommendation based on similar concepts (e.g. during cold
starts)
• Concept Popularity Lookup
– Smart search using concepts in addition to keywords to increase
accuracy
– Popularity-based disambiguation
– Trending visualization by finding trends of concepts and categories
for better decision making
• Smart Graph
– Generation of linguistics resources for domain-specific applications
• Induction Engine
– Discovery of hidden relationships among concepts to better reason
with text
Smart Content Overview
Smart Content service extracts meaningful categories,
concepts, entities and keywords from unstructured text for
broad use in analytics and data science applications.
Smart Content collects relevant data continually to add to its
knowledge-base.
Smart Content knowledge-base is extendable through its
configurable resource repository with custom user-defined
taxonomies and entity dictionaries.
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Content Use Cases
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Example Use Cases:
• Topic Detection and Tracking
• Concept-based Retrieval
• Image Recommendation for Online Publishing
• Contextual Music Recommendation
• Semantic Analysis of News Articles - demo
Smart Sentiment Service
Smart Sentiment assigns a positive, negative, neutral, or
“none” sentiment value to the content of a natural language
text document.
Pre-defined, custom trained models are available for three
domains: product reviews, social networks and news articles.
Sentiments are assigned to each extracted entity and
keyword.
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Sentiment Use Cases
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Example Use Cases:
• Brand reputation analysis/monitoring
• Product sentiment around release date
• Product reviews
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
Sep.
01
Sep.
02
Sep.
03
Sep.
04
Sep.
05
Sep.
06
Sep.
07
Sep.
08
Sep.
09
Sep.
10
Sep.
11
Sep.
12
Sep.
13
Sep.
14
Sep.
15
Sep.
16
Sep.
17
Sep.
18
Sep.
19
Sep.
20
Sep.
21
Sep.
22
Sep.
23
Sep.
24
Sep.
25
Sep.
26
Sep.
27
Sep.
28
Sep.
29
Sep.
30
No
rmalized
sen
tim
en
t s
co
re
Date
Sentiment for “Volkswagen” in Sep. ‘15
The U.S. Environmental Protection Agency said
Friday that Volkswagen intentionally skirted
clean air laws by using a piece of software that
enabled about 500,000 of its diesel cars to emit
fewer smog-causing pollutants during testing
than in real-world driving conditions.
The agency ordered VW to fix the cars at its
own expense.
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
Se
ntim
en
t
Date
Sentiment for entity Volkswagen
Toyota sales fell 9 percent, Honda was
down 7 percent and Volkswagen brand
vehicles were down 8 percent.
German auto giant
Volkswagen posted a 2.8 per
cent decline …
Volkswagen’s finance chief Hans Dieter Poetsch is set to become its next chairman, putting
Europe’s biggest car maker on course for calmer waters after rival factions including ousted
patriarch Ferdinand Piech united to back him.
Newsbot Ninja
18
https://newsbot.dataninja.net
19
https://newsbot.dataninja.net
Newsbot Ninja
Smart Learning Overview
Smart Learning service identifies the intent of a short piece of
text, such as natural-language request to perform some
action.
The service recognizes 30+ intent categories such as call
request, email, news, information seeking, play music,
transportation, schedule, shopping, take a photo, and similar.
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Learning Use Cases
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Use Cases:
• Intelligent Personal Assistants such as Siri, Cortana,
and Google Now
• Car Navigation Assistants
• Query and sentence classification
• Intent identification and query extraction in mobile
apps
Example: “Find me Italian restaurant in Palo Alto.”
Task: Restaurant search
Target: Italian restaurant
Place: Palo Alto
Thank You!
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
- Web sites: dataninja.net,
demo.dataninja.net.
newsbot.dataninja.net
- Visit us at our booth in the expo
area
- Sign up for a demo or API
Backup
23 Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
DOCOMO Innovations, Inc.
• DOCOMO Innovations is a subsidiary of NTT DOCOMO
• We collaborate with business partners, research laboratories, start-ups
and engineers to develop innovative products and services
Copyright © DOCOMO Innovations, Inc. All Rights Reserved.
Smart Content Advantages
• More entities extracted than competitors
• Broader concept tagging to provide higher recall
• Exclusive access to knowledge-graph hierarchy and
induction engine to facilitate custom advanced development
• More categories in taxonomy classification for broader
coverage
• Other signals including concept similarity, ranking, and
popularity for further disambiguation
• Outstanding price/performance
Copyright © DOCOMO Innovations, Inc. All Rights Reserved. 25
Smart Content Pricing
Copyright © DOCOMO Innovations, Inc. All Rights Reserved. 26