• 16.9B USD in 2015
• 40% Big data project
• Hadoop, CAGR 58%,
2.2B 2020
• Volume
• Velocity
• Variety
Super hot in
• Government
• Communication
• Media
• Banking
• Manufacturing
Technology
InfrastructureIAAS, SAAS, DAAS,
ApplicationBI, Social analytics,
visualization…
Domain solutionFinance, Retail,
Insurance
DevelopmentData scientist,
Devops
Business process
Operation, Support
ANALYTICS IS THE
Make data live Data sitting in storage generates no value
Revenue and profit from data Application and solution to get insights from data Link insights with business Don’t stop at visualization or report
Advanced analytics is the engine of business solution Fraud detection Customer retention
COMMON ANALYTICS SCENARIOS Data analysis
Example: Estimate customer’s life cycle value User: data scientist Demanding: flexibility to explore and faster iteration
Product analysis Example: How many female customers visit website home
page and leave within less than 5 clicks? User: product manager, data analyst, marketing team Demanding: No complex coding, SQL query at most
Predictive service Example: Is this transaction a fraud? User: developer and data scientist Demanding: pipeline processing
WHAT DOES DATACANVAS ADDRESS Powering all these scenarios
Data Analysis: Flexible Product Analysis: Intuitive Prediction service: Complex processing
Enable application, solution and business process
DataCanvas
Hadoop(HIVE/Pig) RDBMS NOSQL SPARK
Recommendation Anomaly Detection Operation Analytics
Application
Platform to enable application and connect infrastructure
Service
Pipeline
Infrastructure
• Big data challenges are across services, environments and even locations
Storage
Processing
Reporting
Data Generation
• An orchestration platform is required to manage and connect steps in the pipeline
• Bring Pipeline to the game
No more central data store, bring computation to data, not vice versa!
• Unify resource
• Optimize workload
• Automation
Unmanageable
Redundancy
Hard to fast iterate
Gap between documentation and actual workflow
Pain points
monster configuration
spaghetti script no reuse No idea what’s actually running
WHAT IS DATACANVAS
• Drag & drop to run data flow• Public or private cloud• Intuitive job management
• Module repository• Built-in library• Make your own recipe• Powering advanced analytics
• Business solution template• Address common applications• Fully customizable
• Team collaboration • Flow sharing • Module sharing• This is the BEST documentation
VALUE
WorkflowScheduling
Module Solution Template
Operation Developer/Data scientist
Business
• Data ETL• Machine learning • Module repository
• Business requirement• Recommendation • Fraud detection • Sentiments analysis
• User experience• Production
quality• Easy ops
WHY CONTAINER MATTERS• Seamlessly connect to any existing/
upcoming computation infrastructure
• Enabler for module management
and sharing
• Support Lambda: Processing +
Serving + Visualization
Lambda Architecture
COMPETITORSAWS DP
Oozie AzureML MortarData
Azkaban DataCanvas
Workflow + Scheduling
Module management
Solution template
Multiple Env support
Collaboration + Sharing
Cloud service
DataCanvas = ((Workflow + Scheduler) * Drag & drop * Module composition ) ^ Solution @ Cloud
Good
Bad or not support
Not that great
BUSINESS MODEL Subscription
Charge services on tiers, Startup, Premium, Enterprise
Free
• 1 user• Unlimited
projects• Limited
workload, good for evaluation
• Forum support
Startup
• Unlimited users• Unlimited
projects• Decent
workload, 3-5 jobs in parallel
• Email support
Premium
• Unlimited users• Unlimited
projects• Significant
workload, >20 jobs in parallel
• Email support
Enterprise
• Unlimited users• Unlimited
projects• Workload on
scale• Full support
Annual Support Package For Premier and Enterprise customers Forum support, Email support with SLA, Telephone support
TARGET CUSTOMER Data scientist
Assembly line to facilitate exploration Team collaboration
Analyst Drag and drop to find insights, need any more reason?
Manager Faster iteration Shorter time to deliver project Easier to maintain
WHERE ARE WE NOWDemo upon request ([email protected])
DataCanvasIO @ GitHub
THANK YOU
Top Related