Algorithm Marketplace and the new "Algorithm Economy"

Post on 20-Feb-2017

724 views 1 download

Transcript of Algorithm Marketplace and the new "Algorithm Economy"

Algorithm Marketplaces and the new "algorithm economy“

Data Day Texas 1-16-2016

Diego OppenheimerCEO and Founder

$100 free to get started. Signup at Algorithmia.com with Promo Code:

DATADAYTX

Diego Oppenheimer - CEO / founder Algorithmia• 10+ years building Business Intelligence and Big Data tools• Led advanced data analysis tool development at Microsoft - 1

billion users reachedShipped Excel, SQL Server, PowerBI v1.0

• Previously founded an algorithmic trading startup• Techstars/Startup Weekend Coach and Mentor• B.S. and M.S. Carnegie Mellon University• Passionate data analysis enabler

Email: diego@algorithmia.com @doppenhe

Why Algorithms ?

• “In economics productivity is a measure of technological progress. Productivity increases when fewer inputs are used in the production of a unit of output”

• We went from Hunter-gatherer to agriculture to industrial to the next revolution: interpretation of data.

• Algorithms are at the center of the next revolution. They are the tools of our generation.

• If data is the new oil, advanced algorithms are the drilling platforms, pipelines, tankers and gas stations.

The briefest history of technology…ever

“…data is inherently dumb. It doesn’t actually do anything unless you know how to use it. And big data is even harder to monetize due to the sheer complexity of it.

Data alone is not going to be the catalyst for the next wave of IT-driven innovation. The next digital gold rush will be focused on how you do something with data, not just what you do with it. This is the promise of the algorithm economy.”

Peter Sondergaard (Gartner Research)

Staggering pace of data collection

Sources: Cisco, ComScore, MadReduce, Radicati Group, DataScienceCentral, Insights wired, IBM, EMC,GMAOnline, Twitter, YouTube, Manthan for Strategic Innovation

• 10,000 Tweets per sec

• 2,283 Images per sec

• 1,792 Skype calls per sec

• 49,466 Google searchers per sec

• 103,310 Video viewed per sec

• 2,406,488 Emails sent per sec

• 55,000,000 Status updates per day

• 28,260 Gigabytes of traffic flows through internet per sec

• By 2018 69% of online traffic will be mobile video

• 68% of all unstructured data in 2015 attributed to consumers

• In 2015 enterprise unstructured data will cross 1600 exabytes

Rise of unstructured data“Unstructured data: data value that has little or no metadata and therefore difficult to categorize.”

Internal External

Where is it coming from?

Photo and Videos Audio Data Social MediaTransactions Log Data Emails

Brand and Social Media propertiesCustomer Service Centers

Mobile and Market research dataEmployee Performance reviews

Consumer Survey DataCandidate Interviews

Merchandising photosCrowd SourcingWeb ScrapingSocial Media

Blogs and Chat RoomsConsumer Product Reviews

Classifying unstructured data

Cognition: “the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses.

Humans are the gold standard for interpreting unstructured data…we just don’t scale.

Business that succeed will be the ones that are able to interpret their unstructured data with near human efficiency at super human scale.

Modelling humans in machines

Learning

Perception

Communication

Social Intelligence

Planning

Machine Learning

Computer Vision and Speech Recognition

Natual Language Processing

Affective Computing

Automated Scheduling

Human Cognition

Machine Intelligence

Machines can provide super human scale.

Why now?

1990s Connectivity$10,000 per month

Servers$20,000 per box

Storage$1,000/GB

2000s Connectivity$1,000 per month

Servers$1,000 per box

Storage$10/GB

2010s Connectivity10 cents/GB

Servers20 cents/hour

Storage12 cents/GB

Super human scale = machines…and today they are cheap, plentiful and fast.

Advances in Natural Language Processing• We now suddenly have available to us dozens open

source libraries in the natural language processing space.

• NLTK – ApacheNLP – ScalaNLP – StanfordNLP – etc

• We understand sentiment , intent , entities and are getting better at it every day.

• The combination with knowledge graphs is allowing to interpret subject matter almost immediately.

• StockTwits using tweets as signal for trading.

• Ai2 – Interpret questions - Pass the 8th grade geometry test

• Genomic research Great summarizer

Feed text and allows a machine to answer questions about it through inference - Facebook/Lord of the rings

Advances in Computer Vision

• Again dozens of libraries per language , huge pain to work with.• Wrangling OpenCV is a dark art form.

• Ai2.org passed the 8th grade geometry test, interpreting graphics

• Google Vision API/ Clarifai – submit an image get fully recognized objects

• Visual shopping (similar items in looks based on what you are looking at made super easy through Deep Learning).

Advances in Speech Recognition

• Siri/Cortana/Google Now

• Amazon Echo

• Skype live translator /Baidu Mandarin English translator

• CMU Sphinx – training on different lexicons, data sets and sophistication of language levels.

• Tone Sentiment prediction for customer service calls – Wise.io

We now talk to our machines and they “get us”.

“I cannot see ten years into the future. For me, the wall of fog starts at about 5 years.... I think that the most exciting areas over the next five years will be really understanding videos and text. I will be disappointed if in five years time we do not have something that can watch a YouTube video and tell a story about what happened. I have had a lot of disappointments.” -Geoffrey Hinton’s AMA on Reddit

“Its not about the pieces , it’s how the pieces work together”- ICE CUBE

Building blocks of Machine Intelligence

• Marketing • Product recommendations• Customer Service• HR• Fraud and Churn prevention• Infrastructure monitoring• Crime prevention

But…the use cases where machine intelligence can be applied to are growing at a staggering pace.

We move from the era of “capture everything” to being able to “act on everything”.

Most common use cases at the intersection of machine intelligence and Big Data

• Similar techniques trained on different data sets

• Combine multiple techniques and algorithms

• Engineers need to build every step of the pipeline …and then scale it.

• Whats the problem ?• The skill sets to build models != scale models• The skill sets to tune algorithms != build pipelines• Almost every single use case requires re-inventing the wheel.

What do all these use cases have in common?

All this power …now what?

• Huge advances in multiple fields of machines intelligence but practical implementation still hard.

• Finding the right algorithm/library or framework still a challenge.

• Huge disconnect between academic/top tech companies and rest of industry.

• Top tech company? Let’s go buy a lab.

• Code reusability mostly a myth.

• Incentives between research and users not aligned leading to disconnect.

Algorithm MarketplacesA novel approach:

Algorithm Marketplaces

23

Host algorithmsAnyone can turn their algorithms into scalable/shareable, production ready web servicesTypical users: scientists, academics, domain expertsMake algorithms discoverableAnyone can use and integrate these algorithms into their solutionsTypical users: businesses, data scientists, app developers, IoT makers

Are monetizableAlign incentives between algorithm creators and consumersTypical scenarios: heavy-load use cases with large user base

Algorithm Marketplaces

Are modularAlgorithms can be stacked or piped togetherTypical scenarios: interpretation of unstructured data

24

Algorithm Marketplaces

Host algorithmsAnyone can turn their algorithms into scalable/shareable, production ready web services

Make algorithms discoverableAnyone can use and integrate these algorithms into their solutionsTypical users: businesses, data scientists, app developers, IoT makers

Are monetizableAlign incentives between algorithm creators and consumersTypical scenarios: heavy-load use cases with large user base

Are modularAlgorithms can be stacked or piped togetherTypical scenarios: interpretation of unstructured data

Topic Analysis

Twitter Youtube Satellite Imagery

Computer Vision

Artificial Neural Networks

The future is building blocks…

Some Use Cases

31

Use Cases #1: Birth of new algorithms – Nudity Detection

Algorithms Used● Face Detection● Nose Detection● Skin Color Detection

Based on work from LaSalle University

32

Use Case #2: Unsupervised content recommendation

Algorithms Used● Breadth First Sitemap● Analyze URL● Keywords for Document

Set● Keyword Set Similarity

33

Use Case #3: Video Recommender

Algorithms Used● Get Links● Download Youtube● Speech 2 Text● TF-IDF● Keywords for Document

Set● Keyword Set Similarity

https://algorithmia.com/strata

34

Use Case #4: Intelligent Server-less Apps

The Algorithm Economy

• Reusable algorithms are now monetizable IP, driving choice and fostering reuse.

• Shortage of algorithm developers/ data scientist will lead to more generic model creation that can scale to the demand

• “Bring your own data”

• Marketplaces will bring the benefits of the app economy to software development, lowering software distribution costs and improving access to thousands of algorithms.

• Provides a new avenue where open-source and monetization can co-exist.

• Algorithm creators benefit from constant feedback from the algorithm callers – improving speed of innovation and quality.

Algorithmia

Make state-of-the-art algorithms accessible and discoverable by everyone.

Algorithmia is the leading solution for finding, sharing, and using state-of-the-art algorithms among complex teams with diverse

technologies

40

16k+developer

s

1.8kalgorithm

s

86countries

● Text Analysis summarizer, sentence tagger, profanity detection● Machine Learning digit recognizer, recommendation engines● Web crawler, scraper, pagerank, emailer, html to text● Computer Vision image similarity, face detection, smile detection● Audio & Video speech recognition, sound filters, file conversions● Computation linear regression, spike detection, fourier filter● Graph traveling salesman, maze generator, theta star● Utilities parallel for-each, geographic distance, email

validator● Classifiers deep learning models

Sample algorithms

The future

43

Some predictions

• Algorithm marketplaces will be the driving force in lowering the bar for machine intelligence adoption

• Enterprises will worry less about where their data is going in favor or being able to stay ahead of their business as data collection gets unruly.

• Data locality concerns will be solved by ever moving compute clusters• Move compute to the data not viceversa

• Algorithmic inception• Algorithms that tune other algorithms -> the automated data scientist.

The future…is more autonomous

AutoML – Auto Machine LearningEnsemble learningHyperparameter optimization

The future…is more accesible

$100 free to get started. Signup at Algorithmia.com with Promo Code:

DATADAYTX

+1 206.552.9054

Diego Oppenheimer CEO

doppenheimer

diego@algorithmia.com

@doppenhe

THANK YOU!

Copyright © 2015 Algorithmia. All Rights Reserved. A2-1-151111