Future of ai on the jvm

Future of AI on the JVM

Scala Days Amsterdam 2015

Adam Gibson Creator of Deeplearning4j (and 4s :)

What is AI?● Not Terminator (despite our name)● Many subfields● Our focus: Machine learning

Big Data?

Problem Space● Spam Classification● Summarization● Face Detection● Eye Tracking● Targeted Ads● Recommendation Engines

Current State of ML● Simpler models● Most of industry barely uses Logistic Reg.● Many problems are binary

o e.g. fraud, spam● Some unsupervised (clustering, reccos)● Lots of ML frameworks on JVM

ML Frameworks on JVM...● Apache Mahout● Spark’s MLlib● Weka (is that R?)

ML GUIs● Prediction.io● Encog

Problems● Monolithic● Makes assumptions about data● Hard to use ● No separation of concerns

Ring a Bell?● We call that “Monolithic”● Separate ML concerns:

Data Pipelines/VectorizationScoringModel TrainingEvaluation

Micro-Services + ML?● Kinda like micro-services● Reduce lock in● Take math, data cleaning, model training,

choosing algorithms ...● … and separate them

Math● Parametric Models (Matrices!)● Non Parametric (Random forest)● Focusing on Matrices (the hard part of ML

systems)

Matrices● NDArrays ( > 2d)● Tensors (think of pages of matrices)● Example: 2 x 2 x 2 (2 2x 2 matrices)● ^^THIS IS UNCLEAR. Two 2 x 2 matrices?● Applies to graphs w/ sparse representations

Chips/Hardware/Matrices● CPUs - We work with these● GPUs - CUDA ditto● FPGAs

o Intel bought Altera, an FPGA maker, for $17 billion this month

o The edge, the cloud

Why New Chips?

Why New Chips?● See the numbers yourself:● http://www.slideshare.net/airbots/cuda-2933

0283● http://devblogs.nvidia.com/parallelforall/bidm

ach-machine-learning-limit-gpus/● http://jcuda.org

Mixed clusters● GPUs aren’t good for all workloads● Because latency● Need to upload data: not good for small

problems● Mixed CPU/GPU clusters are best bet

Data Pipelines● More data will be binary● Frameworks today can’t process binary well● Binary data has different semantics ● Moving windows for audio● 3d for images ...

People Roll Their Own b/c● Current frameworks assume clean data :(● Pipelines are brittle, hard to maintain

● Moving towards being composable (reuse)

Dedicated Libraries● Let’s focus on vectorization -- now!● Because IoT● Because more access to raw media

● Should fit into current big data frameworks

Scoring● AUC● F1● Different Loss Functions● Hyper parameter optimization

All independent● These things work for different models● Shouldn’t be tied to a particular system● Should be embeddable

Training● Split Train/Test● Sample data (no, not all the data ;) to

validate model● Increasingly compute intensive

Deep Learning● Most done in Python...● Norm training time is measured in

hours/days -- weeks!?● Work being done in HPC (Model parallelism)● Distbelief (Data parallelism)

Automatic Learning● Good at unstructured data● Images, Text, Audio and Sensors● Quick, baseline feature engineering

● Not good at feature introspection

Or are they?

Where Does Scala Fit In?● Akka - Real time streaming analytics/micro services● Spark - Dataframes/number crunching● JVM Key/Value Stores● Pistachio (powers Yahoo’s ad network)

o http://yahooeng.tumblr.com/post/118860853846/distributed-word2vec-on-top-of-pistachio

The Way We Learn Now● Monolithic ML frameworks● No per-chip optimizations● No Tensors (come on guys, it’s 2015...)● Need isolation and less lockin● JVM is the platform to make it happen

Other Links● http://deeplearning4j.org/● http://nd4j.org/● https://github.com/deeplearning4j/Canova

Questions?● adam@skymind.io● @agibsonccc● github.com/agibsonccc

Future of ai on the jvm

Data & Analytics

Transcript of Future of ai on the jvm

JVM Troubleshooting Guide - enos.itcollege.eeenos.itcollege.ee/~jpoial/allalaadimised/reading/JVM... · JVM Troubleshooting Handbook Oracle HotSpot JVM Memory Java HotSpot VM Heap

JVM for Dummies - O'Reilly Mediaassets.en.oreilly.com/1/event/61/JVM Bytecode for Dummies... · Today • JVM Bytecode • Inspection • Generation • How it works • JVM JIT •

JVM bridge methods: a path not taken - Oracle | Integrated ... · JVM bridge methods: ... Covariant method overriding – JVM rules class Animal { Name: ... JVM bridge methods: a

JVM Performance Tuning GC Tuning Parameters - Oracle€¦ · JVM Performance Tuning Zhao Yi Consulting Solution Architect –OFM A Team. 4 Agenda • JVM Fundamentals • JVM Performance

The Future of Enterprise AI powered by The AI Company€¦ · The Future of Enterprise AI powered by The AI Company If your data is not ready, leverage The AI Company’s data enrichment

OPENNESS TO NEAR FUTURE AI CAPABILITIES… · Retail Revolution | Openness to near future AI capabilities 3 PARTNER COLLABORATION AI PERCEPTONS & ADOPTION HOW PEOPLE INTERACT WITH

AI andThe Future of Free Journalism

Future AI: Autonomous Machine Learning and Beyond · 2018-04-16 · Future AI: Autonomous Machine Learning and Beyond Harri Valpola, CEO The Curious AI Company ... Handcrafted software

JVM Troubleshooting Guide...Troubleshoot the JVM like never before JVM Troubleshooting Guide Pierre-Hugues Charbonneau Ilias Tsagklis

VR, AI & The Future of Mobile

Jvm Architecture

Basic Digital Circuits in Chisel - compute.dtu.dk · JVM Hello.Þr chisel3.lib scala.lib Verilog Emitter JVM Treadle JVM Hello.vcd Hello.v FIRRTL JVM Chisel Tester JVM good/bad ...

AI and the Future of Growth

JVM Tool Interface (JVM TI) Implementation in HotSpotopenjdk.java.net/groups/hotspot/docs/jvmtiImpl.pdf · >jvmti.h (JVM TI standard interface ... •Shield JVM TI internals from

JVM ecosystem languages and the future of JVM

AI in the Factory of the Future - image-src.bcg.comimage-src.bcg.com/Images/BCG-AI-in-the-Factory-of-the-Future-Apr... · the Future The Ghost in the ... AI gives computers and machines

Future of work: AI vs. Human

JVM & ClassLoaser - 50001.com · english. french. target() target() target() target() JVM & ClassLoader. static. JVM & ClassLoader. JVM & ClassLoader

Jvm ecosystem languages and the future of jvm

JVM Instructions Basics –Stack-based –Compactness & Efficiency JVM Instruction Set –Compiling Java to JVM Dachuan Yu.