Large-scale Machine Learning using DryadLINQ
Mihai BudiuMicrosoft Research, Silicon Valley
Ambient Intelligence: From Sensor Networks to Smart Environments and Social Media Workshop
Stanford, June 11, 2019
2
Goal of DryadLINQ
3
Software Stack
Windows Server
Cluster services
Cluster storage
Dryad
DryadLINQ
Windows Server
Windows Server
Windows Server
Applications
.Net + LINQ
4
Dryad = Execution Layer
Job (application)
Dryad
Cluster
Pipeline
Unix Shell
Machine≈
Collection
.NET objects of type T
LINQ Data Model
6
LINQ Language Summary
Where (filter)Select (map)GroupByOrderBy (sort)Aggregate (fold)Join
Input
7
LINQ
Dryad
=> DryadLINQ
8
DryadLINQ Data Model
Partition
Collection
.Net objects
9
Collection<T> collection;
static bool IsLegal(Key c);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
DryadLINQ = LINQ + Dryad
C#
collection
results
C# C# C#
Code
Dryad job
Data
10
Example: Natal Training
11
Natal Problem
• Recognize players from depth map• At frame rate• Low resource usage
12
Learn from Data
Motion Capture(ground truth)
Classifier
Training examplesMachine learning
Rasterize
13
Running on Xbox
14
Cluster-based training
Classifier
Training examples
Dryad
DryadLINQ
Machine learning
You can have it!
• Dryad+DryadLINQ available for download– Academic license– Commercial evaluation license
• Runs on Windows HPC platform• Dryad is in binary form, DryadLINQ in source• Requires signing a 3-page licensing agreement• http://connect.microsoft.com/site/sitehome.aspx?SiteID=891
Conclusions
17
Visual StudioLINQDryad
17
=
Top Related