Experimentation Platform at Netflix

15
Experimentation Platform at Netflix [email protected]
  • date post

    17-Oct-2014
  • Category

    Technology

  • view

    790
  • download

    2

description

 

Transcript of Experimentation Platform at Netflix

Page 1: Experimentation Platform at Netflix

Experimentation Platform at Netflix

[email protected]

Page 2: Experimentation Platform at Netflix

What is Experimentation?• It’s the process of randomly dividing users into groups– Control Group, existing experience/behavior– Variant 1…n, one or more new experiences– Gather behavior and core metrics for all variants– Analyze and evaluate hypothesis by p-value

• Also known as AB Testing• Examples– User interface changes, new product features, changes

personalization algorithms etc.

Page 3: Experimentation Platform at Netflix

Random allocation to experiments

Experiment 1, Variant A Invariant: Control

Netflix Customers

Experiment 2, Variant E Experiment ‘n’, Variant M

………

………

Page 4: Experimentation Platform at Netflix

Why?

• Data driven product innovation– http

://www.quora.com/What-types-of-things-does-Netflix-A-B-test-aside-from-member-sign-up

– http://www.hakkalabs.co/articles/hive-controlled-experimentation-2

• Iteration of ideas in a controlled way– Validate features and presentation based on data– Fail fast

• It’s the culture

Page 5: Experimentation Platform at Netflix

User interface Experiment

Variant 2 Variant 3

Page 6: Experimentation Platform at Netflix

New feature (Profiles) Experiment

Variant 2

Variant 3

Variant 4

Page 7: Experimentation Platform at Netflix

Personalization Algorithm Experiment

No user interface change

Variant 1: Control - No change

Variant 2: Movies and TV shows

Variant 3: TV shows only

Page 8: Experimentation Platform at Netflix

Cassandra

Magma UI

Stratification

Allocation engine

Segmentation rules

Event stream

Hive

Experimentation Platform

Terradata

HTTP

memcached

Clients

Page 9: Experimentation Platform at Netflix

Allocation engine• Responsible for assigning customers to a variant in the

experiment• Randomly distribute customers across variants• Stratified Sampling– Bias or variance reduction– Varying sample size across platforms for e.g. Game console vs

Blu-ray players vs Laptops• Start, Stop and Track allocations in near real-time

Page 10: Experimentation Platform at Netflix

Segmentation• Divide a broad target population into subsets with

similar properties for e.g. customers who have not used a Tablet to access Netflix in the last ‘n’ days

• Have enough data to execute real-time• Achieve scale and maintain low latency• Clean data for analysis

Page 11: Experimentation Platform at Netflix

Analysis• Gather core metrics and behavior data

– Goal: Near-time (~15 mins) data from all data sources• Petabytes of multi-dimensional data

– Pre vs post compute/aggregate processing– Adjust for biases and/or incidents– Interactive analysis within given constraints

• Explore for patterns• Scale to support fast growing business and the big datasets

Page 12: Experimentation Platform at Netflix

Magma UI

• Experiment lifecycle management– Hypothesis, experiment variants– Real-time rule-based segmentation

• Define rules and conditions to identify the right population for experiments• Rules are dynamic and applied real-time

– Scheduling and forecasting

• Insights– Near real-time insights

• Analysis– Interactive analysis of petabytes of data

• Dimension filters• Behavior filters

– Data visualization (trends, behaviors etc)

Page 13: Experimentation Platform at Netflix

Some Challenges• How do we efficiently scale the system• How do we continue to operate at low latencies, given

– Customer segmentation is based on real-time activity– Real-time allocations have broad applications– Data distributed across multiple clusters

• Big data (In petabytes) processing– Billions of rows of data from various sources– Efficient management of huge datasets to support interactive analysis– Rich and flexible filtering to support interactive analysis

• Rich forecasting and insights• Resiliency• And more…

Page 14: Experimentation Platform at Netflix

Large scale and growing

Page 15: Experimentation Platform at Netflix

We’re hiring• Work on large scale, big data and distributed systems

– Distributed systems – Server-side engineers• Data Structures & Algorithms• Concurrency, Multi-threading, Caching

– Data systems – Server-side engineers• Data Structures & Algorithms• Distributed data stores, Experience processing very large (petabytes) datasets

– UI engineers – Data management and visualization• CSS/JavaScript• Previous experience in data visualization

• Contact Us – [email protected]– bit.ly/ExperimentationAtNetflix

– Netflix has unique culture. Read about it here.