NeuroBayes® – Big Data Predictive Analytics for High ...feindt/ISC2013.pdf · anamnesis...
Transcript of NeuroBayes® – Big Data Predictive Analytics for High ...feindt/ISC2013.pdf · anamnesis...
Prof. Dr. Michael Feindt Karlsruhe Institute of Technology Founder & Chief Scientific Advisor, Blue Yonder GmbH&Co KG
NeuroBayes® – Big Data Predictive Analytics for High Energy Physics & "Real Life“
WHAT WILL HAPPEN?
Blue Yonder provides predictions --- based on data (scientifically sound – with quantified uncertainty -- testable and falsifiable,
WHAT HAS HAPPENED? WHY?
DATA
PATTERN PATTERN
PREDICTIONS
Data Mining and conventional Business Intelligence Predictive Analytics
Influence on business performance
Blue Yonder is unique
Predictive Analytics Suite with a combination of statistical algorithms and neural networks
Experienced, award-winning physicists and computer scientists from renowned institutions such as CERN make up the development team : >70 Ph.D.s
100 120 Sales
E(X)
Probability densityP
Optimal order quantity
General Overview
NeuroBayes®
Sales Forecasts
Fraud Detection
Insurance Premium
Optimization
Patient‘s treatment
optimization
Sports event predictions
....
Fundamental Research
Our roots: 30 years of elementary particle physics, peaking at the LHC at CERN. Built to understand how exactly our universe works.
Schreiben Sie hier Ihren Text
LHC: 27km circumference
CERN (1960)
CERN (2005)
Photo: CERN
Our Background: High Energy Physics Fundamental research at the forefront of science
The Large Hadron Collider – 100m under ground, 27km circumference
Photo: CERN
Very strong worldwide competition on getting results from data very strong statistical methods: fast & robust multivariate algos
NeuroBayes® Fundamental research at the forefront of science
Invented in 2000 for reconstruction of b quark fragmentation in DELPHI experiment. Further development in Phi-T, later Blue Yonder. Several hundred successful applications in DELPHI, CDF II, Belle, CMS, ATLAS, LHCb, H1, AMS experiments. More than 400 men-years development Robust and fast algorithm for reconstruction (= prediction) of • conditional probability densities • classifications with extreme generalization ability by means of Bayesian regularization. Output
Input
Sign
ific
ance
con
trol
Postprocessing
Preprocessing
NeuroBayes example: The LHCb trigger very fast intelligent decisions with NeuroBayes
At the LHC (CERN) – per experiment: 40 000 000 events per second, which translates into 1 PetaByte (1,000,000,000,000,000 Byte) per second raw data
At the LHCb experiment 30 000 instances of NeuroBayes running real-time 24/7 filter out the „interesting“ events without introducing lifetime bias
Photo: CERN
But only 1 PB of interesting data per year can be stored. Need online reduction by 1 : 10,000,000
NeuroBayes example: Full reconstruction of B mesons at the Japanese B factory experiment Belle Fundamental research at the forefront of science
Ø Belle experiment at KEK/Japan Ø 400 physicists from whole world Ø 10 years of data taking and analysis Ø World record luminosity Ø > 100 publications
Ø Automatic hierarchical reconstruction system built from 72 NeuroBayes networks reconstructed about 1100 different reactions with a factor 2 larger efficiency than all analyses before
Ø Much cleaner signal
Ø Work performed by 2 PhD and 1 master student
Ø Corresponds to 500 “normal” PhD theses
Ø Corresponds to another 10 years of data taking
© Blue Yonder Seite 10
Goal: find (classify) all relevant pixel information
Big Data challenge: process approx. 10G bit per s
Solution: NeuroBayes on Hardware
Future: intelligent decisions directly on sensor (Belle II pixel detector), before big data reaches any computer …
© Blue Yonder Seite 11
NeuroBayes @ hardware*:
200 million decisions per second
à 5ns for one decision
BELLE 2 Experiment :
utilizes 40 boards:
à8 billion decisions per second
*features dedicated hardware board:
» NeuroBayes on FPGA
» Field Programmable Gate Array: (XILINX Virtex6 VLX75T)
» Clock frequency: 250 MHz
» Approx. 1 decision per clock cycle (fully pipelined architecture)
» Probability decision output possible
Use all available and relevant information as input, e.g. measurements from the various sub-detectors, …
NeuroBayes will extract statistically significant patterns in the data to derive the prediction.
Prediction will return the best estimator for a measurement including a statistically sound estimation of the expected spread.
10
0 Energy
Momentum
Direction
Type
50
90
Sub-Detector
Distance 200
Calo
Kaon
...
prop
abili
tyP
Particle Property
E(X)
NeuroBayes from Science to Industry Predictive Analytics in High Energy Physics
Use all available and relevant information as input, e.g. article properties, previous sales, etc
NeuroBayes will extract statistically significant patterns in the data to derive the prediction.
Prediction will return e.g. the most probable sales rate including a statistically sound estimation of the expected spread.
Article size
Picture size
colour
Previous sales
M
21%
red
brand
price 19,9
171
24
...
Prediction sales
E(X)
NeuroBayes from Science to Industry Predictive Analytics in industry
e.g. Retail
NeuroBayes allows data-driven analysis and forecasts – both in science and industry
prop
abili
tyP
BIG
DATA
MCKINSEY
Growth of the data volume
Each day, companies receive giant quantities of structured and unstructured data from different sources
PREDICTIVE ANALYTICS
MCKINSEY
Increase of operating margins
We are familiar with your questions
Sample recognition of data for the prediction of risks and development to provide forward-looking bases for decision-making
+ 40% + 60%
We are the pioneers
SOCIAL MEDIA
PRODUCT RECOMMENDATION
RETARGETING
MERCHANDISE PLANNING
CUSTOMER ANALYSIS
SALES PREDICTIONS
PREDICTION OF RISKS
DYNAMIC PRICING € €
PREDICTIVE MAINTENANCE
AREAS OF APPLICATION
Technology Overview
NEUROBAYES SYSTEM
High Performance Data Transformation
Trainer
Industry- specific Output
Pred
ictio
ns
Simulator (industry-specific)
OPERATIONAL SYSTEM (Environment)
Expert
OUTPUT
EXPERTISE (Industry-specific models)
HISTORICAL TRANSACTION DATA
REAL-TIME FEED R >
R > BATCH INPUT
WEB UI & SERVICES
STREAM-BASED INPUT
NeuroBayes - a neural network of the 2nd generation and much more
NeuroBayes is an algorithm capable of forecasting whole probability density functions
average patient
individual patient
NeuroBayes System Working principle
Probability density function for target quantity t
Historic Data
Record a = ... b = ... c = ... .... t = …!
NeuroBayes Teacher
NeuroBayes Expert
Current Data
Record a = ... b = ... c = ... .... t = ?
Expertise
Expert System
Forecasts conditional on many features
► The forecast depends on several features which themselves have several manifestations
► Features may be arbitrarily correlated
► Features can be ordered / unordered sets and continuous variables
This results in a complex & high-dimensional space – which is impossible to treat with classical methods
Example: What‘s the right dose for a patient, if she is 56 years old, slightly overweight, works out on 2 days a week, enjoys late dinners, has been treated for 2 other diseases already, etc, etc,
Preprocessing
The preprocessing is an extremely important process before training a network It involves steps like:
Ø smoothing out statistical fluctuations and outliers in input variables Ø transforming variables to unified characteristics (mean, width) Ø Decorrelate variables Ø Find variables with significant impact / throw out others
The benefits of a powerful preprocessing algorithm involve
Ø increased robustness Ø increased network training results (minima easier to find) Ø increased training speed
Training a network
The following process describes how a Neural Net is trained: 1. Start with initial values 2. Measure correct forecasts 3. Vary the weights in order to increase correct forecasts 4. After some loops, an optimal set of weights will (hopefully) be found 5. Save the weights and the topology => This is your expertise (!)
► This method corresponds to an optimization in a high-dimensional and complex space – an extremely hard task (!)
► NB: Finding the global maximum can never be guaranteed ► NB: This is why the preprocessing is so important, it creates a smooth surface
Illustration of the optimization problem in two dimensions
26.09.13 Seite 22 Blue Yonder at BASF
Find the deepest valley • in only two dimensons • you cannot look at each valley • valleys are not smooth • you only have limited time
Now imagine this task for 100 dimensions
• only local minima can be found
What is so special about forecasting whole density functions?
► Shape of the distribution becomes visible (e.g. non-gaussian) ► Uncertainty of the forecast becomes visible and can be extracted
as a number (variance, MAD of dist.) ► Different estimators may be used, such as: mean, modus,
median, p% quantile
Neural Networks – concepts adopted from neuro science
Ø The human brain solves complex problems very efficiently, detects patterns and stores information (memory)
Ø It consists of approximately 1011 neurons and1014 connections
Ø Oversimplied:
Neurons start sending signals when other
neurons reach a certain activity threshold
NeuroBayes output interpretable as a Bayesian posterior
Evidence Posterior
Prior Likelihood
NB1: Posterior ist the probability that the theory is right under the given data NB2: Pior distributions need to applied carefully. A non-informative prior distribution is not flat (tax authorities know about that). Bayesian regularisation at each analysis step in order to avoid overfitting and select models with good generalisation properties is essential!
Individualized Classifcation & Density Forecasts
Ø Classification Ø Target will be true / false
Ø NeuroBayes delivers a number that can be interpreted as a Bayesian Posterior Probability
Ø Probability density forecast Ø Individiual PDF for each event with asymetric uncertainties
Solution: ► Provision of item sales predictions on a daily basis ► Predictions for calculation of the return quota ► Creation of detailed merchandise planning
suggestions
Retail Optimization of item sales predictions
Page 27
Result: Improvement of predictions by 40%
Inventory improvement in the double-digit million € range per year.
"A self-learning system such as NeuroBayes suits our dynamic business model Our prediction quality is increasing constantly and the sales quantities predicted are becoming ever more precise. The solution helps us adjust early on to future developments.“ Michael Sinn, Director Purchasing Support
26.09.13 Blue Yonder at BASF
Per item:
» Sales forecast
» Two estimates on spread (68% and 95% confidence intervals)
Sales Forecast Fashion Example: OTTO Group
Sales [units]
ROI calculations for the Otto Group
30
Around 7% of all perishable foods (e.g. meat, fruit & veg., etc) have to be disposed of in German supermarkets. That‘s about 89M tons of food wasted per year…
Perishable goods in Supermarkets Meat, fruit & veg, bread, diary, ….
26.09.13 Blue Yonder at BASF
Grocery Chain: Auto Replenishment Predictions from Blue Yonder vs. In-House solution
Blue Yonder Forecasted (Actual) In-House Solution Forecast (Actual)
Pec
enta
ge o
f writ
e-of
fs
CW 06 CW 07 CW 08 CW 09 CW 10 CW 11 CW 12 CW 13 CW 14
Overconfidence and gut feeling produced up to 40% higher write-offs in stores not fully automated by
Blue Yonder
e.g. Individual risk predictions for car insurances: Accident probability Claims distribution Large claim prediction Contract cancellation prediction è Successfully implemented at
Correlations to target variable „Ramler II-Plot“
Alter Tarif NeuroBayes®
Premium volume
Prämie, normiert
Anz
ahl K
unde
n
NeuroBayes® delivers precise prognoses for the customer-individual
number and height of claims
Customer structure optimisation Bind your “good“ customers and take the „bad“ customers
Rentability improvement: Simultaneously increase your total premium volume and decrease your claims rate with a more just tariff system
Premium differentiation: NeuroBayes® adjusts premium to customer-individual risk
Alter Tarif NeuroBayes®
Claims rate
Bisheriger Tarif
Ris
iko
Private health insurance claims per year anything but normally distributed...
NeuroBayes® has the solution for difficult distributions of type
Many insured persons (fraction1-P) do not generate any claim
€
f (t) = (1− P)⋅ δ(t) + P⋅ f (t | t > 0)
When there is at least one claim, (fraction P), these are distributed according to f(t|t>0). This distribution has “fat tails“ (extremely high claims).
Difficult to handle by classical methods
t
f (t | ) = (1−P( )) ⋅δ(t)+P( ) ⋅ f (t | t > 0, )
NeuroBayes® calculates for each insured person x the individualised Bayesian probability density.
Insured person x will have no claims
with probability 1-P(x)
If insured person x will have any claim, the costs will be distributed according
to f(t|t>0,x)
t
x x x x
δ(t) = Dirac- delta- ,,function‘‘ (distribution)
NeuroBayes® has the solution for difficult distributions of type
Healthcare insurance – long term prediction from anamnesis
NeuroBayes® Expert Estimation (risk premium loading)
Ø Expert estimations are at best random – for patients with a long history even systematically wrong.
Ø NeuroBayes® forecasts costs correctly and significantly beats expert estimations more than 10 years into the future
Revenue Forecast Example: dm– Large German drug-store chain
Key Challenge:
» Revenue prediction for each individual store,
» Used for staff planning
» Up to ½ year in advance
» Keep track of opening times, public holidays, weather,..
02.1SalesForecast
=∑∑
Easter Ascension Whitsun
Page 38 26.09.13 Blue Yonder at BASF
Forecasts for individual stores » Prediction of the full probability
density function. » Precise forecast of the exptected
revenue including exptected spread (68% and 95% confidence intervals)
Revenue Forecast Example: dm– Large German drug-store chain
revenue
Pro
babi
lity
dens
ity fu
nctio
n
Further Examples
» Churn-Management in telecommunications
» Identify customers who have a high risk to cancel their monthly contract
» Forecast of targeted promotions and individual measures to prevent churn.
» Churn-Management for daily newspapers
» Identify 67% of all customers likely to cancel their contract by predicting the “most interesting” 10% of all customers to target.
„Blue Yonder beats all our churn prediction models. The more complicated and challenging the task the better. NeuroBayes® outperforms the competition.“
Frac
topm
can
cella
tions
Further Examples
Risk-Management
Among all customers who have a high risk of not paying their debts, identify those who are most likely to pay their outstanding debts.
Many more
0 €
50.000 €
100.000 €
150.000 €
200.000 €
250.000 €
300.000 €
35 Tage 60 Tage
Testgruppe Kontrastgruppe NeuroBayes Conv. approach
days days
Prognosis of sports events from historical data: NeuroNetzer
Results: Probabilities for home - tie – guest
Blue Yonder: Awards for Big Science Startup
3 time winner of the Data Mining Cup
Retail Technology Award 2012
bwcon Hightech
Award 2012
Special Prize Deutsche Boerse
2012
DLD 2013: Best Enterprise Solution
Finalist 2012 Finalist 2013
Disclaimer
26.09.13 Page 44 Blue Yonder at BASF
This Presentation (the Presentation) has been prepared by Blue Yonder GmbH & Co KG (collectively, with any officer, director, employee, advisor or agent of any of them, the Preparers) for the purpose of setting out certain confidential information in respect of Blue Yonder’s business activities and strategy. References to the “Presentation” includes any information which has been or may be supplied in writing or orally in connection with the Presentation or in connection with any further inquiries in respect of the Presentation. This Presentation is for the exclusive use of the recipients to whom it is addressed. This Presentation and the information contained herein is confidential. In addition to the terms of any confidentiality undertaking that a recipient may have entered into with Blue Yonder, by its acceptance of the Presentation, each recipient agrees that it will not, and it will procure that each of its agents, representatives, advisors, directors or employees (collectively, Representatives), will not, and will not permit any third party to, copy, reproduce or distribute to others this Presentation, in whole or in part, at any time without the prior written consent of Blue Yonder, and that it will keep confidential all information contained herein not already in the public domain and will use this Presentation for the sole purpose of setting out [familiarizing itself with] certain limited background information concerning Blue Yonder and its business strategy and activities. The foregoing confidentiality obligation shall be legally binding for the recipient infinitely. This Presentation is not intended to serve as basis for any investment decision. If a recipient has signed a confidentiality undertaking with Blue Yonder, this Presentation also constitutes Confidential Information for the purposes of such undertaking. While the information contained in this Presentation is believed to be accurate, the Preparers have not conducted any investigation with respect to such information. The Preparers expressly disclaim any and all liability for representations or warranties, expressed or implied, contained in, or for omissions from, this Presentation or any other written or oral communication transmitted to any interested party in connection with this Presentation so far as is permitted by law. In particular, but without limitation, no representation or warranty is given as to the achievement or reasonableness of, and no reliance should be placed on, any projections, estimates, forecasts, analyses or forward looking statements contained in this Presentation which involve by their nature a number of risks, uncertainties or assumptions that could cause actual results or events to differ materially from those expressed or implied in this Presentation. Only those particular representations and warranties which may be made in a definitive written agreement, when and if one is executed, and subject to such limitations and restrictions as may be specified in such agreement, shall have any legal effect. By its acceptance hereof, each recipient agrees that none of the Preparers nor any of their respective Representatives shall be liable for any direct, indirect or consequential loss or damages suffered by any person as a result of relying on any statement in or omission from this Presentation, along with other information furnished in connection therewith, and any such liability is expressly disclaimed. Except to the extent otherwise indicated, this Presentation presents information as of the date hereof. The delivery of this Presentation shall not, under any circumstances, create any implication that there will be no change in the affairs of Blue Yonder after the date hereof. In furnishing this Presentation, the Preparers reserve the right to amend or replace this Presentation at any time and undertake no obligation to update any of the information contained in the Presentation or to correct any inaccuracies that may become apparent. This Presentation shall remain the property of Blue Yonder. Blue Yonder may, at any time, request any recipient, or its Representatives, shall promptly deliver to Blue Yonder or, if directed in writing by Blue Yonder, destroy all confidential information relating to this Presentation received in written, electronic or other tangible form whatsoever, including without limitation all copies, reproductions, computer diskettes or written materials which contain such confidential information. At such time, all other notes, analyses or compilations constituting or containing confidential information in the recipient’s, or their Representatives’, possession shall be destroyed. Such destruction shall be certified to Blue Yonder by the recipient in writing. Neither the dissemination of this Presentation nor any part of its contents is to be taken as any form of commitment on the part of the Preparers or any of their respective affiliates to enter any contract or otherwise create any legally binding obligation or commitment. The Preparers expressly reserve the right, in their absolute discretion, without prior notice and without any liability to any recipient to terminate discussions with any recipient or any other parties. The distribution of this Presentation in certain jurisdictions may be restricted by law and, accordingly, recipients of this Presentation represent that they are able to receive this Presentation without contravention of any unfulfilled registration requirements or other legal restrictions in the jurisdiction in which they reside or conduct business.