Group Project Forecasting sales of Other Dairy (Lassi ... Forecasting...Forecasting sales of Other...
Transcript of Group Project Forecasting sales of Other Dairy (Lassi ... Forecasting...Forecasting sales of Other...
ForecastingAnalytics
Group Project
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Group members:
PG ID Name
61310442 Arpita Bhattad
61310801 Ushhan Gundevia
61310076 Ridhima Gupta
61310865 Kapil Dev Tejwani
61310106 Kaushik Sur
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 1
Executive Summary
Problem Definition: ABC retail is a large format hyper market. It sells food, fashion and electronics. Data led
insights and analytics forms the foundation of all decision making and communication. It now wishes to
ascertain unique shopping needs to better service customers. In an attempt to do the same, our group chose to
study sales of everyday items such as dairy products including lassi, srikhand and ice-creams. These are part of
two classes “Other Dairy” & “Ice Creams & Gelato”. We analyze the data to determine trends, seasonality or
unique patterns and use the same to forecast daily sales of the dairy products (lassi & srikhand) and ice-cream to
give a reasonable prediction of future demand to assist the retailer to manage stocks better. We consider both
the data series individually as both these series show different seasonal trend.
Other Dairy (Lassi & Srikhand)
Brief description of the data: The data when plotted did not indicate any upward/ downward trend but showed
significant weekly seasonality with sales being highest on Sunday followed by Saturday, and Monday and
Tuesday with minimum sales. This was no surprise as we expect more people shopping over the weekends, far
less on Monday and Tuesday as they are well stocked and sales again pick up from Wednesday peaking towards
the weekend. There are a few outliers; we see higher sales in the middle of the week owing to festivals/ public
holidays.
High level description of final method and performance: In order to build a forecasting model that would best
capture the seasonality we experimented with various techniques. We discovered that today’s sales were
somewhat related to past two days’ sales and with this knowledge we built a model to predict sales for next
two days given last two day’s sales. There is not much correlation between weekly sales. With Moving Average
we have developed a model that can reasonably predict sales for next 2 days. We tested the model and this one
offered minimum deviation from the actual values indicated by the MAPE % error.
Ice-Cream and Gelatos
Brief description of the data: We could clearly see that this series is dominated by two subclasses in terms of
data available – ‘Cups, Cones and Bars’ and ‘Family Packs’. We analyzed each series individually and found two
kinds of seasonality
(i) Weekly seasonality in terms of weekend and weekday seasonality. Sales were higher on weekends as
compared to weekdays.
(ii) A half yearly seasonality in terms of summer and winter months. Sales were considerably higher in summer
months (Mar-Aug) as compared to winters.
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 2
In addition we replaced the holes in the data with the average quantity sold as well as identified the outliers
(festival dates), though we did not change the values or removed them considering the fact that there wasn’t
enough data to justify our assumptions. To understand weekly variation we did a Trellis of sum of quantity sold
over day of week. We can clearly see higher demand over the weekend, compared to the weekdays (Fig. 8 in the
Appendix)
High level description of final method and performance: While building the forecasting model, we performed a
Naïve Forecast initially and used the results as our reference. After performing Naïve, we performed a multiple
linear regression with a single seasonality (weekly) and a linear trend. Thereafter, we performed a Holt Winter’s
analysis with a trend and additive seasonality followed by a multiple regression with multiplicative seasonality.
Our parameters kept on improving in the order sequence mentioned with ‘Multiple Regression with Dual
Seasonality’ giving the best results.
Forecasts:
After training the model on training data and validating on validation data we used the model to forecast for the
next two days. Here are the forecasts
Managerial Implications: For the dairy line we conclude that the model can be used to predict next two days
sales based on previous two days sales and hence the retailer can use the forecasts to decide how much of the
lassi and srikhand does he wish to stock on any given day. For the Ice Cream and Gelato class, we can accurately
predict data for the next week using a regression model. With further advancement in the IT infrastructure, an
ERP/order management system could be used to integrate the supply chain and share this information with the
dairy products supplier. This would even let the supplier plan his stock better and the retailer will be spared
from the hassle of debating/discussing the requirement every day. Needless to say this would result in financial
gains.
Our models assume that the current pattern in demand for the products (including the trend, seasonality, level
etc.) will remain consistent with historical data on which the forecasts are made (his includes customer
preferences and also product attributes).
Row
LabelsForecast LCI UCI
9/1/2012 33 -13.88585 79.88585
9/2/2012 33 -13.88585 79.88585
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 3
Technical Summary – Details of methods used
Data
• Source: Provided by HansaCEquity
• Period 13 months of data from Aug 2011 to Aug 2012
• The daily transaction data is available for a particular store in Mumbai which contains quantity sold ,
extended price among other variables
• Data Availability Assumption: We assume that data will be available in this format on an ongoing basis.
• Data Partitioning: We partitioned data into training set (Aug 2011 to Jul 2012) and validation set (Aug
2012: 4 weeks)
Other Dairy Products
Data Exploration & Visualization We used spot fire to aggregate, visualize & explore available data. This
exercise generated few time series which are shown in Fig. 1 (a, b, c) in the Appendix. For each series we looked
at three daily series – Total Quantity Sold, Total Sales & # of Transactions per day.
Outlier treatment: We observed few outliers Feb 2012 and May 2012 but we have not removed them as we
believe there may be few months which see spikes due to certain reasons and our model should be able to take
into account such outliers as well.
To understand weekly variation we did a Trellis of sum of quantity sold over day of week. We can clearly see
higher demand over the weekend, compared to the weekdays (Fig. 2 in the Appendix)
We ran Auto correlation (ACF) on the data and we understood that there is correlation between sales today and
the sales yesterday, and 6 & 7 days ago (Fig. 3 in Appendix)
Forecasting:
We are doing a rollover forecast on a daily basis. The forecast will be used to predict daily sales. Methods:
• We tried multiple methods starting with Naïve as our benchmark (Fig. 4 in Appendix). The ACF plot indicated
correlations which did not make much sense besides the one that indicated weekly correlation. The MAPE
was 130%, there was much scope to improve.
• Then we moved on to Multiple Linear Regression with Weekly dummy variables. Although the MAPE was
still high but the ACF indicated that there is correlation between today’s and next 2 days’ sales. This aspect
was not being captured by our model. (Fig. 5 (a, b, c) in Appendix). Further need for improvisation.
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 4
• Output from Moving Average with 7 days (MA7) seasonality was even worse as MAPE increased to 241%.
But the plot of Actual vs forcasted on the validation set indicated that the model is doing fine while
predicting next 2 day’s sales. (Fig. 6 in Appendix)
• To improvise, then we did holt winter’s with additive seasonality with 7 days period which was great
improvement as MAPE came down to 80%. (Fig. 7 in Appendix)
• But it was evident from the earlier experiments that there is connection with last 2 days’ sales, so we went
ahead with 2 days period with holt winter’s with even improved MAPE to 56%.
• Finally after this we did MA (2) which gave the best results in terms of MAPE – 48%. Results are provided
below
Note: The main limitation of this method despite giving good results is that it has to be used on rolling
forward basis and it can predict accurately for only the next two days.
Ice-Cream and Gelatos
Data Exploration & Visualization We used spot fire to aggregate, visualize & explore available data. To
understand weekly variation we did a Trellis of sum of quantity sold over day of week. We can clearly see higher
demand over the weekend, compared to the weekdays (Fig. 8 in the Appendix)
Forecasts:
We are doing a rollover forecast on a daily basis. The forecast will be used to predict daily sales. Methods used
• We tried multiple methods starting with Naïve as our benchmark
• Then we moved on to Multiple Linear Regression with Weekly dummy variables.
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 5
• After learning from this model we used Moving Average with 7 days (MA7) seasonality which
did not make much sense.
• Then we did Holt Winter’s with additive seasonality with 7 days period (Fig. 9 in Appendix)
• Finally after this we did Multiple Linear Regression with Weekly and Half Yearly dummy
variables and Polynomial trend (t and t^2). We obtained the best MAPE in this case.
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 6
Appendix
Fig. 1(a) – Total Daily Quantity Sold
Fig. 1(b) – Total Daily Sales (Rs.)
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 7
Fig. 1(c) – No. of Daily Transactions
Fig. 2 – Trellis for Daily Quantity Sold
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 8
Fig. 3 – ACF for Total Daily Sales
Fig. 4 – Naïve Forecast
Fig. 5 (a) – Dummy Variables for Day of Week
Row Labels
Sum of Quantity_SoldNaïve Residual Day of w eek Weekday_Fri Weekday_MonWeekday_Sat Weekday_SunWeekday_Thu Weekday_Tue Weekday_Wed
Data
Data source CategoryVar1!$D$11:$O$367
Time variable
Selected variables
Partitioning Method Sequential
# training row s 329
# validation row s 28
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 9
Fig. 5 (b) – Linear Regression with Dummy Variables
Fig. 5 (b) – ACF /PACF from Regression
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 10
Fig. 6 – Moving Average MA (7)
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 11
Error Measures (Validation)
MAPE 80.534533
MAD 26.412711
MSE 713.09044
Fig. 7 – Holt Winter’s Additive Seasonality Model
Error Measures (Validation)
MAPE 56.46093
MAD 18.750176
MSE 367.86672
Fig. 10 – Holt Winter’s Additive Seasonality Model
0
5
10
15
20
25
30
35
40
45
Sum
of Q
uantity
_S
old
Row Labels
Time Plot of Actual Vs Forecast (Validation Data)
Actual Forecast
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 12
Fig. 8 – Trellis for Daily Quantity Sold
Fig. 9 – Methods used to predict Ice-Cream Sales
Forecasting sales of Other Dairy (Lassi, Srikhand) & Ice-creams
Page 13