Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI...

6
Retail POS Data Analytics Using MS Bi Tools Business Intelligence White Paper

Transcript of Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI...

Page 1: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics UsingMS Bi Tools

Business Intelligence White Paper

Page 2: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics Using MS BI Tools

Introduction

OverviewThere is no doubt that businesses today are driven by data. Companies, big or small, take so much of effort to collect huge amount of data from wide range of sources and mediums such as contact details, financial and operational data, buyer behavior, and even social media data. With the help of this data, companies try to understand their strengths and weaknesses as well their competitors’ to make better business decisions.

In the case of retail sector, retailers often have to rely on store-level POS data which is in huge quantity, gets created on daily / real-time basis and is not well-organized for analysis. Therefore, in order to understand their customers and provide them the best service and shopping experience, store / retail operators need to convert raw retail store data into intelligent information.

This paper attempts to give insights about how through various business intelligence tools and technologies, organizations can derive meaningful information from huge chunks of data.

PurposeThe purpose of this paper is to highlight architectural and technical approach for the optimization of retail Point of Sale (POS) data analysis.

ScopeThe paper’s scope is limited to the basic concepts, tools and technologies of Microsoft Business Intelligence.

Intended AudienceThe target audience of this paper are all decision makers, strategists, and top-level management professionals who are engaged in taking critical decisions at the strategic, tactical, and operational levels for their organizations.

© Contata Solutions 2015 Page 2

Page 3: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics Using MS BI Tools

Problem StatementIn today’s competitive retail business landscape, some of the major challenges faced by the retail / store operators worldwide are:• Aligning the speed of data capture: recording and converting data into information so as to take right

decisions at the right time• Breaking information silos• Data and information consistency at every level of an organization• Trend / Pattern analysis to make tactical and strategic decisions

Since critical information directly affects sales and profitability, retail / store operators need to make quick strategic, marketing and operational decisions. Unavailability of such information often leads to disastrous business impact, such as:• ineffective decision making due to unprocessed and incorrect data• loss of time and money involved in extracting and compiling information from multiple locations / systems

/ subsystems• time gap between the availability of information and the communication done to perform the action• misalignment among strategic, tactical, and operational decisions

Microsoft DWBI Tools and TechnologiesSQL ServerSQL Server 2014 Standard vs. Enterprise Edition: SQL Server is used for relational database to store transactional database as well as define and store data warehouse. By opting the Enterprise version over the Standard version, one can optimize performance significantly.

SQL Server Integration Services (SSIS)SSIS provides Extract, Transform and Load (ETL) capabilities for data import, data integration and data warehousing needs. Its GUI tools help to build workflows such as extracting data from various sources, querying data, transforming data and converting the processed data into required shapes. It can be used in day-to-day business operations as well as for data mining and data warehouse applications.

SQL Server Analysis Services (SSAS)SSAS adds OLAP and data mining capabilities for SQL Server databases.

SQL Server Reporting Services (SSRS)SSRS provides server-based platform designed to support wide variety of reporting needs. It delivers relevant information across the entire enterprise and helps in creating and managing both static and parameterized reports, while providing a sound platform for delivering information.

© Contata Solutions 2015 Page 3

Page 4: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics Using MS BI Tools

Technical SolutionContata Solutions undertook a project that involved helping a retail store perform analysis on POS data. The data was in CSV format and the analysis had to be done on the basis of customer segmentation, geography, product consolidation, and seasonality / trend analysis. Given below are the requirements based on which the project was executed:

Source DatabaseSource data was available in multiple CSV format.

Required OutcomeThe following analysis outcome was required:

Customer Analysis• Customer segmentation on the basis of:

o Number of days since last visito Number of orders in past 12 monthso Dollar value of transactions

• Polarity between high-value and low-value customers• Customer loyalty

Store Analysis• Total number of store visits on daily, weekly, and monthly basis• Total sales on daily, weekly, and monthly basis• High-selling products

Product Analysis• Products commonly bought together• Sales by product category• Product consolidation strategy on basis of high-value, less-cost products

Seasonality / Trend Analysis• Average order value by month• Sales on festival season• Increase in sales of a particular product during a baseball or football series

Hardware InfrastructureThe following hardware infrastructure was used for the project: Server 1

DB Server: 8 Core Processor, 64 GB RAM, Storage size: 1.5 TB

Server 2SSIS server: 8 Core Processor, 64 GB RAM, Storage size: 0.5 TB

Decision on SQL Server Edition Case 1: SQL Server 2014 Standard Edition

SQL Server 2014 Standard Edition was used initially, but the following issues were encountered:• When data was transferred from CSV into the SQL Server staging database, its size was approximately

100 GB with 80% of the data distributed into 2 main tables related to daily transaction and transaction line item details.

• It was taking 2 minutes to count number of records. There are over 200 million records.• To optimize the database, some steps had to be taken such as Table Partitioning, Columnstore Index,

etc. However, Standard Edition did not have those features, hence it was decided to move to SQL Enterprise Edition.

© Contata Solutions 2015 Page 4

Page 5: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics Using MS BI Tools

Case 2: SQL Server 2014 Enterprise EditionTo improve performance, the following steps were taken:1. SSIS Optimization

To gain best results for data load, two separate servers were used for SSIS server and Database server. This was because the SQL Server uses a user mode cooperative multi-tasking and resource control that assumes 100% ownership of the system, and therefore consumes all the memory. In addition, even Lookups were cached to improve performance.

2. Table partition: Table is partitioned on the basis of monthsa. Hard drive with 250 GB storage was selected keeping in mind the scope for future scalability for

both Transaction Database and Datawarehouse.b. Created filegroup for each month that maps each quarter filegroups with the hard drive.c. Data was partitioned on the basis of months, with the data of first month of any year going to the

the partition range 1 (see below diagram).d. Define partition scheme to map partition range with filegroup.e. Associate table and partition scheme during table creation on month field.

© Contata Solutions 2015 Page 5

SSIS packages read the data for each partition from Staging Database and transferred the data to Datawarehouse. Both Staging Database and Datawarehouse main tables were partitioned.

3. Clustered column store indexSince reports had to be defined from Datawarehouse, clustered columnstore index was used to gain query performance over traditional row-oriented storage. This was because the data was stored in columnar data format and was compressed over the uncompressed data size. As a result, query performance improved from 2 minute to 2 seconds for counting the total number of records.

Page 6: Retail POS Data Analytics Using MS Bi Tools - Contata · Retail POS Data Analytics Using MS BI Tools Introduction Overview There is no doubt that businesses today are driven by data.

Retail POS Data Analytics Using MS BI Tools

4. Query optimizationQuery are optimized like:• Use Actual column in select statement instead of Select *• Minimize the subquery usage• Proper indexes are created in tables• Avoid Full table scan wherever possible• Avoid “group by” over multiple keys• Avoid getting data from multiple left joins

5. Partial cube processingIn order to do partial processing for cube for incremental data, the cube was partitioned on month’s basis using views with each view corresponding to each month.

© Contata Solutions 2015 For more information visit: www.contata.com

SummaryIn summary, the following techniques were used to optimize overall architecture and query performance: SSIS optimization having SSIS run on separate server than DB server SSIS optimization using lookup cache Query optimization Table partitioning Clustered columnstore index Cube partitioning

References• http://technet.microsoft.com/• http://msdn.microsoft.com/