Dw allegro alain ozan.
-
Upload
oracle-hrvatska -
Category
Education
-
view
637 -
download
0
Transcript of Dw allegro alain ozan.
Allegro’s DWH Implementation on
Oracle Database Machine
with OWB & OBIEE
Rafał Kudliński, BI Manager Allegro Group
Allegro Group operates market leading e-commerce trading platforms; general, automotive and real
estate classified sites; a price comparison site; and payment web services across Eastern Europe
under various brands. Allegro Group operates 46 platforms in 13 countries.
Do you really know Allegro?
2Rafał Kudliński – BI Manager Allegro Group
Allegro is the most successful eCommerce trading system in Poland and the largest non-eBay auction
platform worldwide.
Do you really know Allegro?
MORE THAN 14.5
MILLION USERS
Over 9500 new users
every day.
Over 3.5 million new
users every year.
1000 EMPLOYEES
250 employees in the
IT department:
150 in the
development division
and 100 in other IT
divisions (IS, DWH,
R&D, etc.).
MORE THAN 500
MILLION PAGE
VIEWS (Peak)
The number of page
views has doubled
within the last 3 years.
MORE THAN 90
MILLION LISTED
ITEMS
The number of listed
items has increased
by 75 million over the
past 3 years.
3Rafał Kudliński – BI Manager Allegro Group
Our figures are extraordinary in all areas. We use a leading-edge technology wherever possible. The
power consumption of our DCs is bigger than 800 average households together!
Do you really know Allegro?
4Rafał Kudliński – BI Manager Allegro Group
Business requirements (Priority 1)
What is our internet services performance?
We need to play with data, simulate and have influence on growth; Users
expecting quality and we are expecting more sales; We need to measure
the categories?; We need to know if our pricing model is optimal?; We
need to know when the growth is getting slowed down? What influence the
success rate for auctions for different categories? We need to know our
We decided that measuring our internet services performance is most important for our business
growth. It covers both services and user areas. We have to really understand what is happening on
our websites and who is our most valuable user.
success rate for auctions for different categories? We need to know our
refunds, fees; We need to analyse the method of payment; Auctions on
the front page and we need to analyse the conversion rate and why? We
want to measure the effect of the changes in the categories; We need to
benchmark countries; What products are sold most frequently ? new or
used? What is source of our profit?; What is result of search? What user
do after that? We need to know where user click and what he do on our
site?
5Rafał Kudliński – BI Manager Allegro Group
Business requirements (Priority 2 and 3)
What is our operational performance?
� We need to do our job faster, better, more efficient; we need to know which
projects to realize, which are profitable; We need to measure the activity?
What is our Marketing campaigns performance?
� (We need to measure the campaigns?; We need to know effect of
The Operational performance and marketing campaign effectiveness is also crucial to our business.
Co brand and affiliate programs as valuable sources of traffic and new users registrations have to be
carefully monitored as well.
� (We need to measure the campaigns?; We need to know effect of
marketing action? source of traffic? need to track the result of our
spending.
What is our co-brand performance?
� What are the sources of new registrations?
What is our Affiliate Performance?
� We need to know affiliation program impact?
What is offered Products performance?
� We need to measure products?
6Rafał Kudliński – BI Manager Allegro Group
Agenda
7Rafał Kudliński – BI Manager Allegro Group
Projects in Numbers
The first project took 6 months to complete, with 8 people working on it. Support from external companies
was necessary due to the implementation of a new technology and software.
� Project duration : 12 months
� Project team: 8 – 12 people
� Man/days spent: 800
� Active Users – 120
8
� Implemented reports – 100
� Implemented KPIs – 160
� Biggest source system size – 7TB
� Largest Tables – 2.8 billion records
Rafał Kudliński – BI Manager Allegro Group
Data warehouse architecture
DWH Staging AreaDWH Staging AreaOracle Oracle DBDB
Load
Oracle Data Guard
DWH ProductionDWH ProductionOracle Oracle DBDB
ETL
We load data from a real time copy of the production system. Extraction and transformation processes
are performed to load data to DWH production scheme. Finally aggregations are built to improve query
processing performance. We use OWB as ETL tool.
Logical Logical
StandbyStandbyOracle Oracle DBDB
Allegro Allegro
Production Production Oracle Oracle DBDB
Oracle Data Guard
Production Environment2 * IBM P590 Machine
Data Warehouse EnvironmentOracle Database Machine
DataMartDataMartOracle Oracle DBDB
ET
L
OWB
Click Stream recording
Environment10 * DL360 Machine
DB 1DB 1
MySqlMySql
DB 2 ..DB 2 ..
MySqlMySql
DB ..10DB ..10
MySqlMySql
9Rafał Kudliński – BI Manager Allegro Group
Oracle BI Server
Ad-hoc Analysis
Interactive Dashboards
Allegro DWH & BI system architecture
We use Oracle Business Intelligence Enterprise Edition as BI tool. OBIEE is connected to both Target
and DataMart schemas. We have almost 120 active users. 10 power users perform Ad- hoc queries.
DWH ProductionDWH Production
TargetTarget
Oracle BI Server
Deliversand Alerts
MS OfficePlug-in
Transaction Transaction PlatformsPlatforms
Data Warehouse EnvironmentOracle Database Machine
Oracle 11g
DWH ProductionDWH Production
DataMartDataMart
Other Other SystemsSystems
10Rafał Kudliński – BI Manager Allegro Group
BI Portal presents the most important reports / KPIs describing performance of our major auction
platforms in all countries we operate. We can find there information about open auctions, registered
users, bids, sales and charges.
Allegro Performance KPIs
11Rafał Kudliński – BI Manager Allegro Group
Product managers can analyze a number of measures drilling down in the product category tree. They
can filter data by selecting an auction type or a seller type.
Auction Category Analysis
12Rafał Kudliński – BI Manager Allegro Group
BI Portal contains also information about IT department performance. Managers can see current
budget realization, SLA, Traffic and status of most important current IT projects.
IT Department KPIs
13Rafał Kudliński – BI Manager Allegro Group
We deliver information about the number of clicks grouped by users, user locations, services, scripts and,
most importantly (not available yet), by product categories. The users can drill down to detail information.
Click Stream analysis
14Rafał Kudliński – BI Manager Allegro Group
DB Machine is very efficient in all types of ETL processing. We do parsing, cleaning, merging and joining
of almost 500 million records each day. The number of aggregation tables is calculated and refreshed.
Lesson 2 – ETL Processing – DB Machine do it all
15Rafał Kudliński – BI Manager Allegro Group
As usual, in order to have excellent performance, you have to think about partitioning, compression and
parallel query execution. No indices; full table scan performance needs to be considered.
Lesson 3 – Data Architecture – Standard but Improved
� We use Standard Star Schema with a collection of fact tables
� Our largest fact tables have almost 3 billion records (billings, clicks)
� Our largest dimension tables have more than 15 million records (users,
locations)
16
� No Indices - no need; in some cases using them was even worse
� Full Table Smart Scan – works very efficiently
� We heavily use partitions (days, months) and sub-partitions (attributes)
� Compression – saves space (avg. 30%) and improves performance
� Parallel query execution /*+parallel(table,8,3)*/ - works very well – average
query execution time improvement = X10
Rafał Kudliński – BI Manager Allegro Group
Even when using DB Machine, it is necessary to use the aggregation tables to achieve necessary user
interface performance. What you get is scalable and fast aggregation and reporting environment.
Lesson 4 – User Access – fast and reliable
� We create a number of aggregation tables to avoid joins between million-
record tables
� Reports and dashboards are delivered within seconds (<5s) even with >100
users working
17
� OBIEE works very well with DB Machine especially in reporting and
dashboarding
� OBIEE ad-hoc Answers application is very powerful but still some users need
to use SQL to get what they want (automatically generated queries are not
adjusted to use all DB machine features)
Rafał Kudliński – BI Manager Allegro Group
Exadata storage server brings an additional value but not many additional tasks. It can be handled by
DBA without any special skills
Lesson 5 - DB Administration – no complexity
� Just typical RAC environment
� Fully integrated with Grid Control – storage cell monitored with a dedicated
plug-in
� Distributed command execution
18
� Easy storage layer administration – replace/create a disk/diskgroup with no
more than 3 commands
� Comprehensive command shell on a storage cell
� Additional hardware/software components needed for integration with SAN
backup environment.
� Self-monitoring storage layer with email notifications
Rafał Kudliński – BI Manager Allegro Group
Support from Oracle and external experienced consultants is necessary for successful DW & BI
implementation (using new environment)
Lesson 7 – External support – helps and speeds up
Business Needs
� Business Discovery (performed with the help of Oracle Consulting ) was very
valuable to prioritize business requirements - Jamal El Faiz
ETL Process
� Experience in massive data processing from ISE – Igor Michaljow
19
� Experience in massive data processing from ISE – Igor Michaljow
OBIEE
� Expertise in building robust reports and dashboards from Oracle Consulting
Alessandro Sabelli, Małgorzata Baran, Marzena Krzanowska
DB Machine administration
� Some initial configuration made by Oracle
� Support from RAC PAC team (best practices, service requests)
� Update Patches are frequently released
� Experienced internal DBA is crucial – Wojciech Semenowicz
Rafał Kudliński – BI Manager Allegro Group
Next Steps and Outlook
FUTUREFUTURE
Right now we are working on processing Click Stream data to DWH. We have more than 400 mln page
views every day. In next few months data from our payment and classified services will be loaded .
20
PRESENTPRESENT
PASPASTT
Rafał Kudliński – BI Manager Allegro Group
� Thing big act small and before you start search for the right staff!
� Plan and manage project carefully
� Have right sponsor and support from business site
� Oracle Database Machine is definitely right choice
Recommendations
� Oracle Database Machine is definitely right choice
� At the beginning Support from Oracle consulting is crucial
� Stand for Information Democracy in your company
21Rafał Kudliński – BI Manager Allegro Group
Q&A
22Rafał Kudliński – BI Manager Allegro Group