Post on 08-Mar-2018
3
Customer Case
Smart Meter at home project • Data loading (daily batch)
• Analytics on very large data set, ad-hoc queries • “average consumptions on a geography and population compared to ..” • 2 to max 100 concurrent queries depending on queries, SLA of 10sec to several
minutes or no SLA
• Online queries for customer portal / CRM • “customer X average consumption over 7 days compared to last month” • About 2000 concurrent queries with SLA < 2s
Deployment Plan: 100x growth • 300 K smart meters deployed in pilot • 4 M by 2015 • 35 M by 2018
4
POC perimeter
Customer asks the POC to be realistic to 2018 deployment Provided 2 years of data in TB-size files Deployed GemFire in EMC lab
• And then in a 10x scale at EMC factory in Cork, with a 3 rack Greenplum
6
Data Volume
Smart Meter collects data every 10 minutes x 35M, x 24x7, x 1 year
• 50 400 000 000 “rows” / year
• Several years !
SQL-centric data model provided
by customer About 15 tables
• Reference data tables
• Big Data tables
7
Enabling Extreme Data Scalability and Elasticity
Application Data Lives Here
Application Data Sleeps Here
8
Cloud Elasticity
Add or remove servers dynamically
Fabric is elastic so it can grow or shrink dynamically with no interruption of service or data loss
10
Big Data meets Fast Data = GreenPlum + GemFire/SQLFire
Apps
Real Time Data
Reference Data
Real-Time Data Query**
Historical Data Query***
Real-Time Data Query
Real-Time Data
Processing
Incremental Data Load
Data Push Lazy Write
*Gemfire will only retain up to X minutes of real-time data, after which the data will be written to Greenplum
For scenario when app requires all results. Greenplum will cross-check with Gemfire to ensure all data (real-time + historical) are returned to the query
Continuous Data Query**
Analytics Map Reduce
Analytics Map Reduce
OLTP
OLAP
11
Project Target Architecture
OLTP
OLAP
Parallel batch data loading
(flat file)
Real Time Compute of agreggates
Tactical Queries
Web Apps / App Servers
Computers / Clients /
Aggregator
Reference Data Backoffice Apps
Extraction / Analysis
(30m x 10min = 50K/s – real time or batch
aggregate or flat file)
Out of POC Scope
12
Data Model
=Customer + many tables designed by team for batch computed aggregates • All non-blue tables are
aggregated pre-computed data
Legend:
compteur
id_compteurfk_pdlfk_concentrateurdate_debutdate_finmodele
bigintbigintbiginttimestamp with time zonetimestamp with time zonecharacter
caracteristique_client
id_caracteristiqueoptarifpsouscnb_phasestype_localtype_usageenergie_chauffageenergie_ecs
smallintcharacter(4)smallintsmallintcharacter(1)character(1)charactercharacter
conso_releve
fk_clientdate_relevedate_reception_sipetecpuissancereleve
biginttimestamp with time zonetimestamp with time zonecharacter(2)integerinteger
Table without modification
Table with field(s) modified or added
Table added
Aggregated table
agg_cdc_moy_commune_10min
id_caracteristiquefk_communedate_relevepuissance_moyennepetec
smallintintegertimestamp with time zonerealcharacter(2)
agg_cdc_moy_commune_1h
id_caracteristiquefk_communedate_relevepuissance_moyennepetec
smallintintegertimestamp with time zonerealcharacter(2)
agg_cdc_moy_departement_10min
id_caracteristiquecode_departementcode_regiondate_relevepuissance_moyennepetec
smallintcharacter(2)character(2)timestamp with time zonerealcharacter(2)
id_caracteristiquecode_departementcode_regiondate_relevepuissance_moyennepetec
smallintcharacter(2)character(2)timestamp with time zonerealcharacter(2)
agg_client_jour
fk_clientdate_relevepuissance_moyennepuissance_cumulpetec
bigintdaterealbigintcharacter(2)
caracteristique_client
id_caracteristiqueoptarifpsouscnb_phasestype_localtype_usageenergie_chauffageenergie_ecs
smallintcharacter(4)smallintsmallintcharacter(1)character(1)charactercharacter
cdc_profilee
fk_pdldate_profilagepuissancepetec
biginttimestamp with time zonenumeric(17,15)character(2)
client
id_clientidentite
bigintcharacter
commune
id_communecode_inseecode_postalfk_stationdensitenb_foyersnb_entreprisesnb_apartnb_maisonnb_rpnb_rsenb_logvac
integercharactercharacterintegerintegerintegerintegerintegerintegerintegerintegerinteger
compteur
id_compteurfk_pdlfk_concentrateurdate_debutdate_finmodele
bigintbigintbiginttimestamp with time zonetimestamp with time zonecharacter
concentrateur
id_concentrateurfk_communelongitudelatitudedebut_hcfin_hc
bigintintegerdoubledoubleintegerinteger
concentrateur_tranche
id_concentrateurfk_communelongitudelatitudedebut_hcfin_hcdebut_hp_tranche1fin_hp_tranche1debut_hp_tranche2fin_hp_tranche2
bigintintegerdoubledoubleintegerintegerintervalintervalintervalinterval
conso_energie
fk_pdldate_releve_prevdate_releve_curenergiefupetec
biginttimestamptimestampdoublenumeric(16,15)character(2)
conso_releve
fk_clientdate_relevedate_reception_sipetecpuissancereleve
biginttimestamp with time zonetimestamp with time zonecharacter(2)integerinteger
contrat
id_contratfk_clientfk_pdlfk_communefk_caracteristiqueoptarifpsousc
bigintbigintbigintintegersmallintcharacter(4)smallint
contrat_recoflux
id_contratfk_clientfk_pdlfk_communefk_caracteristiquefk_fournfk_refk_profiltype_conso_relevetype_contratdate_debutdate_finoptarifpsousc
bigintbigintbigintintegersmallintsmallintsmallintsmallintcharacter(3)charactertimestamptimestampcharacter(4)smallint
fournisseur
id_fournidentite
integercharacter
france
id_communefk_stationnom_communecode_inseecode_departementnom_departementcode_regionnom_region
integerintegercharactercharactercharactercharactercharactercharacter
jdb_excur
fk_compteurdatephaseetattension
biginttimestamp with time zonesmallintcharacter(1)integer
jdb_oc
fk_compteurdateetat
biginttimestamp with time zonecharacter(1)
meteo_releve
fk_stationdate_relevetemp_reelle
integertimestampdouble
meteo_station
id_stationlongitudelatitude
integerdoubledouble
pdl
id_pdlfk_communesurfacenb_occupantsnb_piecesnb_phasestype_localtype_usagelabel_energieenergie_chauffageenergie_ecsannee_constructionaddr_local_voie_numaddr_local_voie_nomaddr_local_2addr_local_3addr_local_cpaddr_local_vil le
bigintintegerintegerintegersmallintsmallintcharacter(1)character(1)character(1)charactercharactercharactersmallintcharactercharactercharactercharactercharacter
pmax
fk_pdldate_relevepmaxdepassement
biginttimestamp with time zoneintegerboolean
profil
id_profill ibelle
integercharacter
profil_coefs_new
fk_profildatepetecpa
smallinttimestamp with time zonecharacter(2)double
profil_sum_pa_glissant
fk_profilpetecdate_startdate_endsum_pa
smallintcharacter(2)timestamp with time zonetimestamp with time zonedouble
re
id_reidentite
integercharacter
agg_cdc_moy_national_10min
id_caracteristiquepetecdate_relevepuissance_moyenne
smallintcharacter(2)timestamp with time zonereal
agg_cdc_moy_national_1h
id_caracteristiquepetecdate_relevepuissance_moyenne
smallintcharacter(2)timestamp with time zonereal
id_caracteristiquecode_regiondate_relevepuissance_moyennepetec
smallintcharacter(2)timestamp with time zonerealcharacter(2)
gg_ _ y_ g _
id_caracteristiquecode_regiondate_relevepuissance_moyennepetec
smallintcharacter(2)timestamp with time zonerealcharacter(2)
agg_concentrateur_mois
date_releveid_concentrateurpuissance_cumulnb_client
datebigintnumericbigint
13
GemFire GreenPlum Architecture
GemFire Java/.Net/C/C++ client SDK (app server etc)
GemFire Locator (active)
GemFire Locator (active)
GemFire Cache Server process
GemFire Cache Server process
High RAM capacity
15
Disks. Lots of Disks.
Super-fast SAS disks. There are 12 in each blade, and there are 16 blades. We had 3 racks like this one. 576 disks in total.
16
The RIGHT hot data with GemFire
Year of deployment
Smart meter count
Data kept for User data capacity
GemFire Modules (minimum)
2011 100 K 1 year < 1TB (20GB) 1 Pilot 2012+ 300 K 1 year < 1TB (40GB) 1 2015 4 M 1 year < 1TB
(550GB) 1
2018
POC
35M 1 year 4.8 TB 3 35M 2 months 800 GB 1 10M
1 year 1.4 TB 2
« 20% of data for 80% of queries »
17
GemFire results for queries (we mean customer « query », not OQL)
Concurrent Queries
Times (min/max/
avg)
700 7 - 300 ms
18 ms
400 5 - 200 ms
13 ms
100 3 - 200 ms
11 ms
18
Results: 99,98% < 1s (110ms!) at 700 concurrent
3 6 9
3
5 7
72
77
115
0
20
40
60
80
100
120
140
100 400 700
Response Time per “query” (GemFire module) 3.1.3 3.1.2 3.1.1
700 concurrent parallel functions – we are CPU bound with 64 cores (x 100ms) = 640 concurrent / sec theoretical
Those ones are get / few gets
19
SLA Before optimization
80 Concurrent requests
After optimization 180 Concurrent
requests Gain Factor Chapter
GreenPlum only results
After 4 days of optimization and lots of pre-computed tables • That creates lots of extra data and data loading complexity as well!
Response time of around 3 to 20 sec for GP GemFire is 100x faster at 10x the load. But you care about GP query time if you do “read through” on cache miss !
In a real system the “online” queries would over-consume vs analytics so even with a 800ms response time – this is not “good enough” for OLTP ! GemFire is a must-have
20
GemFire and Greenplum
Gemfire
• Real-time Data –Transactions • Write Behind to Greenplum • Read Thru to Greenplum
• Query • Continuous/Standing Query
• Data Aware Functions • Map-Reduce
• Geographic Distribution • Survive Network Outage • Reach Back
• Elastic • Linear scalability • Multi Terabyte Data Scale • Built in Load Balancing • HA
• Runs in VM
Greenplum
• Batch updates – Full Data Set • Query
• Complex High Performance Query • Map-Reduce
• Elastic • Linear scalability • Petabyte Data Scale • HA
• Runs in VM
GemFire
GreenPlum
21
ONE EMC: Greenplum & Gemfire (“BELIEVE”)
GreenPlum
GreenPlum
GemFire
GreenPlum
• Greenplum & Gemfire from ONE EMC Distribution
• Same rack • Inter-operable Integration
• Single point of contact Support
• Architecture Scale out Scalability
• 1% only in volume • 800 GB for 2 months of data Redundancy