MarketShare confidential and proprietary 1
Predictive Analytics as a Service
Prateem Mandal, Tech Lead ArchitectOctober 28, 2015
MarketShare confidential and proprietary 2
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
Modeling
ETL
Scenario Analysis & Reporting
1
Stack Generation
MarketShare confidential and proprietary 3
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 4
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 5
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 6
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 7
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 8
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 9
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 10
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 11
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
MarketShare confidential and proprietary 12
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
Deployment Workflow
Modeling Workflow
MarketShare confidential and proprietary 13
Analytics Workflow
Distributed Cache
Tool
Application Engines
Calculation Engines
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Service APIs
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
1
Modeling WorkflowModeling Workflow
MarketShare confidential and proprietary 14
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
Modeling
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
MarketShare confidential and proprietary 15
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
Modeling
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
MarketShare confidential and proprietary 16
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
Modeling
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
U S E R
U S E R S E Q U E N C E
L O O K B A C K
C A U S E E F F E C T U S E R
U S E R S E Q U E N C E
E F F E C T
L O O K B A C K
C A U S E
U S E R S E Q U E N C E
U S E R
E F F E C T U S E R
A G G R E G A T E D U S E R S E Q U E N C E
M E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
57
13
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
57
13
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
57
13
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟏𝟏+𝐦𝐢𝐧 (𝟏𝟑 ,𝟕 ,𝟓)
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
57
13
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟏𝟔
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
E F F E C TC A U S E
57
13
U S E RE F F E C TM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝑮𝑯𝑻 (𝒙 𝒊 )=𝟏
𝟏+𝐦𝐢𝐧 (𝒙𝒊)
MarketShare confidential and proprietary 28
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
Modeling
A G G R E G A T E D U S E R S E Q U E N C E
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A G G R E G A T E D U S E R S E Q U E N C E
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E F F E C TM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
A G G R E G A T E D U S E R S E Q U E N C E
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E F F E C TM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟗𝟗𝟗𝟗𝟗𝟗𝟗 𝟎𝟏 𝟏𝟐𝟑𝟏 .𝟒𝟓𝟓𝟑
A G G R E G A T E D U S E R S E Q U E N C E
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E F F E C TM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟗𝟗𝟗𝟗𝟗𝟗𝟗 𝟎𝟏 𝟏𝟐𝟑𝟏 .𝟒𝟓𝟓𝟑
A G G R E G A T E D U S E R S E Q U E N C E
E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E F F E C TM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 7 5
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟏𝟔𝟔𝟔𝟔𝟔𝟕E F F E C T U S E RM E T R I C iM E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
𝟎 .𝟗𝟗𝟗𝟗𝟗𝟗𝟗 𝟎𝟏 𝟏𝟐𝟑𝟏 .𝟒𝟓𝟓𝟑
E F F E C T U S E RM E T R I C 2
5713
MarketShare confidential and proprietary 34
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
Modeling
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
E F F E C T U S E RM E T R I C 2
5713
C A U S E
13
7
5
E F F E C T U S E R A T T R A T T R
0.20
0.067
0.067
0.067
E F F E C T U S E RM E T R I C 2
5713
C A U S E
13
7
5
E F F E C T U S E R A T T R A T T R
0.20
0.0
0.0
0.2
U S E R S E Q U E N C E
U S E RE F F E C TC A U S E
U S E R A T T RC A U S E E F F E C T
U S E R S E Q U E N C E
CID C1 . . . . Cm SID S1 . . . . SpEID E1 . . . . En
U S E RE F F E C TC A U S E
U S E RE F F E C TC A U S E
A T T R
U S E R S E Q U E N C E
CID C1 . . . . Cm SID S1 . . . . SpEID E1 . . . . En
U S E RE F F E C TC A U S E
U S E RE F F E C TC A U S E
A T T R
Reporting dimensions
U S E R S E Q U E N C E
CID C1 . . . . Cm SID S1 . . . . SpEID E1 . . . . En
U S E RE F F E C TC A U S E
U S E RE F F E C TC A U S E
A T T R
C1 . . . . Cm S1 . . . . SpE1 . . . . En
U S E RE F F E C TC A U S E A T T R
Reporting dimensions
MarketShare confidential and proprietary 42
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display,.. AS stackmetric_hasdonebefore_event_paidsearch,......FROM......<set name="EVT_STACKMETRICS">
<val enable=“TRUE">HASDONEBEFORE</val> <val enable=“FALSE">NUM</val></set>
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Input Config
Generated Config
MarketShare confidential and proprietary 43
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display,.. AS stackmetric_hasdonebefore_event_paidsearch,.. AS stackmetric_hasdonebefore_event_organicsearch,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“FALSE">NUM</val></set>
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Input Config
Generated ConfigData change
MarketShare confidential and proprietary 44
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“FALSE">NUM</val></set>
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Input Config
Generated Config
Adding segmentation dimension
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................5
MarketShare confidential and proprietary 45
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val></set>
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Input Config
Generated Config
Selecting NUM metric
MarketShare confidential and proprietary 46
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,.. AS stackmetric_gsm90_event_display_onlinedisplay,.. AS stackmetric_gsm90_event_display_directresponsevideo,........ AS stackmetric_gsm365_event_paidsearch_brand,.. AS stackmetric_gsm365_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val> <val enable=“TRUE">GSM</val></set>
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
Input Config
Adding new metric
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Generated Config
<set name= “GSM_STEP"> <val>90</val> <val>180</val> <val>365</val></set>
MarketShare confidential and proprietary 47
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,.. AS stackmetric_gsm_event_display_onlinedisplay,.. AS stackmetric_gsm_event_display_directresponsevideo,.. AS stackmetric_gsm_event_paidsearch_brand,.. AS stackmetric_gsm_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val> <val enable=“TRUE">GSM</val></set><set name= “GSM_STEP"> <val>3</val></set>
Input Config
Adding new metric
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Generated Config
stackevtterms_(<STCKGEN_RP><SDATE>i)-- This query calculating all the NUM event metrics (stack variables or stack metrics).-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) ASSELECT
<METRICS_EVTSTACK><OTHER_METRICS_EVTSTACK>t2.rp_start AS rp_start,t2.rp_end AS rp_end,t1.userid AS userid,t1.actuuid AS actuuid,t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,t1.refdate_rp AS refdate_rp,t1.response AS response
FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN
elbis_(<STCKGEN_RP><SDATE>i) t2ON (t1.userid = t2.userid)
WHERE (t1.sample_flag = 1)
GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp,
t1.response; <param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
MarketShare confidential and proprietary 48
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,.. AS stackmetric_gsm_event_display_onlinedisplay,.. AS stackmetric_gsm_event_display_directresponsevideo,.. AS stackmetric_gsm_event_paidsearch_brand,.. AS stackmetric_gsm_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val> <val enable=“TRUE">GSM</val></set><set name= “GSM_STEP"> <val>3</val></set>
Input Config
Adding new metric
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Generated Config
stackevtterms_(<STCKGEN_RP><SDATE>i)-- This query calculating all the NUM event metrics (stack variables or stack metrics).-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) ASSELECT
<METRICS_EVTSTACK><OTHER_METRICS_EVTSTACK>t2.rp_start AS rp_start,t2.rp_end AS rp_end,t1.userid AS userid,t1.actuuid AS actuuid,t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,t1.refdate_rp AS refdate_rp,t1.response AS response
FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN
elbis_(<STCKGEN_RP><SDATE>i) t2ON (t1.userid = t2.userid)
WHERE (t1.sample_flag = 1)
GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp,
t1.response; <param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
MarketShare confidential and proprietary 49
SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN (
(channel_type IN ('Display') AND evt_sub_channel IN ('Online Display')) AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END
)/365 )
) AS stackmetric_gsm365_event_display_onlinedisplay,
SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN ( <and> <xprod>
<in> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist>
</in> </xprod>
</and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END ) / <val>GSM_STEP</val> )) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,
MarketShare confidential and proprietary 50
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,.. AS stackmetric_gsm_event_display_onlinedisplay,.. AS stackmetric_gsm_event_display_directresponsevideo,.. AS stackmetric_gsm_event_paidsearch_brand,.. AS stackmetric_gsm_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val> <val enable=“TRUE">GSM</val></set><set name= “GSM_STEP"> <val>3</val></set>
Input Config
Adding new metric
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Generated Config
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
stackevtterms_(<STCKGEN_RP><SDATE>i)-- This query calculating all the NUM event metrics (stack variables or stack metrics).-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) ASSELECT
<METRICS_EVTSTACK><OTHER_METRICS_EVTSTACK>t2.rp_start AS rp_start,t2.rp_end AS rp_end,t1.userid AS userid,t1.actuuid AS actuuid,t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,t1.refdate_rp AS refdate_rp,t1.response AS response
FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN
elbis_(<STCKGEN_RP><SDATE>i) t2ON (t1.userid = t2.userid)
WHERE (t1.sample_flag = 1)
GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp,
t1.response;
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>
</defn></metric>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
MarketShare confidential and proprietary 51
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<set name= “GSM_STEP"> <val>90</val> <val>180</val> <val>365</val></set>
SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN ( <and> <xprod>
<in> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist>
</in> </xprod>
</and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END ) / <val>GSM_STEP</val> )) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,
12 different columnswill be generated from this
single expression
3 Steps
4 segments
MarketShare confidential and proprietary 52
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<set name= “GSM_STEP"> <val>90</val> <val>180</val> <val>365</val></set>
SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN ( <and> <xprod>
<in> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist>
</in> </xprod>
</and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END ) / <val>GSM_STEP</val> )) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,
12 different columnswill be generated from this
single expression
3 Steps
4 segments .................... AS stackmetric_gsm90_event_display_onlinedisplay,.. AS stackmetric_gsm90_event_display_directresponsevideo,.. AS stackmetric_gsm90_event_paidsearch_brand,.. AS stackmetric_gsm90_event_organicsearch_unknown,.. AS stackmetric_gsm180_event_display_onlinedisplay,.. AS stackmetric_gsm180_event_display_directresponsevideo,.. AS stackmetric_gsm180_event_paidsearch_brand,.. AS stackmetric_gsm180_event_organicsearch_unknown,.. AS stackmetric_gsm365_event_display_onlinedisplay,.. AS stackmetric_gsm365_event_display_directresponsevideo,.. AS stackmetric_gsm365_event_paidsearch_brand,.. AS stackmetric_gsm365_event_organicsearch_unknown,......
MarketShare confidential and proprietary 53
Analytics Life Cycle
New Client
Onboarding
Scoring & Attribution
ETL
Discovery
Scenario Analysis & Reporting
1
Stack Generation
Modeling
MarketShare confidential and proprietary 54
Modelling Spec• Initial set of
variables• Business
rules
Model
Build N Choice Models
Model Evaluation and Ranking• Business Rules• Goodness of Fit• Statistical diagnosis
FinalizedModel
GA Iteration• Top models cross –over to
generate offspring models• Variable mutation
• Bayesian priors• Coeff bounds• Attribution bounds
Attribution
Stack
Automated Model Search
MarketShare confidential and proprietary 55
Max of modelScore Model SourceIterations CrossOver Mutation New Max Score1 0.814674133 0.8146741332 0.840250827 0.874642259 -0.142370371 0.8746422593 0.86051811 0.843359676 0.726339045 0.860518114 0.861854898 0.850294643 0.434102915 0.8618548985 0.868137103 0.851901188 0.709897779 0.8681371036 0.890517983 0.873505914 0.835870771 0.8905179837 0.890538416 0.857600064 0.538250044 0.8905384168 0.900386103 0.877312775 0.389444869 0.9003861039 0.896563775 0.861402806 0.227565829 0.89656377510 0.893281436 0.866702979 0.429202581 0.89328143611 0.907564542 0.869666755 0.738092262 0.90756454212 0.904108687 0.887093232 0.850036487 0.90410868713 0.904159874 0.884415807 0.860378596 0.90415987414 0.905516755 0.898474717 0.4425833 0.90551675515 0.911024584 0.904605785 0.776345059 0.91102458416 0.911007328 0.905694898 0.496509904 0.91100732817 0.910692312 0.898231697 0.815998341 0.91069231218 0.895302999 0.901081899 0.45632258 0.90108189919 0.907471425 0.860367793 0.675483196 0.90747142520 0.904312199 0.872822381 0.787050984 0.904312199Max Score 0.911024584 0.905694898 0.860378596 0.911024584
GA Progression over Iterations
Model with Max Score is generated from {Cross-Over, Mutation} and not from {New}
MarketShare confidential and proprietary 56
Model Score Improvement for Cross-over models over Iterations
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 124
-0.800
-0.600
-0.400
-0.200
0.000
0.200
0.400
0.600
0.800
1.000
Models over Iterations
Mod
el S
core
MarketShare confidential and proprietary 57
Cross-over models over Iterations
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
C - Average of modelScore C - StdDev of modelScore C - Max of modelScoreIterations
Mod
el S
core
MarketShare confidential and proprietary 58
Distributed Cache
Tool
Application Engines
Calculation Engines
Analytics Workflow – Performance – Large Client
Execution Systems
Elastic Load Balancer
Client Onboarding
Model Store
Config Store
Attribution Funnel creationPost Processing
Orchestrator
Metadata Store
Modeling Stack Transformation StackETLConfigurationsETL
Model UAT
Attribution Models
Evaluate
Automated Model Generation
2 h2 d0-2 w 4 h .5 h 25 h
3 h1.5h
2 h
3 days - 2
weeks
1 d
3
MarketShare confidential and proprietary 59
Result: significantly less effort for significantly larger deployments
2010 2012 2014 2015Strategist Data Mgmt. Modeler Ops Support
Software automation has
driven a decline in service
requirements
Hours
4.5 B Data points
600MM Variables
170 MM Data points
90,000 Variables
20 MM Data points
7,000 Variables
2MM Data points
1,200 Variables
Total hours per
variable
Representative deployment
Modeling scale
BANK CO. FINANCE CO. TELECOM CO. TRAVEL CO.
The size of models has grown significantlyYet the deployment effort required
has decreased significantly
3
MarketShare confidential and proprietary 60
2010 2012 2014 2015Revenue per resource
The revenue per deployment resource has 4X’d with scale gainsRepresentative
deployment BANK CO. FINANCE CO. TELECOM CO. TRAVEL CO.
Revenue per service head for deployment increasing >4X over time via greater automation
ACV Revenue/year ($ ‘000)
Deployment FTE Equivalent/year
$ ‘000Revenue per
FTE
3
MarketShare confidential and proprietary 61
Questions & Discussion
MarketShare confidential and proprietary 62
What’s Needed In A Decision Support System
MMM, Attribution, Targeting, Planning all need to share a common modeling platform
HIGHLY RESPONSIVE
Needs to provide responses in seconds not days or weeks
Must use robust models to calculate forward looking estimations and optimization
Needs the ability to integrate into the larger ecosystem
ANALYTICS RIGOR
INTEGRATIONS TO ECOSYSTEM
SINGLE PLATFORM
END-TO-END PLANNING
CMO, CFO, Media planners, Digital Planners
MarketShare confidential and proprietary 63
Three pillars of technology
Automated Modeling
PrescriptiveRecommendations UX
MarketShare confidential and proprietary 64
Decision Support Platform
Modeling Platform Results
2 34
Introduction : Scenario Analysis
1
MarketShare confidential and proprietary 65
Large Scale Analytics requires configuration and metadata driven modeling platform
Automated Modeling
PrescriptiveRecommendations UX
2
MarketShare confidential and proprietary 66
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
BI is well understood
FACTGeo DIM
Product DIM
Time DIM
1Data : Spend in $, Number of impressions bought and the corresponding Google Query Volume for 1st Quarter 2015
MarketShare confidential and proprietary 67
ReportingWhat happened?
AnalysisWhy did it happen?
MonitoringWhat’s happening now?
Com
plex
ity
Business Value
BI can answer a lot of important questions about the past
Query, reporting & search tools
Dashboards, scorecards, listening, real time reporting
OLAP and visualization tools
Business Intelligence
Complex event processing; NLP; Text mining
Time series analysis, data mining, clustering
1
MarketShare confidential and proprietary 68
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
Impact of increasing spend by 10% to total spend in 3rd Quarter?
FACTGeo DIM
Product DIM
Time DIM
1
MarketShare confidential and proprietary 69
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
TID_3 PID_1 GID_1 0 4679.25
TID_3 PID_2 GID_1 0 40392
TID_3 PID_3 GID_1 0 37986.5
TID_3 PID_1 GID_2 0 15206.5
TID_3 PID_2 GID_2 0 260338
TID_3 PID_3 GID_2 0 249003.25
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
The Third Quarter spend could be replicated from the 1st Quarter
FACTGeo DIM
Product DIM
Time DIM
1
MarketShare confidential and proprietary 70
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_3 ALL ALL 419147300 607605.5TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
TID_3 PID_1 GID_1 0 4679.25
TID_3 PID_2 GID_1 0 40392
TID_3 PID_3 GID_1 0 37986.5
TID_3 PID_1 GID_2 0 15206.5
TID_3 PID_2 GID_2 0 260338
TID_3 PID_3 GID_2 0 249003.25
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
Total Spend could be calculated
FACTGeo DIM
Product DIM
Time DIM
To get the GQV value one has to build predictive models 1
MarketShare confidential and proprietary 71
ReportingWhat happened?
AnalysisWhy did it happen?
MonitoringWhat’s happening now?
Com
plex
ity
Business Value
Query, reporting & search tools
Dashboards, scorecards, listening, real time reporting
OLAP and visualization tools
Business Intelligence
Complex event processing; NLP; Text mining
Time series analysis, data mining, clustering
PredictionWhat might happen?
Marketers adopting next-gen analytics to drive decision making
DecisionWhat Should I do now?
Predictive analytics
Decision support and management
Time series analysis, predictive modeling, ensemble modeling, machine learning
Constraint based optimization; choice modeling; decision trees
1
MarketShare confidential and proprietary 72
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
To build models from data
FACTGeo DIM
Product DIM
Time DIM
1
MarketShare confidential and proprietary 73
The Data Is Flattened Out
DENORMALIZED FLATTENED DATATIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
1/1/2015 COMP ALL 176843600 19885.75 80922371/1/2015 DIGI ALL 185465400 300730 36910621/1/2015 GAME ALL 56838300 286989.75 4210211/1/2015 COMP GL 0 4679.25 8091825
1/1/2015 DIGI GL 0 40392 3684004
1/1/2015 GAME GL 0 37986.5 414270
1/1/2015 COMP MW 0 15206.5 412
1/1/2015 DIGI MW 0 260338 7058
1/1/2015 GAME MW 0 249003.25 6751
1
MarketShare confidential and proprietary 74
Data Is Grouped Into Sets Called Features
TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV
1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 80922371/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 36910621/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 4210211/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825
1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004
1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270
1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412
1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058
1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751
DENORMALIZED FLATENNED DATA
Feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features for use in model construction. 1
MarketShare confidential and proprietary 75
Features Are Assembled Into An Equation
TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV
1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 80922371/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 36910621/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 4210211/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825
1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004
1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270
1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412
1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058
1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751
FLATTENED DENORMALIZED DATA
Model Equation LOG(GQV_PD + 1) := TV_PRD_IM_LOGC*C(1) + TV_LOCAL_PRD_SP_LOGC*C(2)
1
MarketShare confidential and proprietary 76
Corresponding Coefficients Are Estimated
TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV COEFF_VALUE
1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 80922370.045756241
1/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 3691062 0.01985766
1/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 4210210.007270448
1/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825 0.027113343
1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004 0.027113343
1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270 0.027113343
1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412 0.027113343
1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058 0.027113343
1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751 0.027113343
Model Equation
FLATTENED DENORMALIZED DATA
LOG(GQV_PD + 1) := TV_PRD_IM_LOGC*C(1) + TV_LOCAL_PRD_SP_LOGC*C(2)
1
MarketShare confidential and proprietary 77
Calculations Are Matrix Operations On The Model And The Coefficients
0.045756241
0.01985766
0.007270448
0.027113343
PRD=COMP & GEO = GL
PRD=DIGI & GEO = GL
PRD=GAME & GEO = GL
PRD=COMP & GEO = MW
PRD=DIGI & GEO = MW
PRD=GAME & GEO = MW
C1, PRD=COMP
C1, PRD=DIGI
C1, PRD=GAME
C2, PRD=ALL
GQV_PD
8091825
3684004
414270
412
7058
6751
TV_PRD_IM TV_PRD_IM TV_PRD_IM TV_LOCAL_PRD_SP
176843600 0 0 4679.25
0 185465400 0 40392
0 0 56838300 37986.5
176843600 0 0 15206.5
0 185465400 0 260338
0 0 56838300 249003.3
Model Input Coeff. Stack Outcome
The number of columns is the number of coefficients and the number of rows is the number of distinct combinations of dimensions
1
MarketShare confidential and proprietary 78
Impact of increasing spend by 10% to total spend in 3rd Quarter?
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
FACTGeo DIM
Product DIM
Time DIM
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_3 ALL ALL 419147300 607605.5TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
TID_3 PID_1 GID_1 0 4679.25
TID_3 PID_2 GID_1 0 40392
TID_3 PID_3 GID_1 0 37986.5
TID_3 PID_1 GID_2 0 15206.5
TID_3 PID_2 GID_2 0 260338
TID_3 PID_3 GID_2 0 249003.25 1
MarketShare confidential and proprietary 79
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_3 ALL ALL 419147300 668366.05TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
TID_3 PID_1 GID_1 0 5147.18
TID_3 PID_2 GID_1 0 44431.2
TID_3 PID_3 GID_1 0 41785.15
TID_3 PID_1 GID_2 0 16727.15
TID_3 PID_2 GID_2 0 286371.8
TID_3 PID_3 GID_2 0 273903.6
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
FACTGeo DIM
Product DIM
Time DIM
Distribute : Distribute The Spend At The Level Where The Model Is Defined
1
MarketShare confidential and proprietary 80
TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$)7/1/2015 ALL ALL M_TV_N_BRD_SP 419147300 668366.05
7/1/2015 COMP ALL M_TV_N_PD_IM 176843600
7/1/2015 DIGI ALL M_TV_N_PD_IM 185465400
7/1/2015 GAME ALL M_TV_N_PD_IM 56838300
7/1/2015 COMP GL M_TV_L_P_SP 0 5147.18
7/1/2015 DIGI GL M_TV_L_P_SP 0 44431.2
7/1/2015 GAME GL M_TV_L_P_SP 0 41785.15
7/1/2015 COMP MW M_TV_L_P_SP 0 16727.15
7/1/2015 DIGI MW M_TV_L_P_SP 0 286371.8
7/1/2015 GAME MW M_TV_L_P_SP 0 273903.6
Calculate : Run The Model
TV_PRD_IM TV_PRD_IM TV_PRD_IM TV_LOCAL_PRD_SP
176843600 0 0 5147.18
0 185465400 0 44431.2
0 0 56838300 41785.15
176843600 0 0 16727.15
0 185465400 0 286371.8
0 0 56838300 273903.6
GQV_PD
88091838
3684114
414372.8
453.23
7763.86
7426.13 1
MarketShare confidential and proprietary 81
TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV
TID_3 ALL ALL 419147300 668366.05 92205968TID_1 PID_1 ALL 176843600 19885.75 8092237TID_1 PID_2 ALL 185465400 300730 3691062TID_1 PID_3 ALL 56838300 286989.8 421021TID_1 PID_1 GID_1 0 4679.25 8091825
TID_1 PID_2 GID_1 0 40392 3684004
TID_1 PID_3 GID_1 0 37986.5 414270
TID_1 PID_1 GID_2 0 15206.5 412
TID_1 PID_2 GID_2 0 260338 7058
TID_1 PID_3 GID_2 0 249003.25 6751
TID_3 PID_1 GID_1 0 5147.18 88091838
TID_3 PID_2 GID_1 0 44431.2 3684114
TID_3 PID_3 GID_1 0 41785.15 414372.8
TID_3 PID_1 GID_2 0 16727.15 453.23
TID_3 PID_2 GID_2 0 286371.8 7763.86
TID_3 PID_3 GID_2 0 273903.6 7426.13
ID CODE
PID_1 COMP
PID_2 DIGI
PID_3 GAME
ID CODE
GID_1 GL
GID_2 MW
ID CODE
TID_1 1/1/2015
TID_2 4/1/2015
TID_3 7/1/2015
FACTGeo DIM
Product DIM
Time DIM
Aggregate : Increase the Total Marketing Spend by 10% Spend
1
MarketShare confidential and proprietary 82
Publisher
Marketing Driver
Tactic
Creative Concept
Geo
Time
Campaign
Year Quarter Month Week
Cube
Product
Non-Marketing Driver
Coupon redemptionDiscountsMacro economicsPricingTabsWeathereCircular
MeasuresPlacement
Online Media Channel
Offline Media Channel
Social Paid Social OtherAffiliateDisplay Mobile Video Desktop OtherPaid Search Branded Non BrandedEmailAudio
MagazineRadioTVLeadsTabProduct ListingServices Directory
866
1590948
23910
336
15790
28
82
156
33
23
Assists OrdersRevenueClicks Events ImpressionsSpend
Last touches Click converting rate Converting click
131M cross sections, 10 KPIs
Dimension
1
MarketShare confidential and proprietary 83
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
ETLScenario
Analysis & Reporting
CORE
MarketShare confidential and proprietary 84
STACK GENERATION MODELING SCORING & ATTRIBUTION
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
MarketShare confidential and proprietary 85
COEFFSTACK COEFFSTACKTRANSVAR MODELING ATTRSTACK SCORING EVENT
ATTRIBUTION
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
MarketShare confidential and proprietary 86
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
ETLPOST
PROCESSING
CORE
MarketShare confidential and proprietary 87
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
MARKETING EVENTS IN LOOKBACK PERIOD
RECENT CONVERSIONS
DATA:
• ELBIS: Event table• Contains marketing data.
• ALBIS : Activity table• Conversions/Purchases
MarketShare confidential and proprietary 88
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
MARKETING EVENTS IN LOOKBACK PERIOD
RECENT CONVERSIONS
DATA:
• ELBIS: Event table• Contains marketing data.
• ALBIS : Activity table• Conversions/Purchases
USER SEQUENCE:
• User history ending at conversion/purchase• Marketing events shown to user• Past conversions/purchases
MarketShare confidential and proprietary 89
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
MARKETING EVENTS IN LOOKBACK PERIOD
RECENT CONVERSIONS
ATTRIBUTION:
• Share of each marketing event in a sequence in deriving the conversion
DATA:
• ELBIS: Event table• Contains marketing data.
• ALBIS : Activity table• Conversions/Purchases
USER SEQUENCE:
• User history ending at conversion/purchase• Marketing events shown to user• Past conversions/purchases
MarketShare confidential and proprietary 90
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
ETLPOST
PROCESSING
CORE
MarketShare confidential and proprietary 91
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
MarketShare confidential and proprietary 92
COEFFSTACK COEFFSTACKTRANSVAR MODELING ATTRSTACK SCORING EVENT
ATTRIBUTION
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
ATTRIBUTION
MarketShare confidential and proprietary 93
MODEL.JSON
SYSTEMSETDEFINITION_GLOBAL
CLIENTSETDEFINITION.XML
<STAGE>_CONFIG.XML
CLIENTPARAMDEFINITION.XML
SYSTEMPARAMDEFINITION_GLOBAL
<SCHEMA>.XML
SYSTEMSETDEFINITION
<SCHEMA>.XML
<STAGE>_PARAMETERS.XML
COEFFSTACK COEFFSTACKTRANSVAR MODELING ATTRSTACK SCORING EVENT
ATTRIBUTION
CONFIG/INPUTS/
CONFIG/SCHEMA_MAPPING/
COMMON/GLOBALCONFIG/
CONFIG/TEMP/
MarketShare confidential and proprietary 94
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
STACKMETRICS:
• Aggregated User behaviour information• Aggregated marketing info per
sequence• Aggregated past
conversion/purchase info• Serves as base for variables for
Model building• Calculated for each sequence
MarketShare confidential and proprietary 95
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
RECENT CONVERSIONS
MARKETING EVENTS IN LOOKBACK PERIOD
RESPONSE
...........
1
1
1
1
...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
STACKMETRICS:
• Aggregated User behaviour information• Aggregated marketing info per
sequence• Aggregated past
conversion/purchase info• Serves as base for variables for
Model building• Calculated for each sequence
MarketShare confidential and proprietary 96
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
NULL UUID4 NULL
NULL UUID5 NULL
........... ........... ...........
INCLUDING NON CONVERTED USERD
MARKETING EVENTS IN LOOKBACK PERIOD
RESPONSE
...........
1
1
1
1
...........
0
0
...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
USER POOL:
• Collection of users used in model building• Converted Users• Non-converted Users
STACKMETRICS:
• Aggregated User behaviour information• Aggregated marketing info per
sequence• Aggregated past
conversion/purchase info• Serves as base for variables for
Model building• Calculated for each sequence
MarketShare confidential and proprietary 97
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
NULL UUID4 NULL
NULL UUID5 NULL
........... ........... ...........
USER-ACTIVITY POOL
MARKETING EVENTS IN LOOKBACK PERIOD
NUM_IM NUM_CL NUM_AFCL NUM_PS USERID
........... ........... ........... ........... ...........
2 0 0 0 UUID0
5 1 2 2 UUID1
1 0 0 3 UUID2
1 0 0 3 UUID2
........... ........... ........... ........... ...........
x x x x UUID4
x x x x UUID5
........... ........... ........... ........... ...........
STACKMETRICS
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
RESPONSE
...........
1
1
1
1
...........
0
0
...........
MarketShare confidential and proprietary 98
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
NULL UUID4 NULL
NULL UUID5 NULL
........... ........... ...........
RECENT CONVERSIONS
MARKETING EVENTS IN LOOKBACK PERIOD
(NUM_IM + NUM_CL ) LOG( 1 + NUM_AFCL) LOG( 1 +
NUM_PS ) USERID
........... ........... ........... ...........
2 LOG(1+0) LOG(1+0) UUID0
6 LOG(1+2) LOG(1+2) UUID1
1 LOG(1+0) LOG(1+3) UUID2
1 LOG(1+0) LOG(1+3) UUID2
........... ........... ........... ...........
x x x UUID4
x x x UUID4
........... ........... ........... ...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
RESPONSE
...........
1
1
1
1
...........
0
0
...........
TRANSVAR
TRANSVAR:
• Universe of all model features• Transforms on stackmetrics• Input to GA, automated Model Selection • The selected Model will have a subset these features
MarketShare confidential and proprietary 99
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
NULL UUID4 NULL
NULL UUID5 NULL
........... ........... ...........
RECENT CONVERSIONS
MARKETING EVENTS IN LOOKBACK PERIOD
(NUM_IM + NUM_CL ) LOG( 1 + NUM_AFCL) LOG( 1 +
NUM_PS ) USERID
........... ........... ........... ...........
2 LOG(1+0) LOG(1+0) UUID0
6 LOG(1+2) LOG(1+2) UUID1
1 LOG(1+0) LOG(1+3) UUID2
1 LOG(1+0) LOG(1+3) UUID2
........... ........... ........... ...........
x x x UUID4
x x x UUID4
........... ........... ........... ...........
TRANSVAR
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
𝑷 (𝑪𝒐𝒏𝒗 )= 𝟏
𝟏+𝐞𝐱 𝐩(−∑𝒊=𝟎𝒏𝜶𝒊 𝒙 𝒊)
M : a1*x1 + a2*x2
x1= (NUM_IM + NUM_CL )x2= LOG( 1 + NUM_PS )
RESPONSE
RESPONSE
...........
1
1
1
1
...........
0
0
...........
MarketShare confidential and proprietary 100
COEFFSTACK COEFFSTACKTRANSVAR MODELING
INPUT CONFIGS
GENERATED CONFIGS
MODEL BUILDING
MODEL.JSON
MarketShare confidential and proprietary 101
ATTRSTACK SCORING EVENT ATTRIBUTION
INPUT CONFIGS
GENERATED CONFIGS
ATTRIBUTION
MODEL.JSON
MarketShare confidential and proprietary 102
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
RECENT CONVERSIONS
MARKETING EVENTS IN LOOKBACK PERIOD
NUM_IM NUM_CL NUM_AFCL NUM_PS USERID
........... ........... ........... ........... ...........
2 0 0 0 UUID0
5 1 2 2 UUID1
1 0 0 3 UUID2
1 0 0 3 UUID2
........... ........... ........... ........... ...........
STACKMETRICS
RESPONSE
...........
1
1
1
1
...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
MarketShare confidential and proprietary 103
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U0A1 UUID0 199
U1A1 UUID1 100
U2A1 UUID2 249
U2A2 UUID2 100
........... ........... ...........
RECENT CONVERSIONS
MARKETING EVENTS IN LOOKBACK PERIOD
(NUM_IM + NUM_CL )
LOG( 1 + NUM_PS )
........... ...........
2 LOG(1+0)
6 LOG(1+2)
1 LOG(1+3)
1 LOG(1+3)
........... ...........
TRANSVAR
RESPONSE
...........
1
1
1
1
...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
RESPONSE
FRAC_x1 FRAC_x2
........... ...........
0.35 0.65
0.46 0.54
0.57 0.43
0.39 0.61
........... ...........
MODELATTRIBUTION
𝑷 (𝑪𝒐𝒏𝒗 )= 𝟏
𝟏+𝐞𝐱 𝐩(−∑𝒊=𝟎𝒏𝜶𝒊 𝒙 𝒊)
M : a1*x1 + a2*x2
x1= (NUM_IM + NUM_CL )x2= LOG( 1 + NUM_PS )
MarketShare confidential and proprietary 104
EVENT TYPE EVTUUID USERID
........... ........... ...........
IM U0E1 UUID0
IM U0E2 UUID0
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
IM U2E1 UUID2
PS U2E2 UUID2
PS U2E3 UUID2
PS U2E4 UUID2
........... ........... ...........
ACTUUID USERID REVENUE
........... ........... ...........
U1A1 UUID1 100
........... ........... ...........
MARKETING EVENTS IN LOOKBACK PERIOD
TRANSVAR
RESPONSE
...........
1
...........
COEFFSTACK
COEFFSTACK
TRANSVARMODELING ATTRSTACK SCORING
EVENT ATTRIBUTIO
N
GENERATED CONFIGS
INPUT CONFIGS
x FRAC_x FRACATTR
...... ......
...... ......
x1 0.46 .0766
x1 0.46 .0766
x1 0.46 .0766
...... ......
x2 0.54 0.27
x2 0.54 0.27
x1 0.46 .0766
x1 0.46 .0766
x1 0.46 .0766
...... ......
EVENT ATTRIBUTION
𝑷 (𝑪𝒐𝒏𝒗 )= 𝟏
𝟏+𝐞𝐱 𝐩(−∑𝒊=𝟎𝒏𝜶𝒊 𝒙 𝒊)
EVENT TYPE EVTUUID USERID
........... ........... ...........
PS U1E1 UUID1
IM U1E2 UUID1
IM U1E3 UUID1
IM U1E4 UUID1
PS U1E5 UUID1
AFCL U1E6 UUID1
AFCL U1E7 UUID1
IM U1E8 UUID1
IM U1E9 UUID1
CL U1E10 UUID1
........... ........... ...........
M : a1*x1 + a2*x2
x1= (NUM_IM + NUM_CL )x2= LOG( 1 + NUM_PS )
MarketShare confidential and proprietary 110
BaaS
Template Inputs
Orchestrator Scripts
stack_gen
stack_gen_config.xml
RunGET
PUT <set name="EVT_MARKETING_SEG_DIMS"><!-- Client Input --> <!-- Event Segmentation Dimensions are determined before the LBIS is created.
These are determined during the client discovery phase based on the campaign strategy of the client -->
<description>ELBIS columns defining event segments</description><attr>channel_type</attr><attr>evt_sub_channel</attr>
</set>
<set name="STCKGEN_DAYSRANGE_EVTSTACK"><description>List of time buckets used as T(max)_(min) for event stack
terms</description><elements>
<min>1</min><max>2</max>
</elements><elements>
<min>3</min><max>10</max>
</elements><elements>
<min>11</min><max>21</max>
</elements><elements>
<min>22</min><max>35</max>
</elements></set>
2
MarketShare confidential and proprietary 111
BaaS
ConfigGenerator
Template Inputs
Orchestrator Scripts
stack_gen.sh
stack_gen_config.xml
Phase Parameters
stack_parameters.xml
RunGET
PUT
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Social”</value></elements><elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements> </set>
2
MarketShare confidential and proprietary 112
BaaS
ConfigGenerator
Template Inputs
Orchestrator Scripts
stack_gen.sh
Phase Parameters
Phase Logic
stack_gen_config.xml
stack_parameters.xml
stack_transforms.xml
RunGET
PUT
<param>NUM_METRICS_EVTSTACK</param>
<xprod order="1" typekey="hasdonebefore" groupkey="event" otherskey="0"> <expras> <alias> StackMetric_HasDoneBefore_Event <xprod order="2"> _ <xprodlist order="2"> <xprodlist order="1"> EVT_MARKETING_SEG_DIMS_SET <element> <element>value</element> </element> </xprodlist> </xprodlist> </xprod> </alias> ..................... ..................... ..................... ..................... </expras> </xprod>
<xprod order="1" typekey="num" groupkey="event" otherskey="0"> <expras> <alias> StackMetric_Num_Event <xprod order="2"> _<xprodlist order="2"> <xprodlist order="1"> EVT_MARKETING_SEG_DIMS_SET <element> <element>value</element> </element> </xprodlist> </xprodlist> </xprod>_T<xprodlist order="1"> STCKGEN_DAYSRANGE_EVTSTACK <element>max</element> </xprodlist>_<xprodlist order="1"> STCKGEN_DAYSRANGE_EVTSTACK <element>min</element> </xprodlist> </alias> ..................... ..................... ..................... ..................... </expras> </xprod>
Phase Transform: https://git.marketshare.com:8443/projects/DIG/repos/msaction_backend/browse/common/BU3.0_core/stages/coeffstack/coeffstack_transforms.xml?at=refs%2Fheads%2FautomatedmodelingPhase Parameters: https://git.marketshare.com:8443/projects/DIG/repos/msaction_backend/browse/common/setup/config/globalconfig/systemparamdefinitions_global.xml?at=refs%2Fheads%2FautomatedmodelingConfig Generator: https://git.marketshare.com:8443/projects/DIG/repos/msaction_backend/browse/common/BU3.0_core/util/configgenerator?at=refs%2Fheads%2Fautomatedmodeling
2
MarketShare confidential and proprietary 113
BaaS
CREATE TABLE stackevtterms ASSELECT...... AS stackmetric_hasdonebefore_event_display_onlinedisplay,...... AS stackmetric_hasdonebefore_event_paidsearch_brand,...... AS stackmetric_hasdonebefore_event_paidsocial_unknown,...... AS stackmetric_hasdonebefore_event_organicsearch_unknown,...... AS stackmetric_hasdonebefore_event_display_directresponsevideo, ...... AS stackmetric_num_event_display_onlinedisplay_t2_1,...... AS stackmetric_num_event_display_onlinedisplay_t10_3,...... AS stackmetric_num_event_display_onlinedisplay_t21_11,...... AS stackmetric_num_event_display_onlinedisplay_t35_22,...... AS stackmetric_num_event_paidsearch_brand_t2_1,...... AS stackmetric_num_event_paidsearch_brand_t10_3,...... AS stackmetric_num_event_paidsearch_brand_t21_11,...... AS stackmetric_num_event_paidsearch_brand_t35_22,...... AS stackmetric_num_event_paidsocial_unknown_t2_1,...... AS stackmetric_num_event_paidsocial_unknown_t10_3,...... AS stackmetric_num_event_paidsocial_unknown_t21_11,...... AS stackmetric_num_event_paidsocial_unknown_t35_22,...... AS stackmetric_num_event_organicsearch_unknown_t2_1,...... AS stackmetric_num_event_organicsearch_unknown_t10_3,...... AS stackmetric_num_event_organicsearch_unknown_t21_11,...... AS stackmetric_num_event_organicsearch_unknown_t35_22,...... AS stackmetric_num_event_display_directresponsevideo_t2_1,...... AS stackmetric_num_event_display_directresponsevideo_t10_3,...... AS stackmetric_num_event_display_directresponsevideo_t21_11,...... AS stackmetric_num_event_display_directresponsevideo_t35_22,......FROM......
ConfigGenerator
CodeGenerator
Template Inputs
Orchestrator Scripts
stack_gen.sh
stack_parameters.xml
stack_transforms.xml
Phase Parameters
Phase Logicstack.hql
stack_gen_config.xml
RunGET
PUT
2
MarketShare confidential and proprietary 116
ETL
AttributionFunnel creation
ETLcustomer Data
(Raw Files)
Modeler
Attribution Models
Tool
Strategist
Elastic Load Balancer
Application Engines
Modelling Spec• Initial set of
variables• Business
rules
Model
Build N Choice Models
Model Evaluation and Ranking• Business Rules• Goodness of Fit• Statistical diagnosis
FinalizedModel
GA Iteration• Top models cross –over to
generate offspring models• Variable mutation
• Bayesian priors• Coeff bounds• Attribution bounds
𝑃 (𝐶𝑜𝑛𝑣 )=1
1+exp (−∑𝑖=0
𝑛
α 𝑖𝑥 𝑖)
Attribution
EnhancedStack
Automated Model Search
2
MarketShare confidential and proprietary 119
stack_parameters.xml
stack_transforms.xml
stack.hql
CodeGenerator
BaaS
ConfigGenerator
stack_gen.sh
<set> <name>EVT_MARKETING_SEG_DIMS_SET</name> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Display”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements> </elements> <elements> <elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements> <elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements> </elements></set>
<select> ........ <param>METRICS_EVTSTACK</param> ........</select>........
<set name="EVT_MARKETING_SEG_DIMS"> <attr>channel_type</attr> <attr>evt_sub_channel</attr></set>
#!/bin/bash.................cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc`.................
OR
CH
ES
TRAT
ION
RUN
GET
PUT
ACTION BACKEND
1
3
4
5stack_gen_config.xml
2
<param>METRICS_EVTSTACK</param><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
CREATE TABLE stackevtterms ASSELECT.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,.. AS stackmetric_hasdonebefore_event_paidsearch_brand,.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,.. AS stackmetric_num_event_display_onlinedisplay,.. AS stackmetric_num_event_display_directresponsevideo,.. AS stackmetric_num_event_paidsearch_brand,.. AS stackmetric_num_event_organicsearch_unknown,.. AS stackmetric_gsm_event_display_onlinedisplay,.. AS stackmetric_gsm_event_display_directresponsevideo,.. AS stackmetric_gsm_event_paidsearch_brand,.. AS stackmetric_gsm_event_organicsearch_unknown,......FROM......
<set name="EVT_STACKMETRICS"> <val enable=“TRUE">HASDONEBEFORE</val> <val enable=“TRUE">NUM</val> <val enable=“TRUE">GSM</val></set><set name= “GSM_STEP"> <val>3</val></set>
<param>METRICS_EVTSTACK</param>
<metric> <name>NUM</name> <defn>.....</defn></metric><metric> <name>HASDONEBEFORE</name> <defn>.....</defn></metric><metric> <name>GSM</name> <defn>.....</defn></metric>
Input Config
Adding new metric
#!/bin/bash.................cmdstatus=`hive –f $PHASE_NAME.hql`.................
6
Generated Config
MarketShare confidential and proprietary 120
<transform>T14: stackevtterms_(<STCKGEN_RP><SDATE>i) -- This query calculating all the NUM event metrics (stack variables or stack metrics).-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name 1. DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);2. CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) AS3. SELECT4. <NUM_METRICS_EVTSTACK>5. <OTHER_METRICS_EVTSTACK>6. t2.rp_start AS rp_start,7. t2.rp_end AS rp_end,8. t1.userid AS userid,9. t1.actuuid AS actuuid,10.t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,11.t1.refdate_rp AS refdate_rp,12.t1.response AS response13.FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN elbis_(<STCKGEN_RP><SDATE>i) t214.ON (t1.userid = t2.userid)15.WHERE (16.(t1.sample_flag = 1))17.GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1. (<STCKGEN_RP><SDATE>i), t1.refdate_rp,
t1.response18.;
MarketShare confidential and proprietary 121
<param>COALESCE(SUM((case when (
t2.evt_event_type IN ('IM') AND
(TO_DATE(t2.evt_time) >= (DATE_SUB(TO_DATE(t1.refdate_rp), (2 - 1))) AND
(TO_DATE(t2.evt_time) <= (DATE_SUB(TO_DATE(t1.refdate_rp), (1 - 1))))) THEN 1.0 else 0.0 END)), 0)
AS num_im_t2_1,
COALESCE(SUM((case when (t2.evt_event_type IN (<xprodlist>STCKGEN_EVTTYPE</xprodlist>) AND (TO_DATE(t2.evt_time) >= (DATE_SUB(TO_DATE(t1.refdate_rp),
(<xprodlist>STCKGEN_DAYSRANGE_ACTSTACK<element>max</element></xprodlist> - 1))) AND(TO_DATE(t2.evt_time) <= (DATE_SUB(TO_DATE(t1.refdate_rp),
((<xprodlist>STCKGEN_DAYSRANGE_ACTSTACK<element>min</element></xprodlist> - 1))))) THEN 1.0 else 0.0 END)), 0) AS num_<xprodlist>STCKGEN_EVTTYPE</xprodlist>_t<xprodlist>STCKGEN_DAYSRANGE_ACTSTACK<element>max</element></xprodlist>_<xprodlist>STCKGEN_DAYSRANGE_ACTSTACK<element>min</element></xprodlist>,
MarketShare confidential and proprietary 122
COALESCE(SUM((case when (t2.evt_event_type IN (<xprodlist>STCKGEN_EVTTYPE</xprodlist>) AND (TO_DATE(t2.evt_time) >= (DATE_SUB(TO_DATE(t1.refdate_rp),
(<xprodlist>STCKGEN_DAYSRANGE_EVTSTACK<element>max</element></xprodlist> - 1))) AND(TO_DATE(t2.evt_time) <= (DATE_SUB(TO_DATE(t1.refdate_rp),
((<xprodlist>STCKGEN_DAYSRANGE_EVTSTACK<element>min</element></xprodlist> - 1))))) THEN 1.0 else 0.0 END)), 0) AS num_<xprodlist>STCKGEN_EVTTYPE</xprodlist>_t<xprodlist>STCKGEN_DAYSRANGE_EVTSTACK<element>max</element></xprodlist>_<xprodlist>STCKGEN_DAYSRANGE_EVTSTACK<element>min</element></xprodlist>,
Parameterization*<!-- List of all events --><set><name>STCKGEN_EVTTYPE</name><val>'IM'</val><val>'CL'</val><val>'PS'</val><val>'OS'</val><val>'EO'</val><val>'EC'</val><val>'ED'</val><val>'AFCL'</val><val>'SOCL'</val><val>'DR'</val></set>
<!-- List of T<max>_<min> ranges for event stack terms --><set><name>STCKGEN_DAYSRANGE_EVTSTACK</name><elements>
<min><val>1</val></min><max><val>2</val></max></elements><elements>
<min><val>3</val></min><max><val>7</val></max></elements><elements>
<min><val>8</val></min><max><val>14</val></max></elements><elements>
<min><val>15</val></min><max><val>21</val></max></elements><elements>
<min><val>22</val></min><max><val>49</val></max></elements></set>
50 different columnswill be generated from this
single expression
10 event types
5 time buckets
Top Related