Strommen Tutoring - University of California,...

60
Final Presentation May 3, 2013 Strommen Tutoring Ian Herbert Ben Vibhagool Joncarlo Putman Link Doucedame Marlen Knusel Max Eady Group #7

Transcript of Strommen Tutoring - University of California,...

Final Presentation

May 3, 2013

Strommen Tutoring

Ian Herbert

Ben Vibhagool

Joncarlo Putman

Link Doucedame

Marlen Knusel

Max Eady

Group #7

Garrett Strommen President/Founder

Strommen Tutoring: private language

tutoring company

Los Angeles, CA

Founded 2009

2 full-time Employees

Over 70 contracted Teachers

Over 1,700 students

Clients include Sylvester Stallone’s

kids Garrett StrommenPresident/Founder

Organization Overview.

Provided Services:

Group Classes:Recurring classes scheduled for certain times

in certain locations.

Private Tutoring:One-on-one or small group tutoring.

Translation Services:Translations for documents or books.

Organization Overview.

Track Company Transactions

Observe Company Operations

Manage Marketing Campaigns

Analysis and Reporting

Existing Database Support:

MySQL web database and PHP

back end

New Database Support:

Microsoft Access

Project Goals.

Project Progress.

DP 1• Preliminary database design

• Simplified EER Diagram (25 entities)

• Familiarization with the Strommen

business model

DP 2• 5 Queries conceptualized

• EER expanded, a lot

• Relational design schema developed

• Relations implemented in Access

Project Progress.

DP 3• Major query revision

• Complexities added with AMPL and sub-queries

• Query implementation in SQL

• Relations defined in database

Final Result• Queries implemented in Access

• Practical forms and reports created

• Relationships altered slightly

• Normalization analysis

eer diagram.

eer diagram.

relational schema.

Relational Schema.

1. Person (PersonID, FName, LName, MInitial, Address, Phone, Email, Gender, BirthDate)

2. Prospect (ProspectID, PersonID1, ContactMethod, DateAdded)

3. EmployeeProspect (EmployeeProspectID, ProspectID2, YearsExperience, Skills, Position)

4. TeacherProspect (TeacherProspectID, ProspectID2, YearsExperience, Occupation)

5. ContractorProspect (ContractorProspectID, ProspectID2, YearsExperience, CompanyName ContractorType)

6. ClientProspect (ClientProspectID, ProspectID2, EventID17, ZipCode14)

7. Contractor (ContractorID, PersonID1, ContractorProspectID5, CompanyName, ContractorType)

8. Employee (EmployeeID, PersonID1, EmployeeProspectID3, Skills, Position, StartDate, Salary,

BankAccount)

9. Client (ClientID, PersonID1, EventID17, ClientProspectID6, ZipCode14, ReferredByID9)

10. Teacher (TeacherID, PersonID1, TeacherProspectID4)

11. TeacherEvaluation (TeacherEvaluationID, TeacherID10, ClientID9, TopicID21, EvaluationDate,TeacherRating, Comments)

12. Reward (RewardID, ClientID9, RewardType, Description, DateAwarded, Received)

13. Translation (TranslationID, InitialLanguage, FinalLanguage, ClientID9, EstCost, FinalCost, StartDate,FinishDate, TeacherID10,

ContractorID7)

14. ZipCode (ZipCode, City, Description, Latitude, Longtitude)

15. TimeSlot (TimeSlotID, TimeSlotDate, TimeSlotTime)

16. Campaign (CampaignID, CampaignName, StartDate, EndDate, Budget, EmployeeID8)

17. Event (EventID, CampaignID16, Date, Cost, Location, ZipCode14)

18. Bonus (BonusID, Amount, EmployeeID8, TeacherID10, DateEarned)

19. ContractJob (ContractJobID, ContractJobName, Description, EstCost, FinalCost, ContractorID7,StartDate, FinishDate)

20. ContractItem (ContractItemID, ContractJobID19, Description, EstHours, FinalHours, HourlyRate,EstCost, FinalCost)

Relational Schema Continued.

21. Topic (TopicID, TopicName, TopicType, Description)

22. Session (SessionID, SessionName, ExperienceLevel29, TopicID21, TeacherID10, ZipCode14, PrivateSessionID23, GroupClassID24)

23. PrivateSession (PrivateSessionID,TravelDistance, AtClientHome)

24. GroupClass (GroupClassID, Location, Capacity)

25. SessionMaterial (SessionMaterialID, SessionID22, MaterialName, Cost, Description, Required, Provided)

26. Transaction (TransactionID, Amount, OpenDate, CloseDate, Paid, WithdrawalID28, DepositID27)

27. Deposit (DepositID, SessionID22, TranslationID13, ClientID9)

28. Withdrawal (WithdrawalID, CampaignID16, BonusID18, SessionID22, TranslationID13, ContractJobID19, EmployeeID8)

29. ExperienceLevel (ExperienceLevelID, Level, Description)

30. Event_TimeSlot (EventID17, TimeSlotID15)

31. Session_Client (SessionID22, ClientID9)

32. Session_TimeSlot (SessionID22, TimeSlotID15)

33. Teacher_TimeSlot (TeacherID10, TimeSlotID15)

34. Teacher_Topic (TeacherID10, TopicID21)

35. Teacher_ZipCode (TeacherID10,ZipCodeID14)

36. TeacherProspect_Topic (TeacherProspectID4, TopicID21)

37. TeacherProspect_ZipCode (TeacherProspectID4, ZipCode14)

38. Client_Topic (ClientID9, TopicID21)

39. ClientProspect_Topic (ClientProspectID6, TopicID21)

Relationship Diagram.

normalization.

Normalization.

12. Reward (RewardID, ClientID9, RewardType, Description,

DateAwarded, Received)

Functional Dependencies

{RewardID} {RewardType}

{RewardID, ClientID} {DateAwarded, Received, Description}

Normalization.

1NF There are no multi-valued attributes.

Reward (RewardID, ClientID9, RewardType, Description,

DateAwarded, Received)

2NF & 3NF & BCNF

RewardType (RewardID, RewardType)

RewardInfo (RewardID, ClientID9, Description, DateAwarded,

Received)

Normalization.

25. SessionMaterial (SessionMaterialID, SessionID22,

MaterialName, Cost, Description, Required, Provided)

Functional Dependencies

{SessionMaterialID}{MaterialName}

{SessionMaterialID, SessionID} {Required, Provided}

{MaterialName} {Cost, Description}

Normalization.

1NF Currently in 1NF because there are no multi-valued

attributes.

SessionMaterial (SessionMaterialID, SessionID22,

MaterialName, Cost, Description, Required, Provided)

2NF Removes partial dependencies that non-prime attributes

have on the Candidate Key

SessionMaterial (SessionMaterialID, SessionID22, Required,

Provided)

Material (SessionMaterialID, MaterialName, Cost,

Description)

Normalization.

3NF Removes transitive dependency from Material Name

determining cost and description

SessionMaterial (SessionMaterialID, SessionID22, Required,

Provided)

Material (SessionMaterialID, MaterialName)

MaterialInfo (MaterialName, Cost, Description)

This is also in BCNF!

queries.

Query 1:

ROI for Marketing Campaigns

Objective: Determine the most cost effective

marketing campaigns and events in terms of

Return on Investment.

?

?

High ROI

Campaigns

Profitable Clients

Query 1:

ROI for Marketing Campaigns

Purpose: Marketing Campaigns with the

highest ROI are the most cost effective. ROI

analysis would allow Strommen Tutoring to

replicate the design of effective campaigns to

better target profitable clients.

Query 1:

ROI for Marketing Campaigns

Campaign Events Clients Net Profit

Campaign CostEvents

Query: Select the name, cost, revenue, and ROI for

each Marketing Campaign.

Campaign Cost

Net Profit

Query 1:

ROI for Marketing Campaigns

SELECT RevenueQuery.Campaign AS CampaignID, RevenueQuery.CampName AS CampaignName,

CostQuery.Cost AS TotalCost, RevenueQuery.Revenue AS TotalRevenue, IIF(TotalCost >0,

TotalRevenue*100 / TotalCost & "%",0 & "%") AS ROI

FROM (

SELECT c.CampaignID AS Campaign, c.CampaignName AS CampName,

IIF(SUM(dt.amount)>0, 0.3*SUM(dt.amount),0) AS Revenue

FROM Campaign AS c, Event AS e, Client AS cl, Session_Client AS sc, [Session] AS s,

(SELECT * FROM Deposit AS d, [Transaction] AS t1 WHERE

d.DepositID = t1.DepositID) AS dt

WHERE dt.Paid = TRUE AND c.CampaignID = e.CampaignID AND e.EventID = cl.EventID AND

cl.ClientID = sc.ClientID AND sc.SessionID = s.SessionID AND dt.SessionID =s.SessionID

AND dt.ClientID = cl.ClientID

GROUP BY c.CampaignID, c.CampaignName) AS RevenueQuery,

(SELECT c2.CampaignID AS Campaign, IIF(SUM(wt.Amount) > 0, SUM(wt.Amount), 0) AS Cost

FROM Campaign AS c2,

(SELECT * FROM Withdrawal AS w, [Transaction] AS t2 WHERE

w.WithdrawalID=t2.WithdrawalID) AS wt

WHERE wt.CampaignID = c2.CampaignID

GROUP BY c2.CampaignID) AS CostQuery

WHERE RevenueQuery.Campaign = CostQuery.Campaign

GROUP BY RevenueQuery.Campaign, RevenueQuery.CampName, CostQuery.Cost,

RevenueQuery.Revenue;

SQL>

Query 1:

ROI for Marketing Campaigns

Results:

Query 2:

Optimal Teacher Assignment

Objective: Find the best-suited teacher for each

session not yet assigned a teacher based on

Teacher Rating, Zip Code, Availability and

Topic.

Query 2:

Optimal Teacher Assignment

Purpose: To reward the most highly-rated

teachers by assigning them the most new clients,

while also providing the best possible first

impression of Strommen Tutoring.

Query 2:

Optimal Teacher Assignment

Query: Select a teacher for each session not yet

assigned a teacher that can teach for the entire

session and in the same zipcode. If there are

multiple feasible teachers, select the one with

the highest average teacher rating from teacher

evaluations.

Query 2:

Optimal Teacher Assignment

SELECT FinalQuery.SuperSession AS SessionID, (SELECT TOP 1 FName & ' ' & LName FROM

Person p, Teacher t2, Session s2, Teacher_Topic tt2, Teacher_ZipCode tzc2 WHERE s2.SessionID

= FinalQuery.SuperSession AND FinalQuery.MaxRating = (SELECT AVG(te2.TeacherRating)

FROM TeacherEvaluation te2 WHERE te2.TeacherID = t2.TeacherID) AND tzc2.ZipCode =

s2.ZipCode AND tzc2.TeacherID = t2.TeacherID AND tt2.TeacherID = t2.TeacherID AND tt2.TopicID

= s2.TopicID AND p.PersonID = t2.PersonID ORDER BY t2.TeacherID ASC) AS TeacherName,

(SELECT TOP 1 t3.TeacherID FROM Teacher t3, Session s3, Teacher_Topic tt3, Teacher_ZipCode

tzc3 WHERE s3.SessionID = FinalQuery.SuperSession AND FinalQuery.MaxRating = (SELECT

AVG(te3.TeacherRating) FROM TeacherEvaluation te3 WHERE te3.TeacherID = t3.TeacherID)

AND tzc3.ZipCode = s3.ZipCode AND tzc3.TeacherID = t3.TeacherID AND tt3.TeacherID =

t3.TeacherID AND tt3.TopicID = s3.TopicID ORDER BY t3.TeacherID ASC) AS TeacherID

FROM (

SELECT SuperQuery.SubSession AS SuperSession, Max(SuperQuery.SubRating) AS MaxRating

FROM (SELECT SubQuery.Sessions AS SubSession, SubQuery.Teachers AS SubTeacher,

(SELECT Avg(te.TeacherRating) FROM TeacherEvaluation te WHERE te.TeacherID

=SubQuery.Teachers) AS SubRating FROM (SELECT s.SessionID AS Sessions, t.TeacherID AS

Teachers FROM Teacher AS t, [Session] AS s, Teacher_ZipCode AS tzc, Teacher_Topic AS tt

WHERE t.TeacherID = tzc.TeacherID AND tt.TopicID = s.TopicID AND tt.TeacherID = t.TeacherID

AND tzc.ZipCode = s.ZipCode AND s.TeacherID Is NULL AND (SELECT COUNT(*) FROM

Session_Timeslot AS sts2 WHERE sts2.SessionID = s.SessionID) = (SELECT COUNT(*) FROM

Session_TimeSlot AS sts, Teacher_TimeSlot AS tts WHERE t.TeacherID = tts.TeacherID AND

tts.TimeSlotID = sts.TimeSlotID AND sts.SessionID = s.SessionID )) AS SubQuery) AS

SuperQuery GROUP BY SuperQuery.SubSession) AS FinalQuery;

SQL>

Query 2:

Optimal Teacher Assignment

Sessions without Teachers assigned

Query 2:

Optimal Teacher Assignment

Results

Query 3:

Reward Allocation System

Objective: Find clients who are eligible for

rewards and what rewards they are eligible for.

Query 3:

Reward Allocation System

Purpose: Rewards are a way to encourage

customer loyalty, through a variety of different

incentives that may appeal to the clients.

Rewards also draw new clients.

Query 3:

Reward Allocation System

Reward Types Examples

Shirts

Water Bottle

Sessions

After 5 Sessions

After 10 Sessions

Every 10 Sessions

Query 3:

Reward Allocation System

Query: For each client, determine the number of rewards

he/she is eligible for but has not yet been assigned.

T Shirt = 1 if n >= 5

0 otherwise

Water Bottle = 1 if n >= 10

0 otherwise

Sessions = Floor(n / 10)

n = # sessions a client has purchased

-# of Each Reward

Already Assigned

Query 3:

Reward Allocation System

SELECT Final.Client AS Client, (SELECT Fname & ' ' & LName FROM Person,Client WHERE Client.ClientID = Final.Client AND

Person.PersonID = Client.PersonID) AS ClientName, Final.WaterBottle, Final.TShirt, INT(ClientSCount/10) +RefCount-

FreeSCount AS EligibleFreeSession

FROM (

SELECT CountCalc.Client AS Client, RefCount, FreeSCount, ClientSCount, WaterBottle, TShirt

FROM (SELECT DISTINCT Refer.Client AS Client, Refer.RCount AS RefCount, IIF(ISNULL(Free.FCount),0,Free.FCount) AS

FreeSCount FROM (SELECT c1.ClientID AS Client, Count(*) AS RCount FROM Client AS c1, Client AS c2 WHERE

c1.ClientID=c2.ReferredByID GROUP BY c1.ClientID) AS Refer LEFT JOIN (SELECT DISTINCT r.ClientID AS Client,

Count(*) AS FCount FROM Reward AS r WHERE r.RewardType = 'Free Session' GROUP BY r.ClientID) AS Free ON

Refer.Client = Free.Client) AS CountCalc INNER JOIN (SELECT itemRe.Client, IIF((NeverWater > 0 AND ClientSCount >

10), 1 , 0) AS WaterBottle, IIF((NeverTShirt > 0 AND ClientSCount > 5), 1 , 0) AS TShirt, itemRe.ClientSCount FROM

(SELECT SesCount.Client

AS Client, ClientSCount, NeverWater, NeverTShirt FROM (SELECT DISTINCT ClientSessions.ClientID AS Client,

IIF(NOT EXISTS (SELECT r.ClientID FROM Reward AS r WHERE r.ClientID = ClientSessions.ClientID AND

RewardType = 'Water Bottle'), 1, 0) AS NeverWater, IIF(NOT EXISTS (SELECT r.ClientID FROM Reward AS r

WHERE r.ClientID = ClientSessions.ClientID AND

RewardType = 'T-Shirt') , 1, 0) AS NeverTShirt FROM (SELECT sc.ClientID AS ClientID FROM Client AS c INNER JOIN

Session_Client AS sc ON sc.ClientID = c.ClientID) AS ClientSessions) AS ItemCheck INNER JOIN (SELECT

ClientSessions.ClientID AS Client, Count(*) AS ClientSCount FROM (SELECT sc.ClientID AS ClientID FROM Client AS c

INNER JOIN Session_Client AS sc ON sc.ClientID = c.ClientID) AS ClientSessions GROUP BY ClientSessions.ClientID)

AS SesCount ON SesCount.Client = ItemCheck.Client) AS itemRe) AS ItemCalc ON CountCalc.Client = ItemCalc.Client)

AS Final;

SQL>

Query 3:

Reward Allocation System

Results

Query 3:

Reward Allocation System

Form Application:

Click this Button!

Query 3:

Reward Allocation System

Query updates

the # of eligible

rewards to 0 for

each Reward

Type.

New Rewards are created in the

database using Visual Basic.

Query 4: Monthly Session Process Control

Objective: Determine trends concerning the # of

sessions for different topics over the last 2 years.

Analyze the significance of outliers in the data

and actions to support or prevent these outliers.

Query 4: Monthly Session Process Control

Purpose: Help target appropriate

markets at appropriate times. Adjust

hiring practices to seasonal trends.

Query: For each topic for each month (for the

last 24 months) find the # of sessions provided

and the deviation from the average.

Query 4: Monthly Session Process Control

Query Results:

E[xi] = μ

Var[xi] = σ 2 / n

Deviation = xi – μ

for all i months for

each topic

Statistical Process Control:

X-Chart

μ ± 2σ/√n

Upper and Lower Control Limits

for 95% Probability Range

SELECT tc.TopicID, tc.TopicName, tc.Year, tc.Month, DateValue(tc.Month & "/1/" & tc.Year) AS ProcessDate, TopicCount,

AverageTopicCount, TopicCount - AverageTopicCount AS DeviationFromAverage, StandardDeviation AS SD, IIF(

StandardDeviation >0 And (DeviationFromAverage / StandardDeviation <= -2 OR DeviationFromAverage /

StandardDeviation >= 2), "Out Of Control", "In Control") AS Status

FROM (

SELECT t.TopicID AS TopicID, t.TopicName AS TopicName, Year(ts.TimeSlotDate) AS [Year], Month(ts.TimeSlotDate) AS

[Month], COUNT(s.SessionID) AS TopicCount

FROM Topic AS t, [Session] AS s, TimeSlot AS ts, Session_TimeSlot AS sts

WHERE sts.SessionID = s.SessionID AND s.TopicID = t.TopicID AND sts.TimeSlotID = ts.TimeSlotID AND

((Month(ts.TimeSlotDate) >= Month(Date()) AND Year(Date()) - Year(ts.TimeSlotDate) = 2) OR (Year(Date()) –

Year(ts.TimeSlotDate) < 2))

GROUP BY t.TopicID, t.TopicName, Year(ts.TimeSlotDate), Month(ts.TimeSlotDate)) AS tc

INNER JOIN (

SELECT TopicID, ROUND(Avg(TopicCount), 2) AS AverageTopicCount, ROUND(StDev(TopicCount),2) AS

StandardDeviation

FROM ((SELECT t.TopicID AS TopicID, Year(ts.TimeSlotDate) AS [Year], Month(ts.TimeSlotDate) AS [Month],

COUNT(s.SessionID) AS TopicCount

FROM Topic AS t, [Session] AS s, TimeSlot AS ts, Session_TimeSlot AS sts

WHERE sts.SessionID = s.SessionID AND s.TopicID = t.TopicID AND sts.TimeSlotID = ts.TimeSlotID AND

((Month(ts.TimeSlotDate) >= Month(Date()) AND Year(Date()) - Year(ts.TimeSlotDate) = 2) OR (Year(Date()) –

Year(ts.TimeSlotDate) < 2))

GROUP BY t.TopicID, Year(ts.TimeSlotDate), Month(ts.TimeSlotDate)) AS SubQuery)

GROUP BY TopicID) AS atc

ON tc.TopicID = atc.TopicID)

ORDER BY tc.TopicID, tc.Year DESC , tc.Month DESC;

Query 4: Monthly Session Process Control

SQL>

Query 4: Monthly Session Process Control

Results:

Query 4: Monthly Session Process Control

Report Application:

Query 4: Monthly Session Process Control

Report Application:

Query 5: Optimal Class Location

Objective: Find the optimal location (zip

code) for future group classes based on the

number of current clients and prospective

clients who have expressed interest in the

topic in each zip code (Set Packing Problem)

Query 5: Optimal Class Location

Purpose: Strommen Tutoring would like to

expand their group class services. Optimization

of class location will increase the probability of

high attendance (and profitability) by offering

classes in areas with more interested people.

Query 5: Optimal Class Location

Goal: Choose Best Zipcode – Topic pairs that Maximize #

Interested Clients / Prospective Clients within a given radius

12

36

7

9 4

12

8

User Defined

Parameters:

- Radius

- # of Classes

Query 5: Optimal Class Location

SELECT UnionQuery.ZipCode AS ZipCode, UnionQuery.Topic AS Topic, UnionQuery.Latitude AS Latitude,

UnionQuery.Longitude AS Longitude, SUM(UnionQuery.NumberofClients) AS NumberofClients FROM (SELECT

SubQuery.ZipCode AS ZipCode, SubQuery.Topic AS Topic, SubQue

FROM (SELECT ry.Latitude AS Latitude,SubQuery.Longitude AS Longitude , COUNT(SubQuery.Client) AS

NumberofClients ZipCodeDistances.ZipCode AS ZipCode, ZipCodeDistances.Topic AS Topic,

ZipCodeDistances.Latitude AS Latitude, ZipCodeDistances.Longitude AS Longitude, ZipCodeDistances.Client AS

Client FROM (SELECT z1.ZipCode AS ZipCode, t1.TopicID AS Topic, z1.Latitude AS Latitude, z1.Longitude AS

Longitude, c1.ClientID AS Client FROM ZipCode AS z1, Topic AS t1, Client AS c1, Client_Topic AS ct WHERE

c1.ZipCode = z1.ZipCode AND ct.TopicID = t1.TopicID AND ct.ClientID = c1.ClientID) AS ZipCodeDistances) AS

SubQuery GROUP BY SubQuery.ZipCode, SubQuery.Topic, SubQuery.Latitude, SubQuery.Longitude UNION

SELECT SubQuery2.ZipCode AS ZipCode, SubQuery2.Topic AS Topic, SubQuery2.Latitude AS Latitude,

SubQuery2.Longitude AS Longitude, COUNT(SubQuery2.ClientProspect) AS NumberofClients FROM( SELECT

ZipCodeDistances2.ZipCode AS ZipCode, ZipCodeDistances2.Topic AS Topic, ZipCodeDistances2.Latitude AS

Latitude, ZipCodeDistances2.Longitude AS Longitude, ZipCodeDistances2.ClientProspect AS ClientProspect FROM

(SELECT z3.ZipCode AS ZipCode, t2.TopicID AS Topic, z3.Latitude AS Latitude, z3.Longitude AS Longitude,

cp1.ClientProspectID AS ClientProspect FROM ZipCode AS z3, Topic AS t2, ClientProspect AS cp1,

ClientProspect_Topic AS cpt WHERE cp1.ZipCode = z3.ZipCode AND cpt.TopicID = t2.TopicID AND

cpt.ClientProspectID = cp1.ClientProspectID AND NOT EXISTS(SELECT * FROM Client AS c3 WHERE

c3.ClientProspectID = cp1.ClientProspectID)) AS ZipCodeDistances2) AS SubQuery2 GROUP BY

SubQuery2.ZipCode, SubQuery2.Topic, SubQuery2.Latitude, SubQuery2.Longitude) AS UnionQuery

GROUP BY UnionQuery.ZipCode, UnionQuery.Topic, UnionQuery.Latitude, UnionQuery.Longitude

ORDER BY SUM(UnionQuery.NumberofClients) DESC;

SQL>

Query 5: Optimal Class Location

Results (Access)

Query 5: Optimal Class Location

spherical distance formula.

linear program variables.

Query 5: Optimal Class Location

linear program formulation.

Query 5: Optimal Class Location

Formulation of LP in AMPL

model file. data file.

Query 5: Optimal Class Location

Results (AMPL)

optimal solution output

user interface.

Report: Prospective Clients List

Report: Pending Transactions

Form: Teacher Annual

Performance Report

Form: New Person

questions?.