A Hybrid Simulation Approach for Competitive Open Software … · 2018. 11. 6. · SDSF 2018...
Transcript of A Hybrid Simulation Approach for Competitive Open Software … · 2018. 11. 6. · SDSF 2018...
-
SDSF 2018 November 7, 2018
A Hybrid Simulation Approach for Competitive Open Software Development Process
Sponsor: DASD(SE)By
Razieh Saremi6th Annual SERC Doctoral Students Forum
November 7, 2018FHI 360 CONFERENCE CENTER1825 Connecticut Avenue NW
8th FloorWashington, DC 20009
www.sercuarc.org
-
Table of Content
Background
Research Question and Methodology
Research Design and Analysis
Conclusion
2
-
Open Competitive Software Workflow
3
?
Open Competitive
Scheduling
Re-Work
Crowdsourced
Platform
Crow
d W
orke
rs
Tasks O
wner Potentially large number of unknown
workers Have access to the internet Collaborate and coordinate on the
tasks Workers take the work they choose
Open Market Software Development [Weiss 2005]:
-
Task Scheduling in Software Engineering
Cost (Cheap)[Stole 2014]
Quality (Good)[Stole 2014]
Duration (Fast)
[ Alba et al 2006, Amiri et al 2015]
Challenges in OSD Scheduling: [Gao et al 2015, Barreto et al 2008]
Not knowing workers in person, Working from different time zone, Workers interest in other tasks among pool of open
tasks
4
Scheduling a software project:Setting sequence of time dependent tasks that make a project [Mingozi et al 1998]Assign tasks to workers to be done in specific time frame [Alba et al 2007]
Complex tasks require higher cooperation amongst co-workers [Malon 1994].
Understanding the crowd workers’ sensitivity and performance to the arrival task [Difallah 2016]
Zero registration, Zero submission and Not qualified submission.
Improper scheduling would result in task failure [Faradiani et al 2011]
-
Motivation Example5
0
0.05
0.1
0.15
0.2
0.25
0
10
20
30
40
50
60
8 11 13 14 15 17 18
# Si
mila
r Ope
n Ta
sks
Task IDSimilar Open Tasks Reliability Factor
Relia
bilit
y Fa
ctor
Project Duration: 110 DaysProject Failure Ratio: 57%
5
-
Limitations of Existing Methods
6
Task Context
Worker Availability
Required Skill-Set
Basic Space Sharing [chi 2013] FIFO
Round RobinShorter Job Fair
HIT Bundle [Difallah2016]
QOS Based Scheduling [Khazankin 2011]
Delay Scheduling [Rajan 2013]
Game with a purpose[Zaharia 2010]
Fair Scheduling /Hodoop-Yarn [Ghodsi 2011]
Weighted Fair Sharing
Fair Sharing
Flash Organization[Valentine et al 2017]
Task Similarity,Resource Reliability
Required Skill-Set
Same Batches of Tasks,Switch Context
Resource Availability, Task Size, Task Priority
-
Research Question
7
Is it feasible to provide a more effective automated task schedulingframework in competitive open software development environments inorder to reduce task failure ratio?
Presented Model:
– CSD data from Jan 2014 to Feb 2015– extracted from TopCoder website – 403 individual projects – 4907 component development tasks – 8108 workers, 5062 active workers– 60433 worker-task participation records
Research Approach:
Hybrid simulation model: Systems Dynamic Discrete Event Agent Based Model
Trained Data Set:
-
CSD Market Place
Demand:
CSD Mini-Tasks per week
76 New Arrival2 Cancel1 Starve
8 Fail
Supply:
CSD Workers per week
Belt Rating Range (X) # Workers % Workers
Gray X
-
Macro LevelCompetition
Model
Meso LevelTask
Completion
Micro LevelAgent Model
System Dynamic
Simulation
Discrete Event
Simulation
Agent Based
Simulation
Task Execution Decision
MakingP
erfo
rman
ce
Hyb
rid S
imul
atio
n
Worker Skill Set
CSD Market Requester
Company
Task Similarity
Task
Worker Decision
Submission Quality
Worker Profile
FailureSuccess
Task ArrivalWorker Arrival
Study Design
Overall View of Hybrid Simulation ModelRelaxed Assumptions:
Workers’ trust factor is constant
Tasks will not be cancelled by client requests
Tasks will not face zero registration scenario
9
-
Micro level: Agent Model
59% of workers respond to a task call
24% of the workers will make submissions
10
Knowledge:
Submitting:
Agent’s Decision Making:
0.51
1
P(Sj )=
0.6 𝑗𝑗𝑗𝑗 𝑅𝑅𝑅𝑅𝑅𝑅0.6 𝑗𝑗 𝑗𝑗 𝑌𝑌𝑅𝑅𝑌𝑌𝑌𝑌𝑌𝑌𝑌𝑌0.39 𝑗𝑗 𝑗𝑗 𝐵𝐵𝑌𝑌𝐵𝐵𝑅𝑅0.45 𝑗𝑗 𝑗𝑗 𝐺𝐺𝐺𝐺𝑅𝑅𝑅𝑅𝐺𝐺0.25 𝑗𝑗 𝑗𝑗 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺
P(Rj )=�1 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 ≤ 18𝐵𝐵𝑅𝑅𝐺𝐺𝐺𝐺𝑌𝑌𝐵𝐵𝑌𝑌𝑌𝑌𝐵𝐵 0.3 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 > 18
Registering:
Workers’ Reliability (Re): Pert(0, 1, 0.19)
10
-
Meso level: Task Completion Model
11
P(TRi )=�0 AR Massege ≠ 11 AR Massege = 1
P(TWi )=�𝐶𝐶𝐺𝐺𝐺𝐺𝐶𝐶𝑅𝑅𝑌𝑌𝑌𝑌𝑅𝑅𝑅𝑅 Score < 75𝐶𝐶𝑌𝑌𝐶𝐶𝐶𝐶𝑌𝑌𝑅𝑅𝐶𝐶𝑅𝑅 Score ≥ 75
P(TSi )=�0 AS Massege ≠ 11 AS Massege = 1
Task Score/ Quality: uniform (0,100)
FPRk =
∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 ∗ 𝑃𝑃𝑗𝑗
3∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 > 2
∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 ∗ 𝑃𝑃𝑗𝑗
21 < ∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 < 2
∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 ∗ 𝑃𝑃𝑗𝑗
1∑𝑖𝑖𝑛𝑛 𝑅𝑅𝑅𝑅𝑗𝑗 < 1
FPSk = 0.0473(𝑇𝑇𝑇𝑇𝑅𝑅𝑘𝑘) + 0.014
11
-
12
Macro Level: Competition Mode
Workers’ Experience
CSD Market
Requester Company
Task SimilarityTask
Worker Decision
Submission Quality
Worker Profile
FailureSuccessTask Arrival
Worker Arrival
Workers’ Experience: Beta (1 , 5 , 0 , 3000)
Workers’ Arrival: Poisson(18, 800, 20, 0, 1)
Task Similarity: uniform (0.33,0.98)
Tasks’ Arrival: Task Model Schedule
12
-
Model Accuracy
13
Time(Week)
Sim
ulat
ion
Rat
io
Failu
re P
redi
ctio
n
MR
E
Failure Prediction in Registration Phase
Failure Prediction in Submissions Phase
Registration FPSubmission FPSimulation Failure Simulation Dropped
Simulation Success
00.10.20.30.40.50.60.70.8
0 20 40 600
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0 20 40 60
Success Ratio ~ 71%Failure ratio ~ 13%
Failure prediction i Submissions(SFP) ~ 15%Failure Prediction in Registration (RFP) ~ 6.5%
MRE(SFP) = 2%MRE (RFP) = 1.1%Actual Failure Ratio ~ 14%
V.S.
0.19
0.07
0.71
00.10.20.30.40.50.60.70.80.9
1 11 21 31 41 51 61 71 81
-
Model Insights
14
Time(Week)
M= 0.43, Std = 0.15
UL = 0.88
LL = 00
0.2
0.4
0.6
0.8
1
1.2
0 10 20 30 40 50 60 70 80
25 5150
64
63
73 75
Task
Sim
ilarit
y
0
0.1
0.2
0.3
0.4
0.5
0.6
0
0.1
0.2
0.3
0.4
0.5
0.6
25 50 51 63 64 73 75
Wor
kers
’ Util
izat
ion
Wor
kers
’ Ava
ilabi
lity
Task Similarity LevelRed CategoryGray Category Green Category
Blue Category Yellow Category
Worker Utilization
Task Similarity level < 60%
Close availability of midlevel experienced worker
Time(Week)
Scenario 1: How diverse?
Scenario 2: How open?
-
Scenario 1: Agents’ Diversity
15
Scenario ran 30 times ↓ Failure%↑Submissions% PM only can manage the
diversity of registrants’ experience
↓ Failure%↑Submissions%
-
Scenario 2: Task Openness
16
Scenario 1 60% Similarity70%
Similarity80%
Similarity90%
Similarity
Task
Sta
tus Fail 18 22 25 23
Success 12 8 5 7
Failure Prediction 60% 73% 83% 77%
Scenario ran 30 times PM can manage openness of the pool of tasks only
↓↓Failure%↑↑ Openness
-
Conclusion and Future Work
Conclusion:
This study provides a hybrid simulation model to help providing more insights in order to have a more efficient task scheduling in OSD.
Attracting higher number of middle ranked agents to compete on the task, would provide lower chance of task failure in general.
Similarity level of 60% and lower in the pool of available tasks, provides lower chance of task failure.
17
Future Work:
Updating the model with schedule available projects form the entire data set and report the recommendation metrics
17
-
Limitation 1
Assuming that monitory prize and task duration represents task complexity
Presented model created based on competitive crowdsourcing only
No access to the management dataset and overheads
No access to the actual task sequential per project
Different factors that may influence workers’ decision-making process
18
-
Thank you!
19
Q&A
A Hybrid Simulation Approach for Competitive Open Software Development ProcessTable of ContentOpen Competitive Software WorkflowTask Scheduling in Software EngineeringMotivation ExampleLimitations of Existing Methods �Research QuestionCSD Market PlaceStudy DesignMicro level: Agent ModelMeso level: Task Completion ModelSlide Number 12Model Accuracy�Model InsightsScenario 1: Agents’ DiversityScenario 2: Task OpennessConclusion and Future WorkLimitation Q&A