Learning From Families...Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd...
Transcript of Learning From Families...Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd...
Learning From Families
Inferring Behavioral Variability From Software Product Lines
Carlos Diego Nascimento DamascenoPhD candidate @ University of Sao Paulo, BR1
Visiting PhD researcher @ University of Leicester, UK2
December 3rd 2019
1Supervisor: Adenilso Simao
2Supervisor: Mohammad Mousavi
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 1 / 32
Context
Figure: In software product lines (SPL), software variants are developed simultaneouslyfrom a common set of reusable assets
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 2 / 32
Context - Family models
StartGame
BowlingGame Pause
Start/1
Exit / 0
Pause/1
Start/1
Exit/1
Exit/1
Pause/0
Start/0
Pause / 1
StartGame
PongGame Pause
Start/1
Exit / 1
Pause/1
Start/1
Exit/0Pause/0
Exit/1
Pause/0
Start/0
Figure: A family model unifies multiple state machines of a product-line into asingle model where states and transitions are annotated with feature constraints 3
3e.g., featured finite state machine - FFSM (Fragal et al., 2017)Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 3 / 32
Context - Family models
Start/1
Start/1
Start/1Start/1Pause/1
Save[S]/1
Start/1
Pause/1Save[S]/1
Save[S]/1
Pause/1Pause[W]/1
Exit/1
Start/1
Start/1
Exit/0Start/0
Save[B]/0
Pause/0
Save[N]/1
Exit/1
Start/0
Exit/0Pause[!W]/1Pause[W]/0
Save/0Exit[!S]/0
Start/0Save[S]/0
Exit[W&&!S]/1Exit[!W||S]/0
Start/1Exit[S]/1
Pause[!W]/0
Start/1Save Game[S]Pause Game
Pong[N]
Bowling[W]
Brickles[B]
Star t Game
Figure: A family model unifies multiple state machines of a product-line into asingle model where states and transitions are annotated with feature constraints 3
3e.g., featured finite state machine - FFSM (Fragal et al., 2017)Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 3 / 32
Problem Statement
- Family-based analysis (e.g., model-based testing 4 and model checking 5)
- Cost as a function of the number of features and amount of feature sharing
- Redundant analysis are avoided/minimised
, Creation and maintenance of family and product models are challenging
, Outdated models may arise as products evolve
4Fragal et al. (2017)5ter Beek et al. (2017)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 4 / 32
Research objective
Investigate approaches to support the automatedconstruction of family models from SPLs
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 5 / 32
RQ1) How can we effectively infer product modelsfrom evolving system?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 6 / 32
Model Learning
Teacher Learning Algorithm( L*M )
SUL
Figure: The Minimally Adequate Teacher (MAT) framework (Angluin, 1987)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 7 / 32
Model Learning
Teacher Learning Algorithm( L*M )
Outputs Query Output
Reset + inputs Membership Query (MQ)
Observation Table
ES
S · I
SUL
Figure: The Minimally Adequate Teacher (MAT) framework (Angluin, 1987)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 7 / 32
Model Learning
Teacher
MBT
Learning Algorithm( L*M )
All pass / Failed test Yes / Counterexample
Perform tests Equivalence Query (EQ)
Outputs Query Output
Reset + inputs Membership Query (MQ)
Observation Table
ES
S · I
SUL
Reset + inputs
Outputs Formulates
intvoff rain
Figure: The Minimally Adequate Teacher (MAT) framework (Angluin, 1987)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 7 / 32
Model Learning (Example)
off
rain/0 swItv / 1
Figure: Initial Hypothesis
rain swItv
S ε 0 1
S · I rain 0 1swItv 1 0
Table: Initial observation table (OT)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 8 / 32
Model Learning (Example)
off
rain/0 swItv / 1
Figure: First Hypothesis
rain swItv
S ε 0 1
S · I rain 0 1swItv 1 0
Table: First observation table
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 9 / 32
Model Learning (Example)
itv
swItv / 1 rain / 1
off
swItv / 0
rain/0
Figure: Second Hypothesis
rain swItv
Sε 0 1swItv 1 0
S · Irain 0 1swItv · rain 0 1swItv · swItv 0 1
Table: Second observation table
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 10 / 32
Model Learning (Example)
itv
swItv / 1 rain / 1
off
swItv / 0
rain/0
Figure: Second Hypothesis
rain swItv
Sε 0 1swItv 1 0
S · Irain 0 1swItv · rain 0 1swItv · swItv 0 1
Table: Second observation table (H 6= SUL)
EQ = swItv · rain · rain · rain1 · 1 · 1 · 1 6= 1 · 1 · 0 · 1
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 11 / 32
Model Learning (Example)
itv
rain / 0
rain
rain / 1 swItv / 1 swItv / 1
off
swItv / 0
rain/0
Figure: Mealy machine of a windscreen wipersupporting intervaled and fast wiping
rain swItv rain · rain
Sε 0 1 0 · 0swItv 1 0 1 · 0swItv · rain 0 1 0 · 1
S · I
rain 0 1 0 · 0swItv · swItv 0 1 0 · 0swItv · rain · rain 1 0 1 · 0swItv · rain · swItv 0 1 0 · 1
Table: Final OT
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 12 / 32
What if our SUL evolves?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 13 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
Adaptive model learning for evolving systems
Reuse transfer and/or separating sequences from pre-existing modelsI Reduce the time for model checking (Groce et al., 2002; Chaki et al., 2008)I Find states maintained in newer versions (Windmuller et al., 2013)
Research Gaps:I Reuse low quality sequences → Irrelevant MQs (Huistra et al., 2018)I How can we calculate good-quality subsets of sequences?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 14 / 32
The partial-Dynamic L∗M algorithm 6
Software evolutionΔ
∂ L*M
1) On-the-fly Exploration
2) Build Experiment Cover
Out
MQ
s / E
Qs
MQ
s
manually specified
OTuEu
SuSu · Iu
off itv rainoff
manually orautomatically built
OTrEr
SrSr · Ir
In
Mr
3) Algorithm L*M
SUL(vupdt)
Reference(vref)
Mu
form
ulat
es
itvprmoff rain
6Accepted at the iFM 2019 (Damasceno et al., 2019b) - Thu @ 10:30 - Aud 1Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 15 / 32
Step 1: On-the-fly exploration of the reused OT
Software evolutionΔ
∂ L*M
1) On-the-fly Exploration
2) Build Experiment Cover
Out
MQ
s / E
Qs
MQ
s
manually specified
OTuEu
SuSu · Iu
off itv rainoff
manually orautomatically built
OTrEr
SrSr · Ir
In
Mr
3) Algorithm L*M
SUL(vupdt)
Reference(vref)
Mu
form
ulat
es
itvprmoff rain
Figure: The partial-Dynamic L∗M algorithm starts by exploring reused OTs on-the-fly todiscard redundant transfer sequences 7
7Improvement #1: We optimized Chaki et al. (2008)Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 16 / 32
Step 2: Building an experiment cover tree
Software evolutionΔ
∂ L*M
1) On-the-fly Exploration
2) Build Experiment Cover
Out
MQ
s / E
Qs
MQ
s
manually specified
OTuEu
SuSu · Iu
off itv rainoff
manually orautomatically built
OTrEr
SrSr · Ir
In
Mr
3) Algorithm L*M
SUL(vupdt)
Reference(vref)
Mu
form
ulat
es
itvprmoff rain
Figure: The partial-Dynamic L∗M algorithm searches fordeprecated separating sequences 8
8Improvement #2: We used breadth-first search to minimize the set of separating sequencesDamasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 17 / 32
Step 3: Running L∗M using the outcomes of partial-Dynamic L∗M
Software evolutionΔ
∂ L*M
1) On-the-fly Exploration
2) Build Experiment Cover
Out
MQ
s / E
Qs
MQ
s
manually specified
OTuEu
SuSu · Iu
off itv rainoff
manually orautomatically built
OTrEr
SrSr · Ir
In
Mr
3) Algorithm L*M
SUL(vupdt)
Reference(vref)
Mu
form
ulat
es
itvprmoff rain
Figure: The L∗M algorithm starts by reusing transfer and separating sequences to reach anddistinguish more states than in the traditional setup (i.e., initial state only) 9
9Improvement #3: We use the subsets of reused sequences as the initial setup for model learningDamasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 18 / 32
partial-Dynamic L∗M - Empirical evaluation
1.0.2 (7)
1.1.0-pre1 (6)
0.9.7 (17)
0.9.7c (14)
0.9.7e (14)
0.9.8l (11)
0.9.8m (10)
0.9.8s (10)
0.9.8f (14)
0.9.8u (12)
0.9.8za (9)
1.0.0 (10)
1.0.0f (10)
1.0.0h (12)
1.0.0m (9)
1.0.1 (14)
1.0.1h (9)
1.0.1k (8)
Figure: OpenSSL toolkit: 18 FSMs versions used as SUL (de Ruiter, 2016)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 19 / 32
partial-Dynamic L∗M - Main findingsΔ
num
ber o
f MQ
s
∂L*M
0 5 10
-150
-100
-50
0
50
ADL*M
0 5 10
0
200
400
600
800
B
Δ time (Years)
Figure: Our technique required less MQs
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 20 / 32
partial-Dynamic L∗M - Main findingsΔ
num
ber o
f MQ
s
∂L*M
0 5 10
-150
-100
-50
0
50
ADL*M
0 5 10
0
200
400
600
800
B
Δ time (Years)
Figure: Our technique was not influenced by the temporal distance between versions
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 20 / 32
RQ2) How can we merge state machinesinto family models?
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 21 / 32
The FFSMDiff algorithm10
Product 1: (AGM & N & ¬B)
TRUE NStart/1
Exit / 1
Start/0Exit/0Pause/0
Product 2: (AGM & B & ¬N)
TRUE BStart/1
Start/0Exit/0
Exit/0Pause/0
AGM
B N
AGM
B N
Product 1 | 2: (AGM & N & ¬B) | (AGM & B & ¬N)
TRUE [B|N]Start[B|N]/1
Exit[N]/1
Start[B|N]/0Exit[B]/0
Exit[B|N]/0Pause[B|N]/0 AGM
B NFind
com
mon
sta
tes
+ Fe
atur
e co
nstra
ints
con
stru
ctio
n
Syst
emca
lcul
atio
nFigure: An automated technique to learn fresh FFSM and include new FSMs into existing FFSMs
by comparing products models and incorporating variability toexpress product-specific behaviors with feature constraints
10This paper has been published at the SPLC 2019 (Damasceno et al., 2019a)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 22 / 32
The FFSMDiff algorithm - Main findings
... SUT
Prod n SUT
Prod 02 SUT
Prod 01
Product model
Prod 1
Product model
Prod {1,2}
Family model
SPL
...Product model
Prod n
FFSMDiff
out
in
Σsize( ) ≥
Figure: Product models can be effectively merged into succinct FFSMs,especially if there is high feature sharing
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 23 / 32
The FFSMDiff algorithm - Main findings
Start/1
Start/1
Start/1Start/1Pause/1
Save[S]/1
Start/1
Pause/1Save[S]/1
Save[S]/1
Pause/1Pause[W]/1
Exit/1
Start/1
Start/1
Exit/0Start/0
Save[B]/0
Pause/0
Save[N]/1
Exit/1
Start/0
Exit/0Pause[!W]/1Pause[W]/0
Save/0Exit[!S]/0
Start/0Save[S]/0
Exit[W&&!S]/1Exit[!W||S]/0
Start/1Exit[S]/1
Pause[!W]/0
Start/1Save Game[S]Pause Game
Pong[N]
Bowling[W]
Brickles[B]
Star t Game
Figure: Alternative FFSM for AGM with fewer states (Fragal, 2017)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 24 / 32
The FFSMDiff algorithm - Main findings
Save[(B||N)&&S]/1
Exit[B]/0
Start/1Save[W&&S]/1
Pause/1
Pause[W]/1
Exit[N]/1
Save[B]/0
Pause/0
Save[N]/1
Exit/1Exit/0
Pause[!W]/1Pause[W]/0
Save/0
Exit[W&&!S]/0
Start/0Save[S]/0
Exit[W&&!S]/1Exit[!W||S]/0
Start/1
Exit[W&&S]/1
Pause[!W]/0
Start/1Save Game[S]Pause GameBrickles*Pong*Bowling
[B| |N| |W]Star t Game
Figure: Alternative FFSM for AGM with fewer states (Fragal, 2017)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 24 / 32
RQ3) Family model learning(Optimization)
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 25 / 32
Family model learning (Optimization)
in
SUL Prod n
in
SUL Prod 02
out
SUL Prod 01
out
Active automata learning
Active family model learning
Product model
Prod 1
Product model
Prod 2
∂ family model
Prod {1,2}
∂ family model
Prod {1,2,n}
FFSMDiff
out
in
in
out
Figure: Family Model Learning
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 26 / 32
Family model learning (Optimization)
Replace traditional FSMs by partial family modelsI Expected result: reduction on the number of queries
Current issue:I Lack of FSMs large and complex enough for family model learning
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 27 / 32
Summary
RQ1Product model learning
RQ2Products model
merging
RQ3Family model learning
(Optimization)
Figure: Summary
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 28 / 32
Future work
RQ1Product model learning
RQ2Products model
merging
RQ3Family model learning
(Optimization)RQ3.2
Learn By Sampling
RQ3.3Learning using
Discrimination trees
RQ3.1Learning Family Models By Reuse
Figure: Future work
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 29 / 32
Questions?
https://damascenodiego.github.io/projects/
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 30 / 32
References I
Angluin, D. (1987). Learning regular sets from queries and counterexamples. Information andComputation, 75(2):87–106.
Chaki, S., Clarke, E., Sharygina, N., and Sinha, N. (2008). Verification of evolving software viacomponent substitutability analysis. Formal Methods in System Design, 32(3):235–266.
Damasceno, C. D. N., Mousavi, M. R., and da Silva Simao, A. (2019a). Learning from difference: Anautomated approach for learning family models from software product lines. In Proceeedings of the23rd International Systems and Software Product Line Conference - Volume 1, SPLC 2019, Paris,France. ACM Press.
Damasceno, C. D. N., Mousavi, M. R., and Simao, A. (2019b). Learning to reuse: Adaptive modellearning for evolving systems. In Integrated Formal Methods - 15th International Conference, IFM2019, Bergen, Norway, December 2-6, 2019, Proceedings. Springer.
de Ruiter, J. (2016). A tale of the openssl state machine: A large-scale black-box analysis. In Secure ITSystems - 21st Nordic Conference, NordSec 2016, Oulu, Finland, November 2-4, 2016, Proceedings,pages 169–184.
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 31 / 32
References II
Fragal, V. H. (2017). Automatic generation of configurable test-suites for software product lines. PhDthesis, Universidade de Sao Paulo. [Online]http://www.teses.usp.br/teses/disponiveis/55/55134/tde-10012019-085746/.
Fragal, V. H., Simao, A., and Mousavi, M. R. (2017). Validated Test Models for Software ProductLines: Featured Finite State Machines, pages 210–227. Springer International Publishing, Cham.
Groce, A., Peled, D., and Yannakakis, M. (2002). Adaptive model checking. In Proceedings of the 8thInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems,TACAS ’02, pages 357–370, London, UK, UK. Springer-Verlag.
Huistra, D., Meijer, J., and Pol, J. (2018). Adaptive learning for learn-based regression testing. InHowar, F. and Barnat, J., editors, Formal Methods for Industrial Critical Systems, Lecture Notes inComputer Science, pages 162–177, Switzerland. Springer Publishers.
ter Beek, M. H., de Vink, E. P., and Willemse, T. A. C. (2017). Family-Based Model Checking withmCRL2, pages 387–405. Springer, Berlin, Heidelberg.
Windmuller, S., Neubauer, J., Steffen, B., Howar, F., and Bauer, O. (2013). Active continuous qualitycontrol. In Proceedings of the 16th International ACM Sigsoft Symposium on Component-basedSoftware Engineering, CBSE ’13, pages 111–120, New York, NY, USA. ACM.
Damasceno C.D.N. et al. Learning from Families @ iFM 2019 December 3rd 2019 32 / 32