Design, Monitoring, and Testing of Microservices Systems ...
Continuous Performance Testing for Microservices · Testing for Microservices Vincenzo Ferme...
Transcript of Continuous Performance Testing for Microservices · Testing for Microservices Vincenzo Ferme...
Continuous Performance Testing for Microservices
Vincenzo FermeReliable Software SystemsInstitute for Software TechnologiesUniversity of Stuttgart, Germany
HPI Future SOC Lab Day
2
Outline
2
1. Continuous Software Development Environments
Outline
2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework
Outline
2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework
3. Approach Extensions:
a. ContinuITy Framework
Outline
2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices: a. Challenges b. Our Approach c. Preliminary Results d. BenchFlow Automation Framework
3. Approach Extensions:
a. ContinuITy Framework
4. Future Work
Outline
3
Continuous Software Development Environments
3
Continuous Software Development Environments
Repo
Developers, Testers,
Architects
3
Continuous Software Development Environments
CI ServerRepo
Developers, Testers,
Architects
3
Continuous Software Development Environments
CI ServerRepo
Developers, Testers,
Architects
CD Server
3
Continuous Software Development Environments
CI ServerRepo
Developers, Testers,
Architects
Production
CD Server
4
Challenges
4
The system changes frequently:- Need to keep the test definition updated (manual is not an option)
Challenges
4
The system changes frequently:- Need to keep the test definition updated (manual is not an option)
Too many tests to be executed:- Need to reduce the number of executed tests
Challenges
4
The system changes frequently:- Need to keep the test definition updated (manual is not an option)
Too many tests to be executed:- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline
Challenges
5
Our Approach
5
Our Approach
Production
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
5
Our Approach
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
5
Our Approach
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Sampled Load Tests
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Sampled Load TestsDeployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load TestsDeployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
0.12 0.14 0.20 0.16 0.11
Test Results
3
5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.734
6
Preliminary Experiments
6
Preliminary Experiments
Production
12 microservices
6
Preliminary Experiments
Production
12 microservices
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
6
Preliminary Experiments
Production
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
6
Preliminary Experiments
Production
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
6
Preliminary Experiments
Scal = avg + 3σ
Production
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
7
Experiments Infrastructure
Virtual Users
32GB RAM, 24 Cores
7
Experiments Infrastructure
Virtual Users
32GB RAM, 24 Cores
896GB RAM, 80 Cores
System Under Test
8
Preliminary Experiments Results
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
8
Preliminary Experiments Results
Custom Op. Mix
API Scalability Criteria(PASS/FAIL)
GET / PASS
GET /cart PASS
POST /item FAIL
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
8
Preliminary Experiments Results
Custom Op. Mix
API Scalability Criteria(PASS/FAIL)
GET / PASS
GET /cart PASS
POST /item FAIL
Users Aggr. Rel. Freq.
50 0.10582
100 0.18519
150 0.22222
200 0.22222
250 0.20370
300 0.06085
Aggr. Rel. Freq.
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
8
Preliminary Experiments Results
Custom Op. Mix
API Scalability Criteria(PASS/FAIL)
GET / PASS
GET /cart PASS
POST /item FAIL
Users Aggr. Rel. Freq.
50 0.10582
100 0.18519
150 0.22222
200 0.22222
250 0.20370
300 0.06085
Aggr. Rel. Freq. Contrib. to Domain Metric
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
8
Preliminary Experiments Results
Custom Op. Mix
API Scalability Criteria(PASS/FAIL)
GET / PASS
GET /cart PASS
POST /item FAIL
Users Aggr. Rel. Freq.
50 0.10582
100 0.18519
150 0.22222
200 0.22222
250 0.20370
300 0.06085
Aggr. Rel. Freq. Contrib. to Domain Metric
Max: 0.20370
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
8
Preliminary Experiments Results
Custom Op. Mix
API Scalability Criteria(PASS/FAIL)
GET / PASS
GET /cart PASS
POST /item FAIL
Users Aggr. Rel. Freq.
50 0.10582
100 0.18519
150 0.22222
200 0.22222
250 0.20370
300 0.06085
Aggr. Rel. Freq. Contrib. to Domain Metric
Max: 0.20370
Actual: 0.13580
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
9
Preliminary Experiments Results
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
9
Preliminary Experiments Results
Users Contribution
50 0.10582
100 0.18519
150 0.22222
200 0.07999
250 0.13580
300 0.04729
Contrib. to Domain Metric
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
9
Preliminary Experiments Results
Users Contribution
50 0.10582
100 0.18519
150 0.22222
200 0.07999
250 0.13580
300 0.04729
Contrib. to Domain Metric
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
Max: 1
Domain Metric4
9
Preliminary Experiments Results
Users Contribution
50 0.10582
100 0.18519
150 0.22222
200 0.07999
250 0.13580
300 0.04729
Contrib. to Domain Metric
Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica
Max: 1
Domain Metric4
0.77631Actual:
10
Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X
_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR
�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX
e h?`2�ib iQ o�HB/Biv
h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,
Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX
Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM
10
Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X
_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR
�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX
e h?`2�ib iQ o�HB/Biv
h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,
Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX
Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM
10
Preliminary Experiments Resultsh�#H2 9, .QK�BM J2i`B+ T2` *QM};m`�iBQM BM i?2 irQ 2MpB`QMK2Mib U>SA- 6l"VXh?2 *QM};m`�iBQM rBi? i?2 >B;?2bi .QK�BM J2i`B+ Bb >B;?HB;?i2/X
_�J *Sl O *�`i _2THB+�b .QK�BM J2i`B+ U>SAV .QK�BM J2i`B+ U6l"VyX8 :" yXk8 R yXeR9NN yX89Rj9R :" yXk8 R yXddejR yX8j339R :" yX8 R yX8j88N yX89RyeyX8 :" yX8 R yX8R8je yX89ddjyX8 :" yX8 k yX8yNN8 yX89RRRR :" yXk8 k yXd9y3y yX89d38R :" yX8 k yX8j9yR yX89RyeyX8 :" yX8 9 yX8y8jR yX89NjNR :" yXk8 9 yXjdRek yX89kdkR :" yX8 9 yX8edR3 yX89kdR
�i >SA `2bmHib BM T2`7Q`K�M+2 /2;`�/�iBQMX 6Q` 2t�KTH2- i?2 +QM};m`�iBQM rBi?R :" Q7 _�J- yXk8 *Sl b?�`2- �M/ 7Qm` +�`i `2THB+�b- r�b �bb2bb2/ �b 0.37162-r?BH2 i?2 +QM};m`�iBQM rBi? R :" Q7 _�J- yX8 *Sl b?�`2- �M/ QM2 +�`i `2THB+�-r�b �bb2bb2/ �b 0.55356X �i 6l"- i?2 /QK�BM K2i`B+ Qb+BHH�i2b Qp2` � M�``Qr`�M;2X a+�HBM; #2vQM/ yX8 :" Q7 _�J- yX8 *Sl b?�`2- �M/ 9 +�`i `2THB+�b /Q2bMQi H2�/ iQ � #2ii2` T2`7Q`K�M+2 B7 i?2 MmK#2` Q7 mb2`b Bb ?B;?2` i?�M R8yX AM�//BiBQM- i?2 +?QB+2 Q7 i?2 >SA Q` i?2 6l" /2THQvK2Mib ?�p2 bB;MB}+�Mi BKT�+iQM i?2 /QK�BM@K2i`B+ �b b?QrM BM h�#H2 9X h?2b2 }M/BM;b bm;;2bi i?�i #QiiH2M2+F�M�HvbBb �M/ +�`27mH T2`7Q`K�M+2 2M;BM22`BM; �+iBpBiB2b b?QmH/ #2 2t2+mi2/ #27Q`2�//BiBQM�H `2bQm`+2b �`2 �//2/ iQ i?2 �`+?Bi2+im`2 /2THQvK2Mi +QM};m`�iBQMX
e h?`2�ib iQ o�HB/Biv
h?2 7QHHQrBM; i?`2�ib iQ p�HB/Biv iQ Qm` `2b2�`+? r2`2 B/2MiB}2/,
Ĝ PT2`�iBQM�H T`Q}H2 /�i� �M�HvbBb, i?2 /QK�BM K2i`B+ BMi`Q/m+2/ BM i?BbT�T2` `2HB2b QM i?2 +�`27mH �M�HvbBb Q7 T`Q/m+iBQM mb�;2 QT2`�iBQM�H T`Q}H2/�i�X J�Mv Q`;�MBx�iBQMb rBHH MQi ?�p2 �++2bb iQ �++m`�i2 QT2`�iBQM�H T`Q}H2/�i�- r?B+? KB;?i BKT�+i i?2 �++m`�+v Q7 i?2 /QK�BM K2i`B+ �bb2bbK2MibXa2p2`�H �TT`Q�+?2b +�M #2 mb2/ iQ Qp2`+QK2 i?2 H�+F Q7 �++m`�i2 QT2`�iBQM�HT`Q}H2 /�i� (9)- bm+? �b, mbBM; `2H�i2/ bvbi2Kb �b T`Qtv 7Q` i?2 bvbi2K mM/2`bim/v- +QM/m+iBM; mb2` bm`p2vb- �M/ �M�HvxBM; HQ; /�i� 7`QK T`2pBQmb p2`bBQMbQ7 i?2 bvbi2K mM/2` bim/vX
Ĝ 1tT2`BK2Mi ;2M2`�iBQM, 2tT2`BK2Mi ;2M2`�iBQM `2[mB`2b i?2 2biBK�iBQMQ7 2�+? T2`7Q`K�M+2 i2bi +�b2 T`Q#�#BHBiv Q7 Q++m``2M+2- r?B+? Bb #�b2/ QMi?2 QT2`�iBQM�H T`Q}H2 /�i�X q?2M i?2 QT2`�iBQM�H T`Q}H2 /�i� ;`�MmH�`BivBb +Q�`b2 i?2`2 Bb � i?`2�i iQ i?2 �++m`�+v Q7 i?2 2biBK�i2/ QT2`�iBQM�H T`Q@}H2 /Bbi`B#miBQMX aQK2 Q7 i?2 bm;;2bi2/ �TT`Q�+?2b iQ Qp2`+QK2 i?2 +Q�`b2;`�MmH�`Biv Q7 i?2 QT2`�iBQM�H T`Q}H2 /�i� �`2, T2`7Q`KBM; i?2 +QKTmi�iBQM
BenchFlow Automation Framework
11
BenchFlow
BF
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters FabanTest Execution
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters Faban
MinioData Collection
Test Execution
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters Faban
MinioData Collection
Data Analysis
Test Execution
BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters Faban
MinioData Collection
Data Analysis
Test Execution
https://github.com/benchflow
12
Generation
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
Test Generation
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
Test Bundle
Test Generation
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
Test Bundle
Test Generation
Experiment Generation
Experiment Bundle
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
Execution
Test Bundle
Experiment Execution
Test Generation
Experiment Generation
Experiment Bundle
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
Execution
Test Bundle
Failures
Experiment Execution
Test Generation
Experiment Generation
Experiment Bundle
Success
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
ExecutionA
nalysis
Test Bundle
Failures
Experiment Execution
Result Analysis
Test Generation
Experiment Generation
Experiment Bundle
Success
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
12
Generation
ExecutionA
nalysis
Test Bundle
Metrics
Failures
Experiment Execution
Result Analysis
Test Generation
Experiment Generation
Experiment Bundle
Success
BenchFlow Automation Framework
[Ferme et al.]Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development Environments. In Proc. of ICPE 2018. 261-272.
Approach: Summary
13
Approach: Summary
13
Production
Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
BenchFlow
BF3
Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.734
BenchFlow
BF3
Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.734
BenchFlow
BF3
1. Decide if the system is ready for deployment
Examples of Use of Domain Metric KPI:
Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.734
BenchFlow
BF3
1. Decide if the system is ready for deployment
2. Assess different deployment configurations
Examples of Use of Domain Metric KPI:
Approach: Summary
14
Con
trib.
to D
omai
n M
etric
Sampled Load Tests
Approach: Summary
14
Con
trib.
to D
omai
n M
etric
Sampled Load Tests
Max Contrib.
Approach: Summary
14
Con
trib.
to D
omai
n M
etric
Sampled Load Tests
Max Contrib.
Depl. Conf.
Approach: Summary
15
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.734
BenchFlow
BF3
Examples of Use of Domain Metric KPI:
1. Decide if the system is ready for deployment
2. Assess different deployment configurations
Approach: Summary
15
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.734
BenchFlow
BF3
3. Configure scaling plans and autoscalers
Examples of Use of Domain Metric KPI:
1. Decide if the system is ready for deployment
2. Assess different deployment configurations
16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
ContinuITy Framework
17
Production
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
ContinuITy Framework
17
Production Sampled Load TestsSampled Load Tests
Load Levels
API CallsOp. Mix
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
ContinuITy Framework
18
Production Sampled Load TestsSampled Load Tests
Load Levels
API CallsOp. Mix
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
ContinuITy Framework
18
Workload Model
Production Sampled Load TestsSampled Load Tests
Load Levels
API CallsOp. Mix
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
ContinuITy Framework
18
APIs SpecificationWorkload Model+
Production Sampled Load TestsSampled Load Tests
Load Levels
API CallsOp. Mix
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
ContinuITy Framework
18
APIs Specification Annotations ModelWorkload Model+ +
Production Sampled Load TestsSampled Load Tests
Load Levels
API CallsOp. Mix
[Schulz et al.]Henning Schulz, Tobias Angerstein, and André van Hoorn. Towards Automating Representative Load Testing in Continuous Software Engineering. In Proc. of ICPE 2018 Companion. 123-126.
Future Work
19
Future Work
19
Automated update of tests definition: - Automatically update load levels, API calls, operations mix
Future Work
19
Automated update of tests definition: - Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:- Make performance engineering activities more accessible
Future Work
19
Automated update of tests definition: - Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:- Make performance engineering activities more accessible
Declarative Performance Test Regression: - Extend the proposed approach to continuously assess regressions in performance and PASS/FAIL CI/CD pipelines
Future Work
19
Automated update of tests definition: - Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:- Make performance engineering activities more accessible
Declarative Performance Test Regression: - Extend the proposed approach to continuously assess regressions in performance and PASS/FAIL CI/CD pipelines
Prioritisation and Selection of Load Tests: - Exploit performance models to reduce even more the # of tests
20
Highlights
4
The system changes frequently:- Need to keep the test definition updated
Too many tests to be executed:- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges
20
Highlights
4
The system changes frequently:- Need to keep the test definition updated
Too many tests to be executed:- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach 5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.734
20
Highlights
4
The system changes frequently:- Need to keep the test definition updated
Too many tests to be executed:- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach 5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.734
Preliminary Experiments 6
Preliminary Experiments
Stab = avg + 3σProduction
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
20
Highlights
4
The system changes frequently:- Need to keep the test definition updated
Too many tests to be executed:- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:- Need to have complete automation for test execution - Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach 5
Our Approach
Observed load situations Time
Load
Lev
el
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
1
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq. 2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.734
Preliminary Experiments 6
Preliminary Experiments
Stab = avg + 3σProduction
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300 Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
Extensions and Future Work 16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations Time
Load
Lev
el
Production Empirical Distribution of Load situations
50 200 350 500Load Level
Rel.
Freq
.
Empirical Distribution of Load situations
100 300 500Sampled Load Level
Agg
r. Re
l. Fr
eq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4