Autoscaled Distributed Automation using AWS at Selenium London MeetUp

52
AUTOSCALED DISTRIBUTED AUTOMATION SELENIUM GRID / AWS Ragavan Ambighananthan @ragsambi Expedia London Selenium MeetUp Group 2016! AKA ‘RUNNING TESTS WITHIN THE TIME TAKEN BY THE SLOWEST TEST CASE’

Transcript of Autoscaled Distributed Automation using AWS at Selenium London MeetUp

AUTOSCALED DISTRIBUTED AUTOMATION

SELENIUM GRID / AWS

Ragavan Ambighananthan@ragsambi

Expedia

London Selenium MeetUp Group 2016!

AKA ‘RUNNING TESTS WITHIN THE TIME TAKEN BY THE SLOWEST TEST CASE’

WHAT DO I GET?• SeleniumGridScaler = Selenium Grid + AWS +

Autoscaling• DA will phenomenally shorten the UI automation run

time to few minutes• Faster feedback cycle• Fewer Jenkins jobs to run automation, instead of

few hundreds• Cost effective and reliable• Enables Continuous Integration / Continuous

Deployment

AGENDA

• Setting up

• Making the Grid stable

• Grid topologies

• Cost saving

• Reporting / Dashboard

PROBLEM DESCRIPTION

TOO MANY UI TESTS

PROBLEM DESCRIPTION

SLOW TEST / EXECUTION

PROBLEM DESCRIPTION

• Hundreds of Jenkins jobs to run all the tests (monolithic apps)

• Not having a system to run hundreds of UI automation tests reliably, fast and scalable in a cost effective way is a blocker for CI / CD

• No intelligent automation report to narrow down failures quickly!

SOLUTION

• To be able to run all UI automation scenarios within the time taken by the slowest test case

• Cost effective, scalable and reliable• Teams focussing on automation• Note: This is not about cross browser test coverage rather using

grid for parallel test execution

SETTING UPTECHNOLOGIES / TOOLS USED

SELENIUMGRIDSCALER

SETTING UPBIG PICTURE

SETTING UP

checkout/lx: features/lx_fraud.feature:21:en_US features/lx_fraud.feature:47:en_US features/lx_responsive_design.feature:25:en_US features/lx_responsive_design.feature:26:en_GB features/lx_responsive_design.feature:27:en_US features/lx_responsive_design.feature:90:de_DE features/lx_responsive_design.feature:240:en_USsearch_landing_pages/flights_tg: features/tg_flights_revamp_hero_image.feature:120:en_US features/tg_flights_revamp_social_sharing.feature:156:en_US features/tg_flights_revamp_search_wizard.feature:202:en_US features/tg_flights_revamp_search_wizard.feature:203:nl_NL features/tg_flights_revamp_top_destinations.feature:159:en_US features/tg_flights_revamp_top_destinations.feature:160:en_US features/tg_flights_revamp_top_destinations.feature:161:en_US features/tg_flights_revamp_top_destinations.feature:207:en_US

• Only scenarios that matches @stubbed | @live and @acceptance | @regression will be included in the list to run

• All these tests will be executed concurrently

SAMPLE GENERATED SCENARIOS

SETTING UP

./gradlew -PnumBrowsers=150 :modulex.ui:scalaAcceptance -i -Denvironment=JENKINS_STUBBED -Dbrowser=Grid

SAMPLE GENERATED SCENARIOS

use ParallelTestExecution Trait

SETTING UP

• c3.4xlarge (16 cpu / 30 GB RAM / High BW) for thousands of test

• c3.large (2 cpu / 3.75 GB RAM / Enhanced Net) for fewer hundreds of tests

• Hub should have enough network bandwidth but low CPU / Memory is fine

• AMI with bootstrap SeleniumGridScaler jar, which will act as the hub that can autoscale

• https://github.com/mhardin/SeleniumGridScaler

SELENIUM GRID HUB SETUP

SETTING UP

• Open Source• Acts as an intelligent hub• Auto scales grid nodes depending on the number of

tests• Optimized termination of nodes when not in use• Adhoc launch of new nodes is also possible• Talks to AWS using EC2• Nodes are bootstrapped to attach themselves to the

hub• Supports AWS Windows as well

SELENIUMGRIDSCALER - HUB

• c3.xlarge• Capable of running maximum 24 Firefox• Number of Chrome that can be run is lesser

~15• Node created out of AMI has bootstrap code

to help attach to the hub

SETTING UPSELENIUM GRID NODE SETUP

SETTING UP

• To have your own node AMI• Either you have to get the node AMI or

create an AWS instance, bootstrap it,create an AMI out of it and refer it in the Hub config.

• Hub creates the node based on a config: AMI ID, subnet, security group, node

type,etc.

SELENIUMGRIDSCALER - NODE

SELENIUM NODE BOOTSTRAP CODE[root@ip-10-2-12-167 ~]# more /home/grid/grid/grid_start_node.sh#!/bin/shPATH=/sbin:/usr/sbin:/bin:/usr/bincd /home/grid/gridexport EC2_INSTANCE_ID="`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id || die \"wget instance-id has failed: $?\"`"# Pull down the user data, which will be a zip file containing necessary informationexport NODE_TEMPLATE="/home/grid/grid/nodeConfigTemplate.json"curl http://169.254.169.254/latest/user-data -o /home/grid/grid/data.zip# Now, unzip the data downloaded from the userdataunzip -o /home/grid/grid/data.zip -d /home/grid/ubuntu/grid# Replace the instance ID in the node config filesed "s/<INSTANCE_ID>/$EC2_INSTANCE_ID/g" $NODE_TEMPLATE > /home/grid/grid/nodeConfig.json# Finally, run the java process in a window so browsers can runxvfb-run --auto-servernum --server-args='-screen 0, 1600x1200x24' java -jar /home/grid/grid/selenium-server-node.jar -role node -nodeConfig /home/grid/grid/nodeConfig.json -Dwebdriver.chrome.driver="/home/grid/grid/chromedriver" -log /home/grid/grid/grid.log &

MAKING THE GRID STABLE

• Timeouts in json config• “timeout”:240000 (ms)• “browserTimeout”:390000 (ms)• browserTimeout has to be bigger than

‘timeout’ and ‘webDriver’ timeout• browserTimeout is specified in secs in

command line

TIMEOUTS

• If browser instance hangs (for any reason what so ever), it will take 3hrs (http client socket timeout) for the particular slot to become free.

• This timeouts the Jenkins job• Solution:

• Fix the particular test scenario causing this issue• Add a cronjob to kill any browser instances that is running

for more than 10mins. • Make this as part of your Chef knife plugin• Ref: selenium repo, PR: 227 / fixed in 285

MAKING THE GRID STABLETIMEOUTS

• Grid setup should be in the same AWS subnet• Using multiple subnets will result in lots of

FORWARDING_TO_NODE_FAILED errors

MAKING THE GRID STABLEAWS - SUBNET

• Subnet you are using should have enough free IP addresses

• It will be a blocker for autoscaling the grid nodes

MAKING THE GRID STABLEAWS - IP ADDRESS

• The webDriver object creation consumes bandwidth in the range of 6Gbits/5min in the Hub for 250+ tests in parallel

MAKING THE GRID STABLEAWS - HUB BANDWIDTH

c3.4xlarge bandwidth is “High”c3.large can also be used for smaller apps

• Fine tune your • -Xms • -Xmx • -DPOOL_MAX

MAKING THE GRID STABLEAWS - HUB / NODE MEMORY

• HUB becomes unstable after running thousands of tests

• Automate restarting of Hub after every 2000+ tests or at the end of your test job

MAKING THE GRID STABLEAWS - RESTARTING HUB

• Jenkins executor which would be running hundreds of tests in parallel, needs to have enough CPU power.

MAKING THE GRID STABLEAWS - JENKINS EXECUTOR CPU

c3.8xlarge when running 250+ tests in parallel

• Don’t rely too much on Selenium Grid’s queuing policy

• If your average test execution time is greater than webDriver timeout, tests will timeout at webDriver creation itself

MAKING THE GRID STABLEHUB QUEUING POLICY

• Update browsers in the node and create a new node AMI

• Necessary browser settings:

MAKING THE GRID STABLEUPDATE BROWSERS

profile =Selenium::WebDriver::Firefox::Profile.new profile['app.update.auto'] = false profile['app.update.enabled'] = false profile['app.update.service.enabled'] = false profile['dom.max_script_run_time'] = 60 profile['dom.max_chrome_script_run_time'] = 60 profile['focusmanager.testmode']=true profile['accept_untrusted_certs']=true profile['assume_untrusted_certificate_issuer'] = false

MAKING THE GRID STABLESCALE THE TEST INFRASTRUCTURE

GRID TOPOLOGIES• Decide what you want before selecting the topology to be cost

efficient!• I want to release code to production ..

1. Every CL (change list)2. Once a day3. Once a week4. When ever I want (on demand!)

• Based on the above answers, Do I want to run all UI automation for 5. Every CL ?6. Every 2 hours7. Four times a day

GRID TOPOLOGY - 1

HUB

Jenk ins J ob

• parallel execution for small projects• 1 executor - 1 hub - 14 nodes• eg: c3.8xlarge can execute 250*+ tests in parallel• Test run would finish in ~5mins

c3.8xlarge

c3.large

c3.xlarge

….

GRID TOPOLOGY - 2

HUB

Job Execu tor

Job Execu tor

• Suitable for medium size projects (500+ tests)

• Adding one more executor (2 executors 1 hub and 28 node),this could double your parallel execution cases, still taking only ~5mins

c3.4xlarge

c3.8xlarge

c3.xlarge

….

….

GRID TOPOLOGY - 3

HUB

• Takes 2x times as previous topology, but half the cost! (1 executor - 1 hub - 14 nodes)

• Suitable for medium size projects• Test run would finish in ~10mins

Job Execu tor

Job Execu tor

c3.8xlarge

c3.xlargejob runs sequentially….

c3.4xlarge

GRID TOPOLOGY

HUB

Job Execu tor

Job Execu tor

• One more job? Probably NOT as HUB network traffic would make it unstable especially during webDriver creation

• c3.8xlarge network bandwidth limit is 10Gbit

c3.4xlarge

c3.8xlarge

c3.xlarge

….

….

GRID TOPOLOGY - 4

HUB

HUB

• Use two hubs to

double the tests

(1000+)• But speed is same

as topology 2

(~5mins)• Double the cost

c3.8xlarge

c3.xlarge

c3.4xlarge

c3.4xlarge

COST SAVING

OPTIMAL USE OF GRID NODES

• Running 250+ tests on a grid setup with 250 slots will take around 5mins

• Nodes are idling for the remaining 55mins of time which is already billed by AWS

• Even during the 5mins of run, only very minority of the tests takes around 4mins and majority of the test complete in less than 1 min

COST SAVING

OPTIMAL USE OF GRID NODESCOST SAVING

• On a c3.8xlarge 250 tests can be run at one go before all 32 CPU reach 100%

• Start 250 cases• Then between every ~50 seconds or so, start

100 tests in batch, repeat this until all tests are executed

• Fine tune the delay according to your observation

BATCH PROCESSINGCOST SAVING

GRID TOPOLOGY - BATCH PROCESSING

HUB

• Cost saving topology 1 executor - 1 hub - 16 nodes• Can run any number of tests• Can run 5000 UI tests within ~1hr 10mins

job runs sequentially

c3.8xlarge c3.xlarge

COST SAVING

c3.4xlarge

COMPARING AWS COST VS DATA CENTRE• 1 Medium box (~$8000 / per month)• 1 Large box (~$10000 / per month)• 1 VM (~$2000 / per month)• Total AWS cost for 2 Batch Processing

Topologies• ~$2400 / month (fully autoscaled and runs

9500+ UI test)• Frequency: 9-11 times a day

COST SAVING

AUTOSCALING OF GRID NODES

• SeleniumGridScaler autoscales the grid nodes• It creates AWS nodes on demand based on

a configuration file and the number of tests to run

• Optimized termination of nodes

COST SAVING

• http://x.x.x.x:4444/grid/admin/AutomationTestRunServlet?uuid=testRun1&threadCount=250&browser=firefox”

• For 250 test cases, it will create 250/24 ~ 11 nodes

• It returns status codes

• 202 - request can be fulfilled by current capacity

• 201 - request can be fulfilled but AMI must be started to meet capacity (wait for ~5mins)

AUTOSCALING OF GRID NODESCOST SAVING

• c3.xlarge = $0.21 per Hour (can run 24 Firefox instances)

• t2.micro = $0.013 per Hour• 16 t2.micro for the price of 1 c3.xlarge = 16 Firefox

Conclusion:• I would prefer to use c3.xlarge as it is more value add• I would not have to use 15 extra IP addresses

But always this depends on your observation of your own setup!

LARGE VS SMALLER NODE TYPESCOST SAVING

• Shutdown the hub when not in use• Benefit: You are paying tiny amount to AWS

when a node is stopped than when its running

• Automate this stop and start tasks

STOPPING THE HUBCOST SAVING

PIPELINE

HUB

CI build

Deploy Job CI

automation job

chec

k if

node

s st

ill

atta

ched

to

the

hub

auto

scal

e no

des

5min

star

t th

e hu

b

wai

t fo

r th

e hu

b to

co

me

onlin

e

if ye

s sh

utdo

wn

the

hub

if no

, let

the

hub

ru

nto

ter

min

ate

rest

of

the

nod

es

crea

te a

larm

to

stop

hub

if n

o ac

tivity

for

1hr

REPORTING / DASHBOARDTREND CHARTS

REPORTING / DASHBOARDPOINT OF SALE GRID

REPORTING / DASHBOARDUNIQUE ERROR REPORT

REPORTING / DASHBOARDFAILURE HISTORY / ONE PAGE

REPORTING / DASHBOARDHIPCHAT NOTIFICATION

REPORTING / DASHBOARDINTELLIGENCE REPORTING

Automate the decision if a failure is a bug or automation issue

•Use OCR to read failed screenshot images to get error messages not captured by

automation• Use Java Script errors in browser console• Use logs (Splunk) to get exceptions specific to the test

• Use good automation failure logging best practices

FEW WORDS

• Few differences in Expedia specific SeleniumGrid Scaler

• https://github.com/ambirag/SeleniumGridScaler, branch: SeleniumGridScalerExp

• Dockerised!

QUESTIONS

?