Escaping Test Hell - ACCU 2014

97
Automated Test Hell Wojciech Seliga [email protected], @wseliga or There and Back Again

description

My talk delivered on 10th of April 2014 in Bristol at ACCU Conference. This is the combination of a few talks I delivered over 2012 and 2013 with some latest updates. This is an experience report based on the work of many developers from Atlassian and Spartez working for years on Atlassian JIRA. If you have (or going to have) thousands of automated tests and you are interested how it may impact you, this presentation is for you.

Transcript of Escaping Test Hell - ACCU 2014

Page 1: Escaping Test Hell - ACCU 2014

Automated Test Hell

Wojciech [email protected], @wseliga

or There and Back Again

Page 2: Escaping Test Hell - ACCU 2014

About me• Coding since 6yo

• Former C++ developer (’90s, early ’00s)

• Agile Practices (inc. TDD) since 2003

• Dev Nerd, Tech Leader, Agile Coach, Speaker, PHB

• 6.5 years with Atlassian (JIRA Dev Manager)

• Spartez Co-founder & CEO

Page 3: Escaping Test Hell - ACCU 2014

XP PromiseC

ost

of C

hang

e

Time

WaterfallXP

Page 4: Escaping Test Hell - ACCU 2014

The Story

Page 5: Escaping Test Hell - ACCU 2014

2.5 years ago

Page 6: Escaping Test Hell - ACCU 2014

About 50 engineers

Page 7: Escaping Test Hell - ACCU 2014

Obsessed with Quality

Page 8: Escaping Test Hell - ACCU 2014

Almost 10 years of accumulating

garbage automatic tests

Page 9: Escaping Test Hell - ACCU 2014

18 000 tests* on all levels13 000 unit tests

*excluding tests of the libraries

4 000 func and integration tests

1 000 Selenium tests

Page 10: Escaping Test Hell - ACCU 2014

Atlassian JIRA

Page 11: Escaping Test Hell - ACCU 2014

Our Continuous Integration

environment

Page 12: Escaping Test Hell - ACCU 2014

Test frameworks

• JUnit 3 and 4

• JMock, Easymock, Mockito

• Powermock, Hamcrest

• QUnit, HTMLUnit, Jasmine.js, Sinon.js

• JWebUnit, Selenium, WebDriver

• Custom runners

Page 13: Escaping Test Hell - ACCU 2014

Bamboo Setup

• Dedicated server with 70+ remote agents (including Amazon Elastic)

• Build engineers

• Bamboo devs on-site

Page 14: Escaping Test Hell - ACCU 2014

Looks good so far?

Page 15: Escaping Test Hell - ACCU 2014

for each main branch

Page 16: Escaping Test Hell - ACCU 2014

Run in parallel in batches

Run first

Page 17: Escaping Test Hell - ACCU 2014

There is

Much More

Page 18: Escaping Test Hell - ACCU 2014

Type of tests

• Unit

• Functional

• Integration

• Platform

• Performance

Page 19: Escaping Test Hell - ACCU 2014

Platforms

• Dimension - DB: MySQL, PostgreSQL, MS SQL, Oracle

• Dimension - OS: Linux, Windows

• Dimension - Java ver.: 1.5, 1.6, 1.7, 1.8

• Dimension - CPU arch.: 32-bit, 64-bit

• Dimension - Deployment Mode: Standalone, Tomcat, Websphere, Weblogic

Run Nightly

Coming

Page 20: Escaping Test Hell - ACCU 2014

Triggering Builds

• On Commit (hooks, polling)

• Dependent Builds

• Nightly Builds

• Manual Builds

Page 21: Escaping Test Hell - ACCU 2014

Very slow (long hours) and fragile feedback loop

Page 22: Escaping Test Hell - ACCU 2014

Serious performance and reliability issues

Page 23: Escaping Test Hell - ACCU 2014

It takes time to fix it...

Page 24: Escaping Test Hell - ACCU 2014

Sometimes very long

Page 25: Escaping Test Hell - ACCU 2014

You commit at 3 PM

You get “Unit Test Green” email at 4PM

You get flood of “Red Test X” emails at 4 - 9PM

Your colleagues on the other side of the globe

You happily go home

You

Page 26: Escaping Test Hell - ACCU 2014

“We probably spend more time dealing with the JIRA

test codebase than the production codebase”

Page 27: Escaping Test Hell - ACCU 2014

Dispirited devs accepting RED as a norm

Page 28: Escaping Test Hell - ACCU 2014

Broken window theory

Page 29: Escaping Test Hell - ACCU 2014

Feedback Speed `

Test Quality

Page 30: Escaping Test Hell - ACCU 2014

Catching up with UI changes

Page Objects Pattern

Problem:

Solution:

Page 31: Escaping Test Hell - ACCU 2014

Page Objects Pattern• Page Objects model UI elements (pages,

components, dialogs, areas) your tests interact with

• Page Objects shield tests from changing internal structure of the page

• Page Objects generally do not make assertions about data. The can assert the state.

• Designed for chaining

Page 32: Escaping Test Hell - ACCU 2014

Page Objects Examplepublic class AddUserPage extends AbstractJiraPage!{!! private static final String URI = !

"/secure/admin/user/AddUser!default.jspa";!! @ElementBy(name = "username")! private PageElement username;!! @ElementBy(name = "password")! private PageElement password;!! @ElementBy(name = "confirm")! private PageElement passwordConfirmation;!! @ElementBy(name = "fullname")! private PageElement fullName;!! @ElementBy(name = "email")! private PageElement email;!! @ElementBy(name = "sendemail")! private PageElement sendEmail;!! @ElementBy(id = "user-create-submit")! private PageElement submit;!! @ElementBy (id = "user-create-cancel")! private PageElement cancelButton;!! @Override! public String getUrl()! {! return URI;! }!

...

@Override! public TimedCondition isAt()! {! return and(username.timed().isPresent(), !password.timed().isPresent(), fullName.timed().isPresent());! }!! public AddUserPage addUser(final String username, !

final String password, final String fullName, final String email, final boolean receiveEmail)!

{! this.username.type(username);! this.password.type(password);! this.passwordConfirmation.type(password);! this.fullName.type(fullName);! this.email.type(email);! if(receiveEmail) {! this.sendEmail.select();! }! return this;! }!! public ViewUserPage createUser()! {! return createUser(ViewUserPage.class);! }!!! public <T extends Page> T createUser(Class<T> nextPage, Object...args)! {! submit.click();! return pageBinder.bind(nextPage, args);! }!

Page 33: Escaping Test Hell - ACCU 2014

Using Page Objects @Test! public void testServerError()! {! jira.gotoLoginPage().loginAsSysAdmin(AddUserPage.class)! .addUser("username", "mypassword", "My Name",!

"[email protected]", false)! .createUser();!

// assertions here! }!

Page 34: Escaping Test Hell - ACCU 2014

Opaque Test Fixtures

REST-based Set-up

Problem:

Solution:

Page 35: Escaping Test Hell - ACCU 2014

REST-based Setup @Before! public void setUpTest() {! restore("some-big-xml-file-with-everything-needed-inside.xml");! }!

@Before! public void setUpTest() {! restClient.restoreEmptyInstance();! restClient.createProject(/* project params */);! restClient.createUser(/* user params */);! restClient.createUser(/* user params */);! restClient.createSomethingElse(/* ... */);! }!

VS

Page 36: Escaping Test Hell - ACCU 2014

Flakey Tests

Timed Conditions

Problem:

Solution:

Mock Unreliable DepsTest-friendly Markup

Page 37: Escaping Test Hell - ACCU 2014

Flakey Tests

Quarantine

Problem:

Solution:

Fix Eradicate

Page 38: Escaping Test Hell - ACCU 2014

Quarantine

• @Ignore

• @Category

• Quarantine on CI server

• Recover or Die

Page 39: Escaping Test Hell - ACCU 2014

Non-deterministic tests are strong inhibitor of change

instead of the catalyst

Page 40: Escaping Test Hell - ACCU 2014

Execution Time: Test Level

Unit Tests

REST API Tests

JWebUnit/HTMLUnit Tests

Selenium/WebDriver Tests

Speed Confidence

Page 41: Escaping Test Hell - ACCU 2014

Our example: Front-end-heavy web app

100 WebDriver tests:100 QUnit tests:

15 minutes1.2 seconds

Page 42: Escaping Test Hell - ACCU 2014

Test Pyramid

Unit Tests (including JS tests)

REST / HTML Tests

Selenium

Good!

Page 43: Escaping Test Hell - ACCU 2014

Test Code is Not Trash

Design

MaintainRefactor

Share

Review

Prune

Respect

Discuss

Restructure

Page 44: Escaping Test Hell - ACCU 2014

Optimum Balance

Isolation Speed Coverage Level Access Effort

Page 45: Escaping Test Hell - ACCU 2014

Dangerous to temper with

MaintainabilityQuality / Determinism

Page 46: Escaping Test Hell - ACCU 2014

Two years later…

Page 47: Escaping Test Hell - ACCU 2014

People - Motivation Making GREEN the norm

Page 48: Escaping Test Hell - ACCU 2014

Shades of Red

Page 49: Escaping Test Hell - ACCU 2014

Pragmatic CI Health

Page 50: Escaping Test Hell - ACCU 2014

Build Tiers and Policy

Tier A1 - green soon after all commits

Tier A2 - green at the end of the day

Tier A3 - green at the end of the iteration

unit tests and functional* tests

WebDriver and bundled plugins tests

supported platforms tests, compatibility tests

Page 51: Escaping Test Hell - ACCU 2014

Wallboards: Constant

Awareness

Page 52: Escaping Test Hell - ACCU 2014

Training

• assertThat over assertTrue/False and assertEquals

• avoiding races - Atlassian Selenium with its TimedElement

• Favouring unit tests over functional tests

• Promoting Page Objects

• Brownbags, blog posts, code reviews

Page 53: Escaping Test Hell - ACCU 2014

Quality

Page 54: Escaping Test Hell - ACCU 2014

Automatic Flakiness Detection Quarantine

Re-run failed tests and see if they pass

Page 55: Escaping Test Hell - ACCU 2014

Quarantine - Healing

Page 56: Escaping Test Hell - ACCU 2014

SlowMo - expose races

Page 57: Escaping Test Hell - ACCU 2014

Selenium 1

Page 58: Escaping Test Hell - ACCU 2014

Selenium ditching Sky did not fall in

Page 59: Escaping Test Hell - ACCU 2014

Ditching - benefits

• Freed build agents - better system throughput

• Boosted morale

• Gazillion of developer hours saved

• Money saved on infrastructure

Page 60: Escaping Test Hell - ACCU 2014

Ditching - due diligence

• conducting the audit - analysis of the coverage we lost

• determining which tests needs to rewritten (e.g. security related)

• rewriting the tests (good job for new hires + a senior mentor)

Page 61: Escaping Test Hell - ACCU 2014

Flaky Browser-based TestsRaces between test code and asynchronous page logic

Playing with "loading" CSS class does not really help

Page 62: Escaping Test Hell - ACCU 2014

Races Removal with Tracing// in the browser:!function mySearchClickHandler() {!    doSomeXhr().always(function() {!        // This executes when the XHR has completed (either success or failure)!        JIRA.trace("search.completed");"    });!}!// In production code JIRA.trace is a no-op

// in my page object:!@Inject!TraceContext traceContext;! !public SearchResults doASearch() {!    Tracer snapshot = traceContext.checkpoint();!    getSearchButton().click(); // causes mySearchClickHandler to be invoked!    // This waits until the "search.completed" // event has been emitted, *after* previous snapshot    !    traceContext.waitFor(snapshot, "search.completed"); !    return pageBinder.bind(SearchResults.class);!}!

Page 63: Escaping Test Hell - ACCU 2014

Can we halve our build times?

Speed

Page 64: Escaping Test Hell - ACCU 2014

Parallel Execution - Theory

End of Build

Batches

Start of Build

Page 65: Escaping Test Hell - ACCU 2014

Parallel Execution

End of Build

Batches

Start of Build

Page 66: Escaping Test Hell - ACCU 2014

Parallel Execution - Reality Bites

End of Build

Batches

Start of Build

Agent availability

Page 67: Escaping Test Hell - ACCU 2014

Dynamic Test Execution Dispatch - Hallelujah

Page 68: Escaping Test Hell - ACCU 2014

"You can't manage what you can't measure."

not by W. Edwards Deming

If you believe just in it

you are doomed.

Page 69: Escaping Test Hell - ACCU 2014

You can't improve something if you can't measure it

Profiler, Build statistics, Logs, statsd → Graphite

Page 70: Escaping Test Hell - ACCU 2014

Anatomy of Build*

CompilationPackaging

Executing Tests

Fetching Dependencies

*Any resemblance to maven build is entirely accidental

SCM Update

Agent Availability/Setup

Publishing Results

Page 71: Escaping Test Hell - ACCU 2014

JIRA Unit Tests Build

Compilation (7min)

Packaging (0min)

Executing Tests (7min)Fetching Dependencies (1.5min)

SCM Update (2min)

Agent Availability/Setup (mean 10min)

Publishing Results (1min)

Page 72: Escaping Test Hell - ACCU 2014

Decreasing Test Execution Time to

ZERRO alone would not let us

achieve our goal!

Page 73: Escaping Test Hell - ACCU 2014

Agent Availability/Setup

• starved builds due to busy agents building very long builds

• time synchronization issue - NTPD problem

Page 74: Escaping Test Hell - ACCU 2014

• Proximity of SCM repo

• shallow git clones are not so fast and lightweight + generating extra git server CPU load

• git clone per agent/plan + git pull + git clone per build (hard links!)

• Stash was thankful (queue)

SCM Update - Checkout time

2 min → 5 seconds

Page 75: Escaping Test Hell - ACCU 2014
Page 76: Escaping Test Hell - ACCU 2014

• Fix Predator

• Sandboxing/isolation agent trade-off: rm -rf $HOME/.m2/repository/com/atlassian/*

intofind $HOME/.m2/repository/com/atlassian/ -name “*SNAPSHOT*” | xargs rm

• Network hardware failure found (dropping packets)

Fetching Dependencies

1.5 min → 10 seconds

Page 77: Escaping Test Hell - ACCU 2014

Compilation

• Restructuring multi-pom maven project and dependencies

• Maven 3 parallel compilation FTW -T 1.5C *optimal factor thanks to scientific trial and error research

7 min → 1 min

Page 78: Escaping Test Hell - ACCU 2014

Unit Test Execution

• Splitting unit tests into 2 buckets: good and legacy (much longer)

• Maven 3 parallel test execution (-T 1.5C)

7 min → 5 min

3000 poor tests (5min)

11000 good tests (1.5min)

Page 79: Escaping Test Hell - ACCU 2014

Functional Tests

• Selenium 1 removal did help

• Faster reset/restore (avoid unnecessary stuff, intercepting SQL operations for debug purposes - building stacktraces is costly)

• Restoring via Backdoor REST API

• Using REST API for common setup/teardown operations

Page 80: Escaping Test Hell - ACCU 2014

Functional Tests

Page 81: Escaping Test Hell - ACCU 2014

Publishing Results

• Server log allocation per test → using now Backdoor REST API (was Selenium)

• Bamboo DB performance degradation for rich build history - to be addressed

1 min → 40 s

Page 82: Escaping Test Hell - ACCU 2014

Unexpected Problem

• Stability Issues with our CI server

• The bottleneck changed from I/O to CPU

• Too many agents per physical machine

Page 83: Escaping Test Hell - ACCU 2014

JIRA Unit Tests Build Improved

Compilation (1min)

Packaging (0min)

Executing Tests (5min)

Fetching Dependencies (10sec)

SCM Update (5sec)

Agent Availability/Setup (3min)*

Publishing Results (40sec)

Page 84: Escaping Test Hell - ACCU 2014

Improvements Summary

Tests Before After Improvement %

Unit tests 29 min 17 min 41%

Functional tests 56 min 34 min 39%

WebDriver tests 39 min 21 min 46%

Overall 124 min 72 min 42%

* Additional ca. 5% improvement expected once new git clone strategy is consistently rolled-out everywhere

Page 85: Escaping Test Hell - ACCU 2014

Better speed increases responsibility

Fewer commits (authors) per single build

vs.

Page 86: Escaping Test Hell - ACCU 2014

The Quality Follows

Page 87: Escaping Test Hell - ACCU 2014

But that's still bad

We want CI feedback loop in a few minutes maximum

Page 88: Escaping Test Hell - ACCU 2014

Splitting The Codebase

Page 89: Escaping Test Hell - ACCU 2014

Inevitable Split - Fears

• Organizational concerns - understanding, managing, integrating, releasing

• Mindset change - if something worked for 10+ years why to change it?

• Trust - does this library still work?

• We damned ourselves with big buckets for all tests - where do they belong to?

Page 90: Escaping Test Hell - ACCU 2014

Splitting code base• Step 0 - JIRA Importers Plugin (3.5 years ago)

• Step 1- New Issue View and Navigator

• Step 2 - now everything else follows JIRA 6.0

Page 91: Escaping Test Hell - ACCU 2014

We are still escaping hell. Hell sucks in your soul.

Page 92: Escaping Test Hell - ACCU 2014

Conclusions

• Visibility and problem awareness help

• Maintaing huge testbed is difficult and costly

• Measure the problem - to baseline

• No prejudice - no sacred cows

• Automated tests are not one-off investment, it's a continuous journey

• Performance is a damn important feature

Page 93: Escaping Test Hell - ACCU 2014

Test performance is a damn important

feature!

Page 94: Escaping Test Hell - ACCU 2014

XP vs Sad RealityC

ost

of C

hang

e

Time

WaterfallXP - idealSad Reality

Page 95: Escaping Test Hell - ACCU 2014

Interested in such stuff?

http://www.spartez.com/careers

We are hiring in Gdańsk

Page 96: Escaping Test Hell - ACCU 2014

• Turtle - by Jonathan Zander, CC-BY-SA-3.0

• Loading - by MatthewJ13, CC-SA-3.0

• Magic Potion - by Koolmann1, CC-BY-SA-2.0

• Merlin Tool - by By L. Mahin, CC-BY-SA-3.0

• Choose Pills - by *rockysprings, CC-BY-SA-3.0

• Flashing Red Light - by Chris Phan, CC BY 2.0

• Frustration - http://www.flickr.com/photos/striatic

• Broken window - http://www.flickr.com/photos/leeadlaf/

Images - Credits

Page 97: Escaping Test Hell - ACCU 2014

Thank You!