Distributed Development

30
Distributed Development Project FreeSpace Dmitri Nesteruk dmitri@activemesa. com Alexey Suvorov alexey@activemesa. com

description

Slides from the talk given at the 28 meeting of Spbalt.net

Transcript of Distributed Development

Page 2: Distributed Development

This Talk Is About…

SpeedHow to get things done faster

QualityHow to get feedback fasterHow to get more testing done

ManageabilityCloud monitoring & controlDecentralization/fault-tolerance

Not only development!

Page 3: Distributed Development

In large/complex projects

IDE interaction is slowCode analysis is slowCompilation is slowTesting is slow(Re)deployment is slow

Page 4: Distributed Development

IDE interaction is slow

IDEs are slow, but we cannot ditch themWe have nearly few software options for optimizing IDEs

E.g., VS is both disk I/O-bound (SSD a must) and CPU-boundWe cannot relocate, e.g., the ReSharper cache into a distributed service

IDEs can be spawned on same project in many machines

Multiple screens/remote desktop windowsSynchronizable with Dropbox, SugarSync, etc.But project/solution reloads in VS will kill you

Page 5: Distributed Development

Code analysis is slow

In-depth analysis of either compiled or source code is computationally intensiveNDepend, FxCop and others can all be run remotely

Not just on the build serverMost of these tools output report files

Can send these to originSome of these tools can be made to work on per-file/per-project rather than per-solution

Page 6: Distributed Development

Compilation is slow

Compilation isC#/VB.NET – acceptableF# – slowC++ – atrociously slow

Made worse by pre/post-build

PostSharpEntity FrameworkCode ContractsMolesEtc.

VS compilation process inefficient

Will rebuild projects that haven’t changedWill not parallelize by default

MSBuild is parallelizable

/m:nCan spawn multiple processes

Page 7: Distributed Development

Testing is (painfully) slow

Unit testing is badly parallelizedMbUnit’s [Parallelizable]Same in NUnit 3

Can easily parallelize at different granularity

Test case/methodTest fixtureTest assembly

Page 8: Distributed Development

Fear of Builds/Tests

Developers loath to compile or run tests too frequentlyDisruption of focus leads them to

Surf the webGo for coffee<insert your pastime here>

Everyone losesLoss of concentration/motivationDevelopers never ‘in the zone’TDD does not work

Page 9: Distributed Development

Who cares?

DevelopersFixed salaryDon’t care about TTMAccustomed to substandard tools/equipmentView compilation as a one-time processDon’t care about frequent/continuous testingSee manual deployment as normal

EmployersMore concerned with saving money than getting things doneUninformed about good/best practicesNot concerned with quick delivery (in case of service companies that charge by the hour)

Page 10: Distributed Development

Problem: nobody recognizes compilation/testing as wasted

time

Page 11: Distributed Development

How to speed things up?

Optimize or buy a faster computerSSDsMore RAMFaster CPUsCostly! Has to be done for every developer.

Alternatively… use existing infrastructure

Both physical (e.g., dev machines) and virtualDistribute workload between machinesUse idle resources – no need to buy new machines.

Page 12: Distributed Development

Status Quo

Computers get fasterMore cores per CPUFaster hard drives (SSD, hybrid)

Software gets more demandingWindows eats more and more RAM & HDDVS is slowerEveryone else follows suit

The overall development experience isnot getting any better

Page 13: Distributed Development

Why Distributed?

Resource under-utilizationA typical enterprise (IT-specific or not) is unlikely to use 100% of processing resources

Resource overloadBottlenecks in servers

Resource costsServer-grade hardwareReliability concerns (e.g. RAID)

Page 14: Distributed Development

Three Pillars of Distribution

Get data on everyone’s machineCloud storage/file syncVerification necessary

Get machines chatting with one another

XMPP client on each nodeLoad balance and optimize execution planSend commands to do work, get results

Work items synchronized via cloud storage

Redundancy/reliability guaranteesIntegrate with existing systems

Easy because XMPP uses XML

Page 15: Distributed Development

Scale Vectors

CoreBetter support for Multi-CoreExists in some cases

MSBuildMbUnit (+ NUnit 3)

Could be leveraged in the general case

Not easy!Needs to mind end user’s preferences

MachineSupport arbitrary networks (both on- and off-site)Need to control code sync (security)Can go for full resource utilization (esp. off-hours)

Speculative processingE.g., Monte-Carlo simulations

Operations which are prohibitively resource-intensive

E.g., mutation tests

Page 16: Distributed Development

Leveraging the Model

CompilationCompiling on dev’s machine is counterproductiveCompilation of some languages (C++, Scala) takes far too much time

But the problem exists everywhere (.Net, Java)

Deployment

TestingLarge test base cannot be run on a dev’s machineCI is not the answer

Constrained to a single machineCan be distributed, but not straightforward

Code analysisVery costly

Coverage analysisExtremely costly

Page 17: Distributed Development

Compilation

MSBuildBuilds all major types of VS projectsCan parallelize locally (/m:n, n=# of processes)

Builds block VSBuild on the UI thread

Builds often inefficientCannot build only projects affected by changes

Cannot use multiple machines

Page 18: Distributed Development

Distributed Compilation

Dramatically speed up solution buildDetermine project dependenciesBuild different projects on separate machinesUse multiple MSBuild processes per machine

Depending on CPU count & power

Does not distract the developerDevelopment machine usable without interruptionQuicker feedback on errorsAllows to instrument a continuous build policy

Build on ever file save

Page 19: Distributed Development

Distributed Testing

Testing is slowUnit testing is largely not parallelized

MbUnit [Paralellizable]Nunit will only support it in v.3

Not parallelized between several machines

Testing in specific environment difficultRequires complicated (possibly manual) deployment processes

Developer typically only tests on their own box (+ maybe CI server)

Distributed testing ensures tests work everywhere

Page 20: Distributed Development

Side Effects

Side effects are unwelcome on users’ machinesEnvironment changes may have undesired consequencesBuilds are typically exempt from this

They do not affect anything beyond solution work folders

Unit tests may or may not affect host systemIntegration tests typically do affect hosts

Require ‘clean’ set-ups

Page 21: Distributed Development

Isolation

No side-effectsIrrelevant, just take care of load

Side effectsProcess-level virtualization (for existing machines)

JauntePEApp-V

Virtual machinesHyper-VESX

Page 22: Distributed Development

Virtualized Testing

Creation of multiple physical nodes is costlyPhysical machine re-configuration takes too much timeCan configure a virtual test environment with

Hyper-VSystem Center Virtual Machine Manager

Virtual/physical migrations

Different hardware requirementsMulti-CPU systemFast disksVery large amounts of RAM

Page 23: Distributed Development

Project FreeSpace

Private Cloud InfrastructureXMPP + file sync

Single-MSI deploymentPlugin architectureFully self-updatingDecentralized service orchestration

Self-organizingEach node has identical capability

Easy to administer

Page 24: Distributed Development

Project FreeSpace Features

Distributed CompilationInitially MSBuild

Distributed Unit TestingInitially via Gallio test automation framework

Distributed Integration TestingVirtual machine managementInitially via Hyper-V

Distributed deploymentE.g., create new VM for testers with appropriate binaries etc.

Page 25: Distributed Development

Benefit Summary

Better than Continuous IntegrationBetter than Continuous TestingBetter than local compilationBetter than local testing

Page 26: Distributed Development

Better Than Continuous Integration

Good for long-running builds/testsHappens on a single machine

Can set up, e.g., multiple instances, but it’s not straightforwardNot designed for distributed builds

Not optimized for idle processingAssumes server is dedicated

Does not give immediate feedbackTypically works on commit

I.e., detects source control changes

Page 27: Distributed Development

Better Than Continuous Testing

Testing is often more costly than compilationTypically, tests run on commitContinuous testing (e.g., Mighty Moose) systems ensure that

Tests run on each saveOnly tests affected by changes are executedFast feedback…

But not instant – you still need to recompile.

Given the option, why not build/test thingsall the time?

Page 28: Distributed Development

Better Than Local Compilation

Does not block IDEScales across your networkMuch faster buildsImmediate feedback

Page 29: Distributed Development

Better Than Local Testing

Much faster recompilationTests do not tax developer CPUAllows for immediate testing in different environmentsTests happen in parallel (where possible)

Page 30: Distributed Development

That’s all!

Questions?