Issue 27 Research Horizons

Post on 22-Jul-2016

215 views 0 download

description

University of Cambridge's research magazine

Transcript of Issue 27 Research Horizons

Pioneering research from the University of Cambridge

Research

Horizons

Issue 27

Spotlight

Big data

Feature The great British takeoff

Feature

How to read adigital footprint

www.cam.ac.uk/research

Contents

4 – 5 Researchnews

6 – 7 Arealpieceofwork8 – 9 Countingonsheep

10 – 11 ThegreatBritishtakeoff 12 – 13 HauntingoftheBlackBook 14 – 15 Play’sthething 16 – 17 Badairdays

18 – 19 AisforAlbatross

20 – 21 Let’sgetstatted 22 – 23 Mastersoftheuniverse 24 – 25 Thebigdatinggame 26 – 27 Miningforcorruption 28 – 29 Computertutor 30 – 31 Howtoreadadigitalfootprint

News

Features

Things

Spotlight: Big data

2 ContentsIssue27,June2015

Attheheartofalmostallresearchliesdataanditsinterpretation.Buttoday’sdatasetsarethelargest,mostdiverseandfastestaccumulatingeverexperienced–somuchso,theyhaveacquiredamonikeralloftheirown,‘BigData’,thefocusofthisissue. TheSquareKilometreArrayoftelescopes,forinstance,willproducemoredatathantheentiretrafficontheglobalinternetatanygivenmoment. GlobalcomparisonsofDNAdatabases,meanwhile,arehelpingresearchersfindpatientswithsomeoftherarestofdiseases,socliniciansdon’thavetostartfromscratcheachtimetheyencounteranewcase. Somedataisofatypeunthinkableafewyearsago.Socialmedia–TwitterandFacebook,forexample–isprovidinginformationthatcouldrevolutionisepsychologicalprofiling,employmentandcommerce. Gettingthemostoutofbigdatarequiresnewmethodstohandlelargevolumesofinformationandthecleveruseofstatisticalalgorithmstodistilmeaningfulknowledgeoutofdisorder.In2013,welaunchedaUniversity-wideinitiative,CambridgeBigData,tohelpresearchersrespondtochallengeslikethese.Cambridgeisalsooneoffiveuniversitiesthatwillcarryoutresearchinorganising,storingandinterrogatingbigdataaspartoftheAlanTuringInstitute. Ofcourse,notalldatais‘big’,butcanstillbeincrediblydifficulttogather–suchasunderstandingtheimpact,decadeslater,ofhowmuchachildplays.Quantifiableevidenceinthisareaisneededforeducationalpracticeandurbandevelopment,asdescribedinthisissue,andanewresearchcentreintheFacultyofEducationwillhelptoprovideit. Wealsocoverresearchonthemostextremeenvironmentknowntoengineering–thejetengine–aswellasairqualitymonitoring,howsheeparehelpingustounderstandadevastatinginfantbraindisease,anewtheoryoftheindustrialrevolutionandthefascinatingstoryofa‘haunted’medievalbook.

Professor Lynn GladdenPro-Vice-ChancellorforResearch

32 – 33 Ialwaysfeellikesomebody’swatchingme…

34 – 35 TragedyinNepal

Inside out

EditorDrLouiseWalsh DesignTheDistrict Printers Micropress Contributors CraigBrierley,SarahCollins,JenniferHayward,TomKirk,FredLewsey,StuartRoberts,LouiseWalsh T +44(0)1223765443 E research.horizons@admin.cam.ac.ukW cam.ac.uk/research

Welcome

Copyright©2015UniversityofCambridgeandContributorsasidentified.ThecontentofResearch Horizons,withtheexceptionofimagesandillustrations,ismadeavailablefornon-commercialre-useinanotherworkunderthetermsoftheCreativeCommonsAttribution-Non-Commercial-ShareAlikeLicence(http://creativecommons.org/licenses/by-nc-sa/3.0/),subjecttoacknowledgementoftheoriginalauthor/s,thetitleoftheindividualworkandtheUniversityofCambridge.ThisLicencerequiresanynewworkwithanadaptationofcontenttobedistributedandre-licensedunderthesamelicenceterms.Research HorizonsisproducedbytheUniversityofCambridge’sOfficeofExternalAffairsandCommunications.

3 ResearchHorizons

News

25.03.15 TheChemistryofHealthprogrammeisawarded£17milliontoresearchAlzheimer’sandParkinson’sdiseases.

01.04.15ArchaeologistsunearthoneofBritain’slargestmedievalhospitalcemeteries,containingover1,000humanremains.

News in brief

More information atwww.cam.ac.uk/research

4 News

‘Big data’ socialA new Centre is teaching undergraduate social scientists the quantitative skills they will need to tackle ‘big data’.

TheUKlagsbehindothercountriesinpreparingsocialscientistsfortheworldof‘bigdata’saysDrBrendanBurchell,DirectorofaCentresetuptoteachtheadvancedquantitativeskillstheywillneedtoworkwithlargedatasets. TheCambridgeUndergraduateQuantitativeMethodsCentre(CUQM),rootedintheDepartmentofSociology,aimstoensurethatatleast25%ofsocialscientistgraduatesleavingCambridgewillhavesomestatisticalexpertise. “TheUKisalreadywayaheadofmanyothercountriesintheavailabilityoflargedatasetsthatcanbeusedtoinformbothpolicyandsocialscienceresearch,”headds.“Overthenextfewdecades–thecareerspanofcurrentundergraduates–wearelikelytoseehugeadvancesinthe

useofquantitativedataincluding‘messier’datasetsthatcanonlybeanalysedwithrecentadvancesinbigdatatechniques. “Theseskillswillbecomeincreasinglyvitalforcareersinsocialscienceresearch,buttheywillalsomakestudentsmuchmoreemployableinmostothersectorsaswell.” CUQMisextendingtheexposuretostatisticsinthesocialscienceundergraduatecoursesatCambridge,aswellasprovidingvacationcoursesandworkplacements.Itispartofawiderinitiativetotrainsocialscientistsinresearchmethods.Cambridge’sSocialScienceResearchMethodsCentre,forinstance,complementstheworkofCUQMbyteachingquantitativemethodstograduatestudents.www.cuqm.cshss.cam.ac.uk

Documents forming the “first draft of history” in the aftermath of the Battle of Waterloo go on display. “ThefieldofBattleexhibitsthismorningamostshockingspectacletoodreadfultodescribe….”Thisletterwrittenfromthebody-strewnbattlefieldatWaterloo,togetherwithaninvasionmapoftheUKandabookfromNapoleon’spersonallibraryinexile,isamongtheexhibitsthathavegoneondisplayinCambridgeUniversityLibraryduringoneofthefirstmajorWaterlooexhibitionsofthebicentenarycommemorations. LookingathowWaterloowaswrittenaboutintheimmediateaftermathofthebattlefoughton18June1815,theexhibitiondrawsontherichandvariedcollectionsattheLibraryandincludespoliticalpropaganda,broadsheets,militarydrillbooks,colouredengravingsandearlyhistoricalaccountsofthebloodshed. “Theexhibitionreallyshowsusthefirstdraftofhistoryasitwasbeingwritteninthedays,monthsandyearsafterthebattle,”explainshistorianDrMarkNicholls,whoco-curatedtheexhibition. Theexhibitionalsofeaturesartefactsandmementoesfromthebattlefielditself,

Image Boy’s Book of British Battles

Credit:UniversityLibrary

Words of Waterloo

includingamusketballandacharredfragmentofHougoumont,thefarmhousewhichoccupiedavitalpositionintheDukeofWellington’sline.Therelicswerecollectedbyateenagegirlvisitingthefield10yearsafterthebattle. “WaterlooisthemostfamousbattleinmodernEuropeanhistory,andfromtheveryfirstmomentsoldiersandciviliansalikewantedtoputtheirexperiencesandemotionsintowords,”addsco-curator

JohnWells.“Weexaminehowthebattle’simpactwasexpressedthroughthewrittenword,andhowthedocumentaryrecordsofthetimecontinuetohaveresonanceforustoday.Waterlooisstillnews,200yearslater.”

‘A Damned Serious Business: Waterloo 1815, the Battle and its Books’ runs until 16 September 2015

12.03.15 Researchonnewfamilyforms,suchassame-sexparents,looksatwhattheseformsmeanfortheparentsandchildreninvolved.

24.03.15 ANationalResearchFacilityforInfrastructureSensingwillrecieve£18millioninfundingtosupporttheapplicationofsensortechnologies.

16.02.15 CambridgeisoneofthreeflagshipDrugDiscoveryInstitutesthatwillfast-trackdevelopmentofnewtreatmentsfordementia.

5 ResearchHorizons

Antennas and chipsDiscovery of the ‘last frontier’ of semiconductor design could be a massive leap forward for wireless communications.

ResearchersfromtheDepartmentofEngineeringhaveunravelledoneofthemysteriesofelectromagnetism,whichcouldenablethedesignofantennassmallenoughtobeintegratedintoanelectronicchip. Thepurposeofanyantenna,whetherinacommunicationstoweroramobilephone,istolaunchenergyintofreespaceintheformofelectromagneticorradiowaves,andtocollectenergyfromfreespacetofeedintothedevice. Oneofthebiggestproblems,however,isthatantennasarestillquitebigandareincompatiblewithelectroniccircuits–whichareultra-smallandgettingsmallerallthetime. “Anaerial’ssizeisdeterminedbythewavelengthassociatedwiththetransmissionfrequencyoftheapplication,andinmostcasesit’samatteroffinding

acompromisebetweenaerialsizeandthecharacteristicsrequiredforthatapplication,”explainsProfessorGehanAmaratunga,wholedtherecentlypublishedresearch. WorkingwithresearchersfromtheNationalPhysicalLaboratoryandCambridge-baseddielectricantennacompanyAntenovaLtd,theteamusedthinfilmsofpiezoelectricmaterials,atypeofinsulatorwhichisdeformedorvibratedwhenvoltageisapplied.Atacertainfrequency,thesematerialsbecomenotonlyefficientresonators,butefficientradiatorsaswell,meaningthattheycanbeusedasaerials. Futureapplicationsofthediscoveryincludeimplementationofthe‘InternetofThings’,wherealmosteverythinginourhomesandoffices,fromtoasterstothermostats,isconnectedtotheinternet.

TherearebillionsofTcellswithinourblood,eachofwhichisengagedintheferociousandunrelentingbattletokeepushealthy.TheTcellextendsmembraneprotrusionsthatexplorethesurfaceofthecell,checkingfortell-talesignsthatitisanuninvitedguest.Ittheninjectspoisonousproteinsknownascytotoxins

A dramatic video has captured the behaviour of white blood cells as they destroy cancer cells.

Insideallofuslurksanarmyofserialkillerswhoseprimaryfunctionistokillagainandagain.CytotoxicTcells,atypeofwhitebloodcell,‘huntdown’anddestroycancercellsandvirallyinfectedcellsbeforemovingontotheirnexttarget. Now,themomentofkillinghasbeencapturedonfilmin3DbyacollaborationofresearchersfromtheUKandtheUSA.TheresearchwasledbyProfessorGillianGriffithsattheCambridgeInstituteforMedicalResearchwithfundingfromtheWellcomeTrust. Theresearchersusedhigh-resolutiontime-lapsemulti-colourimagingtechniquesthatcaptureslicesthroughanobjectandthen‘stitch’themtogether.Asaresult,theyhavemanagedtoelucidatetheorderoftheeventsthatleadtodeliveryofthelethal‘hit’fromtheseserialkillers.

‘Serial killers’ caught on film

betweentheTcellandthecancercell,beforepuncturingthesurfaceanddeliveringitsdeadlycargo. “Oncethecytotoxinsareinjectedintothecancercell,itsfateissealedandwecanwatchasitwithersanddies,”explainsGriffiths.“TheTcellthenmoveson,hungrytofindanothervictim.”

Credit:GillianGriffiths

Image ATcell(green)deliversthelethalhit

Film availablebit.ly/1GMgBJU

6Credit:Scien

ceM

useu

mFeatures

A real piece of work

ImageFlamesandsmokebillowfromtheopencokehearthsoftheBedlamfurnacesinCoalbrookdale,Shropshire,asimaginedbyPhilippeJacquesdeLoutherbourgin1801

“We’re talking about a fundamental change in what we understand

about the past”

I

Dr Leigh Shaw-Taylorlmws2@cam.ac.ukFacultyofHistory

7 ResearchHorizons

n 2003, researchers embarked on a project to piece together a picture of changes in British working life over

the course of 600 years. The emerging results seem to demand a rewrite of the most important chapter in our social and economic history.

TherecomesapointwhentalkingwithDrLeighShaw-Tayloratwhichitseemsnecessarytogooverthefactsagain,ifonlytoestablishthathereallydoesmeanwhatheappearstohavejustsaid.

Whilemanyhistorianswillspendtheircareerschippingawayatthepastwithgentlecare,12yearsintohisresearchproject,TheOccupationalStructureofBritain,1379–1911,Shaw-Taylorseemstobecallingforawholesalerewrite.Ifhisemergingresultsarecorrect,thentheyhavethepotentialtotransformnotonlythemostimportantchapterinoursocialandeconomichistory–theindustrialrevolution(so-called)–butwithitthewellspringofmuchofourlocalandnationalidentity.

Soisn’tthisalittledrastic?“We’retalkingaboutafundamentalchangeinwhatweunderstandaboutthepast,”hesays.“Thatisafairlywidespreadviewofourwork.I’vealwaysfeltthatyoucandomorewithhistoricalresearchthanpeoplethink,butIneverthoughtthatwecoulddothismuch.Andit’snothingcomparedwithwhatwecouldachieveifwecankeeptheprojectgoing.”

Theproject,asitsnamesuggests,isahugelyambitious,wide-scaleattempttoreconstructthepictureofhowworkinglifechangedanddevelopedinBritainfromthelateMiddleAgesthroughtotheearly20thcentury.Co-directedbyShaw-TaylorandhisCambridgecolleagueProfessorSirTonyWrigley,theresearchteamhasspentyearsassemblinginformationaboutmatterssuchaspopulationsize,transportinfrastructureandsector-by-sectoremployment,atdifferentpointsintime.

It’sacomplexjoband,beforethis,nobodyhadreallytriedit.Muchofwhatweknowaboutsocialandeconomichistoryisbasedonrecordssuchaswillsandparishregisters,whicharepatchy,inconsistentorhighlyselective.Aswellascollatinginformation,theteamthereforehadtodevelopamethodofcontrollingforthislackofcoherence,toavoiddistortingtheresultingpictureofthepast.“Wehadtodevelopasystemofweightingtheimportanceofthedatawhenanalysingit,”Shaw-Taylorexplains.“Westillcan’tbesurethatit’sright,butitputsalimitontheextenttowhichwecanbewrong.”

Textbookorthodoxysaysthat,beforetheindustrialrevolution,mostpeopleinBritainworkedinprimarysectoremployment,overwhelminglyinagriculture.Duringthe‘revolutionary’

Equally,iftheshifttosecondarysectoremploymenthappenedbeforethedark,Satanicmillsthatpopulatethenation’sconsciousnessastemplesoftheindustrialrevolutionevenexisted,thenweneedtomodifyourpictureofwhatpeoplewereactuallydoing.Ifnotfarming,thenwhat?

Itseemslikelythatmoreearly-modernBritsthanwethoughtwerecarpenters,shoemakers,bakers,butchers,tailorsandmasons.This,inturn,raisespuzzlesaboutwhenandwhyagriculturalandprimarylabourceasedtobedominant.Thelikelihoodisthattheevolutionofmoreproductive,lesslabour-intensivefarmingledtoadeclineintherelativeimportanceofprimarywork.Overtime,thechildrenandgrandchildrenofagriculturalistswouldhavebeendrawntonewopportunitiesinthesecondarysector,oreventertiary,serviceindustries.

Muchremainstobedoneandtherearestillsignificantgapsintheresearch,mostnotablyaroundtheroleofwomeninBritishemploymenthistory.Manyhistoriansassociatetheindustrialrevolutionwithnewopportunitiesforfemaleemployment;othersbelieve,justasfervently,thatfemaleemploymentcollapsed.Onlywithmoreworkandmorefundingwilltheteambeabletoestablishexactlyhowwomen’slives,andthefamily,changedduringthisperiod,andtheconsequencesthatthishadforwomen’ssocialstatus.

Whatexistsatthemomentis,nevertheless,acompellingcaseforadata-ledapproachtowritingthestoryofthepast.“Methodologically,explainingwhythingshappenedinhistoryisverydifficultbecauseitonlyhappensonceandyoucan’trunitundercontrolledconditions,”Shaw-Taylorobserves.“Yettheprocesseshistoriansaretryingtodescribeareoftenvastlymorecomplexthanthosedescribedbyscience.Ourapproachhasbeentoeschewquestionsofwhyuntilwehavethedataatourdisposal.Untilyouhavethosepatterns,you’rejusttryingtoexplainthingsthatmayormaynothavehappened,andthat’sawasteoftime.”

Funded by the British Academy, the Economic and Social Research Council, The Leverhulme Trust and the Isaac Newton Trust

80-yearperiodstartinginabout1760,thislandscapewastransformedassecondaryindustries–likeprocessingandmanufacturing–tookoff.Onlyinthe1950sdidBritainsupposedlybegintoevolveintothetertiary,service-basedeconomythatwehavetoday.

Onsuchthingsarenationalandlocalmythsfounded–talesofagreenandpleasantlandthatrapidlybecameblackwiththesmogofindustry,forexample,orofacountrythatusedtomakethings,butdoesn’tanymore.

WhenShaw-Taylorandcolleagueslookedatthedatathattheyhadassembled,however,theyfoundthatitdidn’tfittheexistingpicture.Nationally,forexample,secondarysectoremploymentseemstohavegrownmorebetween1500and1750thanbetween1750and1850.“We’vealwayspresumedthatthemajorstructuralshiftinemploymentfromtheprimarytothesecondarysectortookplacebetween1750and1850,”hesays.“Well,accordingtowhatwe’vefound,thatchangetookplaceabout100yearsearlierthanwethought.”

Similarly,thedatatransformsourpictureoftheevolutionoftertiary,service-basedindustriesinBritain.Ratherthantakingoffinthemid-20thcentury,theseseemtohavebeengrowingallthewaythroughthe18thand19th.By1911,onemanin10was,forexample,workingintransport–otherswereshopkeepers,merchants,clerksorprofessionals.

Ifthisistrue,itmeansanadjustmenttoour‘islandstory’thathassomeradicalimplicationsforthehistoryofplacesfarbeyondtheseshoresaswell.Forinstance,itisoftenarguedthatBritain’sindustrialisationwasmadepossiblethankstotherawmaterialsgatheredbytheslavesofEmpire.IfindustrialisationbeganbeforetheEmpireexisted,however,asthesefindingssuggest,thestorychanges.“Moreover,forasmallislandoffthecoastofnorth-westEuropetostartprojectingitspoweraroundtheworld,somethingunusualmusthavehappenedinternallybeforethat,notafter,”Shaw-Taylorpointsout.

heep are smarter than we might think, with brains surprisingly similar to ours. These similarities

are helping researchers to study a devastating and incurable infant brain disease.

“Shallwetakeoneofthesheepforawalk?”asksProfessorJennyMortonbeforeweheaddowntothefarmyard.

Thisseemsastrangequestionatfirst:we’reallfamiliarwithsheepbehavingwithaflockmentality,unabletothinkforthemselves.Somuchso,infact,that‘followlikeasheep’isacommonlyused,derogatoryphraseintheEnglishlanguage.

Yet,onmeetingthesheep,itisimmediatelyclearthatthesearenotjustdumbanimals.TheindividualcharactersportrayedintheanimatedfilmShauntheSheepmightbeclosertothetruth.“Theseanimalsarereallysmart,”explainsMorton,wholeadsateamintheDepartmentofPhysiology,DevelopmentandNeuroscience.“Theyallhavetheirownpersonalities.”

Morton’scolleagueDrNicholasPerentosletsIsabella,oneofhissheep,outofherpen.Sheisexcitedtobeout,butdoesn’tboundoff;rather,shefollows

8

Perentoscloselyatheel,likeaLabradorfollowingitsmaster.Onceoutside,sherunsupanddownthefarmyard,stopping‘tosayhello’toothersheepbeforereturningexpectantlytoherhandler.“She’sdefinitelyNic’ssheep,”saysMorton.“SheknowswhoIam,butI’mnotwearingmyusualfarmclothestoday,soshe’salittlewaryofme.”

Mortonandcolleaguesarestudyingthecognitiveskillsandbehaviourofthesesheep,usingexperimentsadaptedfromthosecarriedoutwithhumans.Astandardtasktheyuseistogivethesheeptwooptionsandmeasuretheirbehaviour:chooseoptionAandtheyreceivepellets,chooseBandtheyreceivenothing.

Usingelectroencephalography(EEG),theresearcherscanmeasurepatternsofelectricalactivityacrossthebraintoseewhatishappeningwhenthesheepmakedecisions.Recently,theyhavebegunmakingmeasurementsfromdeepinsidethebrain.“Wecannowrecordfromindividualneuronsastheyfire,”saysPerentos.“Thismightbeinresponsetoaparticulartaskoradecisionthey’remaking,oritmightbecellsthat‘fire’dependingonwheretheyarestandingorwhichwaytheyareturning.”Thediscoveryoftheselocation-specificcells

inmice–so-called‘placecells’–lastyearwonProfessorJohnO’KeefefromUniversityCollegeLondonaNobelPrize.

Oncetheanimalknowsthetask,theresearcherswillreversethechoices:nowoptionBgivesthepellets,butnudgingtheleverforoptionAoffersnoreward.Rats,monkeys,sheepandhumansalllearntoswitch;but,comparedwithrodents,sheepreactverydifferently,explainsMorton.“Whentheydon’tgettheirrewardthey’llturnaroundandwalkuptoNic,baa-ing,asthoughthey’resaying‘Theapparatusisn’tworking,goandsortitout’.”

Thesheep’sintelligenceisonereasonwhyMortonbelievestheyareausefulanimaltohelpusunderstandhowthebrainworks.Therearesomepracticalreasons–theirdocilenaturemakesthemeasytomanageandtheirlargebodysizemeanstheycaneasilycarryequipmentsuchasGPStrackersinaharnessontheirbacks,allowingresearcherstomeasuretheirnaturalbehaviour–butitisthesizeandstructureoftheirbrainsthatiskey.

Sheep’sbrainsaremuchlargerthanthoseofrodents,similarinsizetothebrainofarhesusmacaque,andwiththecomplexfoldsthatareseeninprimatebrains.Crucially,theirbrainsalsohave

Features

S

Counting on sheep

Professor Jenny Mortonajm41@cam.ac.ukDr Nicholas PerentosDepartmentofPhysiology,DevelopmentandNeuroscience

ResearchHorizons

onecopyfromeachparent.Butitisalsoextremelyserious–symptomsincludeprogressiveblindness,severeseizuresandthelossoflanguage,swallowingandmotorskills.Deathatayoungageisinevitableandthereisnocure.

AlthoughBattendiseaseaffectshumans,ithasneverbeenseeninotherprimates.Itdoes,however,occurnaturallyinsheep,thoughit’sunclearhowcommonitis,asmostfarmedsheeparekilledaslambsforhumanconsumption.ThediseasewasidentifiedinsheepinNewZealand,anditisfromthesesheepthatMorton’sanimalswerebred.Someofhersheepareimported,othersarestudiedinNewZealand.

Battendiseaseisverysimilarinsheepandhumans.Atfirst,itisdifficulttospotaBattensheep,butafteraboutayear,theybegintolosetheireyesightandshowunusualbehaviour.After18monthstotwoyears,theyshowsignsofdementia,oftenstandingmotionlessinspace,andcanbecomeagitatedifhandledbysomeoneotherthantheirusualhandler.

Recordingbrainactivity,particularlyinareassuchasthehippocampus,whichiscrucialformemoryandlearning,willgiveMortonandherteaminsightsintowhatgoeswronginthediseaseinsheep.Thisisonestepalongthelongpathtowardstreating–evencuring–thediseaseinhumans.

WithcollaboratorsinAustralia,MortonisalsostudyingHuntington’sdisease,amorecommonbutequallydevastatingdisease.UnlikethosewithBattendisease,people–andsheep–withHuntington’sdonotbeginshowingsymptomsuntiladulthood.“Wehave

9

goodmousemodelsforstudyingHuntington’sdisease,butmiceareshort-livedanimals,whereassheepcanlivetoatleast12years.Thisisanotherhugebenefitofstudyingthediseaseinsheep.”

Thereisnoquestionthatresearchusinganimalsremainscontroversial.Therearesomewhobelievethatanimalresearchcanneverbejustified.Mortonhasherselfencounteredextremeexamplesofsuchpeopleinthepastandhasfaceddeaththreatsbecauseofherwork.ButsheknowsthatherworkisextremelyimportantforthefamiliesofchildrenwithBattendisease.

“There’sonlyonethingworsethanbeingaparentwithachildwhoisblind,losingtheirmotorskillsanddevelopingdementia,”shesays,“andthat’sbeingaparentwithachildwhoisblind,losingtheirmotorskillsanddevelopingdementia,andthinkingthatnooneisaskingwhy.That’swhywehaveadutytodoourresearch.”

basalgangliasimilartoours–thisistheareadeepinthebrainthat,alongwiththecerebralcortex,isresponsibleforimportantfunctionssuchasthecontrolofmovementand‘executivefunctions’suchasdecision-making,learningandhabitformation.It’sthislatterfacetthatmakessheepausefulmodelforstudyingbraindiseasessuchasHuntington’sdiseaseandBattendiseasethataffectthebasalgangliaandcerebralcortex.

YoumayneverhaveheardofBattendisease:it’sextremelyrare,andonlyahandfulofinfantsorchildrenarediagnosedeachyearintheUK.Itisageneticdiseasecausedwhenachildcarriestwocopiesofanaberrantgene–

“We can now record from

individual neurons as they fire”

Counting on sheep

10

The Great British Takeoff

“Increasing one ingredient might produce one sought-after property, but at the sake of another...

we need to find the perfect chemical recipe”

Features

Credit:Rolls-Royceplc

Dr Cathie Rae cr18@cam.ac.ukDr Howard Stone hjs1002@cam.ac.ukDepartmentofMaterialsScienceandMetallurgy

11 ResearchHorizons

he Periodic Table may not sound like a list of ingredients but, for a group of materials scientists, it’s the starting point for designing the perfect chemical make- up of tomorrow’s jet engines.

Insideajetengineisoneofthemostextremeenvironmentsknowntoengineering.

Inlessthanasecond,atonneofairissuckedintotheengine,squeezedtoafiftiethofitsnormalvolumeandthenpassedacrosshundredsofbladesrotatingatspeedsofupto10,000rpm;reachingthecombustor,theairismixedwithkeroseneandignited;theresultinggasesareaboutathirdashotasthesun’ssurfaceandhurtleatspeedsofalmost1,500kmperhourtowardsawallofturbines,whereeachbladegeneratespowerequivalenttothethrustofaFormulaOneracingcar.

Turbinebladesmadefrom‘super’materialswithoutstandingpropertiesareneededtowithstandtheseunimaginablychallengingconditions–wherethetemperaturessoartoabovethemeltingpointoftheturbinecomponentsandthecentrifugalforcesareequivalenttohangingadouble-deckerbusfromeachblade.

Evenwiththesequalities,thebladesrequireaceramiclayerandanaircoolingsystemtopreventthemfrommeltingwhentheenginereachesitstoptemperatures.Butwithever-increasingdemandsforgreaterperformanceandreducedemissions,theaerospaceindustryneedsenginestorunevenhotterandfaster,andthismeansexpectingmoreandmorefromthematerialstheyaremadefrom.

This,saysDrCathieRae,isthe materialsgrandchallenge.“Turbinebladesaremadeusingnickel-basedsuperalloys,whicharecapableofwithstandingthephenomenalstressesandtemperaturestheyneedtooperateunderwithinthejetengine.Butwearerunningclosetotheircriticallimits.”

Analloyisamixtureofmetals,suchasyoumightfindinsteelorbrass.Asuperalloy,however,isamixturethatimpartssuperiormechanicalstrengthandresistancetoheat-induceddeformationandcorrosion.

RaeisoneofateamofscientistsintheRolls-RoyceUniversityTechnologyCentre(UTC)attheDepartmentofMaterialsScienceandMetallurgy.Theteam’sresearcheffortsarefocusedonextractingthegreatestpossibleperformancefromnickel-basedsuperalloys,andondesigningsuperalloysofthefuture.

Currentjetenginespredominantlyusealloyscontainingnickelandaluminium,whichformastrongcuboidallattice.Withinandaroundthisbrick-likestructure

areuptoeightothercomponentsthatforma‘mortar’.Together,thecomponentsgivethematerialitssuperiorqualities.

“Eventinyadjustmentsintheamountofeachcomponentcanhaveahugeeffectonthemicroscopicstructure,andthiscancauseradicalchangesinthesuperalloy’sproperties,”explainsDrHowardStone.“It’sratherlikeadjustingtheingredientsinacake–increasingoneingredientmightproduceonesought-afterproperty,butatthesakeofanother.Weneedtofindtheperfectchemicalrecipe.” StoneisthePrincipalInvestigatoroverseeinga£50millionStrategicPartnershiponstructuralmetallicsystemsforadvancedgasturbineapplicationsfundedjointlybyRolls-RoyceandtheEngineeringandPhysicalSciencesResearchCouncil(EPSRC),andinvolvingtheUniversitiesofBirmingham,Swansea,Manchester,OxfordandSheffield,andImperialCollegeLondon.

Theresearchersmelttogetherpreciseamountsofeachofthedifferentelementstoobtaina5cmbar,thenexhaustivelytestthebar’smechanicalpropertiesandanalyseitsmicroscopicstructure.Theirpastexperienceinatomicengineeringisvitalforhominginonwheretheincrementalimprovementsmightbefound–withoutthis,theywouldneedtomakemanymillionsofbarstotesteachreasonablemixtureofcomponents.

Now,theyarelookingbeyondtheusualcomponentstoexoticelements,althoughalwayswithaneyeonkeepingcostsaslowaspossible,whichmeansnotusingextremelyrarematerials.“ThePeriodicTableisourplayground…we’repickingandmixingelements,guidedbyourcomputermodelsandexperimentalexperience,tofindthenextgenerationofsuperalloys,”headds.

Theteamnowhave12patentswithRolls-Royce.OneofthemostrecenthasbeenincollaborationwithImperialCollegeLondon,andinvolvesthediscoverythattheextremelystrongmatrixstructureofnickel-basedaluminiumsuperalloyscanalsobeachievedusingamixtureofnickel,aluminium,cobaltandtungsten.

“Insteadofthecakebeingflavouredwithtwomainingredients,wecanmakeitwithfour,”Stoneexplains.“Thisgivesthestructureevenbetterproperties,manyofwhichweareonlyjustdiscovering.”

“We’vealsobeenlookingatnewintermetallicreinforcedsuperalloysusingchromium,tantalumandsilicon–nonickelatall.Wehaven’tquitegotthefinalbalancetoachievewhatwewant,butwe’reworkingtowardsit.”

Stonehighlightstheimportanceofcollaborationbetweenindustryandacademia:“Newalloystypicallytake10yearsandmanymillionsofpoundstodevelopforoperationalcomponents.

Wesimplycouldn’tdothisworkwithoutRolls-Royce.Forthebestpartoftwodecadeswe’vehadacollaborationthatlinksfundamentalmaterialsresearchthroughtoindustrialapplicationandcommercialexploitation.”

It’sasentimentechoedbyDrJustinBurrows,ProjectManageratRolls-Royce:“Ouracademicpartnersunderstandthematerialsanddesignchallengeswefaceinthedevelopmentofgasturbinetechnology.ImprovementslikethenovelnickelandsteelalloysdevelopedinCambridgearekeytohelpingusmeetthesechallengesandtomaintainingourcompetitiveadvantage.”

TheCambridgeUTC,whichwasfoundedbyitsDirectorProfessorSirColinHumphreysin1994,isoneofaglobalnetworkofover30UTCs.TheseformpartofRolls-Royce’s£1billionannualinvestmentinresearchanddevelopment,whichalsoincludestheDepartmentofEngineering’sUniversityGasTurbinePartnership.Rolls-RoyceandEPSRCalsofundDoctoralTrainingCentresinCambridgethathelptoensureacontinuingsupplyofhighlytrainedscientistsandengineersreadytomoveintoindustry.

TheUKaerospaceindustryisthelargestinEurope,withaturnoverin2011of£24.2billion;worldwide,it’ssecondonlytothatoftheUSA.Meanwhile,increasingglobalairtrafficisestimatedtorequire35,000newpassengeraircraftby2030,worthabout$4.8trillion.

Fortheresearchers,it’sfascinatingtoseeglobalengineeringchallengesbeingsolvedfromtheatomup,asRaeexplains:“Thecommercialsuccessofanewenginecanbedependentonverysmalldifferencesinfuelefficiency,whichcanonlybeachievedbyinnovationsinmaterialsanddesign.There’ssomethingreallyexcitingaboutworkingattheatomicscaleandseeingthistranslateintoinnovationwithbigpowerfulmachines.”

T

Film availableonline

he 16th-century owner of one of Wales’ oldest manuscripts probably thought they were

‘tidying up’ when they assiduously erased ancient doodles and verses scribbled in its margins. Now, Cambridge researchers have brought them back to life.

ProfessorPaulRussellandPhDstudentMyriahWilliamshadbeenpeeringattheancientmanuscriptforseveralhours,methodicallyturningpageafterpageandadjustingtheultraviolet(UV)lampinthehopeofcastingnewlightandunderstandingona750-year-oldmasterpiece.

OtherreadersandresearchershadcomeandgonefromtheReadingRoom

attheNationalLibraryofWalesasthepairploughedon.But,despitetheirefforts,thevellumpageshadrevealedonlythemedievalWelshpoetrytheyknewsowell,plusafewtinyfragmentsoftextinthemargins,noneofwhichwereparticularlyremarkableornoteworthy.

Then,astheUVlightfellonfolio39vofthemanuscript,Russellturnedinastonishmenttohiscolleagueandasked:“AreyouseeingwhatI’mseeing?”

There,invisibletothenakedeyebutappearingundertheglareofUV,wereapairofetherealfacesandalineofaccompanyingtext.Withimage-enhancementtechniques,theyweretofindanentirepageoferasedversethatwas(andremains)unknowninthecanonofWelshpoetry.

ThemanuscriptcontainingtheghostlyimageswasThe Black Book of Carmarthen–theearliestsurvivingmedievalmanuscriptwrittensolelyinWelsh.ContainingsomeoftheearliestreferencestothelegendarytalesofKingArthurandMerlin,theBlackBook(so-calledbecauseofthecolourofitsbinding)isacollectionof9th-to12th-centuryreligiousandsecularpoetry,anddrawsonthetraditionsoftheWelshfolk-heroesandlegendsfromtheearlymedievalperiod.

However,despiteitsimportanceanddecadesofscholarlyresearch,itistheworkofthetwoCambridgeresearchersthatisilluminatingnewglimpsesofversefromthisancientbook.

“Weknewthattherehadbeensignificanterasureinthemarginsofthemanuscriptbutweneverexpectedtofindtwofacesstaringoutatus,”saysWilliams.“Wethoughtwemightrecoversometextbutnotimages.Youneverfindimages.”

WilliamsandRussell,fromtheDepartmentofAnglo-Saxon,NorseandCeltic,haveworkedtogetherontheBlackBookforthepastthreeyears.Russellhasstudiedthelanguage–thenutsandboltsofspelling,punctuation,grammar,andsoon–whereasforWilliamsthebookasawholeisthesubjectofherPhD.

The54-pagebook,whichdatesfrom1250andisonlyjustlargerthanahand’slength,isthoughttohavebeenwrittenbyasinglescribewhowasprobablycollectingandrecordingpoetryduringalongperiodofhislife.Then,asthemanuscriptchangedhandsoverthecenturiesthatfollowed,itsvariousownersmadetheirownadditions.

Until,thatis,itwas‘tidiedup’.Theybelievethata16th-centuryownerofthebook,possiblyamannamedJasparGryffyth,summarilyerasedthemarginalia.

T

Haunting of the Black Book

Credit:Allimag

es,N

ationa

lLibraryofW

ales

12 Features

AmongtheerasedmaterialissomepreviouslyunrecordedWelshverse.Althoughthetextisfragmentaryandinneedofmoreanalysis,itseemstobethecontinuationofapoemontheprecedingpagetogetherwithanewpoematthefootofthepage.

“It’seasytothinkweknowallwecanknowaboutamanuscriptliketheBlackBook,”addsWilliams.“Buttoseetheseghostsfromthepastbroughtbacktolifeinfrontofoureyeshasbeenincrediblyexciting.Thedrawingsandversethatwe’reintheprocessofrecoveringdemonstratethevalueofgivingthesebooksanotherlook.

“Themarginsofmanuscriptsoftencontainmedievalandearlymodernreactionstothetext,andthesecancastlightonwhatourancestorsthoughtaboutwhattheywerereading.TheBlackBookwasparticularlyheavilyannotatedbeforetheendofthe16thcentury.Forinstance,WelshscholarDrJohnDaviesofMallwydwroteinthemargin‘Idon’tunderstandtheBlackBook’.Thisiswonderful!ThiswasamanwhowroteoneofthefirstWelshgrammarsanddictionaries!Thistypeofreactionbringsthepagestolife.”

ThepairalsorecoveredadrawingofafishunderneathapoemaboutthedrowningofCardiganBay.Thebaywasflooded,solegendsays,throughthewrathofGod,andbothRussellandWilliamsbelievethefishwasdrawninconnectionwiththepoem’ssubjectmatter.Ironically,thepagecontainingthepoemaboutfloodingshowssomeevidenceofwaterdamage.

ContentsoftheBlackBookrangefromreligiousversetopraisepoetrytonarrative

poetry.AnexampleofthelatteristheearliestpoemconcerningtheadventuresofArthur,whichseesthefamedheroseekingentrancetoanunidentifiedcourtandexpoundingthevirtuesofhismeninordertogainadmittance.

OtherheroesarepraisedandlamentedinalengthytextknownasEnglynion y Beddau,theStanzasoftheGraves,inwhichanarratorpresentsgeographiclorebyclaimingtoknowtheburialplacesofupwardsof80warriors.Arthurmakesanappearancehereaswell,butonlyin-so-farastosaythathecannotbefound:anoeth bid bet y arthur,‘thegraveofArthurisawonder’.

Furtherfamousfiguresalsoappearthroughout,includingMyrddin,morefamiliarlyknownbytheEnglishas‘Merlin’.

Therearetwopropheticpoemsattributedtohimduringhis‘wildman’phaselocatedinthemiddleofthemanuscript,butadditionallytheveryfirstpoemofthebookispresentedasadialoguebetweenhimandthecelebratedWelshpoet

Myriah Williamsmjw202@cam.ac.ukProfessor Paul Russellpr270@cam.ac.ukDepartmentofAnglo-Saxon,NorseandCeltic

“We thought we might recover some text but not images.

You never find images.”

Taliesin.EversinceGeoffreyofMonmouthcomposedHistoria Regum Britanniaeinthe12thcenturytherehasbeenaconnectionbetweenCarmarthenandMerlin,anditmaybenoaccidentthattheBlackBookopenswiththistext.

Russellbelievesthatthenewdiscoveriesmayonlybethetipoftheicebergintermsofwhatcanberecoveredasimagingtechniquesareenhanced:“Thesedrawingsandothermarginaliahelpustogobeyondthetexttoshowwhatpeoplethoughtaboutit,sometimesseriously,sometimesinaplayfulway.Themanuscriptisextremelyvaluableandincrediblyimportant–yettheremaystillbesomuchwedon’tknowaboutit.”

13 ResearchHorizons

ImagesImagesofTheBlack Book of Carmarthen,includingtheerasedfaces(left)

14 Features

hildren’s play is under threat from increased urbanisation, perceptions of risk and

educational pressures. The first research centre of its kind aims to understand the role played by play in how a child develops.

Brickbybrick,six-year-oldAliceisbuildingamagicalkingdom.Imaginingfairy-taleturretsandfire-breathingdragons,wickedsorcerersandgallantheroes,she’screatinganenchantingworld.Althoughsheisn’tawareofit,thisfantasywillhaveimportantrepercussionsinheradultlife:itishelpinghertakeherfirststepstowardshercapacityforabstractthoughtandcreativity.

Minuteslater,Alicehasabandonedthekingdominfavourofwrestlingwithherbrother–or,accordingtoeducationalpsychologists,developinghercapacityfor

strongemotionalattachments.Whenshebosseshimaroundas‘histeacher’,she’spractisinghowtoregulateheremotionsthroughpretence.Whentheysettledownwithaboardgame,she’slearningaboutrulesandturn-taking.

“Playinallitsrichvarietyisoneofthehighestachievementsofthehumanspecies,”saysDrDavidWhitebreadfromCambridge’sFacultyofEducation.“Itunderpinshowwedevelopasintellectual,problem-solving,emotionaladultsandiscrucialtooursuccessasahighlyadaptablespecies.”

Recognisingtheimportanceofplayisnotnew:overtwomillenniaago,Platoextolleditsvirtuesasameansofdevelopingskillsforadultlife,andideasaboutplay-basedlearninghavebeendevelopingsincethe19thcentury.

Butweliveinchangingtimes,andWhitebreadismindfulofaworldwide

declineinplay.“Overhalftheworld’spopulationliveincities.Playiscurtailedbyperceptionsofrisktodowithtraffic,crime,abductionandgerms,andbytheemphasison‘earlierisbetter’inacademiclearningandcompetitivetestinginschools.

“Theopportunitiesforfreeplay,whichIexperiencedalmosteverydayofmychildhood,arebecomingincreasinglyscarce.Today,playisoftenascheduledandsupervisedactivity.”

InternationalbodiesliketheUnitedNationsandtheEuropeanUnionhavebeguntodeveloppoliciesconcernedwithchildren’srighttoplay,andtoconsiderimplicationsforleisurefacilitiesandeducationalprogrammes.Butwhattheyoftenlackistheevidencetobasepolicieson,asWhitebreadexplains:“Thoseofuswhoareinvolvedinearlychildhoodeducationknowthatchildrenlearnbest

C

Play’s the thing

thestory.Inthelateststudy,childrenfirstbuilttheirstorywithLEGO,withsimilarresults.“Manyteacherscommentedthattheyhadalwayspreviouslyhadchildrensayingtheydidn’tknowwhattowriteabout.WiththeLEGObuilding,however,notasinglechildsaidthisthroughthewholeyearoftheproject.”

ThestrandofresearchheleadsintheCentrewillfocusontheresultsoflarge-scalelongitudinalstudies,suchastheUniversityofLondon’sMillenniumCohortStudy,whichischartingthesocial,economicandhealthconditionsofindividualchildren.Whitebreadhopestodeterminehowmuchachildplays,thequalityoftheirplayfulness,andwithwhatendresult.

Evenwhenthisevidenceisknown,itisoftendifficulttodeveloppracticesthatbestsupportchildren’splay.ThetworesearchstrandsledbyGibsonandBakerwillaidthis:Gibsonwillbedevelopinganunderstandingofthecognitiveprocessesinvolvedinplayandmeasuresofplayfulness,andBakerwillbeconstructingandevaluatingplay-basededucationalinterventions.

Whitebread,whodirectsPEDaL,trainedasaprimaryschoolteacherintheearly1970s,when,ashedescribes,“theteachingofyoungchildrenwaslargelyaquietbackwater,untroubledbyanyseriousintellectualdebateorcontroversy.”Now,thelandscapeisverydifferent,withhotlydebatedtopicssuchasschoolstartingageandtheintroductionofbaselineassessmenttothosestartingschoolinSeptember2015.

“Somehowtheimportanceofplayhasbeenlostinrecentdecades.It’sregardedassomethingtrivial,orevenassomethingnegativethatcontrastswith‘work’.Let’snotlosesightofitsbenefits,andthefundamentalcontributionsitmakestohumanachievementsinthearts,sciencesandtechnology.Let’smakesurechildrenhavearichdietofplayexperiences.”

15 ResearchHorizons

Left to rightDr David Whitebreaddgw1004@cam.ac.ukDr Sara Bakerstb32@cam.ac.ukDr Jenny Gibsonjlg53@cam.ac.ukFacultyofEducation

throughplayandthatthishaslong-lastingconsequencesforachievementandwellbeing.Butthekindofhardquantifiableevidencethatisunderstoodbypolicymakersisdifficulttoobtain.Researchingplayisinherentlytricky.”

“Thetypeofplayweareinterestedinischild-initiated,spontaneousandunpredictable–but,assoonasyouaskafive-year-old‘toplay’,thenyouastheresearcherhaveintervened,”explainsDrSaraBaker.“Andwewanttoknowwhattheimpactofplayisyears,evendecades,later.It’sarealchallenge.”

DrJennyGibsonagrees:“Althoughsomeofthestepsinthepuzzleofhowandwhyplayisimportanthavebeenlookedat,thereisverylittle,high-qualityevidencethattakesyoufromtheamountandtypeofplayachildexperiencesthroughtoitsimpactontherestofitslife.”

Now,thankstothenewCentreforResearchonPlayinEducation,DevelopmentandLearning(PEDaL),Whitebread,Baker,Gibsonandateamofresearchershopetoprovideevidenceontheroleplayedbyplayinhowachilddevelops.

“Astrongpossibilityisthatplaysupportstheearlydevelopmentofchildren’sself-control,”explainsBaker.

“Theseareourabilitiestodevelopawarenessofourownthinkingprocesses–itinfluenceshoweffectivelywegoaboutundertakingchallengingactivities.”

InastudycarriedoutbyBakerwithtoddlersandyoungpre-schoolers,shefoundthatchildrenwithgreaterself-controlsolvedproblemsquickerwhenexploringanunfamiliarset-uprequiringscientificreasoning,regardlessoftheirIQ.“Thissortofevidencemakesusthinkthatgivingchildrenthechancetoplaywillmakethemmoresuccessfulandcreativeproblem-solversinthelongrun.”

Ifplayfulexperiencesdofacilitatethisaspectofdevelopment,saytheresearchers,itcouldbeextremelysignificantforeducationalpracticesbecausetheabilitytoself-regulatehasbeenshowntobeakeypredictorofacademicperformance.

Gibsonadds:“Playfulbehaviourisalsoanimportantindicatorofhealthysocialandemotionaldevelopment.Inmypreviousresearch,Iinvestigatedhowobservingchildrenatplaycangiveusimportantcluesabouttheirwellbeingandcanevenbeusefulinthediagnosisofneurodevelopmentaldisorderslikeautism.”

Whitebread’srecentresearchhasinvolveddevelopingaplayfulapproachtosupportingchildren’swriting.“Manyprimaryschoolchildrenfindwritingdifficult,butweshowedinapreviousstudythataplayfulstimuluswasfarmoreeffectivethananinstructionalone.”Childrenwrotelongerandbetterstructuredstorieswhentheyfirstplayedwithdollsrepresentingcharactersin

“The opportunities for free play, which

I experienced almost every day of my childhood, are becoming

increasingly scarce”

ollution causes 30,000 people a year in the UK to die early yet most of us are unaware of the degree

to which we are exposed to it. Low-cost pollution detectors could provide the answer. Rushhourcanbemaddening.Roadscongestedwithtraffic,publictransportovercrowded,pavementsheavingwithpeople.Butaswellasthefrustration,there’sasinistersidetothecommutetowork:everybreathyoutakecouldbeaddingtoyourriskofdyingprematurely.

Airpollutionistheworld’slargestsingleenvironmentalhealthrisk,causingoneineveryeightdeathsaccordingtofiguresreleasedlastyearbytheWorldHealthOrganization.IntheUK,30,000peopledieprematurelyeveryyearasaresultofpoorairquality,anditcoststheNHSandwidereconomymanybillionseachyear.

Trafficisthemainculprit;however,industry,domesticheating,powergenerationandburningareallcontributorstopollution.Andalthoughtheeffectsofpollutionmightbenoticeableonaparticularlysmoggydayinalargecity,decadesofexposuretoonlyslightlyhigherlevels–alevelwewouldn’tevennotice–canincreasetheriskofheartandlungdiseases,strokeandcancer.

“Toworkoutthefactorsweshouldbeworriedabout,andhowwecanintervene,weneedtorethinkhowwemeasurewhat’sgoingon,”explainsatmosphericscientistProfessorRodJones.

IntheUK,theAutomaticUrbanandRuralNetworkprovidesvaluablehour-by-hourassessmentsofairquality.Butwithonly171monitoringstationsatfixedsitesnationwide,largeareasofthecountryremainuncovered.Costisthemainlimitationtodevelopingahigherdensitynetwork.

P

Withthisinmind,Jones’team,togetherwithindustrialpartnersandotheruniversities,hasbeendevelopinglow-costpollutiondetectorsthataresmallenoughtofitinyourpocket,stableenoughtobeinstalledaslong-termstaticdetectorsaroundacity,andsensitiveenoughtodetectsmallchangesinairqualityonastreet-by-streetbasis.TheirfindingsarenowinformingresearchprojectsaimedatimprovingairqualityinmajorcitiesacrossEuropeandNorthAmerica.

ThedetectorsarebasedonelectrochemicalsensorsdevelopedbyprojectpartnerAlphasenseforindustrialsafety,wheredetectionoftoxicgasesisneededattheparts-per-millionlevel.Monitoringairquality,however,requiresparts-per-billionsensitivity.“RodandIhadtheconfidencetobelievethatwecouldpushoursensorstolowerconcentrationlevels,andyetkeepsensorcostslow,”saysDrJohnSaffell,TechnicalDirectoratAlphasense.

Theelectrochemicaldevicestheteamdevelopedcanmeasureawiderangeofpollutants,includingcarbonmonoxide,nitrogendioxideandozone,andtheycontainlasertechnology(developedbytheUniversityofHertfordshire)todetectparticulatesfromcarsandlorries.TheadditionofaGPSaerialallowsairqualitydataandlocationtobemappedsimultaneously.

Aseriesofproof-of-conceptstudiesfollowed.Personaldeviceswerestrappedtobicycles,carriedincarsandonbuses,andstaticdeviceswereattachedtolamppostsandstationedatroadsidesandatcriticalpollutantsites.FiftystaticdeviceswerealsodeployedaroundLondonHeathrowAirporttorecord22monthsinthelifeofoneofthebusiestairportsintheworld.

“Thiswasthefirsttimetechnologylikethishadbeentestedinreal-worldsituationsasahigh-densitynetwork,”saysJones,whoseresearchatHeathrow

Bad air days

16 Features

Professor Rod Jonesrlj1001@cam.ac.ukDepartmentofChemistry

wasfundedbytheNaturalEnvironmentResearchCouncil.“Wecouldseehugevariabilityintheexposuretopollutionthatpeopleencounterastheymovearoundtheurbanenvironment,including‘hotspots’.

AtHeathrow,wecouldseetheairportturningonandoffduringtheday,individualaircrafttaxingandtakingoff,andtheeffectsofwinddirectionandtheperimeterandM25motorwayroadtraffic.”

Theyalsodiscoveredthatsensorperformancecancreatenewopportunities.Jonesandcolleagueshadtodevelopnewsmartsoftwaremethodscapableofseparatinglocalpollutioneventsfrombackgroundsignals(pollutiontransportedfromlongrange)andthentocalibratesensorsacrossnetworks.Plus,theyneededtomovefrombeingabletoprocessthedataafterithasbeencollectedtodoingsoinreal-time.

TheteamhasbeenworkingwithCambridgeEnvironmentalResearchConsultants–developersofworld-

leadingairqualitymodellingsoftware–combiningtheunprecedentedlevelofdatacreatedbythepollution-monitoringstudieswithmodeloutputtoenhancetheunderstandingofpollutiondispersion.

Forinstance,sensorscanbeusedtoaskwhetherpollutionalongbusroutesisimprovedbyupgradingtheexhaustprocessingonabusfleet;whetherpeoplelivingatthetopofhigh-risebuildingsexperiencemoreorlesspollutionthanpeopleatstreetlevel;andtowhatextentchangingaroutetowork,evenfromonesideoftheroadtoanother,canaffectanindividual’sexposure.

Lastyear,thefirstcommercialproduct(AQMesh)wasreleasedbyUKmanufacturerGeotech,whichspecialisesinenvironmentalmonitoringequipment.AQMeshusesAlphasensesensorstosampleevery10seconds,anddataprocessingiscarriedoutinreal-timeusingcloudcomputingsoftwaresimilartothatdevelopedbytheCambridgeteam.

“Whentheprojectstartedin2006therewerelonevoicescallingforadifferentapproachtoairqualitymonitoring,”explainsGeotech’sCommercialManagerAmandaRandle.“TheCambridgeteamandAlphasensehelpedustounderstandthesensor’sfullpotential,andnowwehaveaproductthatcanbeplacedexactlywhereit’sneededandprovidesvaluableinformation.”

AndnowtheapproachpioneeredinCambridgeishelpingtoinformtwoofthelargestairqualityresearchstudiesoftheirkind.

TheAirSensaproject,runbythenon-profitorganisationChangeLondon,aimstodeploylargenumbersofairqualitysensorsacrossthewholeofGreaterLondon.Alphasenseisprovidingthesensorsandsupportingtheengineering;andCambridgeishelpingwithdatainterpretationinaprojectwhoseethos

is“youcan’tmanagewhatyoucan’tmeasure.”

Meanwhile,themethodologiestheresearchersdevelopedinthepilotstudyatHeathrowarecontributingtoCITI-SENSE,anEU-funded€12.7millionprojectprovidingwirelessnetworkstoeightcitiesacrossEurope.CITI-SENSEinvolves27partnerinstitutionsfromacademia,thehealthcaresectorandindustry(includingAlphasenseandGeotech),aswellasthegeneralpublic.CitizensacrossEuropewillbeinvolvedindatacollectionthroughpersonalmonitorsandincommunitydecision-makingtochoosemonitoringsolutionsforspacessuchasschoolsandurbanpublicspaces.

“Eventhoughtheeffectsofpoorairqualityonhealtharewellknown,irrefutableevidenceofthescaleoftheairqualityissueandthebenefitsofamelioratingstrategiesisurgentlyneeded,”addsJones.“CITI-SENSEprovidesatest-bedforbothrollingoutthenewtechnologiesthatarecomingonlineandfordrawingonthe‘poweroftheCitizen’toguidehowsocietyresponds.”

17 ResearchHorizons

“Air pollution is the world’s largest single environmental health risk, causing one in every eight deaths”

Film available online

fascinating 26-part series – the Cambridge Animal Alphabet – has launched online. It celebrates

Cambridge’s connections with animals through literature, art, science and society.

HorsesfrolicontheplastercastoftheParthenonfriezeintheMuseumofClassicalArchaeology;researchintochickensishelpingtounderstandamajorsourceoffoodpoisoning;anewbookexploresthemakingofthemoderndogasapamperedurbanpet;and,everyday,millionsoffruitfliesarefedbytheflykitchenintheDepartmentofGenetics. ThesearejustsomeofthestorieswediscoveredwhenwesetouttocreateaCambridgeAnimalAlphabet:anAtoZofanimalswithaCambridgeconnection.Theseries,whichisavailableonourwebsite,launcheswithA is for Albatross… InJune1910,DrEdwardWilsonsetsailtoAntarcticaonboardtheTerra NovaontheBritishAntarcticExpeditionledbyCaptainScott.Asupremelytalentedartist,Wilsonsketchedwhathesaw–includingthemajesticalbatross.

Theexpeditionendedintragedy.ThemembersoftheBritishexpeditionperishedontheirreturnfromthepolehavingdiscoveredthattheNorwegianshadgottherefirst.Wilson’ssketchbookwasretrievedfromthetentwhereheandhiscompanionsspenttheirlastdays. Today,around1,900ofWilson’sdrawingsandsketchesareheldbyCambridge’sScottPolarResearchInstitute(SPRI),whichhousesauniquecollectionofmaterialsillustratingpolarexploration,historyandscience. “Wilsonisoneofthegreatestartistsoftheheroicageofpolarexploration,”saysHeatherLane,formerKeeperofthePolarMuseumatSPRI.“Hecapturedwithstunningaccuracyboththeanatomicalstructureandthefragilebeautyoflivingthings.”Read more about Wilson’s sketches and our research on albatrosses at www.cam.ac.uk/research, and watch out for B is for Bear, C is for Chicken, D is for Dragon as we work our way through the Cambridge Animal Alphabet.

Things18

ThingsA is for Albatross

ImagesEdwardWilson’ssketchesofalbatrosses

A

Feature and film available bit.ly/1EmWZdL

ResearchHorizons19Credit:Allimag

es,S

cottPolarResearchInstitu

te

20 Spotlight:Bigdata

ith more information than ever at our fingertips, statisticians are vital to innumerable fields and industries. Welcome to the world of the datarati, where humans and machines team up to crunch the numbers.

Let’s get statted

Researchers are now refining the system to cope with the messy, incomplete

nature of real-world data

W

Left to rightProfessor Zoubin Ghahramanizg201@cam.ac.ukDepartmentofEngineeringProfessor Richard Samworthrjs57@cam.ac.ukDepartmentofPureMathematicsandMathematicalStatistics

21 ResearchHorizons

“Ikeepsayingthatthesexyjobinthenext10yearswillbestatisticians,andI’mnotkidding,”HalVarian,ChiefEconomistatGooglefamouslyobservedin2009.Itseemsadifficultassertiontotakeseriously,butsixyearson,thereislittlequestionthattheirskillsareatapremium.

Indeed,wemayneedstatisticiansnowmorethanatanytimeinourhistory.Evencomparedwithadecadeago,wecannowgather,produceandconsumeunimaginablylargequantitiesofinformation.AsVarianpredicted,statisticianswhocancrunchthesenumbersarealltherage.Anewdiscipline,‘DataScience’,whichfusesstatisticsandcomputationalwork,hasemerged.

“Peopleareawashindata,”reflectsZoubinGhahramani,ProfessorofInformationEngineeringatCambridge.“Thisisoccurringacrossindustry,it’schangingsocietyaswebecomemoredigitallyconnected,andit’strueofthesciencesaswell,wherefieldslikebiologyandastronomygeneratevastamountsofdata.”

Overthepastfewyears,RichardSamworth,ProfessorofStatistics,haswatchedthedataratistepoutfromtheshadows.“It’sprobablyfairtosaythatstatisticsdidn’thavetheworld’sbestPRforquitealongtime,”hesays.“Sincethisexplosionintheamountofdatathatwecancollectandstore,opportunitieshavearisentoanswerquestionswepreviouslyhadnohopeofbeingabletoaddress.Thesedemandanawfullotofnewstatisticaltechniques.”

‘Bigdata’ismostobviouslyrelevanttothesciences,wherelargevolumesofinformationaregatheredtoanswerquestionsinfieldssuchasgenetics,astronomyandparticlephysics,butitalsohasmorefamiliarapplications.TransportauthoritiesgatherdatafromelectronicticketingsystemslikeOystercardstounderstandmoreaboutpassengermovements;supermarketscloselymonitorcustomertransactionstoreacttoshoppers’predilections.Asusersofsocialmedia,manyofusdisclosedataaboutourselvesthatisasvaluabletomarketingasitisrelevanttopsychoanalytics.Increasingly,wearealso‘lifeloggers’,monitoringourownbehaviour,health,dietandfitness,throughsmarttechnology.

Thisinformation,asGhahramanipointsout,isnouseonitsown:“Itfillsharddrives,buttoextractvaluefromit,weneedmethodsthatlearnpatternsinthedataandallowustomakepredictionsandintelligentdecisions.”Thisiswhatstatisticians,computerscientistsandmachinelearningspecialistsbringtotheparty–theybuildalgorithms,whicharecodedascomputersoftware,toseepatterns.Atroot,thedataratiareinterpreters.

Despitetheir‘sexy’newimage,however,notenoughdatascientistsexisttomeetthisrocketingdemand.Couldsomeaspectsoftheinterpretationbeautomatedusingartificialintelligenceinstead,Ghahramaniwondered?Andso,in2014andwithfundingfromGoogle,thefirstincarnationofTheAutomaticStatisticianwaslaunchedonline.Despiteminimalpublicity,3,000usersuploadeddatasetstoitwithinafewmonths.

Oncefedadataset,theAutomaticStatisticianassessesitagainstvariousstatisticalmodels,interpretsthedataand–uniquely–translatesthisinterpretationintoashortreportofreadableEnglish.Itdoesthiswithouthumanintervention,drawingonanopen-ended‘grammar’ofstatisticalmodels.Itisalsodeliberatelyconservative,onlybasingitsassessmentsonsoundstatisticalmethodology,andevencritiquingitsownapproach.

Ghahramaniandhisteamarenowrefiningthesystemtocopewiththemessy,incompletenatureofreal-worlddata,andalsoplantodevelopitsbaseofknowledgeandtoofferinteractivereports.Inthelongerterm,theyhopethattheAutomaticStatisticianwilllearnfromitsownwork:“Theideaisthatitwilllookatanewdatasetandsay,‘Ah,I’veseenthiskindofthingbefore,somaybeIshouldcheckthemodelIusedlasttime’,”heexplains.

Whileautomatedsystemsrelyonexistingmodels,newalgorithmsareneededtoextractusefulinformationfromevolvingandexpandingdatasets.Here,theroleofhumanstatisticiansisvital.

Tocharacterisetheproblem,Samworthpresentsathen-and-nowcomparison.Duringthepastcentury,atypicalstatisticalproblemmight,forinstance,havebeentounderstandtherelationshipbetweentheinitialspeedandstoppingdistanceofcarsbasedonasamplesizeof50.

Thesedays,however,wecanrecordinformationonahugenumberofvariablesatonce–theweather,roadsurface,makeofcar,winddirection,andsoon.Althoughtheextrainformationhasthepotentialtoyieldbettermodelsandreduceuncertainty,inmanyareas,thenumberoffeaturesmeasuredissohighitmayevenexceedthenumberofobservations.Identifyingappropriatemodelsinthiscontextisaseriouschallenge,whichrequiresthedevelopmentofnewalgorithms.

Toresolvethis,statisticiansrelyonaprinciplecalled‘sparsity’;theideathatonlyafewbitsofthedatasetarereallyimportant.Thestatisticianidentifiestheseneedlesinthehaystack.Variousalgorithmshavebeendevelopedtoselecttheimportantvariables,sothattheinitialsprawlofinformationstartstobecomemanageableandpatternscanbeextracted.

TogetherwithhiscolleagueDrRajenShahintheDepartmentofPureMathematicsandMathematicalStatistics,Samworthhasdevelopedamethodforrefininganysuchvariableselectiontechniquecalled‘ComplementaryPairsStabilitySelection’.Thisappliestheoriginalmethodtorandomsubsamplesofthedatainsteadofthewhole,anddoesthisoverandoveragain.Eventually,thevariablesthatappearonahighproportionofthesubsamplesemergeasthosemeritingfurtherattention.

ScanningGoogleScholarforcitationsofthepaperinwhichthiswasproposed,Samworthfindsthathisalgorithmhasbeenusedinnumerousresearchprojects.Onelooksathowtoimprovefundraisingfordisasterzones,anotherexaminespotentialbiomarkersforbreastcancersurvival,andathirdidentifiesriskfactorsconnectedwithchildhoodmalnutrition.

Howdoeshefeelwhenheseeshisworkbeingappliedsofarandwide?“It’sfunny,”hesays.“MytrainingisinmathematicsandIstillgetakickfromprovingatheorem,butit’salsorewardingtoseepeopleusingyourwork.It’softensaidthatthegoodthingaboutbeingastatisticianisthatyougettoplayineveryone’sbackyard.Isupposethisdemonstrateswhythat’strue.”

“It fills hard drives, but to extract value

from it, we need methods that

learn patterns in the data”

he ‘world’s largest IT project’ — a system with the power of one hundred million home computers

— may help to unravel many of the mysteries of our universe: how it began, how it developed and whether humanity is alone in the cosmos.

ImaginehavingtodesignacompletelyautomatedsystemthatcouldtakeallofthelivevideofromallofthehundredsofthousandsofcamerasmonitoringLondon,andautomaticallydispatchanambulanceanytimeanypersonfallsandhurtsthemselves,anywhereinthecity,withoutanyhumaninterventionwhatsoever.Thatisthescaleoftheproblemfacingtheteamdesigningthesoftwareandcomputingbehindtheworld’slargestradiotelescope.

Whenitbecomesoperationalin2023,theSquareKilometreArray(SKA)willprobetheorigins,evolutionandexpansionofouruniverse;testoneoftheworld’smostfamousscientifictheories;andperhapsevenanswerthegreatestmysteryofall—arewealone?

Constructiononthemassiveinternationalproject,whichinvolvesandisfundedby11differentcountriesand100organisations,willstartin2018.Whencomplete,itwillbeabletomaptheskyinunprecedenteddetail—10,000timesfasterand50timesmoresensitivelythananyexistingradiotelescope—anddetectextremelyweakextraterrestrialsignals,greatlyexpandingourabilitytosearchforplanetscapableofsupportinglife.

TheSKAwillbeco-locatedinSouthAfricaandAustralia,whereradiointerferenceisleastandviewsofourgalaxyarebest.Theinstrumentitselfwillbemadeupofthousandsofdishesthatcanoperateasonegigantictelescopeormultiplesmallertelescopes—aphenomenonknownasastronomicalinterferometery,whichwasdevelopedinCambridgebySirMartinRylealmost70yearsago.

“TheSKAisoneofthemajorbigdatachallengesinscience,”explainsProfessorPaulAlexander,wholeadstheScienceDataProcessor(SDP)consortium,whichisresponsiblefordesigningallofthesoftwareandcomputingforthetelescope.In2013,theUniversity’sHighPerformanceComputingServiceunveiled‘Wilkes’—oneoftheworld’sgreenestsupercomputerswiththecomputingpowerof4,000desktopmachinesrunningatonce,andakeytest-bedforthedevelopmentoftheSKAcomputingplatform.

Duringitsprojected50-yearlifespan,theSKAwillcarryoutseveralexperimentstostudythenatureoftheuniverse.Cambridgeresearcherswillfocusontwoofthese,thefirstofwhichwillfollowhydrogenthroughbillionsofyearsofcosmictime.

“Hydrogenistherawmaterialfromwhicheverythingintheuniversedeveloped,”saysAlexander.“Everythingwecanseeintheuniverseandeverythingthatwe’remadefromstartedoutinthe

formofhydrogenandasmallamountofhelium.Whatwewanttodoistofigureouthowthathappened.”

Thesecondofthetwoexperimentswilllookatpulsars—spinningneutronstarsthatemitshort,quickpulsesofradiation.Sincetheradiationisemittedatregularintervals,pulsarsalsoturnouttobeextremelyaccuratenaturalclocks,andcanbeusedtotestourunderstandingofspace,timeandgravity,asproposedbyEinsteininhisgeneraltheoryofrelativity.

Bytrackingapulsarasitorbitsablackhole,thetelescopewillbeabletoexaminegeneralrelativitytoitsabsolutelimits.Asthepulsarmovesaroundtheblackhole,theSKAwillfollowhowtheclockbehavesintheverystronggravitationalfield.

“Generalrelativitytellsusthatmassiveobjectslikeblackholeswarpthespace–timearoundthem,andwhatwecallgravityistheeffectofthatwarp,”saysAlexander.“Thisexperimentwillenableustotestourtheoryofgravitywithmuchgreaterprecisionthaneverbefore,andperhapsevenshowthatourcurrenttheoriesneedtobechanged.”

AlthoughtheSKAexperimentswilltellusmuchmorethanwecurrentlyknowaboutthenatureoftheuniverse,theyalsopresentamassivecomputingchallenge.Atanyonetime,theamountofdatagatheredfromthetelescopewillbeequivalenttofivetimestheglobalinternettraffic,andtheSKA’ssoftwaremustprocessthatvaststreamofdataquicklyenoughtokeepupwithwhatthetelescopeisdoing.

22 Spotlight:Bigdata

T

Masters of the

universehow it began, how it developed

and whether humanity is alone in the cosmos

Professor Paul Alexanderpa@mrao.cam.ac.ukDepartmentofPhysics

ResearchHorizons23

Moreover,thesoftwarealsoneedstogrowandadaptalongwiththeproject.ThefirstphaseoftheSKAwillbejust10%ofthetelescope’stotalarea.Eachtimethenumberofdishesonthegrounddoubles,thecomputingloadwillbeincreasedbymorethanthesquareofthat,meaningthatthecomputingpowerrequiredforthecompletedtelescopewillbemorethan100timeswhatisrequiredforphaseone.

“Youcanalwayssolveaproblembythrowingmoreandmoremoneyandcomputingpoweratit,”saysAlexander.“Wehavetomakeitworksensiblyasasinglesystemthatiscompletelyautomatedandcapableoflearningovertimewhatthebestwayofgettingridofbaddatais.Atthemoment,scientiststendtolookatdatabutwecan’tdothatwiththeSKA,becausethevolumesarejusttoolarge.”

ThechallengesfacedbytheSKAteamechothosefacedinmanydifferentfields,andsoAlexander’sgroupisworkingcloselywithindustrialpartnerssuchasIntelandNVIDIA,aswellaswithacademicandfundingpartnersincludingtheUniversitiesofManchesterandOxford,andtheScienceandTechnologyFacilitiesCouncil.ThebigdatasolutionsdevelopedbytheSKA

Credit:SKAOrgan

isation

Image Artist’simpressionoftheSKA,whichwillbemadeupofthousandsofdishesthatoperateasonegigantictelescope

partnerstosolvethechallengesfacedbyamassiveradiotelescopecanthenbeappliedacrossarangeofindustries.

Oneofthesechallengesishowtoprocessdataefficientlyandaffordably,andconvertitintoimagesofthesky.Thetargetforthefirstphaseoftheprojectisa300‘petaflop’computerthatusesnomorethaneightmegawattsofpower:morethan10timestheperformanceoftheworld’scurrentfastestsupercomputer,forthesameamountofenergy.‘Flops’(floatingpointoperationspersecond)areastandardmeasureofcomputingperformance,andonepetaflopisequivalenttoamillionbillioncalculationspersecond.

“TheinvestmentinthesoftwarebehindtheSKAisasmuchas€50million,”addsAlexander.“Andifoursystemisn’tabletogrowandadapt,we’dbethrowingthatinvestmentaway,whichisthesameproblemasanyoneinthisareafaces.Wewantthesolutionswe’redevelopingforunderstandingthemostmassiveobjectsintheuniversetobeappliedtoanynumberofthebigdatachallengesthatsocietywillfaceintheyearstocome.”

hen is a rare disease not a rare disease? The answer: when big data gets involved. An

ambitious new research project aims to show patients that they are not alone.

Atsomepointintheircareer,everydoctorwillencounterapatientwhoseconditionperplexesthem,requiringdetailedinvestigationanddiscussionwithcolleaguesbeforediagnosisispossible.Afterall,noteverydiseaseisascommonascancer,whichaffectsaroundoneinthreeofus,ordepression,whichaffectsonein10.

DrLucyRaymondfromtheDepartmentofMedicalGeneticsspecialisesinrarediseases.Technically,thismeansdiseasesthataffectfewerthanonein2,000people,

butinfact,RaymondseeschildrenwithlearningdisabilitiessorarethattheymaybetheonlypersonintheUKtobeaffected.

Theseconditionsareusuallycausedbyoneoftwoscenarios:aspontaneouschangetotheirDNA,notinherited,ora‘recessivedisorder’wheretwocopiesofthesame,rarevariantarenecessaryforthediseaseandeachparentunwittinglypassesonacopy.Comparingthechild’sandtheirparents’genomesenablestheresearcherstopinpointthegeneresponsible.Inextremelyrarecases–wherethepatientappearstobetrulyunique–theresearchersneedtostudywhetherthesamevariantinmiceorzebrafishcreatesasimilarcondition.

“Or,”Raymondexplains,“wemightessentiallygeneratea‘datingagency’

24

W

The Big Dating Game

totrytomatchourpatientwithasimilarcasesomewhereelseintheworld.”Withthesediseasesasrareastheyare,theonlywayforthistobeviablewouldbetohaveaccesstotens,possiblyhundreds,ofthousandsofpotentialmatches:somethingtheeraof‘bigdata’makespossible.

Butthispresentsapotentialproblem:howtoshareinformationaboutthepatientwithoutbreakingtheirconfidentiality.UnlikeintheUSA,whereprojectssuchastheBroadInstitute’sExomeAggregationConsortium(ExAC)placegenomedatainthepublicdomain,dataintheUKisdepositedina‘managed-access’database:bonafideresearcherswithaclearresearchproposalareallowedaccess,andonlythenaftersigningacommitmentsayingtheywillnotattempttoidentifyindividualpatients.

“Wehavetorememberthatbigdataisgreat,butitisn’tourdata:it’speople’sdataandweneedtoberespectfulofthis.PeopleintheUKareoftenaltruistic;wehavefreeblooddonation,wehaveatremendoustraditionofpatientsgivingtohelpothers.Wemustnotjeopardisethisrelationship.

“Parentsknowthateveniffindingthegeneabnormalitythatisresponsiblewillnotimmediatelyhelptheirchild,itmayhelpensurethatothersdon’thavetowait20yearsbeforetheirchildreceivesadiagnosis.They’rehappytosharethedataonthatbasis,butarelesskeenontheideathatthey’lllosecontroloftheinformation.”

Forseveralyears,Raymond,ProfessorWillemOuwehandandDrJohnBradleyhavebeenleadingtheNationalInstituteforHealthResearchBioResourceforRareDiseasesinCambridge,whichhasrecruitedsome5,800patients.TheyarenowpartofamajorinitiativelaunchedbyPrimeMinisterDavidCameron:the100,000GenomesProject.CambridgeUniversityHospitalsNHSFoundationTrustwillleadtheEastofEnglandGenomicMedicineCentre,oneof11centresacrosstheUKaimedatrealisingthisprojectandsequencingthegenomesofpatientsaffectedbycancerorrarediseases.

“The100,000GenomesProjectisaboutgoingforwardtohavingatrulynationalhealthservice,notaprovincial,regionalhealthservice,”explainsRaymond.“Thedatawillbecentral,willbenational,willbeavailabletoresearchersandhealthcareprofessionalsacrossthecountry.”

Thesheernumberofpeoplerecruitedwillcreateapowerfuldatasetandensurethatcliniciansandresearchersdon’thavetostartfromscratcheachtimetheyencounteranewcase.Infact,thevalueofapatient’sgenomeextendsbeyondjusthelpingidentifythecauseoftheirdisease:it’salsoimportantasa‘control’tocompare

againstandhelpfindthecauseofanotherpatient’sdisease.“It’saformof‘enforcedaltruism’.Havingallthedatastoredinacentralplacemeansthateverybody’sdataactsasacontrolforeverybodyelse’s.Ithasamultiplyingeffect.”

Bigdataalsorevealsanotherwiseglaringlyobviousfactthatthename‘rarediseases’obscures:onein2,000,eveninapopulationof64million,isnotaninsignificantnumberofpeople.“Tenyearsagopeopleusedtoask‘Whystudyrarediseaseswhenthey’resorare?’It’sonlyrecentlythatpeoplearecomingroundtoseethat,withbigdata,rareiscommon.

“Rarediseasesarebecomingincreasinglytractable,too,sonowthere’sahugeinterestinthem,whichisgood:it’snotyourfaultifyourdiseaseisrare.Solvingtheseproblemsisthenextbigchallenge,”saysRaymondwithaglintinhereye.“Ifitwasalleasy,wewouldn’tbedoingit–intypicalCambridgestyle.”

Left to rightDr Lucy Raymond flr24@cam.ac.ukDepartmentofMedicalGeneticsDr Lydia Drumrightlnd23@cam.ac.ukDepartmentofMedicine

25 ResearchHorizons

Trust me, I’m an e-doctor

Big data ‘dating agencies’ are not just for people with rare conditions. A similar concept could help patients with far more common conditions receive the best possible hospital treatment.

Addenbrooke’sHospitalinCambridgeisoneofthefirst‘eHospitals’inEngland,explainsDrLydiaDrumrightfromtheDepartmentofMedicine.Everythingthathappenstoyouwithinthehospital–everytestresult,everydiagnosis,everydrugprescribed–iscapturedinanelectronicrecord.DrumrightandhercolleagueDrAfzalChaudhrybelievethatthewealthofinformationintheserecordscanbeusedtobetterinformthetreatmentsofindividuals. “Around10–20%ofourpatientsmayhavediabetesoracutekidneyinjury,butthat’snotnecessarilywhythey’rehere,”explainsDrumright.“Theymighthavehadaheartattack,sothey’rebeingcaredforbythecardiologyteam,butthedrugsthey’represcribedmighthaveanimpactontheirotherconditions.Addedtothat,they’renowmoresusceptibletoinfection. “It’sthejuniordoctorsthathavetolookafterthepatientsanddothebasicprescribing.They’restilllearning,butneedtoknowwhichdrugsworkbestandthehospital’spolicyforprescribingantibiotics.” Couldapatient‘datingagency’notdissimilartothatsuggestedbyRaymond,basedoneveryone’smedicalrecords,helpthesejuniordoctors?“Thedoctorcansearchforotherpatientsthatlookliketheirown.Theycangobackhistoricallyandseewhatdrugswereprescribedandwhattheiroutcomeslookedlike.” Drumrightismindfulofsettingupasystemthattellsdoctorswhattoprescribe;theliteratureabouthowweinterfacewithtechnologysuggeststhatpeoplecantooeasilysurrendertheirresponsibility.Instead,it’saboutbuildingoncollectiveknowledge,“Whatwe’retryingtodoisenhancethedoctor’sexperiencesothatit’snot‘myexperienceasme’,it’stheexperienceofeveryprescriberinthehospital.”

esearchers have developed a new technique that trawls the enormous amounts of public

procurement data now available across the EU to highlight unscrupulous uses of public funds: from national and regional levels to individual contracts, companies and politicians.

TheAmericaneconomistAlanGreenspanoncedescribedcorruptionas“thewayhumannaturefunctions”,it’sjustthatsuccessfuleconomiesmanagetokeepittoaminimum.Thequestion,ofcourse,ishow.

Inthedigitalage,withits‘freedomofinformation’,corruptusesofpublicfinanceforpoliticalandcorporatecronyismshouldhavefewerdarkcornerstohidein.

Sincethelate2000s,virtuallyalldevelopedcountriesdigitisedandmadeavailablepublicprocurement

data.However,thisdatadelugecancreatetheillusionoftransparency,withafogofinformationsovastastoseemimpenetrable.

Previously,exposingcorruptionoftenreliedonthediligenceofjournalistsand

campaignerstosiftthroughdataandmakeconnections.Suchinvestigationsrequiretimeandluck,andcanbebiased.

Butnowateamofdata-drivensociologistshavecreatedanewmeasurementsystemfordetectingexploitationofpublicfinance,designedtotakeadvantageofthenewdata

avalanche.It’sasystemthatislikelytorattlethoseprofitingcorruptlyatthepublic’sexpense(andgiveactivistsgoodcausetosalivate).

Theteamdefinedkey‘redflags’:contractualsituationsthatsuggesthighrisksofcorruptbehaviour.Byunleashing

‘creeper’algorithmsandsophisticatedtext-miningprogramsonpublicprocurementdatatosnifftheseflagsout,theteamcanmaplevelsofcorruptionriskatregionalandnationalscale,trackcorruptbehaviourintendering

“Corruption is probably the number one complaint about people in power”

R

Mining

₵¤rru₱₮i¤₦For

26 Spotlight:Bigdata

organisations,andpinpointsuppliersandevenindividualcontractsthatlookfishy.

TheCorruptionRiskIndex(CRI)minesavailableinformationaboutexpenditureofpublicfinancesforpoliticalcollusion,competitionriggingandcronycapitalism,

allwithunrivalledspeedandaccuracy.DevelopedbyDrMihályFazekasandProfessorLawrenceKingfromtheDepartmentofSociology,itformsthebasisoftheDigitalWhistleblower,or‘DigiWhist’,ledbyCambridgewithaconsortiumofEuropeaninstitutes,andwhichhasjustsecured€3millionofEuropeanUnion(EU)

Horizon2020funding.“Corruptionisprobablythenumber

onecomplaintaboutpeopleinpower,buttherewerenoreallyobjectivewaystomeasurecorruption,”explainsKing.

“Usingourmethodology,institutionalisedcorruptioncanbemeasuredrightdowntothelevelofindividualcontractsandtendersinabout50countriesaroundtheglobesince2008to2009–openingupawholeuniverseofscientificandpolicyapplications.WeaimtomakeCRIavailabletocitizens,civilsocietygroupsandjournalists,to

holdpoliticiansandpoliticalpartiesaccountableforcorruptbehaviour.”

TheprojectbeganwhenFazekashadabrainwavewhileworkingonhisPhDwithKing.Inmany

developednationssince2007,wheneverthegovernmentpurchasedsomethingoveraround€20,000

(orequivalent),thecontractandtenderdataweremadedigitallyavailable.Inmanycountries,thisis

around7%oftheGDP–abigchunkoftheeconomy.

Fazekasspoketoexpertsonpublicprocurementtouncovertheboxoftricksoftenemployedtofleecethepublicpurse.Cannily,healsotalkedtocompanieswhohadfallenoutoffavoursincetheircountry’sgovernmentchanged,“sotheywerehappytotellmehowitwasbackintheday”.ThisworkeventuallyledtotheCRI’s13‘redflags’ofcorruption.

Forexample:veryshorttenderperiods(“ifatenderisissuedonaFridayandawardedonaMonday–redflag”);veryspecificorsuspiciouslycomplextenderscomparedwiththefield(“likewritingajobdescriptionforaroleyouwantyourfriendtoget”);tendermodificationsleadingtobiggercontracts;inaccessibletenderdocuments;veryfewbiddersinhighlycompetitivemarkets.DifferentscalesandcombinationsofflagsallowresearcherstocreatetheriskrankingsoftheCRI.

UsinganinitialEUgrant,theteamconductedaproofofprinciplewithdatafromHungary,SlovakiaandtheCzechRepublic.TheyfoundthatfirmswithahigherCRIscoremademoremoney:thefinalcontractvaluefrequentlycame

inmuchhigherthantheoriginalestimate.Thesecompaniesarealsomorelikelytohavepoliticiansinvolved–eithermanagingorowningthem–andberegisteredintaxhavens.

Overthenextthreeyears,theteamaimstodothisforprocurementdataacross34EuropeancountriesandtheEUinstitutions,creatingacorruptionrankingthatrangesfromnationaltocontractlevel.“Previous

corruptionindicatorstendedtobeverybluntinstruments.Wecananalyseregionsandsectorsbutalsoindividualorganisationsandloanofficers.It’sanenormouslypowerfulandfine-grainedtool,”addsKing.

TheDigiWhistprojectwillencompassfourdifferentdatalabsacrossEuropetocollectand‘clean’data,andbuild

databases.Whiletheircurrentmechanismhasmanualelements,thenextversion–developedbyDrEikoYoneki’steaminCambridge’sComputerLaboratory–willhaveself-learningalgorithmsthatrecogniseerrorsandlinktoexistingsolutionsfromthedatabase.“Afteraninitialteachingphase,itwillkindofrunonitsown,”saysFazekas.

Alltheirfindingswillbemadepubliclyavailable,withdownloadabledatabasesthatcanbeinterrogatedbyacademics,journalistsand,indeed,anyonewithaninterestinwhathappenstopublicmoneyandinholdingbusinessesandpoliticalpartiesaccountableforcorruptbehaviour.

Fazekasbelievestheirresultscouldbemarriedwithpubliccrowdsourcingtobuildamorecompletepictureoftheconsequencesofsiphoningpublicfunds.“ImagineamobileappcontaininglocalCRIdata,andastreetthat’sinbadneedofrepair.Youcanfindoutwhenpublicfundswereallocated,whoto,howthecontractwasawarded,howthecompanyranksforcorruption.Thenyoucantakeaphotoofthedamagedstreetandaddittothedatabase,taggingcontractsandcompanies,”saysFazekas,whoisalreadyworkingwithDigiWhistadvisorsonprototypes.

“Theideathatthepublicaregoingtobeabletointerrogatethisdataonaverylocalisedbasisandcontributetoitthemselvesthroughthingslikesmartphoneappsisacompellingone!”Fazekasadds.

ForKing,healthwillbeabigfocus.“Oneofthebigdebatesisaroundderegulationandprivatisationofhealth,andwhetheritincreasesefficiency.Butdoesitincreasecorruption?

“There’sbeenalotoftalkofbigdataforawhilenowbutnotmuchhascomeoutofit…ByhavingresearcherslikeMihály,whostraddlebothtechandsocialscience,Ithinkwe’llstarttoseethepotentialforbigdatatoturnintoimportantfindingsthatreallydomaketheworldbetter,”saysKing.

Left to right Professor Lawrence King lk285@cam.ac.ukDr Mihály Fazekas mf436@cam.ac.ukDepartmentofSociology

₵¤rru₱₮i¤₦For

27 ResearchHorizons

Using our methodology, institutionalised

corruption can be measured right down

to the level of individualcontracts and tenders

in about 50 countriesaround the globe

illions of English language tests are taken each year by non-native English speakers.

Researchers at Cambridge’s ALTA Institute are building ‘computer tutors’ to help learners prepare for the exam that could change their lives.

“Wearrivedtoourdestinationandwelookedeachother.”

ToanativeEnglishspeaker,themistakesinthissentenceareclear.ButsomeonelearningEnglishwouldneedateachertopointthemout,explainthecorrectuseofprepositionsandchecklaterthattheyhaveimproved.Allofwhichtakestime.

Nowimaginethelearnerwasabletosubmitafewparagraphsoftextonlineand,inamatterofseconds,receiveanaccurategrade,sentence-by-sentencefeedbackon

itslinguisticqualityandusefulsuggestionsforimprovement.

ThisisCambridgeEnglishWrite&Improve–anonlinelearningsystem,or‘computertutor’,tohelpEnglishlanguagelearners–andit’sbuiltoninformationfromalmost65millionwordsgatheredovera20-yearperiodfromteststakenbyrealexamcandidatesspeaking148differentlanguageslivingin217differentcountriesorterritories.

BuiltbyProfessorTedBriscoe’steaminCambridge’sComputerLaboratory,it’sanexampleofanewkindoftoolthatusesnaturallanguageprocessingandmachinelearningtoassessandgiveguidanceontextithasneverseenbefore,andtodothisindistinguishablyfromahumanexaminer.

“AboutabillionpeopleworldwidearestudyingEnglishasafurtherlanguage,withaprojectedpeakin2050ofabout

twobillion,”saysBriscoe.“Thereare300millionpeopleactivelypreparingforEnglishexamsatanyonetime.Allofthemwillneedmultipletestsduringthislearningprocess.”

Languagetestingaffectsthelivesofmillionsofpeopleeveryyear;asuccessfultestresultcouldopenthedoortojobs,furthereducationandevencountries.

Butmarkingtestsandgivingindividualfeedbackisoneofthemosttime-consumingtasksthatateachercanface.Automatingtheprocessmakessense,saysDrNickSaville,DirectorofResearchandValidationatCambridgeAssessment.

“Humansaregoodteachersbecausetheyshowunderstandingofpeople’sproblems,butmachinesaregoodatdealingwithroutinethingsandlargeamountsofdata,seeingpatterns,and

28 Spotlight:Bigdata

M

“Machines are good at dealing with routine things and large amounts of data… these tools can free up the teacher’s time to focus on actual teaching”

ComputerTutor

givingfeedbackthattheteacherorthelearnercanuse.Thesetoolscanfreeuptheteacher’stimetofocusonactualteaching.”

CambridgeAssessment,anot-for-profitpartoftheUniversity,producesandmarksEnglishlanguageteststakenbyoverfivemillionpeopleeachyear.Twoyearsago,theyteamedupwithBriscoe’steamandProfessorMarkGalesintheDepartmentofEngineeringandDrPaulaButteryintheDepartmentofTheoreticalandAppliedLinguisticstolaunchtheAutomatedLanguageTeachingandAssessment(ALTA)Institute,directedbyBriscoe.TheiraimistocreatetoolstosupportlearnersofbothwrittenandspokenEnglish.

UnderpinningWrite&Improveisinformationgleanedfromavastdatasetofquality-scoredtext–theCambridgeLearnerCorpus.BuiltbyCambridgeUniversityPressandCambridgeAssessment,thisistheworld’slargestcollectionofexampaperstakenbyEnglishlanguagelearnersaroundtheworld.

Eachtesthasbeentranscribedandinformationgatheredaboutthelearner’sage,languageandgradeachieved.Crucially,allerrors(grammar,spelling,misuse,wordsequences,andsoon)havebeenannotatedsothatacomputercanprocessthenaturallanguageusedbythelearner.

Write&Improveworksbysupervisedmachinelearning–havinglearntfromtheCorpusoferrors,itcanmakeinferencesaboutnewunannotateddata.SinceitslaunchasabetaversioninMarch2014,theprogramhasattractedover20,000repeatusers.Andeachnewpieceoftextitreceivescontinuesthisprocessoflearningandimprovingitsaccuracy,whichisalreadyrunningatalmostequaltothemostexperiencedhumanmarkers.

Briscoebelievesthatthissortoftechnologyhasthepotentialtochangethelandscapeofteachingandassessmentpractices:“Textbooksarerapidlymorphingintocoursewarewherepeoplecantesttheirunderstandingastheygoalong.Thisfitswithpedagogicalframeworksinwhichtheemphasisisonindividualprofilingofstudentsandgivingthemtailoredadviceonwhattheycanmostusefullymoveontonext.”

Heregardstheset-upofALTAasthe“besttype”oftechnologytransfer:“Wedoappliedresearchandhaveapipelinefortransferringthistoproducts.Butthatpipelinealsoproducesdatathatfeedsbackintoresearch.”

ThecomplexalgorithmsthatunderpinWrite&ImprovearebeingfurtherdevelopedandcustomisedbyiLexIR,acompanyBriscoeandotherssetup

toconvertuniversityresearchintopracticalapplications;andanewcompany,EnglishLanguageiTutoring,hasbeencreatedtodeliverWrite&Improveandsimilarweb-basedproductsviathecloudandtocapturethedatathatwillfeedbackintotheR&Defforttoimprovethetutoringproducts.

Now,theresearchersarelookingbeyondtexttospeech.AssessingspokenEnglishbringsasetofverydifferentchallengestoassessingwrittenEnglish.Thetechnologyneedstobeabletocopewiththecomplexitiesofthehumanvoice:therhythm,stressandintonationofspeech,theuhmsandahhs,thepauses.

“Thefactthatyoucangetspeechrecognitiononyourphonetendstoimplyinsomepeople’smindsthatspeechrecognitionissolved,”saysGales,ProfessorofInformationEngineering.“Butthetechnologystillstruggleswithsecondlanguagespeech.Weneedtobeabletoassesstherichnessinpeople’sspokenresponses,includingwhetherit’sthecorrectexpressionofemotionorthedevelopmentofanargument.”Galesisdevelopingnewformsofmachinelearning,againusingdatabasesofexamplesofspokenEnglish.

“Thedata-drivenapproachistheonlywaytocreatetoolslikethese,”addsBriscoe.“Buildingautomatedteststhatusemultiplechoiceiseasy.Thestuffwearedoingismessy,andit’sever-changing.We’veshownthatifyoutrainasystemtothisyear’sexamondatafrom10yearsagothesystemislessaccuratethanifyoutrainitondatafromlastyear.”

Thisiswhy,saysBriscoe,it’sunimaginabletoreachapointwherethemachineshavelearnedenoughtounderstandandpredictalmostallofthetypicalmistakeslearnersmake:“Languageisamovingtarget.Englishisconstantlybeingglobalised;vocabularychanges;grammarevolves;andmethodsofassessmentchangeasprogressinpedagogyhappen.Idon’tthinktherewilleverbeapointwhenwecansay‘wearedonenow’.”

Built on information from almost

65

million words gatheredover a 20-year

period from tests taken by real exam

candidates

speaking

148 different languages

living in

217 different countries

or territories

Professor Ted Briscoeejb@cl.cam.ac.ukComputerLaboratory

29 ResearchHorizons

esearchers are using social media data to build a picture of the personalities of millions, changing

core ideas of how psychological profiling works. They say it could revolutionise employment and commerce, but the work must be done transparently.

In2007,DrDavidStillwellbuiltanapplicationforanonlinenetworkingsitethatwasstartingtoexplode:Facebook.Hisapp,myPersonality,alloweduserstocompletearangeofpsychometrictests,getfeedbackontheirscoresandshareitwithfriends.Itwentviral.

By2012,morethansixmillionpeoplehadcompletedthetest,withmanyusersallowingresearchersaccesstotheirprofiledata.Thishugedatabaseofpsychologicalscoresandsocialmediainformation,includingstatusupdates,friendshipnetworksand‘Likes’,isthelargestofitskindinexistence.Itcontains

themoods,musingsandcharacteristicsofmillions–aholygrailofpsychologicaldataunthinkableuntilafewyearsago.

StillwellandcolleaguesatCambridge’sPsychometricsCentreprovidedopenaccesstothedatabaseforotheracademics.Academicresearchersfromover100institutionsgloballynowuseit,producing39journalarticlessince2011.

Meanwhile,theCambridgePsychometricsteamdevisedtheirowncomplexalgorithmstoreadpatternsinthedata.Resultingpublicationscausedmediascrums,withapaperpublishedinearly2015generatingnervousheadlinesaroundtheworldaboutcomputersknowingyourpersonalitybetterthanyourparents.

Buthowsurprisingisthisreally,giventheamountwecasuallyshareaboutourselvesonlineeveryday?Andnotjustthroughsocialmedia,butalsothroughwebbrowsing,internetpurchases,andsoon.Everyinteractioncreatesatrace,

whichalladduptoa‘digitalfootprint’ofwhoweare,whatwedoandhowwefeel.

Weknowthat,behindcloseddoors,corporationsandgovernmentsusethisdatato‘target’us–ouronlineactionsmarkusoutasfuturecustomers,orevenpossibleterrorists–and,formany,thisreductioninprivacyisadisturbingfactof21st-centurylife.

TheCambridgeresearchersbelievethattheneweraofpsychological‘bigdata’canbeusedtoimprovecommercialandgovernmentservicesaswellasfurtheringscientificresearch,butopennessisessential.

“Ifyouaskacompanytomaketheirdataavailableforresearch,usuallyitwillgotosomecorporateresponsibilityofficewhichdeemsittoorisky–there’snothinginitforthem.Whereasifyoutellthemyoucanimprovetheirbusiness,butaspartofthattheymakesomedataavailabletotheresearchcommunity,youfindalotmore

R

30 Spotlight:Bigdata

opendoors,”saysStillwell,whoco-directstheCentre.

AroundhalfoftheCentre’scurrentworkinvolvescommercialcompanies,whocometothemfor“statisticalexpertisecombinedwithpsychologicalunderstanding”–ofteninanattempttoimproveonlinemarketing,anareastillinitsinfancy.

TheteamhasrecentlylaunchedaninterfacecalledApplyMagicSauce,basedonthemyPersonalityresults,whichcanbeusedasamarketingandresearchtoolthatturnsdigital‘footprints’intopsycho-demographicprofiles.

“Ifyouusetheinternetyouwillbetargetedbyadvertisers,butatthemomentthattargetinghappensintheshadowsandisn’tparticularlyaccurate,”saysVesselinPopov,theCentre’sdevelopmentstrategist.

“Weallhavetosufferadvertising,soperhapsit’sbettertoberecommendedproductsthatwemightactuallywant?Usingopt-inanonymouspersonalityprofilingbasedondigitalrecordssuchasFacebookLikesorLast.fmscorescouldvastlyimprovetargetedadvertisingandallowuserstosetthelevelofdata-sharingtheyarecomfortablewith,”saysPopov.“Thisdatacouldthen,withthepermissionofusers,beusedtoenrichscientificresearchdatabases.”

Measuringpsychologicaltraitshaslongbeendifficultforresearchersandboringforparticipants,usuallyinvolvinglaboriousquestionnaires.Thiswillsoundfamiliartoanyonewhohasusedanemploymentagencyorjobcentre.Theteamarenowbuildingontheirpreviousworkwithalgorithmstotakepsychometrictestingevenfurtherintounchartedterritory–videogames.Jobcentresmightbethefirsttobenefit.

“Ajobcentregetsaboutsevenminuteswitheachjobseekereverytwoweeks,soprovidingpersonalisedsupportinthattimeischallenging,”explainsStillwell.“Weareworkingwithacompanytobuildagamethatmeasuresaperson’sstrengthsina‘gamified’waythat’sengagingbutstillaccurate.”

In‘JobCity’,currentlyaniPadproofofconcept,usersexplorejobopportunitiesinasimulatedcity.Thegamemeasurespsychologicalstrengthsandweaknessesalongtheway,offeringcareersuggestionsattheend,andprovidingthejobcentrewithfeedbacktohelpthemguidetheapplicant.Theteamhastestedthegamewithagroupofunder-25sandtheresultsarepromising.

FortheCentre’sDirectorProfessorJohnRust,theteam’sbackgroundinpsychologymeanstheydon’tlosesightofthepeoplewithintheoceansofdata:“We’redealingwithorganisationsthatareusing‘bigdata’tomakeactuarialdecisions

aboutwhogetslentmoney,whogetsajob–youdon’twantthisleftsolelytocomputerengineerswhojustseestatistics.”

“Wewantmachinesthatcanrecogniseyouasaperson.MuchoftheinformationfordoingthatalreadyexistsintheserversofGoogle,Facebook,Amazon,andsoon.Yoursearchesandstatusesareallreflectionsofquestions,experiencesandemotionsyouhave:allpsychometricdata.It’sthebasisforafuturewherecomputerscantrulyinteractwithhumanbeings.”

Cyberspacehas,forRust,openeda‘Pandora’sbox’that’stakenpsychologicaltestingtoanewlevel.But,hesays,thecurrentexplosioninbigdatabearscomparisontoapreviousshiftthathappenedacenturyago–theadventofIQtestsshortlybeforetheFirstWorldWar.Millionsofservicemenweretestedtodetermineroleallocationwithinthemilitary.Suddenly,saysRust,overexcitedscientistshadmassivepsychologicaldatasets.IQtestsinfluencedsocietieslongafterthewar,leadinghesaystosomeofthemostshamefulepisodesofthe20thcenturyincludingscientificracismandsterilisationofthe‘feebleminded’.

“Todayyouhaveanotherpsychologicalbigdatasituationbeingusedtochallengeaperceivedglobalthreat:terrorism.Governmentdatascientistshuntingwould-beterroristsareenthusiasticallyadoptingbigdata,buttherewillbesocialconsequencesagain.Inmanyways,wealreadyhaveBigBrother–whateverthatnowmeans,”Rustsays.

“Thenewpsychologicaldatarevolutionneedsseriousresearch,andethicaldebatesaboutitneedtobehappeninginthepublicarena–andthey’renot.Wehavearesponsibilitytosaytopeopleworkingonthisinsecretincompaniesandinstitutions:‘You’vegottocomeanddiscussthisinanopenplace’.It’swhatuniversitiesarefor.”

It contains the moods, musings

and characteristics of millions – a holy

grail of psychological data unthinkable

until a few years ago

31 ResearchHorizons

Left to rightDr David Stillwellds617@cam.ac.ukProfessor John Rustjnr24@cam.ac.ukVesselin Popovvp288@cam.ac.ukThePsychometricsCentreDepartmentofPsychology

“What we’re trying to do is develop processing frameworks that would allow this data to be useful and to be used, without the somewhat creepy feeling that you’re constantly being watched”

32

W

hat power can individuals have over their data when their every move online is being

tracked? Researchers at the Cambridge Computer Laboratory are building new systems that shift the power back to individual users, and could make personal data faster to access and at much lower cost. It’safactofmodernlife–witheveryclick,everytweet,everyFacebookLike,wehandoverinformationaboutourselvestoorganisationswhoaredesperatetoknowallofoursecrets,inthehopethatthosesecretscanbeusedtosellussomething.

Companieshavebeencollectingeverypossiblescrapofinformationfromtheircustomerssincelongbeforetheinternetage,butwithmorepowerfulcomputers,cheaperstorageandubiquitousonlineuse,themethodsorganisationsusetogatherinformationaboutpeoplehavebecomeever-moresophisticated.Andsometimesthoseorganisationsknowusbetterthanourownfamiliesorfriends.

Forexample,severalyearsago,dataanalysistoolsusedbytheUSretailerTargethadbecomesoprecisethattheywereabletodetermine,withastonishingaccuracy,whetherawomanwaspregnantandhowfaralongshewas,basedonher

purchaseofcertainproducts.Andinoneparticularlyembarrassingincident,Targetknewthatateenagegirlwaspregnantbeforeherfatherdid,muchtoherfather’sdispleasure.

“WhatTargetlearnedfromthatincidentisthatmarketingtooaccuratelycanreallymakepeoplesqueamish,”saysProfessorJonCrowcroftoftheUniversity’sComputerLaboratory.“Butiftheymadetheirmarketingalittlelessaccuratebyincreasingtheamountofprivacytheygivetheircustomers,theyfoundtheycanstillretainorincreasetheircustomerbasewithoutmakingpeoplefeelasifthey’rebeingspiedon.”

I always feel like somebody’s watching me…

Spotlight:Bigdata

ThetypeofsystemthatCrowcroftandMortierenvisionisoneinwhichtheuserhasthescopetoallowaccesstotheirdataonacase-by-casebasis,ratherthanitbeharvestedwhethertheylikeitornot:computationsareperformedwherethedataisgathered,andtheresultsarepushedbacktotheorganisationthatwantsthedata.

“Wecanchangethebigdataproblemcompletelybymovingwherethedataisprocessed,”explainsMortier.“Ratherthanhavingsystemswhereallofthedataisgatheredinsomehugecentrallocationandprocessed,ifyoureconstructthesystemsothatthedataisprocessedinthesameplaceit’sgathered,individualswouldbeabletotakesomeofthecontroloftheirinformationbackfromcorporationsandsurveillanceorganisations.Insteadofonehugecentralprocessingnode,wewanttoseebillionsofsmallernodes,whichwouldmakeinformationquickertoaccess,andcouldpotentiallybestoredatloweroverallcost.”

CrowcroftandMortierhavedesignedandpartiallybuiltsystemswhereaperson’sdatastayslocaltothem,andtheycanhavetheoptiontodecidewhatissharedandwithwhom.Forexample,apatientcansharetheirhealthcaredatawiththeirGP,buttheGPwouldhavetogetauthorisationfromthepatientbeforesharingthatdatawithapharmaceuticalcompany.

“Peoplerealisethey’rebeingmarketedto,butIdon’tthinktheyrealisethescaleofit–itreallyisahiddenmenace,”

saysCrowcroft.“Thepointisthatwecouldbuildsystemsthatcouldstopthatcompletely,andre-enableitonthebasisofalevelplayingfield.Wewanttoseesystemswherepeoplehaveagencyovertheirdata,givingthemtheabilitytoalloworpreventcertaintypesofaccess.”

Contrarytowhatsomepeoplemayassumeaboutthenatureofdigitallife,addsCrowcroft,thevastmajorityofpeoplehighlyvaluetheirownprivacy.HepointstothelaunchandthenrecallofGoogleGlass,awearablecomputerwornlikeeyeglasses.“Peoplestartedwearingthesethingsintorestaurantsandotherdinerswouldn’tputupwithit,becausetheydidn’twanttoberecordedwhileeatingtheirlunch–itreallycreepedpeopleout,”hesays.“Andthat’sinapublicspace:imaginethesamesortofthinghappeninginaprivatespace.It’sabouttheasymmetryandtheideathatthisisbeingdonetoyouandyouhavenocomeback.Theproblemwithdigitalinfrastructuresisyoudon’tseethem,andtoacertainextentcompaniesdependonpeoplenotunderstandingthem–wecanbuildsystemswheretherearemechanismsthroughwhichtheycanbeunderstood.”

CrowcroftandMortierrecognisethatthey’llneverconvinceeveryonetoditchcloudcomputingandswitchtoadecentralisedsystem.Butthatisn’ttheirgoal.“Ittakesawhiletoshowthatnewwaysofdoingthingscanreallywork,”saysCrowcroft.“Ifthesesortsofsystemsbecomeareasonablywidelyusedalternative,itwillgoalongwaytowardskeepingcompaniesandcloudstorageprovidershonest.Theverysmallnumberofprovidersleadstotheexploitationofthenetworkeffect,wheretheyhaveastrongmonopolisticpositionoveracertaintypeofdata.Andmonopoliesarenotgoodforeconomies.Ifadecentralisedsystemismoreethical,enoughpeopleusingitmayincentivisethebigproviderstobemoreethicaltoo.”

Crowcroft’sresearchisintheareaof‘privacybydesign’–systemsthatallowustoliveinthedigitalworldandprotectourprivacyatthesametime.AstheconceptoftheInternetofThings–internet-connectedwashingmachines,toastersandtelevisions–becomesreality,Crowcroftinsiststhatprivacybydesignisneededtoaddressthemassivepowerimbalancethatoccurswhenourpersonaldataissharedwith,andsoldby,corporations,governmentsandotherorganisations.

Butprivacybydesigndoesn’tmeandisconnectingfromtheonlineworldandputtingonatinfoilhat–farfromit.“There’salreadyalotofdatastoredabouteachandeveryoneofus–thethingswebuy,thefoodweeat,thehealthissueswehave–andforeachofthesemarketsegments,thereareperfectlylegitimateusesforthatdata,”addsCrowcroft.“Collectinghealthcaredataisfantasticallyusefulfortrackingpandemics,preventativecare,more-efficienttreatment,publichealth–thoseareallperfectlyreasonableandpositiveusesforbigdata.Atthesametime,mostsitesgatherinformationinordertotargetadsmoreaccurately,andmostpeopleareactuallyokaywiththat.Sothequestionthenbecomes,whatisprivacybydesign?”

“Whatwe’retryingtodoisdevelopprocessingframeworksthatwouldallowthisdatatobeusefulandtobeused,withoutthesomewhatcreepyfeelingthatyou’reconstantlybeingwatched,”saysCrowcroft’scolleagueDrRichardMortier.

Left to right Professor Jon Crowcroft Jon.Crowcroft@cl.cam.ac.ukDr Richard Mortierrichard.mortier@cl.cam.ac.ukComputerLaboratory

33 ResearchHorizons

34 Insideout

Inside outTragedy in Nepal

ollowing the recent devastation in Nepal, Evan Miles reflects on Himalayan glaciology and the

natural hazards faced by those living in an area to which he was about to return for fieldwork.

April25,2015.IawokeinCambridgekeentofinishpreparationsandpackingformyfifthseasonoffieldworkintheLangtangValleyofNepal.Iwasduetoflyoutthefollowingday.Icheckedmyemailforlast-minuteupdates,andwasjerkedoutofanysenseofroutinebythetitle‘BigearthquakejusthitNepal’. AbarrageofTwitter,Facebook,newsandmediasearches,SkypecallsandinquirieslastedfortherestofthedayasmycolleaguesandIbegantorealisethescaleoftheevent.Atfirstwewerewonderingifwecouldgettoourfieldsitebut,inaSkypeconferencethatevening,itbecameclearthatourscientificworkwasoutofthequestion.Ourprioritiesshiftedfromscientifictohumanitarian–tryingtocontactourfriendsandcolleagues,andgatherandsharethescantinformationfrombeyondKathmandu. Geoscientistshadhighlightedthelikelihoodofan8.0-magnitudeearthquakefordecades,andreportshadspecifically

F assessedthehumanandfinancialconsequencesofsuchatremor.ButNepalisaverypoorcountry–itstrugglestopaveroadsconnectingitswidespreadmountainousterrainandhasnorealmeansforemergencypreparednessinthemodernsenseoftheword.Earlierthisyear,aTurkishAirlinesplanecrash-landedinKathmandu,anditwasseveraldaysbeforethecountry’ssingleinternationalrunwaycouldbereopenedtocommercialflights. Asscientists,wehavetoacknowledgetherisksofanystudysite,andhadnotoverlookedthepotentialforanearthquake,orthesevereavalanchesandlandslidesthatoccurfrequentlyinthemountains. However,theNepaleseHimalayaalsopresentaknowledgegapwithinglaciology:whentheInternationalPanelonClimateChangepublishedits4thAssessmentReportin2007,thecommunityofglacierscientistshadlittleunderstandingofHimalayanglaciersowingtoremoteness,weatherandaltitude.ItwasthereforeapriorityforfurtherresearchandthemotivationformyPhD,workingwithDrsIanWillisandNeilArnoldattheScottPolarResearchInstitute,andcollaboratingwithresearchersinSwitzerland,theNetherlandsandNepal.

ThefocusofourstudieshasbeentheLirungGlacierintheupperLangtangValleynorthofKathmandu.It’sagoodexampleofthedebris-coveredglaciersthatarecommoninHighMountainAsia,whereavalanchesandrockfallsdepositlargeamountsofrubbleontotheice. Therubbleaccumulatesatthesurfaceastheicemelts,formingablanketcomprisingsand,gravel,cobblesandboulders,whichsubstantiallyaltershowtheglacierinteractswiththeatmosphere.Thistypeofglacierismuchlessunderstoodthanitsclean-surfacecounterparts. OurstudiesonLirungGlacierareaimedatresolvingaconundrumaboutdebris-coveredglaciersthatcanhelpus

Geoscientists had highlighted the likelihood

of an 8.0-magnitude earthquake for decades

www.cambridge.org/research-horizons

SECOND EDITION

An Introduction to

Galaxies Cosmology

and

Edited by Mark H. Jones, Robert J. A. Lambourne and Stephen Serjeant Edited by Simon F. Green and Mark H. Jones

Sun StarsAn Introduction to the

SECOND EDITION

and

Edited by Simon F. Green and Mark H. Jones

Sun StarsSun StarsSun StarsAn Introduction to the

SECOND EDITION

Sun Starsand

To receive a 20% discount on any science, technology or medicine book title, please enter

the code RESEARCHHORIZONS at the checkout.(Offer available until 31st July 2015. Offer excludes eBooks.)

Shining a light on key academic research

SECOND EDITION

An Introduction to

Galaxies CosmologyGalaxies Cosmology

and

Edited by Mark H. Jones, Robert J. A. Lambourne and Stephen Serjeant

35 ResearchHorizons

Evan Miles esm40@cam.ac.uk ScottPolarResearchInstitute DepartmentofGeography

understandhowtheseglacierswillrespondtoclimatechange. Thethickdebrisshouldreducethemeltofglacierice,butmanydebris-coveredglaciersseemtobemeltingmuchfasterthanexpected–nearlyonparwithclean-iceglaciers.Wethinkthisisduetothepresenceofexceptionalsurfacefeatures–bareice-cliffsandsmalllakesontopoftheglacier–that,althoughcoveringonlyafractionoftheglacier’ssurface,appeartomeltmuchfasterthanthesurfaceunderthedebrislayer. Afteryearsofdetailedobservations,ourteamisdevelopingnumericalmodelsofice-cliffsandponds,whichdolooktobepartlyresponsibleforthehighratesoficelossfromHighMountainAsia’sglaciers.Thisisanimportantsteptounderstandingtheregion’sresponsetoclimatechange,astheyarenotyetaccountedforinprojectionsofglaciermeltinafutureclimate. Now,though,sciencehastotakeabackseat.It’sunclearhoworwhenourobservationswillcontinue,buttheearthquakeisnotsimplyanotherobstacleforourresearch.Duringfourfieldseasons,we’vebuiltrelationshipswithNepalivillagersalongtheLangtangValleyandscientistsinKathmandu.ThedestructioninKathmanduisterrible–largenumbersofcasualties,WorldHeritageSitesdestroyed–butreportssuggestoutlyingvillageshavefaredevenworse,withfewbuildingshavingwithstoodthetremors,anddevastatingavalanchesandlandslideswidespread.TheLangtangValleyisnoexception,asseveralvillagesappeartohavebeenwipedoutentirelybylandslidesburyinghundredsofvillagersandalongTibetanheritage. Wewonderaboutourinstruments,butwearemuchmoreconcernedaboutthevillagerswe’vegottoknow.Forthepresent,wearetryingtomaplandslides,toprioritiseforimmediaterescueoperationsandtheneventualrebuilding.

Evan is funded by a Gates Cambridge PhD Studentship, the University’s Fieldwork Fund, Trinity College’s Rouse Ball and Eddington Fund, the Department of Geography’s Philip Lake and William Vaughan Lewis Fund, and the Scott Polar Research Institute B.B. Roberts Fund

T +44(0)1223765443 E research.horizons@admin.cam.ac.uk W cam.ac.uk/research f facebook.com/cambridge.university twitter.com/cambridge_uni youtube.com/cambridgeuniversity instagram.com/cambridgeuniversity

ContactResearchHorizonsOfficeofExternalAffairsandCommunicationsThePittBuilding,TrumpingtonStreetCambridge,CB21RP

CoverBigdata‘datingagencies’arebeingusedtomatchpatientswhohaverarediseasesworldwidetohelpcliniciansdiagnoseandtreatthem;findoutmoreonp.24thisissue.