Algorithms in C++ Part 5: Graph Algorithms

AlgorithmsThirdEditioninC++Part5GraphAlgorithms

RobertSedgewickPrincetonUniversity

Boston•SanFrancisco•NewYork•Toronto•MontrealLondon•Munich•Paris•Madrid

Capetown•Sydney•Tokyo•Singapore•MexicoCity

Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproducts are claimed as trademarks. Where those designations appear in thisbook and Addison-Wesley was aware of a trademark claim, the designationshavebeenprintedininitialcapitallettersorallcapitals.The author and publisher have taken care in the preparation of this book, butmakenoexpressedorimpliedwarrantyofanykindandassumenoresponsibilityforerrorsoromissions.No liability is assumed for incidentalor consequentialdamages in connection with or arising out of the use of the information orprogramscontainedherein.Thepublisheroffersdiscountsonthisbookwhenorderedinquantityforspecialsales.Formoreinformation,pleasecontact:PearsonEducationCorporateSalesDivision, 201 W. 103rd Street, Indianapolis, IN 46290, (800) 428-5331,[email protected]/cseng/.LibraryofCongressCataloging-in-PublicationDataSedgewick,Robert,1946–AlgorithmsinC++/RobertSedgewick.—3ded.p.cm.Includesbibliographicalreferencesandindex.Contents:v.2,pt.5.Graphalgorithms1.C++(Computerprogramlanguage)2.Computeralgorithms.I.Title.QA76.73.C15S382002005.13’3—dc2092-901CIPCopyright©2002byPearsonEducation,Inc.All rights reserved.Nopartof thispublicationmaybe reproduced, stored inaretrieval system, or transmitted, in any form or by any means, electronic,mechanical, photocopying, recording, or otherwise, without the prior writtenpermissionofthepublisher.PrintedintheUnitedStatesofAmerica.PublishedsimultaneouslyinCanada.ISBN0-201-36118-3Textprintedonrecycledpaper78910–DOH–070605Seventhprinting,February2006

mailto:[email protected]

http://www.awl.com/cseng/

Preface

GRAPHS AND GRAPH algorithms are pervasive in modern computingapplications.Thisbookdescribesthemostimportantknownmethodsforsolvingthegraph-processingproblemsthatariseinpractice.Itsprimaryaimistomakethesemethods and the basic principles behind them accessible to the growingnumberofpeopleinneedofknowingthem.Thematerialisdevelopedfromfirstprinciples, starting with basic information and working through classicalmethods up through modern techniques that are still under development.Carefully chosen examples, detailed figures, and complete implementationssupplementthoroughdescriptionsofalgorithmsandapplications.

AlgorithmsThisbook is the secondof threevolumes thatare intended to survey themostimportantcomputeralgorithmsinusetoday.Thefirstvolume(Parts1–4)coversfundamentalconcepts(Part1),datastructures(Part2),sortingalgorithms(Part3), and searching algorithms (Part 4); this volume (Part 5) covers graphs andgraphalgorithms;andthe(yettobepublished)thirdvolume(Parts6–8)coversstrings(Part6),computationalgeometry(Part7),andadvancedalgorithmsandapplications(Part8).The books are useful as texts early in the computer science curriculum, afterstudentshaveacquiredbasicprogrammingskillsandfamiliaritywithcomputersystems, but before they have taken specialized courses in advanced areas ofcomputerscienceorcomputerapplications.Thebooksalsoareuseful forself-study or as a reference for people engaged in the development of computersystems or applications programs because they contain implementations ofuseful algorithms and detailed information on these algorithms’ performancecharacteristics. The broad perspective taken makes the series an appropriateintroductiontothefield.TogetherthethreevolumescomprisetheThirdEditionofabookthathasbeenwidelyusedby students andprogrammers around theworld formanyyears. Ihavecompletelyrewrittenthetextforthisedition,andIhaveaddedthousandsofnewexercises,hundredsofnewfigures,dozensofnewprograms,anddetailedcommentaryon all the figures andprograms.This newmaterial providesbothcoverageofnewtopicsandfullerexplanationsofmanyoftheclassicalgorithms.Anewemphasisonabstractdatatypesthroughoutthebooksmakestheprogramsmore broadly useful and relevant in modern object-oriented programming

environments.Peoplewhohavereadpreviouseditionswillfindawealthofnewinformation throughout; all readerswill find awealth of pedagogicalmaterialthatprovideseffectiveaccesstoessentialconcepts.These books are not just for programmers and computer science students.Everyonewhousesacomputerwantsittorunfasterortosolvelargerproblems.The algorithms that we consider represent a body of knowledge developedduringthelast50yearsthatisthebasisfortheefficientuseofthecomputerforabroad variety of applications. FromN-body simulation problems in physics togenetic-sequencingproblemsinmolecularbiology,thebasicmethodsdescribedherehavebecomeessentialinscientificresearch;andfromdatabasesystemstoInternet search engines, they have become essential parts ofmodern softwaresystems.As the scopeof computer applicationsbecomesmorewidespread, sogrows the impact of basic algorithms, particularly the fundamental graphalgorithmscoveredinthisvolume.Thegoalofthisbookistoserveasaresourceso that students andprofessionals canknowandmake intelligentuseofgraphalgorithms as the need arises in whatever computer application they mightundertake.

ScopeThis book, Algorithms in C++, Third Edition, Part 5: Graph Algorithms,contains six chapters that cover graph properties and types, graph search,directed graphs, minimal spanning trees, shortest paths, and networks. Thedescriptions here are intended to give readers an understanding of the basicpropertiesofasbroadarangeoffundamentalgraphalgorithmsaspossible.Youwillmost appreciate thematerial here if you have had a course coveringbasicprinciplesofalgorithmdesignandanalysisandprogrammingexperienceina high-level language such as C++, Java, or C. Algorithms in C++, ThirdEdition,Parts1–4iscertainlyadequatepreparation.Thisvolumeassumesbasicknowledgeaboutarrays,linkedlists,andADTdesign,andmakesuseofpriority-queue,symbol-table,andunion-findADTs—allofwhicharedescribedindetailin Parts 1–4 (and in many other introductory texts on algorithms and datastructures).Basic properties of graphs and graph algorithms are developed from firstprinciples, but full understanding often can lead to deep and difficultmathematics. Although the discussion of advanced mathematical concepts isbrief,general,anddescriptive,youcertainlyneedahigherlevelofmathematicalmaturitytoappreciategraphalgorithmsthanyoudoforthetopicsinParts1–4.Still, readers at various levels ofmathematicalmaturity will be able to profit

from this book. The topic dictates this approach: some elementary graphalgorithmsthatshouldbeunderstoodandusedbyeveryonedifferonlyslightlyfromsomeadvancedalgorithmsthatarenotunderstoodbyanyone.Theprimaryintent here is to place important algorithms in context with other methodsthroughout the book, not to teach all of the mathematical material. But therigorous treatment demanded by good mathematics often leads us to goodprograms, so I have tried to provide a balance between the formal treatmentfavored by theoreticians and the coverage needed by practitioners, withoutsacrificingrigor.

UseintheCurriculumThere is a great deal of flexibility in how the material here can be taught,dependingonthetasteoftheinstructorandthepreparationofthestudents.Thealgorithms described have found widespread use for years, and represent anessential body of knowledge for both the practicing programmer and thecomputersciencestudent.Thereissufficientcoverageofbasicmaterialforthebook to be used in a course on data structures and algorithms, and there issufficientdetailandcoverageofadvancedmaterialforthebooktobeusedforacourse on graph algorithms. Some instructors may wish to emphasizeimplementationsandpracticalconcerns;othersmaywishtoemphasizeanalysisandtheoreticalconcepts.Foramorecomprehensivecourse,thisbookisalsoavailableinaspecialbundlewith Parts 1–4; thereby instructors can cover fundamentals, data structures,sorting, searching, andgraph algorithms in one consistent style.A set of slidemasters for use in lectures, sample programming assignments, interactiveexercisesforstudents,andothercoursematerialsmaybefoundbyaccessingthebook’shomepage.The exercises—nearly all of which are new to this edition—fall into severaltypes.Someareintendedtotestunderstandingofmaterialinthetext,andsimplyask readers towork through an exampleor to apply concepts described in thetext. Others involve implementing and putting together the algorithms, orrunningempiricalstudiestocomparevariantsofthealgorithmsandtolearntheirproperties. Still other exercises are a repository for important information at alevelofdetailthatisnotappropriateforthetext.Readingandthinkingabouttheexerciseswillpaydividendsforeveryreader.

AlgorithmsofPracticalUseAnyone wanting to use a computer more effectively can use this book for

reference or for self-study. People with programming experience can findinformationon specific topics throughout thebook.Toa largeextent,youcanreadtheindividualchaptersinthebookindependentlyoftheothers,although,insome cases, algorithms in one chapter make use of methods from a previouschapter.Theorientationof thebookis tostudyalgorithmslikely tobeofpracticaluse.The book provides information about the tools of the trade to the point thatreaderscanconfidentlyimplement,debug,andputtoworkalgorithmstosolveaproblemor to provide functionality in an application. Full implementations ofthemethodsdiscussedareincluded,asaredescriptionsoftheoperationsoftheseprograms on a consistent set of examples. Because we work with real code,ratherthanwritepseudocode, theprogramscanbeputtopracticalusequickly.Programlistingsareavailablefromthebook’shomepage.Indeed, one practical application of the algorithms has been to produce thehundredsoffigures throughout thebook.Manyalgorithmsarebrought to lightonanintuitivelevelthroughthevisualdimensionprovidedbythesefigures.Characteristics of the algorithms and of the situations inwhich theymight beuseful are discussed in detail. Connections to the analysis of algorithms andtheoretical computer science are developed in context. When appropriate,empiricalandanalyticresultsarepresented to illustratewhycertainalgorithmsarepreferred.Wheninteresting,therelationshipofthepracticalalgorithmsbeingdiscussed to purely theoretical results is described. Specific information onperformance characteristics of algorithms and implementations is synthesized,encapsulated,anddiscussedthroughoutthebook.

ProgrammingLanguageThe programming language used for all of the implementations is C++. TheprogramsuseawiderangeofstandardC++idioms,andthetextincludesconcisedescriptionsofeachconstruct.ChrisVanWykandIdevelopedastyleofC++programmingbasedonclasses,templates,andoverloadedoperators thatwefeel isaneffectiveway topresentthealgorithmsanddatastructuresasrealprograms.Wehavestrivenforelegant,compact, efficient, and portable implementations. The style is consistentwheneverpossible,sothatprogramsthataresimilarlooksimilar.Agoalofthisbookistopresentthealgorithmsinassimpleanddirectaformaspossible.Formanyofthealgorithms,thesimilaritiesremainregardlessofwhichlanguage is used: Dijkstra’s algorithm (to pick one prominent example) is

Dijkstra’salgorithm,whetherexpressed inAlgol-60,Basic,Fortran,Smalltalk,Ada,Pascal,C,C++,Modula-3,PostScript,Java,oranyof thecountlessotherprogramming languages and environments in which it has proved to be aneffective graph-processingmethod.On the one hand, our code is informed byexperiencewithimplementingalgorithmsintheseandnumerousotherlanguages(aCversionofthisbookisalsoavailable,andaJavaversionwillappearsoon);on the other hand, some of the properties of some of these languages areinformed by their designers’ experiencewith some of the algorithms and datastructures that we consider in this book. In the end, we feel that the codepresented in the book both precisely defines the algorithms and is useful inpractice.

AcknowledgmentsMany people gave me helpful feedback on earlier versions of this book. Inparticular, thousandsofstudentsatPrincetonandBrownhavesufferedthroughpreliminarydraftsovertheyears.SpecialthanksareduetoTrinaAveryandTomFreeman for their help in producing the first edition; to Janet Incerpi for hercreativity and ingenuity in persuading our early and primitive digitalcomputerizedtypesettinghardwareandsoftwaretoproducethefirstedition; toMarc Brown for his part in the algorithm visualization research that was thegenesisofthefiguresinthebook;toDaveHansonandAndrewAppelfortheirwillingnesstoanswermyquestionsaboutprogramminglanguages;andtoKevinWayne,forpatientlyansweringmybasicquestionsaboutnetworks.Kevinurgedme to include the network simplex algorithm in this book, but I was notpersuadedthatitwouldbepossibletodosountilIsawapresentationbyUlrichLautheratDagstuhloftheideasonwhichtheimplementationsinChapter22arebased.Iwouldalsoliketothankthemanyreaderswhohaveprovidedmewithdetailed comments about various editions, includingGuyAlmes, JonBentley,MarcBrown,JayGischer,AllanHeydon,KennedyLemke,UdiManber,DanaRichards, John Reif, M. Rosenfeld, Stephen Seidman, Michael Quinn, andWilliamWard.To produce this new edition, I have had the pleasure of working with PeterGordonandHelenGoldsteinatAddison-Wesley,whopatientlyshepherdedthisprojectasithasevolvedfromastandardupdatetoamassiverewrite.IthasalsobeenmypleasuretoworkwithseveralothermembersoftheprofessionalstaffatAddison-Wesley.Thenatureofthisprojectmadethebookasomewhatunusualchallenge for many of them, and I much appreciate their forbearance. Inparticular,MarilynRashdidanoutstandingjobmanagingthebook’sproduction

withinaverytightlycompressedschedule.Ihavegained threenewmentors inwriting thisbook,andparticularlywant toexpressmy appreciation to them. First, Steve Summit carefully checked earlyversionsof themanuscriptona technical level, andprovidedmewith literallythousands of detailed comments, particularly on the programs. Steve clearlyunderstood my goal of providing elegant, efficient, and effectiveimplementations,andhiscommentsnotonlyhelpedmetoprovideameasureofconsistencyacrosstheimplementations,butalsohelpedmetoimprovemanyofthem substantially. Second, Lyn Dupré also provided me with thousands ofdetailedcommentsonthemanuscript,whichwereinvaluableinhelpingmenotonlytocorrectandavoidgrammaticalerrors,butalso—moreimportant—tofindaconsistentandcoherentwritingstylethathelpsbindtogetherthedauntingmassoftechnicalmaterialhere.Third,ChrisVanWykimplementedanddebuggedallmy algorithms in C++, answered numerous questions about C++, helped todevelop an appropriate C++ programming style, and carefully read themanuscripttwice.ChrisalsopatientlystoodbyasItookapartmanyofhisC++programsandthen,asIlearnedmoreandmoreaboutC++fromhim,hadtoputthembacktogethermuchashehadwrittenthem.Iamextremelygratefulfortheopportunity to learn from Steve, Lyn, and Chris—their input was vital in thedevelopmentofthisbook.MuchofwhatIhavewrittenhereIhavelearnedfromtheteachingandwritingsofDonKnuth,myadvisoratStanford.AlthoughDonhadnodirectinfluenceonthiswork,hispresencemaybefeltinthebook,foritwashewhoputthestudyofalgorithmsonthescientificfootingthatmakesaworksuchasthispossible.My friend and colleaguePhilippeFlajolet,who has been amajor force in thedevelopmentoftheanalysisofalgorithmsasamatureresearcharea,hashadasimilarinfluenceonthiswork.IamdeeplythankfulforthesupportofPrincetonUniversity,BrownUniversity,andtheInstitutNationaldeRecherceenInformatiqueetAutomatique(INRIA),where I did most of the work on the books; and of the Institute for DefenseAnalysesandtheXeroxPaloAltoResearchCenter,whereIdidsomeworkonthebookswhilevisiting.Manypartsof thesebooksaredependentonresearchthathasbeengenerouslysupportedbytheNationalScienceFoundationandtheOffice ofNaval Research. Finally, I thankBill Bowen,Aaron Lemonick, andNeil Rudenstine for their support in building an academic environment atPrincetoninwhichIwasabletopreparethisbook,despitemynumerousotherresponsibilities.

RobertSedgewickMarly-le-Roi,France,1983Princeton,NewJersey,1990

Jamestown,RhodeIsland,2001

C++Consultant’sPrefaceBob Sedgewick and I wrotemany versions ofmost of these programs in ourquest to implement graph algorithms in clear and natural programs. Becausetherearesomanykindsofgraphsandsomanydifferentquestionstoaskaboutthem,weagreedearlyonnot topursueasingleclassschemethatwouldworkacross thewhole book. Remarkably, we ended up using only two schemes: asimple one in Chapters 17 through 19, where the edges of a graph are eitherpresent or absent; and an approach similar to STL containers in Chapters 20through22,wheremoreinformationisassociatedwithedges.C++ classes offer many advantages for presenting graph algorithms. We useclasses to collect useful generic functions on graphs (like input/output). InChapter 18, we use classes to factor out the operations common to severaldifferentgraph-searchmethods.Throughoutthebook,weuseaniteratorclassontheedgesemanatingfromavertexsothattheprogramsworknomatterhowthegraphisstored.Mostimportant,wepackagegraphalgorithmsinclasseswhoseconstructorprocessesthegraphandwhosememberfunctionsgiveusaccesstotheanswersdiscovered.Thisorganizationallowsgraphalgorithmstoreadilyuseother graph algorithms as subroutines—see, for example, Program 19.13(transitive closure via strong components), Program 20.8 (Kruskal’s algorithmfor minimum spanning tree), Program 21.4 (all shortest paths via Dijkstra’salgorithm),Program21.6 (longestpath inadirectedacyclicgraph).This trendculminatesinChapter22,wheremostoftheprogramsarebuiltatahigherlevelofabstraction,usingclassesthataredefinedearlierinthebook.ForconsistencywithAlgorithmsinC++,ThirdEdition,Parts1–4ourprogramsrelyonthestackandqueueclassesdefinedthere,andwewriteexplicitpointeroperations on singly-linked lists in two low-level implementations. We haveadopted two stylistic changes from Parts 1–4: Constructors use initializationrather than assignment and we use STL vectors instead of arrays. Here is asummaryoftheSTLvectorfunctionsweuseinourprograms:

•Thedefaultconstructorcreatesanemptyvector.•Theconstructorvec(n)createsavectorofnelements.•Theconstructorvec(n,x)createsavectorofnelementseachinitializedto

thevaluex.•Member functionvec.assign(n,x)makesvecavectorofnelementseachinitializedtothevaluex.•Memberfunctionvec.resize(n)growsorshrinksvectohavecapacityn.•Memberfunctionvec.resize(n,x)growsorshrinksvectohavecapacitynandinitializesanynewelementstothevaluex.

TheSTLalsodefinestheassignmentoperator,copyconstructor,anddestructorneededtomakevectorsfirst-classobjects.BeforeIstartedworkingontheseprograms,Ihadreadinformaldescriptionsandpseudocode for many of the algorithms, but had only implemented a few ofthem. I have found it very instructive to work out the details needed to turnalgorithmsintoworkingprograms,andfuntowatchtheminaction.Ihopethatreadingandrunningtheprogramsinthisbookwillalsohelpyoutounderstandthealgorithmsbetter.Thanks: to Jon Bentley, BrianKernighan, and Tom Szymanski, fromwhom Ilearned much of what I know about programming; to Debbie Lafferty, whoasked whether I would be interested in this project; and to Drew University,LucentTechnologies,andPrincetonUniversity,forinstitutionalsupport.

ChristopherVanWykChatham,NewJersey,2001

ToAdam,Andrew,Brett,Robbie,andespeciallyLinda

NotesonExercisesClassifyingexercisesisanactivityfraughtwithperil,becausereadersofabooksuch as this come to the material with various levels of knowledge andexperience.Nonetheless,guidanceisappropriate,somanyoftheexercisescarryoneoffourannotations,tohelpyoudecidehowtoapproachthem.Exercisesthattestyourunderstandingofthematerialaremarkedwithanopentriangle,asfollows:

•18.34Considerthegraph3-71-47-80-55-23-82-90-64-92-66-4.

DrawitsDFStreeandusethetreetofindthegraph’sbridgesandedge-connectedcomponents.

Mostoften, suchexercises relatedirectly toexamples in the text.Theyshouldpresentnospecialdifficulty,butworkingthemmightteachyouafactorconceptthatmayhaveeludedyouwhenyoureadthetext.Exercises thatadd newand thought-provoking information to thematerial aremarkedwithanopencircle,asfollows:

•19.106WriteaprogramthatcountsthenumberofdifferentpossibleresultsoftopologicallysortingagivenDAG.

Suchexercisesencourageyoutothinkaboutanimportantconceptthatisrelatedtothematerialinthetext,ortoansweraquestionthatmayhaveoccurredtoyouwhenyoureadthetext.Youmayfinditworthwhiletoreadtheseexercises,evenifyoudonothavethetimetoworkthemthrough.Exercises that are intended to challenge you are marked with a black dot, asfollows:

•20.73DescribehowyouwouldfindtheMSTofagraphsolargethatonlyVedgescanfitintomainmemoryatonce.

Suchexercisesmayrequireasubstantialamountoftimetocomplete,dependingonyourexperience.Generally,themostproductiveapproachistoworkontheminafewdifferentsittings.Afewexercisesthatareextremelydifficult(bycomparisonwithmostothers)aremarkedwithtwoblackdots,asfollows:

••20.37Developa reasonablegenerator for randomgraphswithVverticesand E edges such that the running time of the heap-based PFSimplementationofDijkstra’salgorithmissuperlinear.

Theseexercisesaresimilartoquestionsthatmightbeaddressedintheresearchliterature,butthematerialinthebookmayprepareyoutoenjoytryingtosolvethem(andperhapssucceeding).Theannotationsareintendedtobeneutralwithrespecttoyourprogrammingandmathematicalability.Thoseexercisesthatrequireexpertiseinprogrammingorinmathematical analysis are self-evident.All readers are encouraged to test theirunderstandingofthealgorithmsbyimplementingthem.Still,anexercisesuchasthis one is straightforward for a practicing programmer or a student in aprogrammingcourse,butmayrequiresubstantialworkforsomeonewhohasnotrecentlyprogrammed:

•17.74Write a program that generatesV randompoints in theplane, thenbuilds a network with edges (in both directions) connecting all pairs ofpointswithinagivendistancedofoneanother(seeProgram3.20),settingeachedge’sweighttothedistancebetweenthetwopointsthatitconnects.DeterminehowtosetdsothattheexpectednumberofedgesisE.

Inasimilarvein,all readersareencouragedtostrive toappreciate theanalyticunderpinningsofourknowledgeaboutpropertiesofalgorithms.Still,anexercisesuch as this one is straightforward for a scientist or a student in a discretemathematicscourse,butmayrequiresubstantialworkforsomeonewhohasnotrecentlydonemathematicalanalysis:

• 19.5 How many digraphs correspond to each undirected graph with VverticesandEedges?

There are far toomany exercises for you to read and assimilate them all;myhopeisthatthereareenoughexercisesheretostimulateyoutostrivetocometoabroaderunderstandingofthetopicsthatinterestyouthanyoucouldgleanbysimplyreadingthetext.

Contents

GraphAlgorithms

Chapter17.GraphPropertiesandTypes17.1Glossary17.2GraphADT17.3Adjacency-MatrixRepresentation17.4Adjacency-ListsRepresentation17.5Variations,Extensions,andCosts17.6GraphGenerators17.7Simple,Euler,andHamiltonPaths17.8Graph-ProcessingProblems

Chapter18.GraphSearch18.1ExploringaMaze18.2Depth-FirstSearch18.3Graph-SearchADTFunctions18.4PropertiesofDFSForests18.5DFSAlgorithms18.6SeparabilityandBiconnectivity18.7Breadth-FirstSearch18.8GeneralizedGraphSearch18.9AnalysisofGraphAlgorithms

Chapter19.DigraphsandDAGs19.1GlossaryandRulesoftheGame19.2AnatomyofDFSinDigraphs19.3ReachabilityandTransitiveClosure19.4EquivalenceRelationsandPartialOrders19.5DAGs19.6TopologicalSorting19.7ReachabilityinDAGs

19.8StrongComponentsinDigraphs19.9TransitiveClosureRevisited19.10Perspective

Chapter20.MinimumSpanningTrees20.1Representations20.2UnderlyingPrinciplesofMSTAlgorithms20.3Prim’sAlgorithmandPriority-FirstSearch20.4Kruskal’sAlgorithm20.5Boruvka’sAlgorithm20.6ComparisonsandImprovements20.7EuclideanMST

Chapter21.ShortestPaths21.1UnderlyingPrinciples21.2Dijkstra’sAlgorithm21.3All-PairsShortestPaths21.4ShortestPathsinAcyclicNetworks21.5EuclideanNetworks21.6Reduction21.7NegativeWeights21.8Perspective

Chapter22.NetworkFlow36722.1FlowNetworks22.2Augmenting-PathMaxflowAlgorithms22.3Preflow-PushMaxflowAlgorithms22.4MaxflowReductions22.5MincostFlows22.6NetworkSimplexAlgorithm22.7Mincost-FlowReductions22.8Perspective

ReferencesforPartFive

PARTFIVEGraphAlgorithms

CHAPTERSEVENTEENGraphPropertiesandTypes

MANYCOMPUTATIONALAPPLICATIONSnaturallyinvolvenotjustasetofitems, but also a set of connections between pairs of those items. Therelationshipsimpliedbytheseconnectionsleadimmediatelytoahostofnaturalquestions: Is there a way to get from one item to another by following theconnections?Howmanyotheritemscanbereachedfromagivenitem?Whatisthebestwaytogetfromthisitemtothisotheritem?Tomodelsuchsituations,weuseabstractobjectscalledgraphs.Inthischapter,weexaminebasicpropertiesofgraphsindetail,settingthestageforustostudyavariety of algorithms that are useful for answering questions of the type justposed.Thesealgorithmsmakeeffectiveuseofmanyofthecomputationaltoolsthat we considered in Parts 1 through 4. They also serve as the basis forattackingproblemsinimportantapplicationswhosesolutionwecouldnotevencontemplatewithoutgoodalgorithmictechnology.Graph theory, amajor branch of combinatorialmathematics, has been studiedintensively for hundreds of years. Many important and useful properties ofgraphshavebeenproved,yetmanydifficultproblemsremainunresolved.Inthisbook,whilerecognizingthatthereismuchstilltobelearned,wedrawfromthisvast body of knowledge about graphs what we need to understand and use abroadvarietyofusefulandfundamentalalgorithms.Like so many of the other problem domains that we have studied, thealgorithmic investigation of graphs is relatively recent.Although a few of thefundamentalalgorithmsareold, themajorityof the interestingoneshavebeendiscoveredwithinthelastfewdecades.Eventhesimplestgraphalgorithmsleadtousefulcomputerprograms,andthenontrivialalgorithmsthatweexamineareamongthemostelegantandinterestingalgorithmsknown.Toillustratethediversityofapplicationsthatinvolvegraphprocessing,webeginour exploration of algorithms in this fertile area by considering severalexamples.MapsApersonwho isplanninga tripmayneed to answerquestions suchas,“WhatistheleastexpensivewaytogetfromPrincetontoSanJose?”Apersonmore interested in time than in money may need to know the answer to thequestion“WhatisthefastestwaytogetfromPrincetontoSanJose?”Toanswer

such questions, we process information about connections (travel routes)betweenitems(townsandcities).HypertextsWhen we browse theWeb, we encounter documents that containreferences(links)tootherdocuments,andwemovefromdocumenttodocumentby clicking on the links. The entire web is a graph, where the items aredocuments and the connections are links. Graph-processing algorithms areessentialcomponentsofthesearchenginesthathelpuslocateinformationontheweb.CircuitsAnelectriccircuitcompriseselementssuchastransistors,resistors,andcapacitors that are intricately wired together. We use computers to controlmachines that make circuits, and to check that the circuits perform desiredfunctions. We need to answer simple questions such as, “Is a short-circuitpresent?”aswellascomplicatedquestionssuchas,“Canwelayoutthiscircuitonachipwithoutmakinganywirescross?”Inthiscase,theanswertothefirstquestiondependsononlythepropertiesoftheconnections(wires),whereastheanswertothesecondquestionrequiresdetailedinformationaboutthewires,theitemsthatthosewiresconnect,andthephysicalconstraintsofthechip.SchedulesAmanufacturingprocessrequiresavarietyoftaskstobeperformed,underasetofconstraintsthatspecifiesthatcertaintaskscannotbestarteduntilcertain other tasks have been completed. We represent the constraints asconnections between the tasks (items), and we are faced with a classicalschedulingproblem:Howdoweschedulethetaskssuchthatwebothrespectthegivenconstraintsandcompletethewholeprocessintheleastamountoftime?Transactions A telephone company maintains a database of telephone-calltraffic. Here the connections represent telephone calls. We are interested inknowingaboutthenatureoftheinterconnectionstructurebecausewewanttolaywires and build switches that can handle the traffic efficiently. As anotherexample,afinancialinstitutiontracksbuy/sellordersinamarket.Aconnectionin this situation represents the transfer of cash between two customers.Knowledge of the nature of the connection structure in this instance mayenhanceourunderstandingofthenatureofthemarket.Matching Students apply for positions in selective institutions such as socialclubs,universities,ormedicalschools.Itemscorrespondtothestudentsandtheinstitutions; connections correspond to the applications. We want to discovermethodsformatchinginterestedstudentswithavailablepositions.Networks A computer network consists of interconnected sites that send,forward, and receivemessages of various types.We are interested not just in

knowingthatitispossibletogetamessagefromeverysitetoeveryothersite,but also in maintaining this connectivity for all pairs of sites as the networkchanges.Forexample,wemightwishtocheckagivennetworktobesurethatnosmallsetofsitesorconnectionsissocriticalthatlosingitwoulddisconnectanyremainingpairofsites.ProgramstructureAcompilerbuildsgraphstorepresentthecallstructureofalarge software system. The items are the various functions or modules thatcomprise thesystem;connectionsareassociatedeitherwith thepossibility thatone functionmight call another (static analysis) orwith actual callswhile thesystem is in operation (dynamic analysis). We need to analyze the graph todeterminehowbesttoallocateresourcestotheprogrammostefficiently.These examples indicate the range of applications for which graphs are theappropriate abstraction and also the range of computational problems that wemightencounterwhenweworkwithgraphs.Suchproblemswillbeourfocusinthisbook.Inmanyoftheseapplicationsastheyareencounteredinpractice,thevolume of data involved is truly huge, and efficient algorithms make thedifferencebetweenwhetherornotasolutionisatallfeasible.We have already encountered graphs, briefly, in Part 1. Indeed, the firstalgorithmsthatweconsideredindetail,theunion-findalgorithmsinChapter1,areprimeexamplesofgraphalgorithms.WealsousedgraphsinChapter3asanillustration of applications of two-dimensional arrays and linked lists, and inChapter 5 to illustrate the relationship between recursive programs andfundamental data structures.Any linked data structure is a representation of agraph, and some familiar algorithms for processing trees and other linkedstructuresarespecialcasesofgraphalgorithms.ThepurposeofthischapteristoprovideacontextfordevelopinganunderstandingofgraphalgorithmsrangingfromthesimpleonesinPart1tothesophisticatedonesinChapters18 through22.Asalways,weareinterestedinknowingwhicharethemostefficientalgorithmsthatsolveaparticularproblem.Thestudyoftheperformancecharacteristicsofgraphalgorithmsischallengingbecause

•Thecostofanalgorithmdependsnotjustonpropertiesofthesetofitems,but also on numerous properties of the set of connections (and globalpropertiesofthegraphthatareimpliedbytheconnections).•Accuratemodelsofthetypesofgraphsthatwemightfacearedifficulttodevelop.

Weoftenworkwithworst-caseperformanceboundsongraphalgorithms,eventhoughtheymayrepresentpessimisticestimatesonactualperformanceinmanyinstances.Fortunately,asweshallsee,anumberofalgorithmsareoptimalandinvolve littleunnecessarywork.Other algorithmsconsume the same resourcesonallgraphsofagivensize.Wecanpredictaccuratelyhowsuchalgorithmswillperforminspecificsituations.Whenwecannotmakesuchaccuratepredictions,weneed topayparticularattention topropertiesof thevarious typesofgraphsthat we might expect in practical applications and must assess how thesepropertiesmightaffecttheperformanceofouralgorithms.Webeginbyworkingthroughthebasicdefinitionsofgraphsandthepropertiesofgraphs, examining the standardnomenclature that is used todescribe them.Followingthat,wedefinethebasicADT(abstractdatatype)interfacesthatweuse to study graph algorithms and the twomost important data structures forrepresenting graphs—the adjacency-matrix representation and the adjacency-lists representation, and various approaches to implementing basic ADTfunctions.Then,weconsiderclientprogramsthatcangeneraterandomgraphs,whichwecanuse to testour algorithmsand to learnpropertiesofgraphs.Allthismaterial provides a basis for us to introduce graph-processing algorithmsthat solve three classical problems related to finding paths in graphs, whichillustratethatthedifficultyofgraphproblemscandifferdramaticallyevenwhentheymight seem similar.We conclude the chapter with a review of themostimportantgraph-processingproblemsthatweconsiderinthisbook,placingthemincontextaccordingtothedifficultyofsolvingthem.

17.1GlossaryA substantial amount of nomenclature is associated with graphs.Most of theterms have straightforward definitions, and, for reference, it is convenient toconsidertheminoneplace:here.Wehavealreadyusedsomeoftheseconceptswhen considering basic algorithms in Part 1; others of themwill not becomerelevantuntilweaddressassociatedadvancedalgorithmsinChapters18through22.Definition17.1Agraphisasetofverticesandasetofedgesthatconnectpairsofdistinctvertices(withatmostoneedgeconnectinganypairofvertices).Weusethenames0throughV-1fortheverticesinaV-vertexgraph.Themainreason that we choose this system is that we can access quickly informationcorresponding to each vertex, using vector indexing. In Section 17.6, weconsider a program that uses a symbol table to establish a 1–1 mapping toassociateVarbitraryvertexnameswiththeVintegersbetween0andV−1.With

that program in hand, we can use indices as vertex names (for notationalconvenience)without loss of generality.We sometimes assume that the set ofverticesisdefinedimplicitly,bytakingthesetofedgestodefinethegraphandconsideringonly thosevertices thatare includedinat leastoneedge.Toavoidcumbersome usage such as “the ten-vertex graph with the following set ofedges,”wedonotexplicitlymentionthenumberofverticeswhenthatnumberisclearfromthecontext.Byconvention,wealwaysdenotethenumberofverticesinagivengraphbyV,anddenotethenumberofedgesbyE.Weadoptas standard thisdefinitionofagraph (whichwe firstencountered inChapter 5), but note that it embodies two technical simplifications. First, itdisallows duplicate edges (mathematicians sometimes refer to such edges asparallel edges, andagraph that cancontain themasamultigraph). Second, itdisallowsedges thatconnectvertices to themselves;suchedgesarecalledself-loops.Graphsthathavenoparalleledgesorself-loopsaresometimesreferredtoassimplegraphs.Weuse simple graphs in our formal definitions because it is easier to expresstheirbasicpropertiesandbecauseparalleledgesandself-loopsarenotneededinmanyapplications.Forexample,wecanboundthenumberofedgesinasimplegraphwithagivennumberofvertices.Property17.1AgraphwithVverticeshasatmostV(V−1)/2edges.Proof: The total of V2 possible pairs of vertices includes V self-loops andaccountstwiceforeachedgebetweendistinctvertices,sothenumberofedgesisatmost(V2−V)/2=V(V−1)/2.•Nosuchboundholdsifweallowparalleledges:agraphthatisnotsimplemightconsistof twoverticesandbillionsofedgesconnectingthem(orevenasinglevertexandbillionsofself-loops).Forsomeapplications,wemightconsidertheeliminationofparalleledgesandself-loops to be a data-processing problem that our implementations mustaddress.Forother applications, ensuring that agiven setof edges represents asimplegraphmaynotbeworththetrouble.Throughoutthebook,wheneveritismoreconvenient toaddressanapplicationor todevelopanalgorithmbyusinganextendeddefinitionthatincludesparalleledgesorself-loops,weshalldoso.Forexample,self-loopsplayacriticalroleinaclassicalalgorithmthatwewillexamineinSection17.4;andparalleledgesarecommonintheapplicationsthatwe address in Chapter 22. Generally, it is clear from the context whether weintendtheterm“graph”tomean“simplegraph”or“multigraph”or“multigraphwithself-loops.”

Mathematiciansusethewordsvertexandnodeinterchangeably,butwegenerallyusevertexwhendiscussinggraphsandnodewhendiscussingrepresentations—forexample,inC++datastructures.Wenormallyassumethatavertexcanhavea name and can carry other associated information. Similarly, the words arc,edge,andlinkareallwidelyusedbymathematicianstodescribetheabstractionembodying a connection between two vertices, but we consistently use edgewhendiscussinggraphsandlinkwhendiscussingC++datastructures.When there is an edge connecting two vertices, we say that the vertices areadjacenttooneanotherandthattheedgeisincidentonbothvertices.Thedegreeof a vertex is the number of edges incident on it.We use the notation v-w torepresentanedgethatconnectsvandw;thenotationw-visanalternativewaytorepresentthesameedge.A subgraph is a subset of a graph’s edges (and associated vertices) thatconstitutesagraph.Manycomputationaltasksinvolveidentifyingsubgraphsofvarious types. Ifwe identifyasubsetofagraph’svertices,wecall thatsubset,togetherwithalledges thatconnect twoof itsmembers, the inducedsubgraphassociatedwiththosevertices.We can draw a graph by marking points for the vertices and drawing linesconnectingthemfortheedges.Adrawinggivesusintuitionaboutthestructureof thegraph;but this intuitioncanbemisleading,becausethegraphisdefinedindependently of the representation. For example, the two drawings in Figure17.1andthelistofedgesrepresentthesamegraph,becausethegraphisonlyits(unordered)setofverticesandits(unordered)setofedges(pairsofvertices)—nothingmore.Althoughitsufficestoconsideragraphsimplyasasetofedges,we examineother representations that are particularly suitable as the basis forgraphdatastructuresinSection17.4.

Figure17.1Threedifferentrepresentationsofthesamegraph

Agraphisdefinedbyitsverticesanditsedges,notbythewaythatwechoosetodrawit.Thesetwodrawingsdepictthesamegraph,asdoesthelistofedges

(bottom),giventheadditionalinformationthatthegraphhas13verticeslabeled0through12.

Placing the vertices of a given graph on the plane and drawing them and theedges that connect them is known as graph drawing. The possible vertexplacements, edge-drawing styles, and aesthetic constraints on the drawing arelimitless.Graph-drawingalgorithmsthatrespectvariousnaturalconstraintshavebeen studied heavily and have many successful applications (see referencesection).Forexample,oneof thesimplestconstraints is to insist thatedgesdonotintersect.Aplanargraphisonethatcanbedrawnintheplanewithoutanyedges crossing. Determiningwhether or not a graph is planar is a fascinatingalgorithmic problem that we discuss briefly in Section 17.8. Being able toproduceahelpfulvisualrepresentationisausefulgoal,andgraphdrawingisafascinatingfieldofstudy,butsuccessfuldrawingsareoftendifficult to realize.Manygraphsthathavehugenumbersofverticesandedgesareabstractobjectsforwhichnosuitabledrawingisfeasible.For some applications, such as graphs that representmaps or circuits, a graphdrawingcancarryconsiderable informationbecause thevertices correspond topointsintheplaneandthedistancesbetweenthemarerelevant.Werefertosuchgraphs asEuclideangraphs. Formany other applications, such as graphs thatrepresent relationships or schedules, the graphs simply embody connectivity

information,andnoparticulargeometricplacementofvertices isever implied.We consider examples of algorithms that exploit the geometric information inEuclideangraphsinChapters20and21,butweprimarilyworkwithalgorithmsthat make no use of any geometric information, and stress that graphs aregenerally independent of any particular representation in a drawing or in acomputer.Focusing solely on the connections themselves, we might wish to view thevertex labels asmerely a notational convenience, and to regard two graphs asbeing the same if they differ in only the vertex labels. Two graphs areisomorphic ifwecanchange thevertex labelsonone tomake its setofedgesidenticaltotheother.Determiningwhetherornottwographsareisomorphicisadifficult computational problem (see Figure 17.2 and Exercise 17.5). It ischallenging because there areV ! possibleways to label the vertices—far toomanyforustotryallthepossibilities.Therefore,despitethepotentialappealofreducing the number of different graph structures thatwe have to consider bytreatingisomorphicgraphsasidenticalstructures,werarelydoso.Aswe sawwith trees inChapter 5,we are often interested in basic structuralproperties thatwecandeduceby considering specific sequencesof edges in agraph.Definition 17.2 A path in a graph is a sequence of vertices in which eachsuccessivevertex(afterthefirst)isadjacenttoitspredecessorinthepath.Inasimplepath,theverticesandedgesaredistinct.Acycleisapaththatissimpleexceptthatthefirstandfinalverticesarethesame.

Figure17.2Graphisomorphismexamples

Thetoptwographsareisomorphicbecausewecanrelabeltheverticestomakethetwosetsofedgesidentical(tomakethemiddlegraphthesameasthetop

graph,change10to4,7to3,2to5,3to1,12to0,5to2,9to11,0to12,11to9,1to7,and4to10).Thebottomgraphisnotisomorphictotheothersbecausethereisnowaytorelabeltheverticestomakeitssetofedgesidenticaltoeither.

Wesometimesuse the termcyclicpath to refer toapathwhose first and finalverticesarethesame(andisotherwisenotnecessarilysimple);andweusethetermtourtorefertoacyclicpaththatincludeseveryvertex.Anequivalentwaytodefineapathisasthesequenceofedgesthatconnectthesuccessivevertices.Weemphasizethis inournotationbyconnectingvertexnamesinapathinthesame way as we connect them in an edge. For example, the simple paths inFigure17.1include3-4-6-0-2,and9-12-11,andthecyclesinthegraphinclude0-6-4-3-5-0 and 5-4-3-5.We define the length of a path or a cycle to be itsnumberofedges.

Figure17.3Graphterminology

Thisgraphhas55vertices,70edges,and3connectedcomponents.Oneoftheconnectedcomponentsisatree(right).Thegraphhasmanycycles,oneofwhichishighlightedinthelargeconnectedcomponent(left).Thediagramalsodepictsaspanningtreeinthesmallconnectedcomponent(center).Thegraphasawhole

doesnothaveaspanningtree,becauseitisnotconnected.

We adopt the convention that each single vertex is a path of length 0 (a pathfromthevertextoitselfwithnoedgesonit,whichisdifferentfromaself-loop).Apartfromthisconvention,inagraphwithnoparalleledgesandnoself-loops,apairofverticesuniquelydeterminesanedge,pathsmusthaveon themat leasttwodistinctvertices,andcyclesmusthaveonthematleastthreedistinctedgesandthreedistinctvertices.We say that two simplepaths aredisjoint if they have novertices in commonother than, possibly, their endpoints. Placing this condition is slightly weakerthan insisting that the paths have no vertices at all in common, and is usefulbecausewe can combine simple disjoint paths from s to t and t to u to get asimpledisjointpathfromstouifsanduaredifferent(andtogetacycleifsandu are thesame).The termvertexdisjoint is sometimesused todistinguish thisconditionfromthestrongerconditionofedgedisjoint,wherewerequirethatthepathshavenoedgeincommon.Definition 17.3A graph is a connected graph if there is a path from everyvertextoeveryothervertexinthegraph.Agraphthatisnotconnectedconsistsofasetofconnectedcomponents,whicharemaximalconnectedsubgraphs.

The term maximal connected subgraph means that there is no path from asubgraphvertextoanyvertexinthegraphthatisnotinthesubgraph.Intuitively,iftheverticeswerephysicalobjects,suchasknotsorbeads,andtheedgeswerephysicalconnections,suchasstringsorwires,aconnectedgraphwouldstayinonepieceifpickedupbyanyvertex,andagraphthatisnotconnectedcomprisestwoormoresuchpieces.Definition17.4Anacyclicconnectedgraphiscalledatree (seeChapter4).Aset of trees is called a forest . A spanning tree of a connected graph is asubgraph that contains all of that graph’s vertices and is a single tree. Aspanning forest of a graph is a subgraph that contains all of that graph’sverticesandisaforest.For example, the graph illustrated in Figure 17.1 has three connectedcomponents,andisspannedbytheforest7-89-109-119-120-10-20-55-35-44-6 (there aremany other spanning forests). Figure 17.3 highlights these andotherfeaturesinalargergraph.We explore further details about trees in Chapter 4, and look at variousequivalentdefinitions.For example, agraphGwithV vertices is a tree if andonlyifitsatisfiesanyofthefollowingfourconditions:

•GhasV−1edgesandnocycles.•GhasV−1edgesandisconnected.•ExactlyonesimplepathconnectseachpairofverticesinG.•Gisconnected,butremovinganyedgedisconnectsit.

Anyoneoftheseconditionsisnecessaryandsufficienttoprovetheotherthree,and we can develop other combinations of facts about trees from them (seeExercise 17.1). Formally, we should choose one condition to serve as adefinition;informally,weletthemcollectivelyserveasthedefinition,andfreelyengageinusagesuchasthe“acyclicconnectedgraph”choiceinDefinition17.4.Graphswithalledgespresentarecalledcompletegraphs(seeFigure17.4).WedefinethecomplementofagraphGbystartingwithacompletegraphthathasthesamesetofverticesastheoriginalgraphandthenremovingtheedgesofG.Theunionoftwographsisthegraphinducedbytheunionoftheirsetsofedges.Theunionofagraphand itscomplement isacompletegraph.Allgraphs thathaveV vertices are subgraphs of the complete graph that hasV vertices. ThetotalnumberofdifferentgraphsthathaveVverticesis2V(V−1)/2(thenumberofdifferent ways to choose a subset from the V (V - 1)/2 possible edges). Acompletesubgraphiscalledaclique.

Figure17.4Completegraphs

Thesecompletegraphs,witheveryvertexconnectedtoeveryothervertex,have10,15,21,28,and36edges(bottomtotop).Everygraphwithbetween5and9vertices(therearemorethan68billionsuchgraphs)isasubgraphofoneof

thesegraphs.

Most graphs thatwe encounter in practice have relatively few of the possibleedgespresent.Toquantify thisconcept,wedefine thedensityofagraph tobethe average vertex degree, or 2E/V. Adense graph is a graphwhose averagevertexdegreeisproportionaltoV;asparsegraphisagraphwhosecomplementisdense.Inotherwords,weconsideragraphtobedenseifEisproportionaltoV2 and sparse otherwise. This asymptotic definition is not necessarilymeaningfulforaparticulargraph,butthedistinctionisgenerallyclear:Agraphthathasmillionsofverticesandtensofmillionsofedgesiscertainlysparse,andagraphthathas thousandsofverticesandmillionsofedges iscertainlydense.Wemightcontemplateprocessingasparsegraphwithbillionsofvertices,butadense graphwith billions of verticeswould have an overwhelming number ofedges.Knowingwhetheragraphissparseordenseisgenerallyakeyfactorinselectinganefficient algorithm toprocess thegraph.For example, for agivenproblem,wemightdeveloponealgorithmthattakesaboutV2stepsandanotherthattakesaboutE lgE steps.These formulas tell us that the secondalgorithmwouldbebetterforsparsegraphs,whereasthefirstwouldbepreferredfordensegraphs.Forexample,adensegraphwithmillionsofedgesmighthaveonlythousandsofvertices: in this case V2 and E would be comparable in value, and the V2algorithmwouldbe20timesfasterthantheElgEalgorithm.Ontheotherhand,asparsegraphwithmillionsofedgesalsohasmillionsofvertices,sotheElgEalgorithmcouldbemillionsoftimesfasterthantheV2algorithm.Wecouldmakespecifictradeoffsonthebasisofanalyzingtheseformulasinmoredetail,butitgenerally suffices in practice to use the terms sparse and dense informally tohelpusunderstandfundamentalperformancecharacteristics.Whenanalyzinggraphalgorithms,weassume thatV/E isboundedabovebyasmallconstant,sothatwecanabbreviateexpressionssuchasV(V+E)toVE.This assumption comes into play only when the number of edges is tiny incomparisontothenumberofvertices—araresituation.Typically,thenumberofedgesfarexceedsthenumberofvertices(V/Eismuchlessthan1).Abipartitegraphisagraphwhoseverticeswecandivideintotwosetssuchthatalledgesconnectavertexinonesetwithavertexintheotherset.Figure17.5

givesanexampleofabipartitegraph.Bipartitegraphsariseinanaturalwayinmany situations, such as thematching problems described at the beginning ofthischapter.Anysubgraphofabipartitegraphisbipartite.

Figure17.5Abipartitegraph

Alledgesinthisgraphconnectodd-numberedverticeswitheven-numberedones,soitisbipartite.Thebottomdiagrammakesthepropertyobvious.

Graphsasdefinedtothispointarecalledundirectedgraphs.Indirectedgraphs,alsoknownasdigraphs,edgesareone-way:weconsiderthepairofverticesthatdefineseachedgetobeanorderedpairthatspecifiesaone-wayadjacencywherewethinkabouthaving theability toget fromthefirstvertex to thesecondbutnotfromthesecondvertextothefirst.Manyapplications(forexample,graphsthatrepresenttheWeb,schedulingconstraints,ortelephone-calltransactions)arenaturallyexpressedintermsofdigraphs.We refer to edges in digraphs as directed edges, though that distinction isgenerally obvious in context (some authors reserve the term arc for directededges).Thefirstvertexinadirectededgeiscalledthesource;thesecondvertexiscalledthedestination.(Someauthorsusethetermstailandhead,respectively,todistinguishtheverticesindirectededges,butweavoidthisusagebecauseofoverlapwithouruseof thesame terms indata-structure implementations.)Wedrawdirectededgesasarrowspointingfromsourcetodestination,andoftensaythat the edge points to the destination. When we use the notation v-w in adigraph,wemeanittorepresentanedgethatpointsfromvtow;itisdifferentfromw-v,which represents an edge that points fromw to v.We speak of theindegree and outdegree of a vertex (the number of edges where it is thedestinationandthenumberofedgeswhereitisthesource,respectively).Sometimes,wearejustifiedinthinkingofanundirectedgraphasadigraphthat

hastwodirectededges(oneineachdirection);othertimes,itisusefultothinkofundirected graphs simply in terms of connections. Normally, as discussed indetailinSection17.4,weusethesamerepresentationfordirectedandundirectedgraphs(seeFigure17.6).That is,wegenerallymaintain tworepresentationsofeachedgeforundirectedgraphs,onepointingineachdirection,sothatwecanimmediatelyanswerquestionssuchas,“Whichverticesareconnectedtovertexv?”

Figure17.6Twodigraphs

ThedrawingatthetopisarepresentationoftheexamplegraphinFigure17.1interpretedasadirectedgraph,wherewetaketheedgestobeorderedpairsandrepresentthembydrawinganarrowfromthefirstvertextothesecond.ItisalsoaDAG.ThedrawingatthebottomisarepresentationoftheundirectedgraphfromFigure17.1thatindicatesthewaythatweusuallyrepresentundirectedgraphs:asdigraphswithtwoedgescorrespondingtoeachconnection(onein

eachdirection).

Chapter19isdevotedtoexploringthestructuralpropertiesofdigraphs;theyaregenerally more complicated than the corresponding properties for undirectedgraphs.Adirectedcycleinadigraphisacycleinwhichalladjacentvertexpairsappearintheorderindicatedby(directed)graphedges.Adirectedacyclicgraph(DAG) isadigraphthathasnodirectedcycles.ADAG(anacyclicdigraph) isnotthesameasatree(anacyclicundirectedgraph).Occasionally,werefertotheunderlyingundirectedgraphofadigraph,meaningtheundirectedgraphdefined

bythesamesetofedges,butwheretheseedgesarenotinterpretedasdirected.Chapters 20 through 22 are generally concerned with algorithms for solvingvarious computational problems associated with graphs in which otherinformation is associatedwith the vertices and edges. Inweightedgraphs, weassociate numbers (weights) with each edge, which generally represents adistanceorcost.Wealsomightassociateaweightwitheachvertex,ormultipleweights with each vertex and edge. In Chapter 20 we work with weightedundirectedgraphs;inChapters21and22westudyweighteddigraphs,whichwealsorefer toasnetworks.Thealgorithms inChapter22 solveclassicproblemsthatarisefromaparticularinterpretationofnetworksknownasflownetworks.As was evident even in Chapter 1, the combinatorial structure of graphs isextensive. This extent of this structure is all the more remarkable because itspringsforthfromasimplemathematicalabstraction.Thisunderlyingsimplicitywillbereflectedinmuchofthecodethatwedevelopforbasicgraphprocessing.However,thissimplicitysometimesmaskscomplicateddynamicpropertiesthatrequiredeepunderstandingofthecombinatorialpropertiesofgraphsthemselves.Itisoftenfarmoredifficulttoconvinceourselvesthatagraphalgorithmworksasintendedthanthecompactnatureofthecodemightsuggest.

Exercises17.1ProvethatanyacyclicconnectedgraphthathasVverticeshasV-1edges.•17.2Givealltheconnectedsubgraphsofthegraph

0-10-20-31-32-3.•17.3WritedownalistofthenonisomorphiccyclesofthegraphinFigure17.1.Forexample,ifyourlistcontains3-4-5-3,itshouldnotcontain3-5-4-3,4-5-3-4,4-3-5-4,5-3-4-5,or5-4-3-5.17.4Considerthegraph

3-71-47-80-55-23-82-90-64-92-66-4.Determinethenumberofconnectedcomponents,giveaspanningforest,listallthesimplepathswithatleastthreevertices,andlistallthenonisomorphiccycles(seeExercise17.3).•17.5Considerthegraphsdefinedbythefollowingfoursetsofedges:0-10-20-31-31-42-52-93-64-74-85-85-96-76-97-80-10-20-30-31-42-52-93-64-74-85-85-96-76-97-80-11-21-30-30-42-52-93-64-74-85-85-96-76-97-84-17-96-27-35-00-20-81-63-96-32-81-59-84-54-7.

Which of these graphs are isomorphic to one another? Which of them areplanar?17.6 Consider the more than 68 billion graphs referred to in the caption toFigure17.4.Whatpercentageofthemhasfewerthanninevertices?•17.7HowmanydifferentsubgraphsarethereinagivengraphwithVverticesandEedges?

• 17.8 Give tight upper and lower bounds on the number of connectedcomponentsingraphsthathaveVverticesandEedges.

•17.9HowmanydifferentundirectedgraphsaretherethathaveVverticesandEedges?

••• 17.10 If we consider two graphs to be different only if they are notisomorphic, howmany different graphs are there that haveV vertices and Eedges?17.11HowmanyV-vertexgraphsarebipartite?

17.2GraphADTWe develop our graph-processing algorithms using an ADT that defines thefundamental tasks, using the standard mechanisms introduced in Chapter 4.Program 17.1 is the ADT interface that we use for this purpose. Basic graphrepresentationsandimplementationsforthisADTarethetopicofSections17.3through17.5.Laterinthebook,wheneverweconsideranewgraph-processingproblem,weconsider thealgorithms that solve it and their implementations inthe context of client programs and ADTs that access graphs through thisinterface.Thisschemeallowsustoaddressgraph-processingtasksrangingfromelementary maintenance functions to sophisticated solutions of difficultproblems.Theinterfaceisbasedonourstandardmechanismthathidesrepresentationsandimplementationsfromclientprograms(seeSection4.8).Italsoincludesasimplestructure type definition that allows our programs to manipulate edges in auniformway.Theinterfaceprovidesthebasicmechanismsthatallowclientstobuildgraphs(byconstructingthegraphandthenaddingtheedges),tomaintainthe

Program17.1GraphADTinterfaceThis interface is a starting point for implementing and testing graph algorithms. Itdefines two data types: a trivial Edge data type, including a constructor function thatcreatesanEdgefromtwogivenvertices;andaGRAPHdatatype,whichisdefinedwiththestandardrepresentation-independentADTinterfacemethodologyfromChapter4.

TheGRAPHconstructortakestwoarguments:anintegergivingthenumberofverticesand a boolean that tellswhether the graph is undirected or directed (a digraph),withundirectedthedefault.

ThebasicoperationsthatweusetoprocessgraphsanddigraphsareADTfunctionstocreate and destroy them; to report the number of vertices and edges; and to add anddeleteedges.TheiteratorclassadjIteratorallowsclientstoprocesseachoftheverticesadjacenttoanygivenvertex.Programs17.2and17.3illustrateitsuse.

graphs(byremovingsomeedgesandaddingothers),andtoexaminethegraphs(usinganiteratorforprocessingtheverticesadjacenttoanygivenvertex).TheADTinProgram17.1isprimarilyavehicletoallowustodevelopandtestalgorithms; it is not a general-purpose interface. As usual, we work with thesimplest interface that supports the basic graph-processing operations that wewish to consider. Defining such an interface for use in practical applicationsinvolves making numerous tradeoffs among simplicity, efficiency, andgenerality.Weconsiderafewofthesetradeoffsnext;weaddressmanyothersinthecontextofimplementationsandapplicationsthroughoutthisbook.The graph constructor takes themaximum possible number of vertices in thegraphasanargument,sothatimplementationscanallocatememoryaccordingly.Weadoptthisconventionsolelytomakethecodecompactandreadable.Amoregeneral graph ADT might include in its interface the capability to add andremove vertices as well as edges; this would impose more demandingrequirementsonthedatastructuresusedtoimplementtheADT.Wemightalsochoosetoworkatanintermediatelevelofabstraction,andconsiderthedesignofinterfacesthatsupporthigher-levelabstractoperationsongraphsthatwecanuseinimplementations.WerevisitthisideabrieflyinSection17.5,afterweconsiderseveralconcreterepresentationsandimplementations.AgeneralgraphADTneeds to take intoaccountparalleledgesandself-loops,

becausenothingpreventsaclientprogramfromcallinginsertwithanedgethatisalreadypresentinthegraph(paralleledge)orwithanedgewhosetwovertexindicesarethesame(self-loop).Itmightbenecessarytodisallowsuchedgesinsomeapplications,desirabletoincludetheminotherapplications,andpossibleto ignore them in still other applications. Self-loops are trivial to handle, butparalleledgescanbecostlytohandle,dependingonthegraphrepresentation.Incertain situations, including a remove parallel edges ADT function might beappropriate;then,implementationscanletparalleledgescollect,andclientscanremove or otherwise process parallel edges when warranted. We will revisittheseissueswhenweexaminegraphrepresentationsinSections17.3and17.4.Program17.2isafunctionthatillustratestheuseoftheiteratorclassinthegraphADT.Itisafunctionthatextractsagraph’s

Program17.2Exampleofagraph-processingclientfunctionThis function shows one way to use the graph ADT to implement a basic graph-processing operation in amanner independent of the representation. It returns all thegraph’sedgesinavector.Thisimplementationillustratesthebasisformostoftheprogramsthatweconsider:weprocesseachedgeinthegraphbycheckingalltheverticesadjacenttoeachvertex.Wegenerallydonotcallbeg,end,andnxtexceptas illustratedhere,so thatwecanbetterunderstandtheperformancecharacteristicsofourimplementations(seeSection17.5).

template<classGraph>

vector<Edge>edges(Graph&G)

{intE=0;

vector<Edge>a(G.E());

for(intv=0;v<G.V();v++)

{

typenameGraph::adjIteratorA(G,v);

for(intw=A.beg();!A.end();w=A.nxt())

if(G.directed()||v<w)

a[E++]=Edge(v,w);

}

returna;

}

setofedgesandreturnsitinaC++StandardTemplateLibrary(STL)vector.Agraphisnothingmorenorlessthanitssetofedges,andweoftenneedawaytoretrieveagraphinthisform,regardlessofitsinternalrepresentation.Theorderin which the edges appear in the vector is immaterial and will differ fromimplementation to implementation. We use a template for such functions toallowforusingmultipleimplementationsofthegraphADT.Program 17.3 is another example of the use of the iterator class in the graphADT, to print out a table of the vertices adjacent to each vertex, as shown inFigure17.7.Thecodeinthesetwoexamplesisquitesimilarandissimilartothecodeinnumerousgraph-processingalgorithms.Remarkably,wecanbuildallof

the algorithms that we consider in this book on this basic abstraction ofprocessing all the vertices adjacent to each vertex (which is equivalent toprocessingalltheedgesinthegraph),asinthesefunctions.

Figure17.7Adjacencylistsformat

ThistableillustratesyetanotherwaytorepresentthegraphinFigure17.1:weassociateeachvertexwithitssetofadjacentvertices(thoseconnectedtoitbyasingleedge).Eachedgeaffectstwosets:foreveryedgeu-vinthegraph,u

appearsinv’ssetandvappearsinu’sset.

Program17.3AclientfunctionthatprintsagraphThis implementationof the show function from the io classofProgram17.4 uses thegraphADT toprinta tableof theverticesadjacent toeachgraphvertex.Theorder inwhich the vertices appear depends upon the graph representation and the ADTimplementation(seeFigure17.7).


voidIO<Graph>::show(constGraph&G)

{

for(ints=0;s<G.V();s++)

{

cout.width(2);cout<<s<<“:”;

typenameGraph::adjIteratorA(G,s);

for(intt=A.beg();!A.end();t=A.nxt())

{cout.width(2);cout<<t<<“”;}

cout<<endl;

}

}

As discussed in Section 17.5, we often package related graph-processingfunctions into a single class. Program 17.4 is an interface for such a class. ItdefinestheshowfunctionofProgram17.3andtwofunctionsforinsertingintoagraphedges taken fromstandard input (seeExercise17.12andProgram 17.14

forimplementationsofthesefunctions).Generally,thegraph-processingtasksthatweconsiderinthisbookfallintooneofthreebroadcategories:•Computethevalueofsomemeasureofthegraph.•Computesomesubsetoftheedgesofthegraph.•Answerqueriesaboutsomepropertyofthegraph.Examplesofthefirstarethenumberofconnectedcomponentsandthelengthofthe shortest path between two given vertices in the graph; examples of thesecond are a spanning tree and the longest cycle containing a given vertex;examplesofthethirdarewhethertwogivenverticesareinthesameconnectedcomponent.Indeed,thetermsthatwedefinedinSection17.1immediatelybringtomindahostofcomputationalproblems.

Program17.4Graph-processinginput/outputinterfaceThisclassillustrateshowwemightpackagerelatedgraph-processingfunctionstogetherinasingleclass.Itdefinesfunctionsforprintingagraph(seeProgram17.3); insertingedgesdefinedbypairsofintegersonstandardinput(seeExercise17.12);andinsertingedgesdefinedbypairsofsymbolsonstandardinput(seeProgram17.14).


classIO

{

public:

staticvoidshow(constGraph&);

staticvoidscanEZ(Graph&);

staticvoidscan(Graph&);

};

OurconventionforaddressingsuchtaskswillbetobuildADTsthatareclientsofthebasicADTinProgram17.1,butthat, inturn,allowustoseparateclientprograms that need to solve the problem from implementations. For example,Program17.5isaninterfaceforagraph-connectivityADT.Wecanwriteclientprograms that use this ADT to create objects that can provide the number ofconnected components in the graph and that can test whether or not any twoverticesareinthesameconnectedcomponent.WedescribeimplementationsofthisADTandtheirperformancecharacteristicsinSection18.5,andwedevelopsimilar ADTs throughout the book. Typically, such ADTs include apreprocessing public member function (usually the constructor), private datamembers that keep information learned during the preprocessing, and querypublic member functions that use this information to provide clients withinformationaboutthegraph.Inthisbook,wegenerallyworkwithstaticgraphs,whichhaveafixednumber

ofverticesVandedgesE.Generally,webuildthegraphsbyexecutingEcallstoinsert,thenprocessthemeitherbycallingsomeADTfunctionthattakesagraphasargumentandreturnssomeinformationaboutthatgraph,orbyusingobjectsofthekindjustdescribedtopreprocessthegraphsoastobeabletoefficientlyanswerqueriesabout it. Ineithercase, changing thegraphbycalling insertorremovenecessitatesreprocessingthegraph.Dynamicproblems,

Program17.5ConnectivityinterfaceThisADTinterface illustratesa typicalparadigmthatweusefor implementinggraph-processingalgorithms.Itallowsaclienttoconstructanobjectthatprocessesagraphsothat it can answer queries about the graph’s connectivity.The countmember functionreturns the number of connected components and the connect member function testswhether two given vertices are connected. Program18.4 is an implementation of thisinterface.


classCC

{

private:

//implementation-dependentcode

public:

CC(constGraph&);

intcount();

boolconnect(int,int);

};

wherewewanttointermixgraphprocessingwithedgeandvertexinsertionandremoval, take us into the realm ofonline algorithms (also known asdynamicalgorithms ), which present a different set of challenges. For example, theconnectivityproblemthatwesolvedwithunion-findalgorithmsinChapter1isan example of an online algorithm, becausewe can get information about theconnectivityofagraphasweinsertedges.TheADTinProgram17.1supportsinsertedgeandremoveedgeoperations,soclientsarefreetousethemtomakechangesingraphs,buttheremaybeperformancepenaltiesforcertainsequencesofoperations.Forexample,union-findalgorithmsmayrequirereprocessingthewhole graph if a client uses remove edge. For most of the graph-processingproblems that we consider, adding or deleting a few edges can dramaticallychangethenatureofthegraphandthusnecessitatereprocessingit.One of our most important challenges in graph processing is to have a clearunderstanding of performance characteristics of implementations and to makesure that client programs make appropriate use of them. As with the simplerproblemsthatweconsideredin

Program17.6Exampleofagraph-processingclientprogramThisprogramillustratestheuseofthegraph-processingADTsdescribedinthissection,

using the ADT conventions described in Section 4.5. It constructs a graph with Vvertices,insertsedgestakenfromstandardinput,printstheresultinggraphifitissmall,and computes (and prints) the number of connected components. It assumes thatProgram17.1,Program17.4,andProgram17.5(with implementations)are in thefilesGRAPH.cc,IO.cc,andCC.cc(respectively).

#include<iostream.h>

#include<stdlib.h>

#include“GRAPH.cc”

#include“IO.cc”

#include“CC.cc”

main(intargc,char*argv[])

{intV=atoi(argv[1]);

GRAPHG(V);

IO<GRAPH>::scan(G);

if(V<20)IO<GRAPH>::show(G);

cout<<G.E()<<“edges”;

CC<GRAPH>Gcc(G);

cout<<Gcc.count()<<“components”<<endl;

}

Parts1through4,ouruseofADTsmakesitpossibletoaddresssuchissuesinacoherentmanner.Program17.6 is an example of a graph-processing client program. It uses thebasicADTofProgram17.1,theinput-outputclassofProgram17.4toreadthegraph fromstandard inputandprint it to standardoutput, and theconnectivityclass of Program 17.5 to find its number of connected components. We usesimilarbutmoresophisticatedclients togenerateother typesofgraphs, to testalgorithms,tolearnotherpropertiesofgraphs,andtousegraphstosolveotherproblems. The basic scheme is amenable for use in any graph-processingapplication.In Sections 17.3 through 17.5, we examine the primary classical graphrepresentations and implementations of the ADT functions in Program 17.1.Theseimplementationsprovideabasisforustoexpandtheinterfacetoincludethegraph-processingtasksthatareourfocusforthenextseveralchapters.ThefirstdecisionthatwefaceindevelopinganADTimplementationiswhichgraphrepresentationtouse.Wehavethreebasicrequirements.First,wemustbeable to accommodate the types of graphs that we are likely to encounter inapplications(andwealsowouldprefernottowastespace).Second,weshouldbe able to construct the requisite data structures efficiently.Third,wewant todevelop efficient algorithms to solve our graph-processing problems withoutbeingundulyhamperedbyanyrestrictionsimposedbytherepresentation.Suchrequirementsarestandardonesforanydomainthatweconsider—weemphasizethemagainthemherebecause,asweshallsee,differentrepresentationsgiverisetohugeperformancedifferencesforeventhesimplestofproblems.Forexample,wemightconsideravectorofedgesrepresentationasthebasisfor

an ADT implementation (see Exercise 17.16). That direct representation issimple,butitdoesnotallowustoperformefficientlythebasicgraph-processingoperations that we shall be studying. As we will see, most graph-processingapplicationscanbehandledreasonablywithoneoftwostraightforwardclassicalrepresentationsthatareonlyslightlymorecomplicatedthanthevector-of-edgesrepresentation:theadjacency-matrixortheadjacency-listsrepresentation.Theserepresentations,whichweconsiderindetailinSections17.3and17.4,arebasedonelementarydatastructures(indeed,wediscussedthembothinChapters3and5 as example applications of sequential and linked allocation). The choicebetween the two depends primarily on whether the graph is dense or sparse,although, as usual, the nature of the operations to be performed also plays animportantroleinthedecisiononwhichtouse.

Exercises•17.12ImplementthescanEZfunctionfromProgram17.4:writeafunctionthatbuildsagraphbyreadingedges(pairsof integersbetween0andV−1) fromstandardinput.

•17.13WriteanADTclientthataddsalltheedgesinagivenvectortoagivengraph.

•17.14Writeafunctionthatcallsedgesandprintsoutalltheedgesinthegraph,intheformatusedinthistext(vertexnumbersseparatedbyahyphen).

•17.15DevelopanimplementationfortheconnectivityADTofProgram17.5,usingaunion-findalgorithm(seeChapter1).

•17.16Providean implementationof theADTfunctions inProgram17.1 thatusesavectorofedgestorepresentthegraph.Useabrute-forceimplementationofremovethatremovesanedgev-wbyscanningthevectortofindv-worw-vand then exchanges the edge found with the final one in the vector. Use asimilarscantoimplementtheiterator.Note:ReadingSection17.3firstmightmakethisexerciseeasier.

17.3Adjacency-MatrixRepresentationAnadjacency-matrix representationofagraph isaV -by-VmatrixofBooleanvalues,withtheentryinrowvandcolumnwdefinedtobe1ifthereisanedgeconnectingvertexvandvertexwinthegraph,andtobe0otherwise.Figure17.8depictsanexample.Program17.7isanimplementationofthegraphADTinterfacethatusesadirectrepresentationof thismatrix,builtasavectorofvectors,asdepicted inFigure

17.9.Itisatwo-dimensionalexistencetablewiththeentryadj[v][w]settotrueifthereisanedgeconnectingvandwinthegraph,andsettofalseotherwise.Notethatmaintainingthispropertyinanundirectedgraphrequiresthateachedgeberepresentedby two entries: the edgev-w is representedby true values inbothadj[v][w]andadj[w][v],asistheedgew-v.ThenameDenseGRAPHinProgram17.7emphasizesthattheimplementationismoresuitedfordensegraphsthanforsparseones,anddistinguishesitfromotherimplementations. Clients may use typedef to make this type equivalent toGRAPHoruseDenseGRAPHexplicitly.In theadjacencymatrix that representsagraphG, rowv is avector that is anexistencetablewhoseithentryistrueifvertexiisadjacenttov(theedgev-iisinG).Thus,toprovideclientswiththecapabilitytoprocesstheverticesadjacentto v, we need only provide code that scans through this vector to find trueentries, as shown in Program 17.8. We need to be mindful that, with thisimplementation,processingalloftheverticesadjacenttoagivenvertexrequires(atleast)timeproportionaltoV,nomatterhowmanysuchverticesexist.AsmentionedinSection17.2,ourinterfacerequiresthatthenumberofverticesisknownto theclientwhen thegraph is initialized. Ifdesired,wecouldallowfor inserting and deleting vertices (see Exercise 17.21). A key feature of theconstructorinProgram17.7is

Figure17.8Adjacency-matrixgraphrepresentation

ThisBooleanmatrixisanotherrepresentationofthegraphdepictedinFigure17.1.Ithasa1(true)inrowvandcolumnwifthereisanedgeconnectingvertexvandvertexwanda0(false)inrowvandcolumnwifthereisnosuchedge.

Thematrixissymmetricaboutthediagonal.Forexample,thesixthrow(andthe

sixthcolumn)saysthatvertex6isconnectedtovertices0and4.Forsomeapplications,wewilladopttheconventionthateachvertexisconnectedtoitself,andassign1sonthemaindiagonal.Thelargeblocksof0sintheupperrightandlowerleftcornersareartifactsofthewayweassignedvertexnumbersforthis

example,notcharacteristicofthegraph(exceptthattheydoindicatethegraphtobesparse).

Program17.7GraphADTimplementation(adjacencymatrix)Thisclass is a straightforward implementationof the interface inProgram17.1 that isbased on representing the graph with a vector of boolean vectors (see Figure 17.9).Edges are inserted and removed in constant time. Duplicate edge insert requests aresilentlyignored,butclientscanuseedgetotestwhetheranedgeexists.ConstructingthegraphtakestimeproportionaltoV2.

Program17.8Iteratorforadjacency-matrixrepresentationThisimplementationof theiteratorforProgram17.7usesanindexi toscanpastfalseentriesinrowvoftheadjacencymatrix(adj[v]).Acalltobeg()followedbyasequenceofcalls tonxt() (checking thatend() is falsebeforeeachcall)givesa sequenceof theverticesadjacenttovinGinorderoftheirvertexindex.

classDenseGRAPH::adjIterator

{constDenseGRAPH&G;

inti,v;

public:

adjIterator(constDenseGRAPH&G,intv):

G(G),v(v),i(-1){}

intbeg()

{i=-1;returnnxt();}

intnxt()

{

for(i++;i<G.V();i++)

if(G.adj[v][i]==true)returni;

return-1;

}

boolend()

{returni>=G.V();}

};

thatitinitializesthegraphbysettingthematrixentriesalltofalse.Weneedtobemindful that thisoperation takes timeproportional toV2, nomatterhowmanyedgesareinthegraph.ErrorchecksforinsufficientmemoryarenotincludedinProgram 17.7 for brevity—it is prudent programming practice to add thembeforeusingthiscode(seeExercise17.24).Toaddanedge,wesettheindicatedmatrixentriestotrue(onefordigraphs,twoforundirectedgraphs).This representationdoesnotallowparalleledges: Ifanedgeistobeinsertedforwhichthematrixentriesarealready1,thecodehasnoeffect. InsomeADTdesigns, itmightbepreferable to informtheclientof theattempt to insert aparallel edge,perhapsusinga returncode from insert.Thisrepresentationdoes allow self-loops:An edgev-v is representedby a nonzeroentryina[v][v].

Figure17.9Adjacencymatrixdatastructure

ThisfiguredepictstheC++representationofthegraphinFigure17.1,asanvectorofvectors.

Toremoveanedge,wesettheindicatedmatrixentriestofalse.Ifanonexistentedge(one forwhich thematrixentriesarealreadyfalse) is tobe removed, thecodehasnoeffect.Again, insomeADTdesigns,wemightwish toarrange toinformtheclientofsuchconditions.

Ifweareprocessinghugegraphsorhugenumbersofsmallgraphs,orspaceisotherwise tight, there are severalways to save space. For example, adjacencymatricesthatrepresentundirectedgraphsaresymmetric:a[v][w]isalwaysequaltoa[w][v].Thus,wecouldsavespacebystoringonlyone-halfofthissymmetricmatrix(seeExercise17.22).Anotherwaytosaveasignigicantamountofspaceis touse amatrixofbits (assuming thatvector<bool>doesnotdo so). In thisway, for instance,wecould representgraphsofup toabout64,000vertices inabout64million64-bitwords(seeExercise17.23).Theseimplementationshavethe slight complication that we need to add an ADT operation to test for theexistenceofanedge(seeExercise17.20).(Wedonotusesuchanoperationinour implementations becausewe can test for the existence of an edge v-w bysimplytestinga[v][w].)Suchspace-savingtechniquesareeffective,butcomeatthe cost of extra overhead that may fall in the inner loop in time-criticalapplications.Many applications involve associating other information with each edge—insuch cases, we can generalize the adjacency matrix to hold any informationwhatever,notjustbools.Whateverdatatypethatweuseforthematrixelements,weneedtoincludeanindicationwhethertheindicatededgeispresentorabsent.InChapters20and21,weexploresuchrepresentations.Use of adjacencymatrices depends on associating vertex nameswith integersbetween0andV−1.Thisassignmentmightbedoneinoneofmanyways—forexample,we consider a program that does so in Section17.6). Therefore, thespecificmatrixof0-1valuesthatwerepresentwithavectorofvectorsinC++isbut one possible representation of any given graph as an adjacency matrix,becauseanotherprogrammightassigndifferentvertexnamestotheindicesweuse to specify rows and columns. Two matrices that appear to be markedlydifferentcouldrepresentthesamegraph(seeExercise17.17).Thisobservationisarestatementofthegraphisomorphismproblem:Althoughwemightliketodeterminewhether or not two differentmatrices represent the same graph, noonehasdevisedanalgorithmthatcanalwaysdosoefficiently.Thisdifficultyisfundamental. For example, our ability to find an efficient solution to variousimportantgraph-processingproblemsdependscompletelyon theway inwhichtheverticesarenumbered(see,forexample,Exercise17.26).Program17.3,whichweconsideredinSection17.2,printsouta tablewith theverticesadjacenttoeachvertex.WhenusedwiththeimplementationinProgram17.7, it prints the vertices in order of their vertex index, as in Figure 17.7.Notice, though, that it is not part of the definition of adjIterator that it visitsvertices in index order, so developing an ADT client that prints out the

adjacency-matrix representation of a graph is not a trivial task (see Exercise17.18). The output produced by these programs are themselves graphrepresentations thatclearly illustrateabasicperformance tradeoff.Toprintoutthematrix,weneedroomonthepageforallV2entries;toprintoutthelists,weneedroomforjustV+Enumbers.Forsparsegraphs,whenV2ishugecomparedtoV+E,wepreferthelists;fordensegraphs,whenEandV2arecomparable,we prefer thematrix.Aswe shall soon see,wemake the same basic tradeoffwhen we compare the adjacency-matrix representation with its primaryalternative:anexplicitrepresentationofthelists.Theadjacency-matrix representation isnot satisfactory forhugesparsegraphs:We need at least V2 bits of storage and V2 steps just to construct therepresentation.Inadensegraph,whenthenumberofedges(thenumberof1bitsin thematrix) isproportional toV2, this costmaybe acceptable, because timeproportionaltoV2isrequiredtoprocesstheedgesnomatterwhatrepresentationwe use. In a sparse graph, however, just initializing the matrix could be thedominantfactorintherunningtimeofanalgorithm.Moreover,wemaynotevenhaveenoughspace for thematrix.Forexample,wemaybe facedwithgraphswithmillionsofverticesandtensofmillionsofedges,butwemaynotwant—orbe able—to pay the price of reserving space for trillions of 0 entries in theadjacencymatrix.Ontheotherhand,whenwedoneedtoprocessahugedensegraph,thenthe0-entries that representabsentedges increaseourspaceneedsbyonlyaconstantfactorandprovideuswiththeabilitytodeterminewhetheranyparticularedgeispresentinconstanttime.Forexample,disallowingparalleledgesisautomaticinanadjacencymatrixbut iscostly insomeother representations. Ifwedohavespace available to hold an adjacency matrix, and either V2 is so small as torepresentanegligibleamountoftimeorwewillberunningacomplexalgorithmthatrequiresmorethanV2stepstocomplete,theadjacency-matrixrepresentationmaybethemethodofchoice,nomatterhowdensethegraph.

Exercises•17.17Give theadjacency-matrix representationsof the threegraphsdepictedinFigure17.2.

•17.18Givean implementationof showfor the representation-independent iopackageofProgram17.4thatprintsoutatwo-dimensionalmatrixof0sand1slike the one illustrated in Figure 17.8.Note: You cannot depend upon theiteratorproducingverticesinorderoftheirindices.

17.19Givenagraph,consideranothergraphthatisidenticaltothefirst,exceptthat the names of (integers corresponding to) two vertices are interchanged.Howaretheadjacencymatricesofthesetwographsrelated?•17.20AddafunctionedgetothegraphADTthatallowsclientstotestwhetherthereisanedgeconnectingtwogivenvertices,andprovideanimplementationfortheadjacency-matrixrepresentation.

•17.21Addfunctions to thegraphADTthatallowclients to insertanddeletevertices,andprovideimplementationsfortheadjacency-matrixrepresentation.

•17.22ModifyProgram17.7,augmentedasdescribedinExercise17.20,tocutitsspacerequirementsaboutinhalfbynotincludingarrayentriesa[v][w]forwgreaterthanv.17.23 Modify Program 17.7, augmented as described in Exercise 17.20, toensure that, ifyourcomputerhasB bitsperword, agraphwithV vertices isrepresented in about V2/B words (as opposed to V2). Do empirical tests toassesstheeffectofpackingbits intowordsonthetimerequiredfor theADToperations.17.24 Describe what happens if there is insufficient memory available torepresent the matrix when the constructor in Program 17.7 is invoked, andsuggestappropriatemodificationstothecodetohandlethissituation.17.25 Develop a version of Program 17.7 that uses a single vector withV2entries.•17.26Supposethatallkverticesinagrouphaveconsecutiveindices.Howcanyoudeterminefromtheadjacencymatrixwhetherornotthatgroupofverticesconstitutes a clique? Write a client ADT function that finds, in timeproportionaltoV2, the largestgroupofverticeswithconsecutive indices thatconstitutesaclique.

17.4Adjacency-ListsRepresentationThe standard representation that is preferred for graphs that are not dense iscalledtheadjacency-listsrepresentation,wherewekeeptrackofalltheverticesconnectedtoeachvertexonalinkedlistthatisassociatedwiththatvertex.Wemaintainavectoroflistssothat,givenavertex,wecanimmediatelyaccessitslist;weuselinkedlistssothatwecanaddnewedgesinconstanttime.Program17.9isanimplementationoftheADTinterfaceinProgram17.1thatisbasedon this approach, andFigure17.10depicts an example.To add an edgeconnecting v and w to this representation of the graph, we add w to v’s

adjacencylistandvtow’sadjacencylist.Inthisway,westillcanaddnewedgesinconstanttime,butthetotalamountofspacethatweuseisproportionaltothenumber of vertices plus the number of edges (as opposed to the number ofvertices squared, for the adjacency-matrix representation). For undirectedgraphs,weagainrepresenteachedgeintwodifferentplaces:anedgeconnectingvandwisrepresentedasnodesonbothadjacencylists.Itisimportanttoincludeboth; otherwise, we could not answer efficiently simple questions such as,“Which vertices are adjacent to vertex v?” Program 17.10 implements theiteratorthatanswersthisquestionforclients,intimeproportionaltothenumberofsuchvertices.The implementation in Programs 17.9 and 17.10 is a low-level one. Analternative is to use the STL list to implement each linked list (see Exercise17.30).Thedisadvantageofdoing so is thatSTL list implementationsneed tosupportmanymoreoperationsthanweneedandthereforetypicallycarryextraoverheadthatmightaffecttheperformanceofallofouralgorithms(seeExercise17.31).Indeed,allofourgraphalgorithmsusetheGraphADTinterface,sothisimplementation is an appropriate place to encapuslate all the low-leveloperations and concentrate on efficiency without affecting our other code.Another advantage of using the linked-list representation is that it provides aconcrete basis for understanding the performance characteristics of ourimplementations.Butanimportantfactortoconsideristhatthelinked-list–basedimplementationinPrograms17.9 and 17.10 is incomplete, because it lacks a destructor and acopy constructor. Formany applications, this defect could lead to unexpectedresultsorsevereperformanceproblems.

Figure17.10Adjacency-listsdatastructure

ThisfiguredepictsarepresentationofthegraphinFigure17.1asanarrayoflinkedlists.Thespaceusedisproportionaltothenumberofnodesplusthe

numberofedges.Tofindtheindicesoftheverticesconnectedtoagivenvertexv,welookatthevthpositioninanarray,whichcontainsapointertoalinkedlist

containingonenodeforeachvertexconnectedtov.Theorderinwhichthenodesappearonthelistsdependsonthemethodthatweusetoconstructthe

lists.

Program17.9GraphADTimplementation(adjacencylists)Thisimplementationof theinterfaceinProgram17.1usesavectorof linkedlists,onecorresponding to each vertex. It is equivalent to the representation of Program 3.15,whereanedgev-wisrepresentedbyanodeforwonlistvandanodeforvonlistw.Implementationsofremoveandedgeare left forexercises,asare thecopyconstructorand the destructor. The insert code keeps insertion time constant by not checking forduplicateedges,andthetotalamountofspaceusedisproportionaltoV+E;hence,thisrepresentationismostsuitableforsparsemultigraphs.

Clients may use typedef to make this type equivalent to GRAPH or useSparseMultiGRAPHexplicitly.

Program17.10Iteratorforadjacency-listsrepresentationThis implementationof the iterator forProgram17.9maintainsa link t to traverse thelinked listassociatedwithvertexv.Acall tobeg() followedbya sequenceofcalls tonxt() (checking that end() is false before each call) gives a sequence of the verticesadjacenttovinG.

classSparseMultiGRAPH::adjIterator

{constSparseMultiGRAPH&G;

intv;

linkt;

public:

adjIterator(constSparseMultiGRAPH&G,intv):

G(G),v(v){t=0;}

intbeg()

{t=G.adj[v];returnt?t->v:-1;}

intnxt()

{if(t)t=t->next;returnt?t->v:-1;}

boolend()

{returnt==0;}

};

These functions are direct extensions of those in the first-class queueimplementation of Program 4.22 (see Exercise 17.29).We assume throughoutthebookthatSparseMultiGRAPHobjectshavethem.UsingtheSTLlistinsteadoflow-levelsingly-linkedlistshasthedistinctadvantagethatthisextracodeisunnecessarybecauseaproperdestructorandcopyconstructorareautomaticallydefined.Forexample,DenseGRAPHobjectsbuiltbyProgram17.7areproperlydestroyedandcopiedinclientprogramsthatmanipulatethem,becausetheyarebuiltfromSTLobjects.BycontrasttoProgram17.7,Program17.9buildsmultigraphs,because itdoesnot removeparallel edges.Checking for duplicate edges in the adjacency-listsstructure would necessitate searching through the lists and could take timeproportional toV.Similarly,Program17.9doesnot includean implementationoftheremoveedgeoperationortheedgeexistencetest.Addingimplementations

for thesefunctions isaneasyexercise(seeExercise17.28),buteachoperationmighttaketimeproportionaltoV, tosearchthroughthelistsforthenodesthatrepresent the edges. These costsmake the basic adjacency-lists representationunsuitableforapplicationsinvolvinghugegraphswhereparalleledgescannotbetolerated, or applications involving heavy use of remove edge or of edgeexistence tests. InSection17.5,wediscussadjacency-list implementations thatsupportconstant-timeremoveedgeandedgeexistenceoperations.Whenagraph’svertexnamesarenotintegers,then(aswithadjacencymatrices)twodifferentprogramsmightassociatevertexnameswiththeintegersfrom0toV− 1 in two differentways, leading to two different adjacency-list structures(see,forexample,Program17.15).Wecannotexpect tobeabletotellwhethertwodifferentstructuresrepresentthesamegraphbecauseofthedifficultyofthegraphisomorphismproblem.Moreover,with adjacency lists, there are numerous representations of a givengraph even for a given vertex numbering. Nomatter in what order the edgesappear on the adjacency lists, the adjacency-list structure represents the samegraph(seeExercise17.33).Thischaracteristicofadjacencylistsisimportanttoknowbecausetheorderinwhichedgesappearontheadjacencylistsaffects,inturn, the order in which edges are processed by algorithms. That is, theadjacency-list structure determines how our various algorithms see the graph.Althoughanalgorithmshouldproduceacorrectanswernomatterhowtheedgesare ordered on the adjacency lists, it might get to that answer by differentsequencesofcomputationsfordifferentorderings.Ifanalgorithmdoesnotneedtoexamineall thegraph’sedges, thiseffectmightaffect the time that it takes.And, if there ismore than one correct answer, different input orderingsmightleadtodifferentoutputresults.Theprimaryadvantageoftheadjacency-listsrepresentationovertheadjacency-matrix representation is that it always uses space proportional to E + V, asopposedtoV2intheadjacencymatrix.TheprimarydisadvantageisthattestingfortheexistenceofspecificedgescantaketimeproportionaltoV,asopposedtoconstanttimeintheadjacencymatrix.Thesedifferencestrace,essentially,tothedifferencebetweenusinglinkedlistsandvectorstorepresentthesetofverticesincidentoneachvertex.Thus,weseeagainthatanunderstandingofthebasicpropertiesoflinkeddatastructures and vectors is critical if we are to develop efficient graph ADTimplementations.Our interest in theseperformancedifferences is thatwewantto avoid implementations that are inappropriately inefficient under unexpected

circumstanceswhenawiderangeofoperationsistobedemandedoftheADT.In Section 17.5, we discuss the application of basic data structures to realizemanyofthetheoreticalbenefitsofbothstructures.Nonetheless,Program17.9isasimpleimplementationwiththeessentialcharacteristicsthatweneedtolearnefficientalgorithmsforprocessingsparsegraphs.

Exercises•17.27Show,inthestyleofFigure17.10,theadjacency-listsstructureproducedwhenyouuseProgram17.9toinserttheedgesinthegraph

3-71-47-80-55-23-82-90-64-92-66-4(inthatorder)intoaninitiallyemptygraph.17.28 Provide implementations of remove and edge for the adjacency-listsgraphclass(Program17.9).Note:Duplicatesmaybepresent,butitsufficestoremoveanyedgeconnectingthespecifiedvertices.17.29Addacopyconstructorandadestructortotheadjacency-listsgraphclass(Program17.9).Hint:SeeProgram4.22.•17.30Modify the implementationofSparseMultiGRAPHPrograms17.9and17.10touseanSTLlistinsteadofalinkedlistforeachadjacencylist.17.31 Run empirical tests to compare your SparseMultiGRAPHimplementationofExercise17.30with the implementation in the text. For awell-chosen set of values forV, compare running times for a client programthat builds complete graphs with V vertices, then extracts the edges usingProgram17.2.•17.32Give a simple example of an adjacency-lists graph representation thatcouldnothavebeenbuiltbyrepeatedinsertionofedgesbyProgram17.9.17.33Howmany different adjacency-lists representations represent the samegraphastheonedepictedinFigure17.10?•17.34AddapublicmemberfunctiondeclarationtothegraphADT(Program17.1) that removes self-loops and parallel edges. Provide the trivialimplementationofthisfunctionfortheadjacency-matrix–basedclass(Program17.7), and provide an implementation of the function for the adjacency-list–based class (Program17.9) that uses time proportional toE and extra spaceproportionaltoV.17.35 Write a version of Program 17.9 that disallows parallel edges (byscanningthroughtheadjacencylist toavoidaddingaduplicateentryoneachedge insertion) and self-loops. Compare your implementation with the

implementationdescribedinExercise17.34.Whichisbetterforstaticgraphs?Note:SeeExercise17.49foranefficientimplementation.17.36WriteaclientofthegraphADTthatreturnstheresultofremovingself-loops,paralleledges,anddegree-0(isolated)verticesfromagivengraph.Note:The running time of your program should be linear in the size of the graphrepresentation.•17.37WriteaclientofthegraphADTthatreturnstheresultofremovingself-loops, collapsing paths that consist solely of degree-2 vertices from a givengraph. Specifically, every degree-2 vertex in a graphwith no parallel edgesappearsonsomepathu-…-wwhereuandwareeitherequalornotofdegree2. Replace any such path with u-w, and then remove all unused degree-2vertices as inExercise17.36.Note: This operationmay introduce self-loopsand parallel edges, but it preserves the degrees of vertices that are notremoved.

•17.38Givea (multi)graph thatcouldresult fromapplying the transformationdescribedinExercise17.37onthesamplegraphinFigure17.1.

17.5Variations,Extensions,andCostsIn this section, we describe a number of options for improving the graphrepresentationsdiscussedinSections17.3and17.4.The topics fall intooneofthree categories. First, the basic adjacency-matrix and adjacency-listsmechanismsextendreadilytoallowustorepresentothertypesofgraphs.Intherelevant chapters, we consider these extensions in detail and give examples;here,welookatthembriefly.Second,wediscussgraphADTdesignswithmorefeatures than our basic one and implementations that usemore advanced datastructurestoefficientlyimplementthem.Third,wediscussourgeneralapproachto addressing graph-processing tasks, by developing task-specific classes thatusethebasicgraphADT.OurimplementationsinPrograms17.7and17.9builddigraphsifthecalltotheconstructorincludesasecondargumentwithvaluetrue.Werepresenteachedgejustonce,asillustratedinFigure17.11.Anedgev-winadigraphisrepresentedby a 1 in the entry in rowv and columnw in the adjacencymatrix or by theappearanceofwonv’sadjacencylistintheadjacency-listsrepresentation.Theserepresentationsaresimplerthanthecorrespondingrepresentationsthatwehavebeenconsideringforundirectedgraphs,buttheasymmetrymakesdigraphsmorecomplicatedcombinatorialobjectsthanundirectedgraphs,asweseeinChapter19.Forexample,thestandardadjacency-listsrepresentationgivesnodirectwaytofindalledgescomingintoavertexinadigraph,sowewouldneedtochoosea

differentrepresentationifthatoperationneedstobesupported.Figure17.11Digraphrepresentations

Theadjacency-matrixandadjacency-listsrepresentationsofadigraphhaveonlyonerepresentationofeachedge,asillustratedintheadjacency-matrix(top)and

adjacency-lists(bottom)representationofthesetofedgesinFigure17.1interpretedasadigraph(seeFigure17.6,top).

Forweightedgraphsandnetworks,wefill theadjacencymatrixwithstructurescontaininginformationaboutedges(includingtheirpresenceorabsence)insteadof Boolean values; in the adjacency-lists representation, we include thisinformationinadjacency-listelements.Itisoftennecessarytoassociatestillmoreinformationwiththeverticesoredgesof a graph, to allow that graph to model more complicated objects. We canassociate extra information with each edge by extending the Edge type inProgram17.1asappropriate, thenusinginstancesof that typeintheadjacencymatrix, or in the list nodes in the adjacency lists. Or, since vertex names areintegers between0 andV− 1,we canusevertex-indexedvectors to associateextra information forvertices,perhapsusinganappropriateADT.WeconsiderADTs of this sort in Chapters 20 through 22. Alternatively, we could use aseparatesymbol-tableADTtoassociateextrainformationwitheachvertexandedge(seeExercise17.48andProgram17.15).To handle various specialized graph-processing problems, we often defineclassesthatcontainspecializedauxiliarydatastructuresrelatedtothegraph.Themostcommonsuchdatastructureisavertex-indexedvector,aswesawalreadyin Chapter 1, where we used vertex-indexed vectors to answer connectivityqueries.Weusevertex-indexedvectorsinnumerousimplementationsthroughoutthebook.Asanexample,supposethatwewishtoknowwhetheravertexvinagraphisisolated.Isvofdegree0?Fortheadjacency-listsrepresentation,wecanfindthisinformationimmediately,simplybycheckingwhetheradj[v]isnull.Butfortheadjacency-matrix representation,we need to check allV entries in the row orcolumncorrespondingtovtoknowthateachoneisnotconnectedtoanyothervertex; and for thevector-of-edges representation,wehavenobetter approachthantocheckallEedgestoseewhetherthereareanythatinvolvev.Weneedtoenable clients to avoid these potentially time-consuming computations. Asdiscussed inSection17.2 oneway to do so is to define a clientADT for theproblem, such as the example in Program 17.11. This implementation, afterpreprocessing the graph in time proportional to the size of its representation,allowsclientstofindthedegreeofany

Program17.11Vertex-degreesclassimplementationThisclassprovidesawayforclientstolearnthedegreeofanygivenvertexinaGRAPHinconstanttime,afterlinear-timepreprocessingintheconstructor.Theimplementationis based on maintaining a vertex-indexed vector of vertex degrees as a private datamemberandoverloading[]asapublic functionmember.Weinitializeallentries to0,thenprocessalledgesinthegraph,incrementingtheappropriateentryforeachedge.We use classes like this one throughout the book to develop object-orientedimplementationsofgraph-processingfunctionsasclientsofclassGRAPH.

template<classGraph>classDEGREE

{constGraph&G;

vector<int>degree;

public:

DEGREE(constGraph&G):G(G),degree(G.V(),0)

{


{typenameGraph::adjIteratorA(G,v);

for(intw=A.beg();!A.end();w=A.nxt())

degree[v]++;

}

}

intoperator[](intv)const

{returndegree[v];}

};

vertexinconstanttime.Thatisnoimprovementiftheclientneedsthedegreeofjust one vertex, but it represents a substantial savings for clients that need toknowthedegreesofmanyvertices.Suchasubstantialperformancedifferentialforsuchasimpleproblemistypicalingraphprocessing.For each of the graph-processing tasks that we consider in this book, weencapsulate solutions in classes like this one, with private data and functionmembersandpublicfunctionmembersspecifictothetask.Clientscreateobjectswhose member functions do the graph processing. This approach amounts toextendingthegraphADTinterfacebydefiningacooperatingsetofclasses.Anysetofsuchclassesdefinesagraph-processinginterface,buteachencapsulatesitsownprivatedataandfunctionmembers.There are many other ways to build upon an interface in C++. One way toproceed is to simply add public functionmembers (andwhatever private dataand function members we might need) to the basic GRAPH ADT definition.WhilethisapproachhasallofthevirtuesextolledinChapter4,italsohassomeseriousdrawbacks,becausetheworldofgraph-processingissignificantlymoreexpansivethanthekindsofbasicdatastructuresthatarethesubjectofChapter4.Chiefamongthesedrawbacksarethefollowing:

•Therearemanymoregraph-processingfunctionstoimplementthanwecanaccuratelydefineinasingleinterface.• Simple graph-processing tasks have to use the same interface needed by

complicatedtasks.• One member function can access a data member intended for use byanother member function, contrary to encapsulation principles that wewouldliketofollow.

Interfacesofthiskindhavecometobeknownasfatinterfaces.Inabookfilledwithgraph-processingalgorithms,aninterfaceofthissortwouldbefatindeed.Another approach is to use inheritance to define various types of graphs thatprovide clients with various sets of graph-processing tasks. Comparing theintricacies of this approach with the simpler approach that we use is aworthwhileexerciseinthestudyofsoftwareengineering,butwouldtakeusstillfurtherafieldfromthesubjectofgraph-processingalgorithms,ourmainfocus.Table17.1showsthedependenceofthecostofvarioussimplegraph-processingoperations on the representation that we use. This table is worth examiningbeforeweconsider the implementationofmorecomplicatedoperations; itwillhelpyoutodevelopanintuitionforthedifficultyofvariousprimitiveoperations.Mostof thecosts listedfollowimmediatelyfrominspecting thecode,with theexception of the bottom row, which we consider in detail at the end of thissection.In several cases,we canmodify the representation tomake simple operationsmore efficient, althoughwe have to take care that doing so does not increasecosts for other simple operations. For example, the entry for adjacency-matrixdestroyisanartifactofourvector-of-vectors

Table17.1Worst-casecostofgraph-processingoperations

Theperformancecharacteristicsofbasicgraph-processingADToperationsfordifferentgraphrepresentationsvarywidely,evenforsimpletasks,asindicatedinthistableoftheworst-casecosts(allwithinaconstantfactorforlargeVandE).Thesecostsareforthesimpleimplementationswehavedescribedinprevioussections;variousmodificationsthataffectthecostsaredescribedinthetextof

thissection.

allocation scheme for two-dimensional matrices (see Section 3.7). It is notdifficult to reduce this cost to be constant (see Exercise 17.25). On the otherhand, if graphedges are sufficiently complex structures that thematrix entriesarepointers,thentodestroyanadjacencymatrixwouldtaketimeproportionaltoV2.Becauseoftheirfrequentuseintypicalapplications,weconsiderthe findedgeandremoveedgeoperationsindetail.Inparticular,weneedafindedgeoperationtobeabletoremoveordisallowparalleledges.AswesawinSection17.3,theseoperations are trivial if we are using an adjacency-matrix representation—weneedonlytocheckorsetamatrixentrythatwecanindexdirectly.Buthowcanweimplementtheseoperationsefficientlyintheadjacency-listsrepresentation?InC++,wecouldusetheSTL;herewedescribeunderlyingmechanisms,togainperspectiveonefficiencyissues.Oneapproachisdescribednext,andanotherisdescribed in Exercise 17.50. Both approaches are based on symbol-tableimplementations. Ifwe use, for example, dynamic hash table implementations(see Section 14.5), both approaches take space proportional to Eandallowustoperform either operation in constant time (on the average,

amortized).Specifically,toimplementfindedgewhenweareusingadjacencylists,wecoulduse an auxiliary symbol table for the edges.We can assign an edge v-w theinteger key v*V+w and use an STL map or any of the symbol-tableimplementationsfromPart4.(Forundirectedgraphs,wemightassignthesamekeytov-wandw-v.)Wecan inserteachedge into thesymbol table,after firstcheckingwhetherithasalreadybeeninserted.Wecanchooseeithertodisallowparallel edges (see Exercise 17.49) or to maintain duplicate records in thesymboltableforparalleledges(seeExercise17.50).Inthepresentcontext,ourmain interest in this technique is that it provides a constant-time find edgeimplementationforadjacencylists.Tobe able to remove edges,weneed a pointer in the symbol-table record foreach edge that refers to its representation in the adjacency-lists structure. Buteventhisinformationisnotsufficienttoallowustoremovetheedgeinconstanttime unless the lists are doubly linked (see Section 3.4). Furthermore, inundirectedgraphs,itisnotsufficienttoremovethenodefromtheadjacencylist,becauseeachedgeappearsontwodifferentadjacencylists.Onesolutiontothisdifficultyistoputbothpointersinthesymboltable;anotheristolinktogetherthetwolistnodesthatcorrespondtoaparticularedge(seeExercise17.46).Witheitherofthesesolutions,wecanremoveanedgeinconstanttime.Removing vertices is more expensive. In the adjacency-matrix representation,weessentiallyneedtoremovearowandacolumnfromthematrix,whichisnotmuchlessexpensivethanstartingoveragainwithasmallermatrix(althoughthatcostcanbeamortizedusingthesamemechanismasfordynamichashtables).Ifweareusinganadjacency-listsrepresentation,weseeimmediatelythatitisnotsufficient to removenodes fromthevertex’sadjacency list,becauseeachnodeon the adjacency list specifies another vertex whose adjacency list we mustsearchtoremovetheothernodethatrepresentsthesameedge.Weneedtheextralinks to support constant-time edge removal as described in the previousparagraphifwearetoremoveavertexintimeproportionaltoV.We omit implementations of these operations here because they arestraightforward programming exercises using basic techniques from Part 1,because the STL provides implementations that we could use, becausemaintainingcomplexstructureswithmultiplepointerspernodeisnotjustifiedintypical applications that involve static graphs, and because we wish to avoidgetting bogged down in layers of abstraction or in low-level details ofmaintainingmultiple pointerswhen implementinggraph-processing algorithms

thatdonototherwiseusethem.InChapter22,wedoconsiderimplementationsof a similar structure that play an essential role in the powerful generalalgorithmsthatweconsiderinthatchapter.For clarity in describing and developing implementations of algorithms ofinterest,weuse thesimplestappropriate representation.Generally,westrive touse data structures that are directly relevant to the task at hand. Manyprogrammers practice this kindofminimalismas amatter of course, knowingthat maintaining the integrity of a data structure with multiple disparatecomponentscanbeachallengingtask,indeed.We might also consider alternate implementations that modify the basic datastructures in a performance-tuning process to save space or time, particularlywhenprocessinghugegraphs(orhugenumbersofsmallgraphs).Forexample,we can dramatically improve the performance of algorithms that process hugestatic graphs represented with adjacency lists by stripping down therepresentationtousevectorsofvaryinglengthinsteadoflinkedliststorepresentthesetofverticesincidentoneachvertex.Withthistechnique,wecanultimatelyrepresentagraphwith just2E integers less thanVandV integers less thanV2(see Exercises 17.52 and 17.54). Such representations are attractive forprocessinghugestaticgraphs.Thealgorithmsthatweconsideradaptreadilytoallthevariationsthatwehavediscussed in this section, because they are based on a few high-level abstractoperations such as “perform the followingoperation for each edge adjacent tovertexv”thataresupportedbyourbasicADT.Insome instances,ouralgorithm-designdecisionsdependoncertainpropertiesoftherepresentation.Workingatahigherlevelofabstractionmightobscureourknowledgeofthatdependence.Ifweknowthatonerepresentationwouldleadtopoor performance but anotherwould not,wewould be taking an unnecessaryrisk were we to consider the algorithm at the wrong level of abstraction. Asusual, our goal is to craft implementations such that we can make precisestatementsaboutperformance.Forthisreason,thoughbothimplementthebasicgraphADT,weretainseparateDenseGRAPHandSparseMultiGRAPHtypesforthe adjacency-matrix and adjacency-lists representations, respectively, toemphasize thatclientscanuse these implementationsasappropriate tosuit thetaskathand.Alloftheoperationsthatwehaveconsideredsofararesimple,albeitnecessary,data-processingfunctions;andthebottomlineofthediscussioninthissectionisthatbasicalgorithmsanddatastructuresfromParts1through3areeffectivefor

handlingthem.Aswedevelopmoresophisticatedgraph-processingalgorithms,wefacemoredifficultchallengesinfindingthebestimplementationsforspecificpractical problems. To illustrate this point, we consider the last row in Table17.1,which gives the costs of determiningwhether there is a path connectingtwogivenvertices.Intheworstcase,thesimplepath-findingalgorithminSection17.7examinesallEedges in thegraph(asdoseveralothermethods thatweconsider inChapter18).TheentriesinthecenterandrightcolumnonthebottomrowinTable17.1indicate, respectively, that the algorithm may examine all V2 entries in anadjacency-matrixrepresentation,andallVlistheadsandallEnodesonthelistsin an adjacency-lists representation. These facts imply that the algorithm’srunning time is linear in the size of the graph representation, but they alsoexhibittwoanomalies:Theworst-caserunningtimeisnotlinearinthenumberof edges in thegraph ifweareusingan adjacency-matrix representation for asparsegraphoreitherrepresentationforanextremelysparsegraph(onewithahuge number of isolated vertices). To avoid repeatedly considering theseanomalies,weassumethroughoutthatthesizeoftherepresentationthatweuseisproportional to thenumberofedges in thegraph.Thispoint ismoot in themajority of applications because they involve huge sparse graphs and thusrequireanadjacency-listsrepresentation.The left column on the bottom row inTable17.1 derives from the use of theunion-find algorithms in Chapter 1 (see Exercise 17.15). This method isattractivebecauseitonlyrequiresspaceproportionaltoV,buthasthedrawbackthat it cannot exhibit the path. This entry highlights the importance ofcompletelyandpreciselyspecifyinggraph-processingproblems.Even after taking all of these factors into consideration, one of the mostsignificantchallenges thatwefacewhendevelopingpracticalgraph-processingalgorithmsisassessingtheextenttowhichtheresultsofworst-caseperformanceanalyses, such as those in Table 17.1, overestimate time and space needs forprocessinggraphsthatweencounterinpractice.Mostoftheliteratureongraphalgorithmsdescribesperformance in termsof suchworst-caseguarantees, and,while this information is helpful in identifying algorithms that can haveunacceptablypoorperformance,itmaynotshedmuchlightonwhichofseveralsimple, direct programs may be most suitable for a given application. Thissituationisexacerbatedbythedifficultyofdevelopingusefulmodelsofaverage-case performance for graph algorithms, leaving us with (perhaps unreliable)benchmark testing and (perhaps overly conservative) worst-case performanceguaranteestoworkwith.Forexample,thegraph-searchmethodsthatwediscuss

inChapter18arealleffectivelinear-timealgorithmsforfindingapathbetweentwo given vertices, but their performance characteristics differ markedly,depending both upon the graph being processed and its representation.Whenusinggraph-processingalgorithmsinpractice,weconstantlyfightthisdisparitybetweentheworst-caseperformanceguaranteesthatwecanproveandtheactualperformancecharacteristicsthatwecanexpect.Thisthemewillrecurthroughoutthebook.

Exercises•17.39Developanadjacency-matrixrepresentationfordensemultigraphs,andprovideanADTimplementationforProgram17.1thatusesit.

• 17.40Why not use a direct representation for graphs (a data structure thatmodelsthegraphexactly,withvertexobjectsthatcontainadjacencylistswithreferencestothevertices)?

•17.41WhydoesProgram17.11notincrementbothdeg[v]anddeg[w]whenitdiscoversthatwisadjacenttov?

•17.42Add to thegraph class that uses adjacencymatrices (Program 17.7) avertex-indexed vector that holds the degree of each vertex. Add a publicmemberfunctiondegreethatreturnsthedegreeofagivenvertex.17.43DoExercise17.42fortheadjacency-listsrepresentation.•17.44AddarowtoTable17.1fortheproblemofdeterminingthenumberofisolated vertices in a graph. Support your answer with functionimplementationsforeachofthethreerepresentations.

•17.45GivearowtoaddtoTable17.1fortheproblemofdeterminingwhetheragivendigraphhas avertexwith indegreeV andoutdegree0.Supportyouranswer with function implementations for each of the three representations.Note:Yourentryfortheadjacency-matrixrepresentationshouldbeV.17.46Usedoubly-linkedadjacencylistswithcrosslinksasdescribedinthetexttoimplementaconstant-timeremoveedgefunctionremoveforthegraphADTimplementationthatusesadjacencylists(Program17.9).17.47Add a removevertex function remove to the doubly-linked adjacency-listsgraphclassdescribedinthepreviousexercise.•17.48ModifyyoursolutiontoExercise17.16touseadynamichashtable,asdescribed in the text, such that insert edge and remove edge take constantamortizedtime.17.49Addtothegraphclassthatusesadjacencylists(Program17.9)asymbol

table to ignore duplicate edges, so that it represents graphs instead ofmultigraphs. Use dynamic hashing for your symbol-table implementation sothatyourimplementationusesspaceproportionaltoEandcaninsert,find,andremoveedgesinconstanttime(ontheaverage,amortized).17.50 Develop a multigraph class based on a vector-of-symbol-tablesrepresentation(withonesymboltableforeachvertex,whichcontainsitslistofadjacent edges).Use dynamic hashing for your symbol-table implementationsothatyourimplementationusesspaceproportionaltoEandcaninsert,find,andremoveedgesinconstanttime(ontheaverage,amortized).17.51 Develop a graph ADT intended for static graphs, based upon aconstructorthattakesavectorofedgesasanargumentandusesthebasicgraphADT to build a graph. (Such an implementation might be useful forperformance comparisons with the implementations described in Exercises17.52through17.55.)17.52 Develop an implementation for the constructor described in Exercise17.51 that uses a compact representation based on the following datastructures:

structnode{intcnt;vector<int>edges;};

structgraph{intV;intE;vector<node>adj;};

Agraph is a vertex count, an edge count, and a vector of vertices.A vertexcontains an edge count and a vectorwith one vertex index corresponding toeachadjacentedge.•17.53Add toyoursolution toExercise17.52a function thateliminatesself-loopsandparalleledges,asinExercise17.34.

• 17.54 Develop an implementation for the static-graph ADT described inExercise17.51thatusesjusttwovectorstorepresentthegraph:onevectorofEvertices,andanotherofVindicesorpointersintothefirstvector.Implementio::showforthisrepresentation.

•17.55Add toyoursolution toExercise17.54a function thateliminatesself-loopsandparalleledges,asinExercise17.34.17.56Develop a graphADT interface that associates (x, y) coordinateswitheach vertex, so that you can work with graph drawings. Include functionsdrawVanddrawEtodrawavertexandtodrawanedge,respectively.17.57Write a clientprogram thatusesyour interface fromExercise17.56 toproducedrawingsofedgesthatarebeingaddedtoasmallgraph.17.58Develop an implementationof your interface fromExercise 17.56 that

producesaPostScriptprogramwithdrawingsasoutput(seeSection4.3).17.59 Find an appropriate graphics interface that allows you to develop animplementation of your interface from Exercise 17.56 that directly drawsgraphsinawindowonyourdisplay.•17.60ExtendyoursolutiontoExercises17.56and17.59toincludefunctionstoeraseverticesandedgesandtodrawthemindifferentstyles,sothatyoucanwrite client programs that provide dynamic graphical animations of graphalgorithmsinoperation.

17.6GraphGeneratorsTodevelopfurtherappreciationforthediversenatureofgraphsascombinatorialstructures,wenowconsiderdetailedexamplesofthetypesofgraphsthatweuselater to test the algorithms that we study. Some of these examples are drawnfromapplications.Othersaredrawnfrommathematicalmodelsthatareintendedbothtohavepropertiesthatwemightfindinrealgraphsandtoexpandtherangeofinputtrialsavailablefortestingouralgorithms.Tomaketheexamplesconcrete,wepresentthemasclientfunctionsofProgram17.1,sothatwecanputthemtoimmediateusewhenwetestimplementationsofthe graph algorithms that we consider. In addition, we consider theimplementationofio::scanfromProgram17.4,whichreadsasequenceofpairsof arbitrary names from standard input and builds a graph with verticescorrespondingtothenamesandedgescorrespondingtothepairs.The implementations that we consider in this section are based upon theinterface of Program 17.1, so they function properly, in theory, for any graphrepresentation. In practice, however, some combinations of interface andrepresentationcanhaveunacceptablypoorperformance,asweshallsee.

Program17.12Randomgraphgenerator(randomedges)Thisfunctionaddsrandomedges toagraphbygeneratingErandompairsof integers,interpretingtheintegersasvertexlabelsandthepairsofvertexlabelsasedges.Itleavesthedecisionaboutthetreatmentofparalleledgesandself-loopstotheimplementationofthe insert member function of Graph. This method is generally not suitable forgeneratinghugedensegraphsbecauseofthenumberofparalleledgesthatitgenerates.

staticvoidrandE(Graph&G,intE)

{

for(inti=0;i<E;i++)

{

intv=int(G.V()*rand()/(1.0+RAND_MAX));

intw=int(G.V()*rand()/(1.0+RAND_MAX));

G.insert(Edge(v,w));

}

}

As usual, we are interested in having “random problem instances,” both toexercise our programs with arbitrary inputs and to get an idea of how theprogramsmightperforminrealapplications.Forgraphs,thelattergoalismoreelusive than for other domains that we have considered, although it is still aworthwhile objective. We shall encounter various different models ofrandomness,startingwiththesetwo.RandomedgesThismodelissimpletoimplement,asindicatedbythegeneratorgiveninProgram17.12.ForagivennumberofverticesV,wegeneraterandomedgesbygeneratingpairsofnumbersbetween0andV−1.Theresultislikelytobea randommultigraphwithself-loops.Agivenpaircouldhave two identicalnumbers(hence,self-loopscouldoccur);andanypaircouldberepeatedmultipletimes(hence,paralleledgescouldoccur).Program17.12generatesedgesuntilthegraphisknowntohaveEedges,leavingtotheimplementationthedecisionof whether to eliminate parallel edges. If parallel edges are eliminated, thenumber of edges generated is substantially higher than then number of edgesused(E)fordensegraphs(seeExercise17.62);sothismethodisnormallyusedforsparsegraphs.

Figure17.12Tworandomgraphs

Bothoftheserandomgraphshave50vertices.Thesparsegraphatthetophas50edges,whilethedensegraphatthebottomhas500edges.Thesparsegraphisnotconnected,witheachvertexconnectedonlytoafewothers;thedensegraph

iscertainlyconnected,witheachvertexconnectedto20others,ontheaverage.Thesediagramsalsoindicatethedifficultyofdevelopingalgorithmsthatcandrawarbitrarygraphs(theverticeshereareplacedinrandomposition).

Program17.13Randomgraphgenerator(randomgraph)LikeProgram17.12,thisfunctiongeneratesrandompairsofintegersbetween0andV-1toaddrandomedges toagraph,but itusesadifferentprobabilisticmodelwhereeachpossibleedgeoccursindependentlywithsomeprobabilityp.Thevalueofpiscalculatedsuch that the expectednumberof edges (pV (V− 1)/2) is equal toE. The number ofedgesinanyparticulargraphgeneratedbythiscodewillbeclosetoEbutisunlikelytobepreciselyequaltoE.Thismethodisprimarilysuitablefordensegraphs,becauseitsrunningtimeisproportionaltoV2.

staticvoidrandG(Graph&G,intE)

{doublep=2.0*E/G.V()/(G.V()-1);

for(inti=0;i<G.V();i++)

for(intj=0;j<i;j++)

if(rand()<p*RAND_MAX)

G.insert(Edge(i,j));

}

Random graph The classic mathematical model for random graphs is toconsider all possible edges and to include each in the graph with a fixedprobabilityp.IfwewanttheexpectednumberofedgesinthegraphtobeE,wecanchoosep=2E/V(V−1).Program17.13isafunctionthatusesthismodeltogeneraterandomgraphs.Thismodelprecludesduplicateedges,butthenumberofedgesinthegraphisonlyequaltoEontheaverage.Thisimplementationiswell-suited for dense graphs, but not for sparse graphs, since it runs in timeproportionaltoV(V−1)/2togeneratejustE=pV (V -1)/2edges.That is, forsparsegraphs,therunningtimeofProgram17.13isquadraticinthesizeofthegraph(seeExercise17.68).Thesemodelsarewell-studiedandarenotdifficulttoimplement,buttheydonotnecessarily generate graphswith properties similar to the ones that we see inpractice.Inparticular,graphsthatmodelmaps,circuits,schedules,transactions,networks, and other practical situations are usually not only sparse but alsoexhibit a locality property—edges are much more likely to connect a givenvertex tovertices inaparticularset than tovertices thatarenot in theset.Wemight considermany different ways ofmodeling locality, as illustrated in thefollowingexamples.k-neighborgraphThegraphdepictedatthetopinFigure17.13isdrawnfromasimple modification to a random-edges graph generator, where we randomlypickthefirstvertexv,thenrandomlypickthesecondfromamongthosewhose

indices arewithin a fixed constant k of v (wrapping around fromV− 1 to 0,whentheverticesarearrangedinacircleasdepicted).Suchgraphsareeasytogenerateandcertainlyexhibitlocalitynotfoundinrandomgraphs.EuclideanneighborgraphThegraphdepictedatthebottominFigure17.13isdrawn from a generator that generates V points in the plane with randomcoordinates between 0 and 1, and then generates edges connecting any twopointswithindistancedofoneanother.Ifdissmall,thegraphissparse;ifdislarge, thegraph isdense (seeExercise17.74).This graphmodels the typesofgraphs that wemight expect whenwe process graphs frommaps, circuits, orotherapplicationswhereverticesareassociatedwithgeometriclocations.Theyareeasytovisualize,exhibitpropertiesofalgorithmsinanintuitivemanner,andexhibitmanyofthestructuralpropertiesthatwefindinsuchapplications.Onepossibledefectinthismodelisthatthegraphsarenotlikelytobeconnectedwhentheyaresparse;otherdifficultiesare that thegraphsareunlikelytohavehigh-degreeverticesand that theydonothaveany longedges.Wecanchangethemodels to handle such situations, if desired, orwe can consider numeroussimilar examples to try tomodel other situations (see, for example, Exercises17.72and17.73).Or,wecantestouralgorithmsonrealgraphs.Inmanyapplications,thereisnoshortageofprobleminstancesdrawnfromactualdatathatwecanusetotestouralgorithms. For example, huge graphs drawn from actual geographic data areeasy to find; two more examples are listed in the next two paragraphs. Theadvantageofworkingwithrealdatainsteadofarandomgraphmodelisthatwecanseesolutionstorealproblemsasalgorithmsevolve.Thedisadvantageisthatwe may lose the benefit of being able to predict the performance of ouralgorithmsthroughmathematicalanalysis.Wereturntothis topicwhenwearereadytocompareseveralalgorithmsforthesametask,attheendofChapter18.TransactiongraphFigure17.14illustratesatinypieceofagraphthatwemightfindinatelephonecompany’scomputers.Ithasa

Figure17.13Randomneighborgraphs

Thesefiguresillustratetwomodelsofsparsegraphs.Theneighborgraphatthetophas33verticesand99edges,witheachedgerestrictedtoconnectverticeswhoseindicesdifferbylessthan10(moduloV).TheEuclideanneighborgraphatthebottommodelsthetypesofgraphsthatwemightfindinapplications

whereverticesaretiedtogeometriclocations.Verticesarerandompointsintheplane;edgesconnectanypairofverticeswithinaspecifieddistancedofeach

other.Thisgraphissparse(177verticesand1001edges);byadjustingd,wecangenerategraphsofanydesireddensity.

Figure17.14Transactiongraph

Asequenceofpairsofnumberslikethisonemightrepresentalistoftelephonecallsinalocalexchange,orfinancialtransfersbetweenaccounts,oranysimilarsituationinvolvingtransactionsbetweenentitieswithuniqueidentifiers.Thegraphsarehardlyrandom—somephonesarefarmoreheavilyusedthanothers

andsomeaccountsarefarmoreactivethanothers.

Program17.14BuildingagraphfrompairsofsymbolsThis implementation of the scan function from Program17.4 uses a symbol table tobuildagraphbyreadingpairsofsymbolsfromstandardinput.Thesymbol-tableADTfunctionindexassociatesanintegerwitheachsymbol:onunsuccessfulsearchinatableof size N it adds the symbol to the table with associated integer N+1; on successfulsearch, itsimplyreturns the integerpreviouslyassociatedwith thesymbol.Anyof thesymbol-tablemethods inPart4canbeadapted for thisuse; forexample, seeProgram17.15.

#include“ST.cc”


voidIO<Graph>::scan(Graph&G)

{stringv,w;

STst;

while(cin>>v>>w)

G.insert(Edge(st.index(v),st.index(w)));

}

vertexdefinedforeachphonenumber,andanedgeforeachpairiandjwiththepropertythatimadea telephonecall to jwithinsomefixedperiod.Thissetofedges represents a huge multigraph. It is certainly sparse, since each personplacescallstoonlyatinyfractionoftheavailabletelephones.Itisrepresentativeofmanyotherapplications.Forexample,afinancialinstitution’screditcardandmerchantaccountrecordsmighthavesimilarinformation.FunctioncallgraphWecanassociateagraphwithanycomputerprogramwithfunctionsasverticesandanedgeconnectingXandYwheneverfunctionXcallsfunctionY.We can instrument the program to create such a graph (or have acompiler do it). Two completely different graphs are of interest: the staticversion,wherewe create edges at compile time corresponding to the functioncalls that appear in theprogram textof each function; andadynamicversion,wherewecreateedgesatruntimewhenthecallsactuallyhappen.Weusestaticfunction call graphs to study program structure and dynamic ones to studyprogrambehavior.Thesegraphsaretypicallyhugeandsparse.In applications such as these, we facemassive amounts of data, sowemightprefertostudytheperformanceofalgorithmsonreal

Program17.15Symbolindexingforvertexnames

Thisimplementationofsymbol-tableindexingforstringkeys(whichisdescribedinthecommentaryforProgram17.14)accomplishesthetaskbyaddinganindexfieldtoeachnodeinanexistence-tableTST(seeProgram15.8).Theindexassociatedwitheachkeyiskeptintheindexfieldinthenodecorrespondingtoitsend-of-stringcharacter.

WeusethecharactersinthesearchkeytomovedowntheTST,asusual.Whenwereachtheendofthekey,wesetitsindexifnecessaryandalsosettheprivatedatamemberval,whichisreturnedtothecallerafterallrecursivecallstothefunctionhavereturned.

Figure17.15Degrees-of-separationgraph

Thegraphatthebottomisdefinedbythegroupsatthetop,withonevertexforeachpersonandanedgeconnectingapairofpeoplewhenevertheyareinthe

samegroup.Shortestpathlengthsinthegraphcorrespondtodegreesofseparation.Forexample,FrankisthreedegreesofseparationfromAliceand

Bob.

sample data rather than on randommodels.Wemight choose to try to avoiddegenerate situations by randomly ordering the edges or by introducingrandomness in the decision making in our algorithms, but that is a differentmatterfromgeneratingarandomgraph.Indeed, inmanyapplications, learningthepropertiesofthegraphstructureisagoalinitself.In several of these examples, vertices are natural named objects, and edgesappear as pairs of named objects. For example, a transaction graphmight bebuilt from a sequence of pairs of telephone numbers, and a Euclidean graphmightbebuiltfromasequenceofpairsofcitiesortowns.Program17.14 isanimplementationofthescanfunctioninProgram17.4,whichwecanusetobuildagraphinthiscommonsituation.Fortheclient’sconvenience,ittakesthesetofedgesasdefiningthegraphanddeducesthesetofvertexnamesfromtheiruseinedges. Specifically, the program reads a sequence of pairs of symbols fromstandardinput,usesasymboltabletoassociatethevertexnumbers0toV−1tothesymbols(whereVisthenumberofdifferentsymbolsintheinput),andbuildsagraphbyinsertingtheedges,asinPrograms17.12and17.13.Wecouldadaptany symbol-table implementation to support the needs of Program 17.14;Program17.15isanexamplethatusesternarysearchtrees(TSTs)(seeChapter14).Theseprogramsmakeiteasyforustotestouralgorithmsonrealgraphsthatmaynotbecharacterizedaccuratelybyanyprobabilisticmodel.Program17.14 is also significant because it validates the assumptionwe havemadeinallofouralgorithmsthatthevertexnamesareintegersbetween0andV−1.Ifwehaveagraphthathassomeothersetofvertexnames, thenthefirststepinrepresentingthegraphistouseProgram17.15tomapthevertexnamestointegersbetween0andV−1.Somegraphsarebasedonimplicitconnectionsamongitems.Wedonotfocusonsuchgraphs,butwenotetheirexistenceinthenextfewexamplesanddevoteafew exercises to them. When faced with processing such a graph, we cancertainly write a program to construct explicit graphs by enumerating all theedges;but therealsomaybesolutions tospecificproblems thatdonot requirethatweenumeratealltheedgesandthereforecanruninsublineartime.Degrees-of-separation graph Consider a collection of subsets drawn fromVitems.Wedefineagraphwithonevertexcorrespondingtoeachelementintheunionof thesubsetsandedgesbetween twovertices ifbothverticesappear insomesubset(seeFigure17.15).Ifdesired,thegraphmightbeamultigraph,with

edgelabelsnamingtheappropriatesubsets.Allitemsincidentonagivenitemvaresaidtobe1degreeofseparationfromv.Otherwise,allitemsincidentonanyitemthatisidegreesofseparationfromv(thatarenotalreadyknowntobeiorfewerdegreesofseparationfromv)are(i+1)degreesofseparationfromv.Thisconstructionhasamusedpeoplerangingfrommathematicians(Erdösnumber)tomoviebuffs(separationfromKevinBacon).IntervalgraphConsideracollectionofVintervalsontherealline(pairsofrealnumbers).We define a graph with one vertex corresponding to each interval,with edges between vertices if the corresponding intervals intersect (have anypointsincommon).deBruijngraphSupposethatVisapowerof2.Wedefineadigraphwithonevertexcorresponding toeachnonnegative integer less thanV,withedgesfromeachvertexito2iand(2i+1)modlgV.Thesegraphsareusefulinthestudyofthe sequence of values that can occur in a fixed-length shift register for asequenceofoperationswherewerepeatedlyshiftallthebitsonepositiontotheleft, throwaway the leftmostbit, and fill the rightmostbitwith0or1.Figure17.16depictsthedeBruijngraphswith8,16,32,and64vertices.Thevarioustypesofgraphsthatwehaveconsideredinthissectionhaveawidevariety of different characteristics. However, they all look the same to ourprograms: They are simply collections of edges. As we saw in Chapter 1,learningeventhesimplestfactsaboutthemcanbeacomputationalchallenge.Inthisbook,weconsidernumerousingeniousalgorithmsthathavebeendevelopedforsolvingpracticalproblemsrelatedtomanytypesofgraphs.Basedjustonthefewexamplespresentedinthissection,wecanseethatgraphsare complex combinatorial objects, far more complex than those underlyingother algorithms that we studied in Parts 1 through 4. Inmany instances, thegraphs that we need to consider in applications are difficult or impossible tocharacterize.Algorithmsthatperformwellonrandomgraphsareoftenoflimitedapplicability because it is often difficult to be persuaded that random graphshave

Figure17.16deBruijngraphs

AdeBruijndigraphofordernhas2nverticeswithedgesfromito2imodnand(2i+1)mod2n,foralli.PicturedherearetheunderlyingundirecteddeBruijn

graphsoforder6,5,4,and3(toptobottom).

structural characteristics the same as those of the graphs that arise inapplications. The usual approach to overcome this objection is to designalgorithmsthatperformwellintheworstcase.Whilethisapproachissuccessfulinsomeinstances,itfallsshort(bybeingtooconservative)inothers.Whileweareoftennotjustifiedinassumingthatperformancestudiesongraphsgeneratedfromoneoftherandomgraphmodelsthatwehavediscussedwillgiveinformation sufficiently accurate to allow us to predict performance on realgraphs,thegraphgeneratorsthatwehaveconsideredinthissectionareusefulinhelping us to test implementations and to understand our algorithms’performance.Beforewe even attempt to predict performance for applications,we must at least verify any assumptions that we might have made about therelationshipbetweentheapplication’sdataandwhatevermodelsorsampledatawemayhaveused.Whilesuchverficationiswisewhenweareworkinginanyapplicationsdomain,itisparticularlyimportantwhenweareprocessinggraphs,becauseofthebroadvarietyoftypesofgraphsthatweencounter.

Exercises•17.61WhenweuseProgram17.12togeneraterandomgraphsofdensityαV,whatfractionofedgesproducedareself-loops?

•17.62CalculatetheexpectednumberofparalleledgesproducedwhenweuseProgram17.12togeneraterandomgraphswithVverticesofdensity.Usetheresultofyourcalculationtodrawplotsshowingthefractionofparalleledgesproducedasafunctionof,forV=10,100,and1000.17.63UseanSTLmaptodevelopanalternateimplementionoftheSTclassofProgram17.15.• 17.64 Find a large undirected graph somewhere online—perhaps based onnetwork-connectivityinformation,oraseparationgraphdefinedbycoauthorsinasetofbibliographiclistsorbyactorsinmovies.

•17.65Writeaprogramthatgeneratessparserandomgraphsforawell-chosensetofvaluesofVandE, andprints the amountof space that it used for thegraphrepresentationandtheamountoftimethatittooktobuildit.Testyourprogramwithasparse-graphclass(Program17.9)andwiththerandom-graphgenerator (Program17.12), so thatyoucandomeaningfulempirical testson

graphsdrawnfromthismodel.•17.66Writeaprogramthatgeneratesdenserandomgraphsforawell-chosensetofvaluesofVandE, andprints the amountof space that it used for thegraphrepresentationandtheamountoftimethatittooktobuildit.Testyourprogramwithadense-graphclass (Program17.7)andwith the random-graphgenerator (Program17.13), so that you candomeaningful empirical tests ongraphsdrawnfromthismodel.• 17.67 Give the standard deviation of the number of edges produced byProgram17.13.

•17.68Write a program that produces each possible graphwith precisely thesameprobabilityasdoesProgram17.13,butusestimeandspaceproportionaltoonlyV+E,notV2.TestyourprogramasdescribedinExercise17.65.

•17.69Write a program that produces each possible graphwith precisely thesameprobabilityasdoesProgram17.12,butusestimeproportionaltoE,evenwhen the density is close to 1. Test your program as described in Exercise17.66.

• 17.70 Write a program that produces, with equal likelihood, each of thepossible graphs withV vertices andE edges (see Exercise 17.9). Test yourprogramasdescribedinExercise17.65(forlowdensities)andasdescribedinExercise17.66(forhighdensities).

•17.71Write a program that generates randomgraphs by connecting verticesarranged ina -by- grid to theirneighbors (seeFigure1.2),withkextraedges connecting each vertex to a randomly chosen destination vertex (eachdestination vertex equally likely). Determine how to set k such that theexpected number of edges isE. Test your program as described in Exercise17.65.17.72Writeaprogramthatgeneratesrandomdigraphsbyrandomlyconnectingvertices arranged in a -by- grid to their neighbors, with each of thepossibleedgesoccurringwithprobabilityp(seeFigure1.2).Determinehowtoset p such that the expected number of edges is E. Test your program asdescribedinExercise17.65.• 17.73 Augment your program from Exercise 17.72 to add R extra randomedges,computedasinProgram17.12.ForlargeR,shrinkthegridsothatthetotalnumberofedgesremainsaboutV.17.74WriteaprogramthatgeneratesVrandompointsintheplane,thenbuildsa graph consisting of edges connecting all pairs of points within a given

distancedofoneanother(seeFigure17.13andProgram3.20).Determinehowto set d such that the expected number of edges isE. Test your program asdescribed in Exercise 17.65 (for low densities) and as described in Exercise17.66(forhighdensities).•17.75WriteaprogramthatgeneratesV randomintervals in theunit interval,alloflengthd,thenbuildsthecorrespondingintervalgraph.Determinehowtoset d such that the expected number of edges is E. Test your program asdescribed inExercise17.65 (for low densities) and as described in Exercise17.66(forhighdensities).Hint:UseaBST.

•17.76WriteaprogramthatchoosesVverticesandEedgesatrandomfromtherealgraphthatyoufoundforExercise17.64.TestyourprogramasdescribedinExercise17.65 (for lowdensities) andasdescribed inExercise17.66 (forhighdensities).

•17.77Onewaytodefineatransportationsystemiswithasetofsequencesofvertices,eachsequencedefiningapathconnectingthevertices.Forexample,thesequence0-9-3-2definestheedges0-9,9-3,and3-2.Writeaprogramthatbuilds a graph from an input file consisting of one sequence per line, usingsymbolicnames.Developinputsuitable toallowyoutouseyourprogramtobuildagraphcorrespondingtotheParismetrosystem.17.78 Extend your solution to Exercise 17.77 to include vertex coordinates,along the lines of Exercise 17.60, so that you can work with graphicalrepresentations.•17.79ApplythetransformationsdescribedinExercises17.34through17.37tovariousgraphs(seeExercises17.63–76),and tabulate thenumberofverticesandedgesremovedbyeachtransformation.

•17.80 Implement a constructor for Program17.1 that allows clients to buildseparationgraphswithouthavingtocallafunctionforeachimpliededge.Thatis,thenumberoffunctioncallsrequiredforaclienttobuildagraphshouldbeproportional to the sum of the sizes of the groups. Develop an efficientimplementation of this modified ADT (based on data structures involvinggroups,notimpliededges).17.81GiveatightupperboundonthenumberofedgesinanyseparationgraphwithNdifferentgroupsofkpeople.•17.82DrawgraphsinthestyleofFigure17.16that,forV=8,16,and32,haveVverticesnumberedfrom0toV−1andanedgeconnectingeachvertexiwith[floorleft]i/2[floorright].

17.83 Modify the ADT interface in Program 17.1 to allow clients to usesymbolicvertexnamesandedgestobepairsof instancesofagenericVertextype. Hide the vertex-index representation and the symbol-table ADT usagecompletelyfromclients.17.84AddafunctiontotheADTinterfacefromExercise17.83thatsupportsajoin operation for graphs, and provide implementations for the adjacency-matrixandadjacency-listsrepresentations.Note:Anyvertexoredge ineithergraph should be in the join, but vertices that are in both graphs appear onlyonceinthejoin,andyoushouldremoveparalleledges.

17.7Simple,Euler,andHamiltonPathsOur first nontrivial graph-processing algorithms solve fundamental problemsconcerningpathsingraphs.Theyintroducethegeneralrecursiveparadigmthatweusethroughoutthebook,andtheyillustrate

Program17.16SimplepathsearchThis class uses a recursive depth-first search function searchR to find a simple pathconnectingtwogivenverticesinagraphandprovidesamemberfunctionexiststoallowclientstocheckpathexistence.Giventwoverticesvandw,searchRcheckseachedgev-t adjacent to v to see whether it could be the first edge on a path to w. The vertex-indexed vector visited keeps the function from revisiting any vertex, so only simplepathsaretraversed.

template<classGraph>classsPATH

{constGraph&G;

vector<bool>visited;

boolfound;

boolsearchR(intv,intw)

{

if(v==w)returntrue;

visited[v]=true;



if(!visited[t])

if(searchR(t,w))returntrue;

returnfalse;

}

public:

sPATH(constGraph&G,intv,intw):

G(G),visited(G.V(),false)

{found=searchR(v,w);}

boolexists()const

{returnfound;}

};

thatapparentlysimilargraph-processingproblemscanrangewidelyindifficulty.Theseproblemstakeusfromlocalpropertiessuchastheexistenceofedgesorthedegreesofverticestoglobalpropertiesthattellusaboutagraph’sstructure.Themostbasicsuchpropertyiswhethertwoverticesareconnected.Iftheyare,weareinterestedinfindingasimplepaththatconnectsthem.

Figure17.17Traceforsimplepathsearch

ThistraceshowstheoperationoftherecursivefunctioninProgram17.16forthecallsearchR(G,2,6)tofindasimplepathfrom2to6inthegraphshownatthetop.Thereisalineinthetraceforeachedgeconsidered,indentedonelevelforeachrecursivecall.Tocheck2-0,wecallsearchR(G,0,6).Thiscallcausesustocheck0-1,0-2,and0-5.Tocheck0-1,wecallsearchR(G,1,6),whichcausesustocheck1-0and1-2,whichdonotleadtorecursivecallsbecause0and2are

marked.Forthisexample,thefunctiondiscoversthepath2-0-5-4-6.

SimplepathGiventwovertices,isthereasimplepathinthegraphthatconnectsthem? In someapplications,wemightbe satisfied toknowmerelywhetherornotsuchapathexists,butweareconcernedherewiththeproblemoffindingaspecificpath.Program17.16 is a direct solution that finds a path. It is based on depth-firstsearch,afundamentalgraph-processingparadigmthatweconsideredbrieflyinChapters3and5andshallstudyindetailinChapter18.Thealgorithmisbasedonarecursiveprivatefunctionmemberthatdetermineswhetherthereisasimplepathfromvtowbychecking,foreachedgev-tincidentonv,whetherthereisasimple path from t to w that does not go through v. It uses a vertex-indexed

vectortomarkvsothatnopaththroughvwillbecheckedinanyrecursivecall.ThecodeinProgram17.16simplytestsfortheexistenceofapath.Howcanweaugment it to print the path’s edges? Thinking recursively suggests an easysolution:

•Addastatementtoprintt-vjustaftertherecursivecallinsearchRfindsapathfromttow.•SwitchwandvinthecallonsearchRintheconstructor.

Alone,thefirstchangewouldcausethepathfromvtowtobeprintedinreverseorder:IfthecalltosearchR(t,w)findsapathfromttow(andprintsthatpath’sedgesinreverseorder),thenprintingt-vcompletesthejobforthepathfromvtow.Thesecondchangereversestheorder:Toprinttheedgesonthepathfromvtow,weprinttheedgesonthepathfromwtovinreverseorder.(Thistrickonlyworksforundirectedgraphs.)WecouldusethissamestrategytoimplementanADTfunction that calls aclient-supplied function foreachof thepath’sedges(seeExercise17.88).Figure17.17 gives an example of the dynamics of the recursion.Aswith anyrecursiveprogram(indeed,anyprogramwithfunctioncallsatall),suchatraceis easy to produce: To modify Program 17.16 to produce one, we can add avariable depth that is incremented on entry and decremented on exit to keeptrack of the depth of the recursion, then add code at the beginning of therecursive function to print out depth spaces followed by the appropriateinformation(seeExercises17.86and17.87).Property17.2Wecan findapathconnecting twogivenvertices inagraph inlineartime.Therecursivedepth-firstsearchfunctioninProgram17.16immediatelyimpliesa proof by induction that theADT function determineswhether or not a pathexists. Such a proof is easily extended to establish that, in the worst case,Program 17.16 checks all the entries in the adjacency matrix exactly once.Similarly,wecanshowthattheanalogousprogramforadjacencylistschecksallofthegraphedgesexactlytwice(onceineachdirection),intheworstcase.•Weusethephraselinear inthecontextofgraphalgorithmstomeanaquantitywhose value is within a constant factor of V +E, the size of the graph. Asdiscussed at the end of Section 17.5, such a value is also normally within aconstantfactorof thesizeof thegraphrepresentation.Property17.2iswordedsoastoallowfortheuseoftheadjacency-listsrepresentationforsparsegraphsandtheadjacency-matrixrepresentationfordensegraphs,ourgeneralpractice.It

isnotappropriatetousetheterm“linear”todescribeanalgorithmthatusesanadjacencymatrixandrunsintimeproportionaltoV2(eventhoughitislinearinthesizeofthegraphrepresentation)unlessthegraphisdense.Indeed,ifweusetheadjacency-matrixrepresentationforasparsegraph,wecannothavealinear-timealgorithmforanygraph-processingproblemthatcouldrequireexaminationofalltheedges.Westudydepth-firstsearchindetailinamoregeneralsettinginthenextchapter,and we consider several other connectivity algorithms there. For example, aslightlymoregeneralversionofProgram17.16givesusawaytopassthroughalltheedgesinthegraph,buildingavertex-indexedvectorthatallowsaclienttotestinconstanttimewhetherthereexistsapathconnectinganytwovertices.Property17.2cansubstantiallyoverestimatetheactualrunningtimeofProgram17.16, because itmight find a path after examiningonly a fewedges.For themoment,ourinterestisinknowingamethodthatisguaranteedtofindinlineartime a path connecting any two vertices in any graph. By contrast, otherproblems that appear similar are much more difficult to solve. For example,consider the following problem, where we seek paths connecting pairs ofvertices,butaddtherestrictionthattheyvisitalltheotherverticesinthegraph,aswell.HamiltonpathGiventwovertices,isthereasimplepathconnectingthemthatvisitseveryvertexinthegraphexactlyonce?Ifthepathisfromavertexbacktoitself,thisproblemisknownasthe

Figure17.18Hamiltontour

ThegraphatthetophastheHamiltontour0-6-4-2-1-3-5-0,whichvisitseachvertexexactlyonceandreturnstothestartvertex,butthegraphatthebottom

hasnosuchtour.

Figure17.19Hamilton-tour–searchtrace

ThistraceshowstheedgescheckedbyProgram17.17whendiscoveringthatthegraphshownatthetophasnoHamiltontour.Forbrevity,edgestomarked

verticesareomitted.

Program17.17HamiltonpathThisrecursivefunctiondiffersfromtheoneinProgram17.16injusttworespects:First,ittakesthelengthofthepathsoughtasitsthirdargumentandreturnssuccessfullyonlyifit finds a path of length V; second, it resets the visited marker before returningunsuccessfully.If we replace the recursive function in Program 17.16 by this one, and add a thirdargumentG.V()-1tothesearchRcallinsearch,thensearchlooksforaHamiltonpath.Butdonotexpectthesearchtoendexceptintinygraphs(seetext).

boolsearchR(intv,intw,intd)

{

if(v==w)return(d==0);

visited[v]=true;



if(!visited[t])

if(searchR(t,w,d-1))returntrue;

visited[v]=false;

returnfalse;

}

Hamilton tour problem. Is there a cycle that visits every vertex in the graphexactlyonce?At first blush, this problem seems to admit a simple solution:We can use thesimplemodificationtotherecursivepartofthepath-findingclassthatisshowninProgram17.17.But thisprogramisnot likely tobeuseful formanygraphs,becauseitsworst-caserunningtimeisexponential in thenumberofvertices inthegraph.Property17.3A recursive search for aHamilton tour could take exponentialtime.Proof: Consider a graphwhere vertexV-1 is isolated and the edges, with theotherV−1vertices,constituteacompletegraph.Program17.17willneverfindaHamiltonpath,butitiseasytoseebyinductionthatitwillexamineallofthe(V−1)!pathsinthecompletegraph,allofwhichinvolveV−1recursivecalls.ThetotalnumberofrecursivecallsisthereforeaboutV!,orabout(V/e)V,whichishigherthananyconstanttotheVthpower.•OurimplementationsProgram17.16forfindingsimplepathsandProgram17.17for finding Hamilton paths are extremely similar. If no path exists, bothprograms terminatewhen all the elements of thevisitedvector are set to true.

Why are the running times so dramatically different? Program 17.16 isguaranteed to finish quickly because it sets at least one element of the visitedvectorto1eachtimesearchRiscalled.Program17.17,on theotherhand,cansetvisitedelementsbackto0,sowecannotguaranteethatitwillfinishquickly.Whensearchingforsimplepaths,inProgram17.16,weknowthat,ifthereisapathfromvtow,wewillfinditbytakingoneoftheedgesv-tfromv,andthesameistrueforHamiltonpaths.Buttherethissimilarityends.Ifwecannotfindasimplepathfromttow,thenwecanconcludethatthereisnosimplepathfromvtowthatgoesthrought;buttheanalogoussituationforHamiltonpathsdoesnothold.ItcouldbethecasethatthereisnoHamiltonpathtowthatstartswithv-t,but there isone that startswithv-x-t for someothervertexx.Wehave tomakearecursivecallfromtcorrespondingtoeverypaththatleadstoitfromv.Inshort,wemayhavetocheckeverypathinthegraph.It isworthwhiletoreflectonjusthowslowafactorial-timealgorithmis.Ifwecould process a graph with 15 vertices in 1 second, it would take 1 day toprocessagraphwith19vertices,over1yearfor21vertices,andover6centuriesfor23vertices.Faster computersdonothelpmuch, either.Acomputer that is200,000 times faster thanouroriginalonewould still takemore thanaday tosolve that 23-vertex problem. The cost to process graphs with 100 or 1000vertices is too high to contemplate, let alone graphs of the size thatwemightencounterinpractice.Itwouldtakemillionsofpagesinthisbookjust towritedownthenumberofcenturiesrequiredtoprocessagraphthatcontainedmillionsofvertices.In Chapter 5, we examined a number of simple recursive programs that aresimilarincharactertoProgram17.17butthatcouldbedrasticallyimprovedwithtop-downdynamicprogramming.Thisrecursiveprogramisentirelydifferentincharacter: The number of intermediate results that would have to be saved isexponential. Despitemany people doing an extensive amount of work on theproblem,noonehasbeenabletofindanyalgorithmthatcanpromisereasonableperformanceforlarge(orevenmedium-sized)graphs.

Figure17.20Eulertourandpathexamples

ThegraphatthetophastheEulertour0-1-2-0-6-4-3-2-4-5-0,whichusesalltheedgesexactlyonce.Thegraphatthebottomnosuchtour,butithastheEuler

path1-2-0-1-3-4-2-3-5-4-6-0-5.

Now,supposethatwechangetherestrictionfromhavingtovisitalltheverticestohavingtovisitalltheedges.Isthisproblemeasy,likefindingasimplepath,orhopelesslydifficult,likefindingaHamiltonpath?EulerpathIsthereapathconnectingtwogivenverticesthatuseseachedgeinthegraphexactlyonce?Thepathneednotbesimple—verticesmaybevisitedmultipletimes.Ifthepathisfromavertexbacktoitself,wehavetheEulertourproblem.Isthereacyclicpaththatuseseachedgeinthegraphexactlyonce?WeproveinthecorollarytoProperty17.4thatthepathproblemisequivalenttothetourprobleminagraphformedbyaddinganedgeconnectingthetwovertices.Figure17.20givestwosmallexamples.This classical problem was first studied by L. Euler in 1736. Indeed, somepeopletracetheoriginofthestudyofgraphsandgraphtheorytoEuler’sworkonthisproblem,startingwithaspecialcaseknownasthebridgesofKönigsbergproblem(seeFigure17.21).TheSwiss townofKönigsberghad sevenbridgesconnectingriverbanksandislands,andpeopleinthetownfoundthattheycouldnot seem to cross all the bridges without crossing one of them twice. TheirproblemamountstotheEulertourproblem.Theseproblemsare familiar topuzzleenthusiasts.Theyarecommonlyseen inthe formofpuzzleswhereyouare todrawagiven figurewithout liftingyourpencilfromthepaper,perhapsundertheconstraintthatyoumuststartandendatparticular points. It is natural for us to consider Euler pathswhen developing

graph-processingalgorithms,becauseanEulerpathisanefficientrepresentationofthegraph(puttingtheedgesinaparticularorder) thatwemightconsiderasthebasisfordevelopingefficientalgorithms.Eulershowedthatitiseasytodeterminewhetherornotapathexists,becauseallthatweneedtodoistocheckthedegreeofeachofthevertices.Thepropertyiseasytostateandapply,buttheproofisatrickyexerciseingraphtheory.Property17.4AgraphhasanEulertourifandonlyifitisconnectedandallitsverticesareofevendegree.Proof:Tosimplifytheproof,weallowself-loopsandparalleledges,thoughitisnotdifficulttomodifytheprooftoshowthatthispropertyalsoholdsforsimplegraphs(seeExercise17.94).IfagraphhasanEulertour,thenitmustbeconnectedbecausethetourdefinesapathconnectingeachpairofvertices.Also,anygivenvertexvmustbeofevendegreebecausewhenwetraversethetour(startinganywhereelse),weentervonone edge and leave on a different edge (neither ofwhich appear again on thetour);sothenumberofedgesincidentuponvmustbetwicethenumberoftimeswevisitvwhentraversingthetour,anevennumber.Toprovetheconverse,weuse inductiononthenumberofedges.Theclaimiscertainlytrueforgraphswithnoedges.Consideranyconnectedgraphthathasmore thanoneedge,withallverticesof evendegree.Suppose that, startingatanyvertexv,we followand removeanyedge,andwecontinuedoingsountilarriving at a vertex that has no more edges. This process certainly mustterminate, since we delete an edge at every step, but what are the possibleoutcomes?Figure17.22illustratesexamples.Immediately,weseethatwemustend up back at v, becausewe end at a vertex other than v if and only if thatvertexhadanodddegreewhenwestarted.Onepossibility is thatwe trace thefull tour; ifso,wearedone.Otherwise,allthe vertices in the graph that remains have even degree, but it may not beconnected.Still, eachconnectedcomponenthasanEuler tourby the inductivehypothesis.Moreover,thecyclicpathjustremovedconnectsthosetourstogetherintoanEulertourfortheoriginalgraph:traversethecyclicpath,takingdetoursto do the Euler tours for the connected components. Each detour is a properEuler tour that takes us back to the vertexon the cyclic pathwhere it started.Notethatadetourmaytouchthecyclicpathmultipletimes(seeExercise17.98).Insuchacase,wetakethedetouronlyonce(say,whenwefirstencounterit).•CorollaryAgraphhasanEulerpathifandonlyif it isconnectedandexactly

twoofitsverticesareofodddegree.Proof: This statement is equivalent to Property 17.4 in the graph formed byaddinganedgeconnectingthetwoverticesofodddegree(theendsofthepath).•Therefore,forexample,thereisnowayforanyonetotraverseallthebridgesofKönigsberg in a continuous path without retracing their steps, since all fourverticesinthecorrespondinggraphhaveodddegree(seeFigure17.21).

Figure17.21BridgesofKönigsberg

Awell-knownproblemstudiedbyEulerhastodowiththetownofKönigsberg,inwhichthereisanislandatthepointwheretheriverPregelforks.Therearesevenbridgesconnectingtheislandwiththetwobanksoftheriverandthelandbetweentheforks,asshowninthediagramattop.Isthereawaytocrossthesevenbridgesinacontinuouswalkthroughthetown,withoutrecrossinganyofthem?Ifwelabeltheisland0,thebanks1and2,andthelandbetweentheforks3anddefineanedgecorrespondingtoeachbridge,wegetthemultigraphshownatthebottom.Theproblemistofindapaththroughthisgraphthatuseseach

edgeexactlyonce.

Figure17.22Partialtours

FollowingedgesfromanyvertexinagraphthathasanEulertouralwaystakesusbacktothatvertex,asshownintheseexamples.Thecycledoesnot

necessarilyusealltheedgesinthegraph.

As discussed in Section 17.5, we can find all the vertex degrees in timeproportional toE for the adjacency-lists or set-of-edges representation and intime proportional to V2 for the adjacency-matrix representation, or we canmaintain a vertex-indexed vector with vertex degrees as part of the graphrepresentation (see Exercise 17.42). Given the vector, we can check whetherProperty17.4 issatisfied in timeproportional toV.Program17.18 implementsthis strategy and demonstrates that determiningwhether a given graph has anEulerpathisaneasycomputationalproblem.ThisfactissignificantbecausewehavelittleintuitiontosuggestthattheproblemshouldbeeasierthandeterminingwhetheragivengraphhasaHamiltonpath.Now, suppose thatweactuallywish to findanEulerpath.Weare treadingonthinicebecauseadirectrecursiveimplementation(findapathbytryinganedgeandthenmakingarecursivecalltofindapathfortherestofthegraph)willhavethesamekindoffactorial-timeperformanceasProgram17.17.Weexpectnottohavetolivewithsuchperformancebecauseit issoeasytotestwhetherapathexists,soweseekabetteralgorithm.Itispossibletoavoidfactorial-timeblowupwithafixed-costtestfordeterminingwhetherornottouseanedge(ratherthanunknown costs from the recursive call), but we leave this approach as anexercise(seeExercises17.96and17.97).AnotherapproachissuggestedbytheproofofProperty17.4.Traverseacyclicpath,deletingtheedgesencounteredandpushingontoastacktheverticesthatitencounters,sothat(i)wecantracebackalongthatpath,printingoutitsedges,and(ii)wecancheckeachvertexforadditionalsidepaths(whichcanbesplicedintothemainpath).ThisprocessisillustratedinFigure17.23.Program17.19isanimplementationalongtheselines.ItassumesthatanEulerpathexists,anditdestroysitslocalcopyofthegraph;thus,itisimportantthattheGraph class that this program uses have a copy constructor that creates acompletelyseparatecopyofthegraph.Thecodeistricky—novicesmaywishtopostpone trying to understand it until gaining more experience with graph-processingalgorithmsinthenextfewchapters.Ourpurposeinincludingithereis to illustrate that good algorithms and clever implementations can be veryeffectiveforsolvingsomegraph-processingproblems.Property17.5WecanfindanEulertourinagraph,ifoneexists,inlineartime.

Program17.18EulerpathexistenceWiththisclass,clientscantestfortheexistenceofanEulerpathinagraph.Itkeepsvandwasprivatedatamemberssothatclientscanusethefunctionmembershow(whichusestheprivatefunctionmembertour)toprintthepath(seeProgram17.19).

The test isbasedupon thecorollary toProperty17.4andusesProgram17.11. It takestime proportional toV, not including preprocessing time to check connectivity and tobuildthevertex-degreetableinDEGREE.

template<classGraph>classePATH

{GraphG;

intv,w;

boolfound;

STACK<int>S;

inttour(intv);

public:

ePATH(constGraph&G,intv,intw):

G(G),v(v),w(w)

{DEGREE<Graph>deg(G);

intt=deg[v]+deg[w];

if((t%2)!=0){found=false;return;}

for(t=0;t<G.V();t++)

if((t!=v)&&(t!=w))

if((deg[t]%2)!=0)

{found=false;return;}

found=true;

}

boolexists()const

{returnfound;}

voidshow();

};

Weleaveafullinductionproofasanexercise(seeExercise17.100).Informally,afterthefirstcallonpath,thestackcontainsapathfromvtow,andthegraphthatremains(afterremovalofisolatedvertices)consistsofthesmallerconnectedcomponents(sharingatleastonevertexwiththepathsofarfound)thatalsohaveEuler tours.WepopisolatedverticesfromthestackandusepathtofindEulertoursthat

Program17.19Linear-timeEulerpathThis implementation of show for the class in Program 17.18 prints an Euler pathbetweentwogivenvertices,ifoneexists.Unlikemostofourotherimplementations,thiscodereliesontheGraphADTimplementationhavingapropercopyconstructor,becauseitmakes a copyof the graph anddestroys the copyby removing edges from itwhileprintingthepath.Withaconstant-timeimplementationofremove(seeExercise17.46),showrunsinlineartime.Theprivatefunctionmembertourfollowsandremovesedgesonacyclicpathandpushesverticesontoastack,tobecheckedforsideloops(seetext).Themainloopcallstouraslongasthereareverticeswithsideloopstotraverse.


intePATH<Graph>::tour(intv)

{

while(true)


intw=A.beg();if(A.end())break;

S.push(v);

G.remove(Edge(v,w));

v=w;

}

returnv;

}


voidePATH<Graph>::show()

{

if(!found)return;

while(tour(v)==v&&!S.empty())

{v=S.pop();cout<<“-”<<v;}

cout<<endl;

}

containthenonisolatedvertices,inthesamemanner.Eachedgeinthegraphispushedonto(andpoppedfrom)thestackexactlyonce,sothetotalrunningtimeisproportionaltoE.•Despite theirappealasasystematicway to traverseall theedgesandvertices,werarelyuseEulertoursinpracticebecausefewgraphshavethem.Instead,wetypicallyusedepth-firstsearchtoexplore

Figure17.23Eulertourbyremovingcycles

ThisfigureshowshowProgram17.19discoversanEulertourfrom0backto0inasamplegraph.Thickblackedgesarethoseonthetour,thestackcontentsarelistedbeloweachdiagram,andadjacencylistsforthenon-touredgesareshown

atleft.

First, the program adds the edge 0-1 to the tour and removes it from theadjacencylists(intwoplaces)(topleft, listsat left).Second, itadds1-2tothetour in the sameway (left, second from top).Next, itwinds up back at 0 butcontinues to do another cycle 0-5-4-6-0, winding up back at 0 with nomoreedgesincidentupon0(right,secondfromtop).Thenitpopstheisolatedvertices0and6fromthestackuntil4isatthetopandstartsatourfrom4(right,thirdfromfromtop),whichtakesitto3,2,andbackto4,whereuponitpopsallthenow-isolatedvertices4,2,3,andsoforth.ThesequenceofverticespoppedfromthestackdefinestheEulertour0-6-4-2-3-4-5-0-2-1-0ofthewholegraph.graphs,asdescribedindetailinChapter18.Indeed,asweshallsee,doingdepth-firstsearchinanundirectedgraphamountstocomputingatwo-wayEulertour:apaththattraverseseachedgeexactlytwice,onceineachdirection.Insummary,wehaveseeninthissectionthat it iseasytofindsimplepathsingraphs,thatitiseveneasiertoknowwhetherwecantouralltheedgesofalargegraphwithoutrevisitinganyofthem(byjustcheckingthatallvertexdegreesareeven),andthatthereisevenacleveralgorithmtofindsuchatour;butthatitispractically impossible to know whether we can tour all the graph’s verticeswithoutrevisitingany.Wehavesimplerecursivesolutionstoalltheseproblems,butthepotentialforexponentialgrowthintherunningtimerenderssomeofthesolutions useless in practice.Others provide insights that lead to fast practicalalgorithms.Thisrangeofdifficultyamongapparentlysimilarproblemsthatisillustratedbytheseexamplesistypicalingraphprocessing,andisfundamentaltothetheoryofcomputing.AsdiscussedbrieflyinSection17.8andindetailinPart8,wemustacknowledgewhatseemstobeaninsurmountablebarrierbetweenproblemsthatseemtorequireexponentialtime(suchastheHamiltontourproblemandmanyother commonly encountered problems) and problems for which we knowalgorithmsthatcanguaranteetofindasolutioninpolynomialtime(suchastheEuler tour problem andmany other commonly encountered problems). In thisbook,ourprimaryobjectiveistodevelopefficientalgorithmsforproblemsinthelatterclass.

Exercises

• 17.85 Show, in the style of Figure 17.17, the trace of recursive calls (andverticesthatareskipped)whenProgram17.16findsapathfrom0to5inthegraph

3-71-47-80-55-23-82-90-64-92-66-4.17.86ModifytherecursivefunctioninProgram17.16toprintoutatracelikeFigure17.17,usingaglobalvariableasdescribedinthetext.17.87DoExercise17.86 by adding an argument to the recursive function tokeeptrackofthedepth.• 17.88 Using the method described in the text, give an implementation ofsPATH that provides a public member function that calls a client-suppliedfunctionforeachedgeonapathfromvtow,ifanysuchpathexists.

•17.89ModifyProgram17.16suchthatittakesathirdargumentdandteststheexistenceofapathconnectinguandvof lengthgreater thand.Inparticular,search(v,v,2)shouldbenonzeroifandonlyifvisonacycle.

•17.90RunexperimentstodetermineempiricallytheprobabilitythatProgram17.16 findsapathbetween two randomlychosenvertices forvariousgraphs(seeExercises17.63–76)andtocalculatetheaveragelengthofthepathsfoundforeachtypeofgraph.

•17.91Considerthegraphsdefinedbythefollowingfoursetsofedges:0-10-20-31-31-42-52-93-64-74-85-85-96-76-97-80-10-20-31-30-32-55-63-64-74-85-85-96-76-98-80-11-21-30-30-42-52-93-64-74-85-85-96-76-97-84-17-96-27-35-00-20-81-63-96-32-81-59-84-54-7.

WhichofthesegraphshaveEulertours?WhichofthemhaveHamiltontours?•17.92Givenecessaryandsufficientconditionsforadirectedgraphtohavea(directed)Eulertour.17.93Provethateveryconnectedundirectedgraphhasatwo-wayEulertour.17.94 Modify the proof of Property 17.4 to make it work for graphs withparalleledgesandself-loops.•17.95Showthataddingonemorebridgecouldgiveasolutiontothebridges-of-Königsbergproblem.

•17.96ProvethataconnectedgraphhasanEulerpathfromvtowonlyifithasan edge incident onvwhose removal doesnot disconnect thegraph (exceptpossiblybyisolatingv).

•17.97UseExercise17.96todevelopanefficientrecursivemethodforfinding

anEulertourinagraphthathasone.BeyondthebasicgraphADTfunctions,youmay use the classes from this chapter that can give vertex degrees (seeProgram17.11)andtestwhetherapathexistsbetweentwogivenvertices(seeProgram17.16). Implementand testyourprogramforbothsparseanddensegraphs.

•17.98GiveanexamplewherethegraphremainingafterthefirstcalltopathinProgram17.19isnotconnected(inagraphthathasanEulertour).

•17.99DescribehowtomodifyProgram17.19sothatitcanbeusedtodetectwhetherornotagivengraphhasanEulertour,inlineartime.17.100 Give a complete proof by induction that the linear-time Euler touralgorithm described in the text and implemented in Program 17.19 properlyfindsanEulertour.•17.101 Find the number ofV -vertex graphs that have an Euler tour, for aslargeavalueofVasyoucanfeasiblyaffordtodothecomputation.

• 17.102 Run experiments to determine empirically the average length of thepath foundby the first call topath inProgram17.19 forvariousgraphs (seeExercises17.63–76).Calculatetheprobabilitythatthispathiscyclic.

•17.103Writeaprogramthatcomputesasequenceof2n+n-1bitsinwhichnotwopairsofnconsecutivebitsmatch.(Forexample,forn=3,thesequence0001110100hasthisproperty.)Hint:FindaEulertourinadeBruijndigraph.

• 17.104 Show, in the style of Figure 17.19, the trace of recursive calls (andvertices that are skipped),whenProgram17.16 finds aHamilton tour in thegraph

3-71-47-80-55-23-82-90-64-92-66-4.17.105ModifyProgram17.17toprintouttheHamiltontourifitfindsone.•17.106FindaHamiltontourofthegraph

1-25-24-22-60-83-01-33-61-01-44-04-66-52-66-99-03-14-39-24-96-97-95-09-77-34-50-57-8

orshowthatnoneexists.••17.107Determine howmanyV -vertex graphs have aHamilton tour, for aslargeavalueofVasyoucanfeasiblyaffordtodothecomputation.

17.8Graph-ProcessingProblemsArmedwiththebasictoolsdevelopedinthischapter,weconsiderinChapters18through22abroadvarietyofalgorithmsforsolvinggraph-processingproblems.

Thesealgorithmsarefundamentalonesandareusefulinmanyapplications,butthey serve as only an introduction to the subject of graph algorithms. Manyinterestingandusefulalgorithmshavebeendevelopedthatarebeyondthescopeofthisbook,andmanyinterestingproblemshavebeenstudiedforwhichgoodalgorithmshavestillnotyetbeeninvented.As is true in anydomain, the first challenge thatwe face in addressinganewgraph-processingproblemisdetermininghowdifficult it is tosolve.Forgraphprocessing,thisdecisioncanbefarmoredifficultthanwemightimagine,evenfor problems that appear to be simple to solve.Moreover, our intuition is notalways helpful in distinguishing easy problems from difficult or hithertounsolvedones.Inthissection,wedescribebrieflyimportantclassicalproblemsandthestateofourknowledgeofthem.Given a newgraph-processingproblem,what typeof challengedowe face indevelopinganimplementationtosolveit?Theunfortunatetruthis that thereisno good method to answer this question for any problem that we mightencounter,butwecanprovideageneraldescriptionof thedifficultyofsolvingvarious classical graph-processing problems. To this end, we will roughlycategorizetheproblemsaccordingtothedifficultyofsolvingthem,asfollows:•Easy•Tractable•Intractable•Unknown

These termsare intended to convey information relative tooneanother and tothecurrentstateofknowledgeaboutgraphalgorithms.As indicatedby the terminology,ourprimaryreasonforcategorizingproblemsin thisway is that there aremany graph problems, such as theHamilton tourproblem,thatnooneknowshowtosolveefficiently.Wewilleventuallysee(inPart8)how tomake that statementmeaningful inaprecise technical sense;atthispoint,wecanatleastbewarnedthatwefacesignificantobstaclestowritinganefficientprogramtosolvetheseproblems.Wedeferfullcontextonmanyofthegraph-processingproblemsuntillaterinthebook.Here,wepresentbrief statements that are easilyunderstood, inorder tointroduce the general issue of classifying the difficulty of graph-processingproblems.An easy graph-processing problem is one that can be solved by the kind ofelegant and efficient short programs to which we have grown accustomed in

Parts1through4.Weoftenfindtherunningtimetobelinearintheworstcase,or bounded by a small-degree polynomial in the number of vertices or thenumberof edges.Generally, aswehavedone inmanyother domains,we canestablish that a problem is easy by developing a brute-force solution that,although it may be too slow for huge graphs, is useful for small and evenintermediate-sizedproblems.Then,onceweknowthattheproblemiseasy,welookforefficientsolutionsthatwemightuseinpracticeandtrytoidentifythebestamongthose.TheEulertourproblemthatweconsideredinSection17.7isaprimeexampleofsuchaproblem,andweshallseemanyothersinChapters18through22including,mostnotably,thefollowing.Simple connectivity Is a given graph connected? That is, is there a pathconnectingeverypairofvertices?Isthereacycleinthegraph,orisitaforest?Giventwovertices,aretheyonacycle?Wefirstconsideredthesebasicgraph-processing question in Chapter 1. We consider numerous solutions to suchproblems in Chapter 18. Some are trivial to implement in linear time; othershaverathersophisticatedlinear-timesolutionsthatbearcarefulstudy.StrongconnectivityindigraphsIsthereadirectedpathconnectingeverypairof vertices in a digraph? Given two vertices, are they connected by directedpaths in both directions (are theyon a directed cycle)? Implementing efficientsolutions to these problems is a much more challenging task than for thecorresponding simple-connectivity problem in undirected graphs, andmuch ofChapter19isdevotedtostudyingthem.Despite theclever intricaciesinvolvedin solving them, we classify the problems as easy because we can write acompact,efficient,andusefulimplementation.TransitiveclosureWhat set of vertices can be reached by following directededges fromeachvertex inadigraph?Thisproblemisclosely related tostrongconnectivity and to other fundamental computational problems. We studyclassicalsolutionsthatamounttoafewlinesofcodeinChapter19.Minimumspanning tree In aweighted graph, find aminimum-weight set ofedges that connects all the vertices.This is one of the oldest and best-studiedgraph-processing problems; Chapter 20 is devoted to the study of variousclassicalalgorithms tosolve it.Researchersstill seek fasteralgorithmsfor thisproblem.Single-source shortest pathsWhat are the shortest paths connecting a givenvertexvwitheachothervertexinaweighteddigraph(network)?Chapter21 isdevotedtothestudyofthisproblem,whichisextremelyimportantinnumerousapplications.Theproblemisdecidedlynoteasyifedgeweightscanbenegative.

A tractable graph-processingproblem isone forwhich an algorithm isknownwhose time and space requirements are guaranteed to be bounded by apolynomial function of the size of the graph (V +E ). All easy problems aretractable,butwemakeadistinctionbecausemany tractableproblemshave thepropertythatdevelopinganefficientpracticalprogramtosolveisanextremelychallenging,ifnotimpossible,task.Solutionsmaybetoocomplicatedtopresentinthisbook,becauseimplementationsmightrequirehundredsoreventhousandsoflinesofcode.Thefollowingexamplesaretwoofthemostimportantproblemsinthisclass.Planarity Canwe draw a given graphwithout any of the lines that representedgesintersecting?Wehavethefreedomtoplacetheverticesanywhere,sowecansolvethisproblemformanygraphs,but it is impossible tosolveformanyother graphs. A remarkable classical result known as Kuratowski’s theoremprovidesaneasytestfordeterminingwhetheragraphisplanar:itsaysthattheonly graphs that cannot be drawn with no edge intersections are those thatcontainsomesubgraphthat,afterremovingverticesofdegree2,isisomorphictooneofthegraphsinFigure17.24.Astraightforwardimplementationofthattest,evenwithout taking the vertices of degree 2 into consideration,would be tooslowforlargegraphs(seeExercise17.110),butin1974R.Tarjandevelopedaningenious(butintricate)algorithmforsolvingtheprobleminlineartime,usingadepth-first search scheme that extends those that we consider in Chapter 18.Tarjan’s algorithmdoes not necessarily give a practical layout; it just certifiesthatalayoutexists.AsdiscussedinSection17.1,developingavisuallypleasinglayout in applications where vertices do not necessarily relate directly to thephysicalworldhasturnedouttobeachallengingresearchproblem.MatchingGivenagraph,whatisthelargestsubsetofitsedgeswiththepropertythatnotwoareconnectedtothesamevertex?ThisclassicalproblemisknowntobesolvableintimeproportionaltoapolynomialfunctionofVandE,butafastalgorithm that is suitable forhugegraphs is still anelusive researchgoal.Theproblem is easier to solve when restricted in various ways. For example, theproblemofmatchingstudentstoavailablepositionsinselectiveinstitutionsisanexampleofbipartitematching:Wehavetwodifferenttypesofvertices(studentsand institutions) and we are concerned with only those edges that connect avertex of one type with a vertex of the other type.We see a solution to thisprobleminChapter22.The solutions to some tractable problems have never been written down asprograms, or have running times so high thatwe could not contemplate usingtheminpractice.Thefollowingexampleisinthisclass.Italsodemonstratesthe

capricious nature of the mathematical reality of the difficulty of graphprocessing.

Figure17.24Forbiddensubgraphsinplanargraphs

Neitherofthesegraphscanbedrawnintheplanewithoutintersectingedges,norcananygraphthatcontainseitherofthesegraphsasasubgraph(afterweremoveverticesofdegreetwo);butallothergraphscanbesodrawn.

EvencyclesindigraphsDoesagivendigraphhaveacycleofevenlength?Thisquestionwouldseemsimple to resolvebecause thecorrespondingquestion forundirected graphs is easy to solve (see Section 18.4), as is the question ofwhether a digraph has a cycle of odd length. However, for many years, theproblemwasnotsufficientlywellunderstoodforuseventoknowwhetherornotthere exists an efficient algorithm for solving it (see reference section). Atheoremestablishingtheexistenceofanefficientalgorithmwasprovedin1999,but themethod is socomplicated thatnomathematicianorprogrammerwouldattempttoimplementit.One of the important themes of Chapter 22 is that many tractable graphproblems are best handled by algorithms that can solve a whole class ofproblemsinageneralsetting.Theshortest-pathsalgorithmsofChapter21, thenetwork-flow algorithms of Chapter 22, and the powerful network-simplexalgorithm of Chapter 22 are capable of solving many graph problems thatotherwise might present a significant challenge. Examples of such problemsincludethefollowing.AssignmentThisproblem,alsoknownasbipartiteweightedmatching,istofindaperfectmatchingofminimumweight in abipartitegraph. It is easily solvedwithnetwork-flowalgorithms.Specificmethodsthatattacktheproblemdirectlyareknown,but theyhavebeen shown tobeessentially equivalent tonetwork-

flowsolutions.GeneralconnectivityWhat is theminimum number of edgeswhose removalwill separate a graph into two disjoint parts (edge connectivity)?What is theminimum number of vertices whose removal will separate a graph into twodisjoint parts (vertex connectivity)?Aswe see inChapter22, these problems,although difficult to solve directly, can both be solved with network-flowalgorithms.MailcarrierGiven agraph, find a tourwith aminimal numberof edges thatuseseveryedgeinthegraphatleastonce(butisallowedtouseedgesmultipletimes). This problem is much more difficult than the Euler tour problem butmuchlessdifficultthantheHamiltontourproblem.The step from convincing yourself that a problem is tractable to providingsoftwarethatcanbeusedinpracticalsituationscanbealargestep,indeed.Ontheonehand,whenprovingthataproblemistractable,researcherstendtobrushpast numerous details that have to be dealtwith in an implementation; on theotherhand,theyhavetoaccountfornumerouspotentialsituationseventhoughthey may not arise in practice. This gap between theory and practice isparticularly acute when investigators are considering graph algorithms, bothbecause mathematical research is filled with deep results describing abewildering variety of structural properties that we may need to take intoaccountwhen processing graphs, and because the relationships between thoseresults and the properties of graphs that arise in practice are little understood.Thedevelopmentofgeneralschemessuchasthenetwork-simplexalgorithmhasbeenanextremelyeffectiveapproachtodealingwithsuchproblems.An intractable graph-processing problem is one for which there is no knownalgorithmthatisguaranteedtosolvetheprobleminareasonableamountoftime.Many such problems have the characteristic that we can use a brute-forcemethodwherewecantryallpossibilitiestocomputethesolution—weconsiderthem tobe intractablebecause there are far toomanypossibilities to consider.Thisclassofproblemsisextensiveandincludesmanyimportantproblemsthatwewouldliketoknowhowtosolve.ThetermNP-harddescribestheproblemsin this class.Most experts believe that no efficient algorithms exist for theseproblems.We consider the bases for this terminology and this belief inmoredetailinPart8.TheHamiltonpathproblemthatwediscussedinSection17.7isaprimeexampleofanNP-hardgraph-processingproblem,asare thoseon thefollowinglist.LongestpathWhatisthelongestsimplepathconnectingtwogivenverticesina

graph?Despiteitsapparentsimilaritytoshortest-pathsproblems,thisproblemisaversionoftheHamiltontourproblem,andisNP-hard.ColorabilityIsthereawaytoassignoneofkcolorstoeachoftheverticesofagraphsuchthatnoedgeconnectstwoverticesofthesamecolor?Thisclassicalproblemiseasyfork=2(seeSection18.4),butitisNP-hardfork=3.IndependentsetWhatisthesizeofthelargestsubsetoftheverticesofagraphwith theproperty thatno twoareconnectedbyanedge?Justaswesawwhencontrasting the Euler and Hamilton tour problems, this problem is NP-hard,despite its apparent similarity to the matching problem, which is solvable inpolynomialtime.CliqueWhatissizeofthelargestclique(completesubgraph)inagivengraph?This problem generalizes part of the planarity problem, because if the largestcliquehasmorethanfournodes,thegraphisnotplanar.These problems are formulated as existence problems—we are asked todeterminewhetherornotaparticularsubgraphexists.Someoftheproblemsaskfor the size of the largest subgraph of a particular type, whichwe can do byframinganexistenceproblemwherewetestfor theexistenceofasubgraphofsizek that satisfies the property of interest, then use binary search to find thelargest.Inpractice,weactuallyoftenwanttofindacompletesolution,whichispotentially much harder to do. Four example, the famous four-color theoremtellsusthatitispossibleusejustfourcolorstocoloralltheverticesofaplanargraphsuchthatnoedgeconnectstwoverticesofthesamecolor.Butthetheoremdoes not tell us how to do so for a particular planar graph: knowing that acoloringexistsdoesnothelpusfindacompletesolutiontotheproblem.Anotherfamousexampleisthetravelingsalespersonproblem,whichasksustofindtheminimum-lengthtourthroughalltheverticesofaweightedgraph.ThisproblemisrelatedtotheHamiltonpathproblem,butitiscertainlynoeasier:ifwecannotfindanefficientsolutiontotheHamiltonpathproblem,wecannotexpecttofindone for the travelingsalespersonproblem.Asa rule,whenfacedwithdifficultproblems,weworkwith the simplest version that we cannot solve. Existenceproblemsarewithinthespiritofthisrule,buttheyalsoplayanessentialroleinthetheory,asweshallseeinPart8.TheproblemsjustlistedarebutafewofthethousandsofNP-hardproblemsthathavebeenidentified.Theyariseinalltypesofcomputationalapplications,asweshalldiscussinPart8;theyareparticularlyprevalentingraphprocessing,sowehavetobeawareoftheirexistencethroughoutthisbook.Notethatweareinsistingthatouralgorithmsguaranteeefficiency,intheworst

case.Perhapsweshouldinsteadsettleforalgorithmsthatareefficientfortypicalinputs (butnotnecessarily in theworstcase).Similarly,manyof theproblemsinvolve optimization. Perhaps we should instead settle for a long path (notnecessarily the longest) or a large clique (not necessarily the maximum). Forgraphprocessing,itmightbeeasytofindagoodanswerforgraphsthatariseinpractice, and wemay not even be interested in looking for an algorithm thatcouldfindanoptimalsolutioninfictionalgraphsthatwewillneversee.Indeed,intractable problems can often be attacked with straightforward or general-purpose algorithms similar to Program 17.17 that, although they haveexponential running time in the worst case, can quickly find a solution (or agood approximation) for specific problem instances that arise in practice.Wewouldbereluctanttouseaprogramthatwillcrashorproduceabadanswerforcertain inputs,butwedo sometimes findourselvesusingprograms that run inexponentialtimeforcertaininputs.WeconsiderthissituationinPart8.Therearealsomany research resultsproving thatvarious intractableproblemsremainintractableevenwhenwerelaxvariousrestrictions.Moreover,therearemanyspecificpracticalproblemsthatwecannotsolvebecausenooneknowsasufficientlyfastalgorithm.Inthispartofthebook,wewilllabelproblemsNP-hardwhenwe encounter them and interpret this label asmeaning, at the veryleast, thatwearenotgoingtoexpecttofindefficientalgorithmstosolvethemandthatwearenotgoingtoattackthemwithoutusingadvancedtechniquesofthetypediscussedinPart8(exceptperhapstousebrute-forcemethodstosolvetinyproblems).Therearesomegraph-processingproblemswhosedifficultyisunknown.Neitheris there an efficient algorithmknown for them,nor are theyknown tobeNP-hard. It is possible, as our knowledge of graph-processing algorithms andproperties of graphs expands, that some of these problemswill turn out to betractable,oreveneasy.Thefollowingimportantnaturalproblem,whichwehavealreadyencountered(seeFigure17.2),isthebest-knownprobleminthisclass.Graph isomorphism Can we make two given graphs identical by renamingvertices?Efficientalgorithmsareknownforthisproblemformanyspecialtypesofgraphs,butthedifficultyofthegeneralproblemremainsopen.Thenumberofsignificantproblemswhoseintrinsicdifficultyisunknownisverysmallincomparisontotheothercategoriesthatwehaveconsidered,becauseofintenseresearchinthisfieldoverthepastseveraldecades.Certainproblemsinthis class, suchasgraph isomorphism, areof immensepractical interest; otherproblemsinthisclassareofsignificanceprimarilybyvirtueofhavingresisted

classification.Table17.2Difficultyofclassicalgraph-processingproblems

This tablesummarizes thediscussion in the textabout therelativedifficultyofvarious classical graph-processing problems, comparing them in roughsubjectiveterms.Theseexamplesindicatenotonlytherangeofdifficultyoftheproblemsbutalsothatclassifyingagivenproblemcanbeachallengingtask.

Fortheclassofeasyalgorithms,weareusedtotheideaofcomparingalgorithmswith different worst-case performance characteristics and trying to predictperformance through analysis and empirical tests. For graph processing, thesetasksareparticularlychallengingbecauseofthedifficultyofcharacterizingthetypesofgraphsthatmightariseinpractice.Fortunately,manyoftheimportantclassical algorithms have optimal or near-optimal worst-case performance, orhave a running time that depends on only the number of vertices and edges,rather than on the graph structure; we thus can concentrate on streamliningimplementationsandstillcanpredictperformancewithconfidence.In summary, there is awide spectrum of problems and algorithms known forgraphprocessing.Table17.2summarizessomeoftheinformationthatwehavediscussed. Every problem comes in different versions for different types ofgraphs (directed, weighted, bipartite, planar, sparse, dense), and there arethousandsofproblemsandalgorithmstoconsider.Wecertainlycannotexpecttosolveeveryproblemthatwemightencounter,andsomeproblemsthatappeartobesimplearestillbafflingtheexperts.Despiteanaturalaprioriexpectationthatweshouldhavenoproblemdistinguishingeasyproblemsfromintractableones,themanyexamplesthatwehavediscussedillustratethatplacingaproblemevenintotheseroughcategoriescanturnintoasignificantresearchchallenge.Asourknowledgeaboutgraphsandgraphalgorithmsexpands,givenproblemsmaymove among these categories.Despite a flurry of research activity in the1970s and intensivework bymany researchers since then, the possibility stillremainsthatalltheproblemsthatwearediscussingwillsomedaybecategorizedas “easy” (solvable by an algorithm that is compact, efficient, and possiblyingenious).Havingdeveloped this context,we shall press on to consider numerous usefulgraph-processing algorithms. Problems that we can solve do arise often, thegraphalgorithmsthatwestudyservewellinagreatvarietyofapplications,andthesealgorithmsserveas thebasis forattackingnumerousotherproblems thatweneedtohandleevenifwecannotguaranteeefficientsolutions.

Exercises•17.108ProvethatneitherofthetwographsdepictedinFigure17.24isplanar.17.109 Write a graph ADT client that determines whether or not a graphcontains one of the graphs depicted in Figure 17.24, using a brute-forcealgorithmwhereyoutestallpossiblesubsetsoffiveverticesforthecliqueandallpossiblesubsetsofsixverticesforthecompletebipartitegraph.Note:This

testdoesnotsufficetoshowwhetherthegraphisplanar,becauseitignorestheconditionthatremovingverticesofdegree2insomesubgraphmightgiveoneofthetwoforbiddensubgraphs.17.110Giveadrawingofthegraph

3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6

thathasnointersectingedges,orprovethatnosuchdrawingexists.17.111Findawaytoassignthreecolorstotheverticesofthegraph

3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6

suchthatnoedgeconnectstwoverticesofthesamecolor,orshowthatitisnotpossibletodoso.17.112Solvetheindependent-setproblemforthegraph

3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6.

17.113WhatisthesizeofthelargestcliqueinadeBruijngraphofordern?

CHAPTEREIGHTEENGraphSearch

WEOFTENLEARNpropertiesofagraphbysystematicallyexaminingeachofitsverticesandeachof itsedges.Determiningsomesimplegraphproperties—for example, computing the degrees of all the vertices—is easy if we justexamine each edge (in any order whatever).Many other graph properties arerelatedtopaths,soanaturalwaytolearnthemistomovefromvertextovertexalong the graph’s edges. Nearly all the graph-processing algorithms that weconsider use this basic abstract model. In this chapter, we consider thefundamental graph-search algorithms that we use to move through graphs,learningtheirstructuralpropertiesaswego.Graph searching in this way is equivalent to exploring a maze. Specifically,passages in the maze correspond to edges in the graph, and points wherepassages intersect in the maze correspond to vertices in the graph. When aprogramchangesthevalueofavariablefromvertexvtovertexwbecauseofanedgev-w,weviewitasequivalenttoapersoninamazemovingfrompointvtopointw.Webeginthischapterbyexaminingasystematicexplorationofamaze.By correspondence with this process, we see precisely how the basic graph-searchalgorithmsproceedthrougheveryedgeandeveryvertexinagraph.In particular, the recursive depth-first search (DFS) algorithm correspondsprecisely to theparticularmaze-exploration strategyofSection18.1.DFS is aclassicandversatilealgorithmthatweuse tosolveconnectivityandnumerousother graph-processing problems. The basic algorithm admits two simpleimplementations: one that is recursive, and another that uses an explicit stack.Replacing the stack with a FIFO queue leads to another classicalgorithm,breadth-first search (BFS), which we use to solve another class ofgraph-processingproblemsrelatedtoshortestpaths.

Figure18.1Exploringamaze

Wecanexploreeverypassagewayinasimplemazebyfollowingasimplerulesuchas“keepyourrighthandonthewall.”Followingthisruleinthemazeatthe

top,weexplorethewholemaze,goingthrougheachpassageonceineachdirection.Butifwefollowthisruleinamazewithacycle,wereturntothe

startingpointwithoutexploringthewholemaze,asillustratedinthemazeatthebottom.

ThemaintopicsofthischapterareDFS,BFS,theirrelatedalgorithms,andtheirapplicationtographprocessing.WebrieflyconsideredDFSandBFSinChapter5;wetreatthemfromfirstprincipleshere,inthecontextofsearch-basedgraph-processing classes, and use them to demonstrate relationships among variousgraph algorithms. In particular, we consider a general approach to searchinggraphs that encompasses a number of classical graph-processing algorithms,includingbothDFSandBFS.As illustrations of the application of these basic graph-searching methods tosolvemorecomplicatedproblems,weconsideralgorithmsforfindingconnectedcomponents, biconnected components, spanning trees, and shortest paths, andforsolvingnumerousothergraph-processingproblems.These implementationsexemplify the approach that we shall use to solvemore difficult problems inChapters19through22.Weconcludethechapterbyconsideringthebasicissuesinvolvedintheanalysisofgraphalgorithms, in thecontextofacasestudycomparingseveraldifferentalgorithmsforfindingthenumberofconnectedcomponentsinagraph.

18.1ExploringaMazeItisinstructivetothinkabouttheprocessofsearchingthroughagraphintermsofanequivalentproblemthathasalonganddistinguishedhistory(seereferencesection)—findingourway throughamaze that consistsofpassages connectedby intersections. This section presents a detailed study of a basic method for

exploringeverypassageinanygivenmaze.Somemazescanbehandledwithasimple rule, butmostmazes require amore sophisticated strategy (see Figure18.1).Usingtheterminologymazeinsteadofgraph,passageinsteadofedge,andintersection insteadofvertex ismakingmeresemanticdistinctions,but,forthemoment,doingsowillhelptogiveusanintuitivefeelfortheproblem.One trick forexploringamazewithoutgetting lost thathasbeenknownsinceantiquity(datingbackatleast tothelegendofTheseusandtheMinotaur)istounrollaballofstringbehindus.Thestringguaranteesthatwecanalwaysfindawayout, butweare also interested inbeing sure thatwehave explored everypartofthemaze,andwedonotwanttoretraceourstepsunlesswehaveto.Toaccomplish thesegoals,weneedsomeway tomarkplaces thatwehavebeen.We could use the string for this purpose as well, but we use an alternativeapproachthatmodelsacomputerimplementationmoreclosely.We assume that there are lights, initially off, in every intersection, and doors,initiallyclosed,atbothendsofeverypassage.Wefurtherassumethatthedoorshave windows and that the lights are sufficiently strong and the passagessufficientlystraightthatwecandetermine,byopeningthedooratoneendofapassage,whetherornottheintersectionattheotherendislit(evenifthedoorattheotherendisclosed).Ourgoalsaretoturnonallthelightsandtoopenallthedoors. To reach them, we need a set of rules to follow, systematically. Thefollowingmaze-explorationstrategy,whichwerefertoasTrémauxexploration,hasbeenknownatleastsincethenineteenthcentury(seereferencesection):

(i) If there are no closed doors at the current intersection, go to step (iii).Otherwise,openanycloseddoortoanypassageleadingoutofthecurrentintersection(andleaveitopen).(ii) If you can see that the intersection at the other end of that passage isalready lighted, try another door at the current intersection (step (i)).Otherwise (if you can see that the intersection at the other end of thepassageisdark),followthepassagetothatintersection,unrollingthestringasyougo,turnonthelight,andgotostep(i).(iii) Ifall thedoorsat thecurrent intersectionareopen,checkwhetheryouarebackatthestartpoint.Ifso,stop.Ifnot,usethestringtogobackdownthepassagethatbroughtyoutothisintersectionforthefirsttime,rollingthestring backup as yougo, and look for another closed door there (that is,returntostep(i)).

Figures18.2and18.3depictatraversalofasamplegraphandshowthat,indeed,everylight is litandeverydoor isopenedfor thatexample.Thefiguresdepict

justoneofmanypossibleoutcomesof theexploration,becauseweare free toopenthedoorsinanyorderateachintersection.Convincingourselvesthatthismethodisalwayseffectiveisaninterestingexerciseinmathematicalinduction.

Figure18.2Trémauxmazeexplorationexample

Inthisdiagram,placesthatwehavenotvisitedareshaded(dark)andplacesthatwehavevisitedarewhite(light).Weassumethattherearelightsinthe

intersections,andthat,whenwehaveopeneddoorsintolightedintersectionsonbothendsofapassage,thepassageislighted.Toexplorethemaze,webeginat0andtakethepassageto2(left,top).Thenweproceedto6,4,3,and5,openingthedoorstothepassages,lightingtheintersectionsasweproceedthroughthem,andleavingastringtrailingbehindus(left).Whenweopenthedoorthatleadsto

0from5,wecanseethat0islighted,soweskipthatpassage(topright).Similarly,weskipthepassagefrom5to4(right,secondfromtop),leavinguswithnowheretogofrom5exceptbackto3andthenbackto4,rollingupourballofstring.Whenweopenthedoorwayfromthepassagefrom4to5,wecanseethat5islightedthroughtheopendoorattheotherend,andwethereforeskipthatpassage(right,bottom).Weneverwalkeddownthepassageconnecting4

and5,butwelighteditbyopeningthedoorsatbothends.

Figure18.3Trémauxmazeexplorationexample(continued)

Next,weproceedto7(topleft),openthedoortoseethat0islighted(left,secondfromtop),andthenproceedto1(left,thirdfromtop).Atthispoint,mostofthemazeistraversed,andweuseourstringtotakeusbacktothebeginning,movingfrom1to7to4to6to2to0.Backat0,wecompleteourexplorationbycheckingthepassagesto5(right,secondfrombottom)and7(bottomright),

leavingallpassagesandintersectionslighted.Again,thepassagesconnecting0to5and0to7arebothlightedbecauseweopenedthedoorsatbothends,butwe

didnotwalkthroughthem.

Figure18.4Decomposingamaze

ToprovebyinductionthatTrémauxexplorationtakesuseverywhereinamaze(top),webreakitintotwosmallerpieces,byremovingalledgesconnectingthefirstintersectionwithanyintersectionthatcanbereachedfromthefirstpassage

withoutgoingbackthroughthefirstintersection(bottom).

Property18.1WhenweuseTrémauxmazeexploration,we lightall lightsandopenalldoorsinthemazeandendupbackwherewestarted.Proof:Toprovethisassertionbyinduction,wefirstnotethatitholds,trivially,foramazethatcontainsoneintersectionandnopassages—wejust turnonthelight. For any maze that contains more than one intersection, we assume thepropertytobetrueforallmazeswithfewerintersections.Itsufficestoshowthatwevisitall intersections, sinceweopenall thedoorsatevery intersection thatwevisit.Now,considerthefirstpassagethatwetakefromthefirstintersection,and divide the intersections into two subsets: (i) those that we can reach bytakingthatpassagewithoutreturningto thestart,and(ii) those thatwecannotreach from that passagewithout returning to the start.Applying the inductivehypothesis,weknowthatwevisitallintersectionsin(i)(ignoringanypassagesback to the start intersection, which is lit) and end up back at the start

intersection.Then,applyingthetheinductivehypothesisagain,weknowthatwevisitallintersectionsin(ii)(ignoringthepassagesfromthestarttointersectionsin(i),whicharelit).•FromthedetailedexampleinFigures18.2and18.3,wesee that thereare fourdifferentpossiblesituationsthatariseforeachpassagethatweconsidertaking:

(i)Thepassageisdark,sowetakeit.(ii)Thepassageistheonethatweusedtoenter(ithasourstringinit),soweuseittoexit(andwerollupthestring).(iii)Thedoorattheotherendofthepassageisclosed(buttheintersectionislit),soweskipthepassage.(iv)Thedoorattheotherendofthepassageisopen(andtheintersectionislit),soweskipit.

Thefirstandsecondsituationsapplytoanypassagethatwetraverse,firstatoneend and then at the other end. The third and fourth situations apply to anypassagethatweskip,firstatoneendandthenattheotherend.Next,weseehowthisperspectiveonmazeexplorationtranslatesdirectlytographsearch.

Exercises• 18.1 Assume that intersections 6 and 7 (and all the hallways connected tothem)areremovedfromthemazeinFigures18.2and18.3,andahallway isadded that connects 1 and 2. Show a Trémaux exploration of the resultingmaze,inthestyleofFigures18.2and18.3.

•18.2Whichofthefollowingcouldnotbetheorderinwhichlightsareturnedonat the intersectionsduringaTrémauxexplorationof themazedepicted inFigures18.2and18.3?

0-7-4-5-3-1-6-20-2-6-4-3-7-1-50-5-3-4-7-1-6-20-7-4-6-2-1-3-5

• 18.3 How many different ways are there to traverse the maze depicted inFigures18.2and18.3withaTrémauxexploration?

18.2Depth-FirstSearchOurinterestinTrémauxexplorationisthatthistechniqueleadsusimmediatelytotheclassicrecursivefunctionfortraversinggraphs:Tovisitavertex,wemarkitashavingbeenvisited,then(recursively)visitalltheverticesthatareadjacent

to it and that have not yet been marked. This method, which we consideredbrieflyinChapters3and5andusedtosolvepathproblemsinSection17.7, iscalleddepth-firstsearch(DFS). It isoneof themost importantalgorithms thatweshallencounter.DFSisdeceptivelysimplebecauseitisbasedonafamiliarconceptandiseasytoimplement;infact,it isasubtleandpowerfulalgorithmthatweputtousefornumerousdifficultgraph-processingtasks.Program18.1isaDFSclassthatvisitsalltheverticesandexaminesalltheedgesinaconnectedgraph.Likethesimplepath-searchfunctionsthatweconsideredinSection17.7,itisbasedonarecursivefunctionthatkeepsaprivatevectortomarkverticesashavingbeenvisited.Inthis implementation,ordisavectorofintegersthatrecordstheorderinwhichverticesarevisited.Figure18.5isatracethatshowstheorderinwhichProgram18.1visitstheedgesandverticesfortheexample depicted in Figures 18.2 and 18.3 (see also Figure 18.17), when theadjacency-matrixgraphimplementationDenseGRAPHofSection17.3 isused.Figure18.6depictsthemaze-explorationprocessusingstandardgraphdrawings.These figures illustrate the dynamics of a recursive DFS and show thecorrespondencewithTrémauxexplorationof amaze.First, thevertex-indexedvectorcorrespondstothelightsintheintersections:Whenweencounteranedgetoavertexthatwehavealreadyvisited(seealightattheendofthepassage),wedonotmakearecursive

Figure18.5DFStrace

ThistraceshowstheorderinwhichDFScheckstheedgesandverticesfortheadjacency-matrixrepresentationofthegraphcorrespondingtotheexamplein

Figures18.2and18.3(top)andtracesthecontentsoftheordvector(right)asthesearchprogresses(asterisksrepresent-1,forunseenvertices).Therearetwolinesinthetraceforeverygraphedge(onceforeachorientation).Indentation

indicatesthelevelofrecursion.

Program18.1Depth-firstsearchofaconnectedcomponentThisDFSclasscorrespondstoTrémauxexploration.TheconstructormarksasvisitedallverticesinthesameconnectedcomponentasvbycallingtherecursivesearchC,whichvisitsalltheverticesadjacenttovbycheckingthemallandcallingitselfforeachedgethatleadsfromvtoanunmarkedvertex.Clientscanusethecountfunctiontolearnthenumber of vertices encountered and the overloaded [] operator to learn the order inwhichthesearchvisitedthevertices.

calltofollowthatedge(godownthatpassage).Second,thefunctioncall–returnmechanismintheprogramcorrespondstothestringinthemaze:Whenwehaveprocessedalltheedgesadjacenttoavertex(exploredallthepassagesleavinganintersection),we“return”(inbothsensesoftheword).Inthesamewaythatweencountereachpassageinthemazetwice(onceateachend),weencountereachedgeinthegraphtwice(onceateachofitsvertices).InTrémauxexploration,weopenthedoorsateachendofeachpassage;inDFSofanundirectedgraph,we

Figure18.6Depth-firstsearch

ThesediagramsareagraphicalviewoftheprocessdepictedinFigure18.5,showingtheDFSrecursive-calltreeasitevolves.ThickblackedgesinthegraphcorrespondtoedgesintheDFStreeshowntotherightofeachgraphdiagram.Shadededgesarethecandidatestobeaddedtothetreenext.Intheearlystages(left)thetreegrowsdowninastraightline,aswemakerecursivecallsfor0,2,

6,and4(left).Thenwemakerecursivecallsfor3,then5(right,toptwodiagrams);andreturnfromthosecallstomakearecursivecallfor7from4

(right,secondfrombottom)andto1from7(right,bottom).

Figure18.7DFStrace(adjacencylists)

ThistraceshowstheorderinwhichDFScheckstheedgesandverticesfortheadjacency-listsrepresentationofthesamegraphasinFigure18.5.

checkeachofthetworepresentationsofeachedge.Ifweencounteranedgev-w,we either do a recursive call (if w is not marked) or skip the edge (if w ismarked).Thesecondtimethatweencountertheedge,intheoppositeorientationw-v,wealwaysignoreit,becausethedestinationvertexvhascertainlyalreadybeenvisited(thefirsttimethatweencounteredtheedge).One difference between DFS as implemented in Program 18.1 and TrémauxexplorationasdepictedinFigures18.2and18.3,although it is inconsequentialinmanycontexts,isworthtakingthetimetounderstand.Whenwemovefromvertexvtovertexw,wehavenotexaminedanyoftheentriesintheadjacencymatrix that correspond to edges from w to other vertices in the graph. Inparticular,weknowthatthereisanedgefromwtovandthatwewillignorethatedgewhenwegettoit(becausevismarkedasvisited).Thatdecisionhappensata time different from in the Trémaux exploration, where we open the doorscorrespondingtotheedgefromvtowwhenwegotowforthefirsttime,fromv.Ifweweretoclosethosedoorsonthewayinandopenthemonthewayout(having identified the passagewith the string), thenwewould have a precisecorrespondencebetweenDFSandTrémauxexploration.Figure 18.6 also depicts the tree corresponding to the recursive calls as itevolves, incorrespondencewithFigure18.5.This recursive-call tree,which isknownastheDFStree, isastructuraldescriptionofthesearchprocess.Aswesee in Section 18.4, the DFS tree, properly augmented, can provide a fulldescriptionofthesearchdynamics,inadditiontojustthecallstructure.Theorderinwhichverticesarevisiteddependsnotjustonthegraph,butonitsrepresentation andADT implementation. For example, Figure 18.7 shows thesearchdynamicwhentheSparseMultiGRAPHadjacency-listsimplementationofSection17.4 is used.For the adjacency-matrix representation,we examine theedges incident on each vertex in numerical order; for the adjacency-listsrepresentation,weexaminethemintheorderthattheyappearonthelists.Thisdifference leads to a completely different recursive search dynamic, aswoulddifferences in the order in which edges appear on the lists (which occur, forexample,when the samegraph is constructedby inserting edges in adifferentorder).NotealsothattheexistenceofparalleledgesisinconsequentialforDFSbecause anyedge that is parallel to an edge that has alreadybeen traversed isignored,sinceitsdestinationvertexhasbeenvisited.

Despiteallofthesepossibilities,thecriticalfactremainsthatDFSvisitsalltheedges and all the vertices connected to the start vertex, regardless of in whatorder it examines the edges incident on each vertex. This fact is a directconsequenceofProperty18.1,sincetheproofofthatpropertydoesnotdependon the order inwhich the doors are opened at any given intersection.All theDFS-based algorithms that we examine have this same essential property.Althoughthedynamicsoftheiroperationmightvarysubstantiallydependingonthe graph representation and details of the implementation of the search, therecursivestructuregivesusaway tomakerelevant inferencesabout thegraphitself,nomatterhowit is representedandnomatterwhichorderwechoose toexaminetheedgesincidentuponeachvertex.

Exercises•18.4AddapublicmemberfunctiontoProgram18.1thatreturnsthesizeoftheconnectedcomponentsearchedbytheconstructor.18.5WriteaclientprogramlikeProgram17.6thatscansagraphfromstandardinput,usesProgram18.1torunasearchstartingateachvertex,andprintsouttheparent-linkrepresentationofeachspanningforest.Use theDenseGRAPHgraphADTimplementationfromSection17.3.•18.6Show,in thestyleofFigure18.5,a traceof therecursivefunctioncallsmadewhenacDFS<DenseGRAPH>objectisconstructedforthegraph

0-20-51-23-44-53-5.DrawthecorrespondingDFSrecursive-calltree.18.7 Show, in the style of Figure 18.6, the progress of the search for theexampleinExercise18.6.

18.3Graph-SearchADTFunctionsDFSandtheothergraph-searchmethodsthatweconsiderlaterinthischapterallinvolve following graph edges from vertex to vertex, with the goal ofsystematicallyvisitingeveryvertexandeveryedgeinthegraph.Butfollowinggraphedgesfromvertextovertexcanleadustoalltheverticesinonlythesameconnectedcomponentasthestartingvertex.Ingeneral,ofcourse,graphsmightnotbeconnected,soweneedonecallonasearchfunctionforeachconnectedcomponent.We

Program18.2GraphsearchThisbaseclassisforprocessinggraphsthatmaynotbeconnected.DerivedclassesmustdefineafunctionsearchCthat,whencalledwithaself-looptovasitssecondargument,

setsord[t]tocnt++foreachvertex t in thesameconnectedcomponentasv.Typically,constructorsinderivedclassescallsearch,whichcallssearchConceforeachconnectedcomponentinthegraph.

template<classGraph>classSEARCH

{

protected:

constGraph&G;

intcnt;

vector<int>ord;

virtualvoidsearchC(Edge)=0;

voidsearch()

{for(intv=0;v<G.V();v++)

if(ord[v]==-1)searchC(Edge(v,v));}

public:

SEARCH(constGraph&G):G(G),

ord(G.V(),-1),cnt(0){}

intoperator[](intv)const{returnord[v];}

};

will typicallyusegraph-search functions thatperform the followingstepsuntilalloftheverticesofthegraphhavebeenmarkedashavingbeenvisited:

•Findanunmarkedvertex(astartvertex).•Visit(andmarkasvisited)alltheverticesintheconnectedcomponentthatcontainsthestartvertex.

Themethodformarkingverticesisnotspecifiedinthisdescription,butwemostoftenusethesamemethodthatweusedfortheDFSimplementationsinSection18.2:We initialize all entries in a private vertex-indexed vector to a negativeinteger,andmarkverticesbysettingtheircorrespondingentrytoanonnegativevalue.Using thisprocedureamounts tousinga singlebit (thesignbit) for themark;mostimplementationsarealsoconcernedwithkeepingotherinformationassociated with marked vertices in the vector (such as, for the DFSimplementation in Section 18.2, the order inwhich vertices aremarked). Themethod for looking for a vertex in the next connected component is also notspecified,butwemostoftenuseascanthroughthevectorinorderofincreasingindex.Wepassanedgetothesearchfunction(usingadummyself-loopinthefirstcallforeachconnectedcomponent),insteadofpassingitsdestinationvertex,becausetheedgetellsushowwereachedthevertex.Knowingtheedgecorrespondstoknowing which passage led to a particular intersection in a maze. ThisinformationisusefulinmanyDFSclasses.Whenwearesimplykeepingtrackofwhich vertices we have visited, this information is of little consequence; butmoreinterestingproblemsrequirethatwealwaysknowfromwhencewecame.Program 18.2 is an implementation that illustrates these choices. Figure 18.8givesanexamplethatillustrateshoweveryvertexisvisited,bytheeffectontheordvectorofanyderivedclass.Typically, thederivedclasses thatweconsider

alsoexaminealledgesincidentuponeachvertexvisited.Insuchcases,knowingthatwevisit allvertices tellsus thatwevisit all edgesaswell, as inTrémauxtraversal.Program18.3 is an example that showshowwederive aDFS-based class forcomputingaspanningforestfromtheSEARCHbaseclassofProgram18.2.Weinclude a private vector st in the derived class to hold a parent-linkrepresentationof the tree thatweinitialize in theconstructor;defineasearchCthatissimilartosearchCfromProgram18.1,exceptthatittakesanedgev-wasargument and to set st[w] to v; and add a publicmember function that allowsclientstolearntheparentofanyvertex.Spanningforestsareofinterestinmanyapplications,butourprimaryinterestintheminthischapteristheirrelevanceinunderstandingthedynamicbehavioroftheDFS,thetopicofSection18.4.Inaconnectedgraph,theconstructorinProgram18.2callssearchConcefor0-0andthenfindsthatalltheotherverticesaremarked.Inagraphwithmorethanoneconnectedcomponent,theconstructorchecksalltheconnectedcomponentsinastraightforwardmanner.DFSisthefirstofseveralmethodsthatweconsiderforsearchingaconnectedcomponent.Nomatterwhichmethod(andnomatterwhat graph representation) we use, Program 18.2 is an effective method forvisitingallthegraphvertices.

Figure18.8Graphsearch

Thetableatthebottomshowsvertexmarks(contentsoftheordvector)duringatypicalsearchofthegraphatthetop.Initially,thefunctionGRAPHsearchin

Program18.2unmarksallverticesbysettingthemarksallto-1(indicatedbyanasterisk).Thenitcallssearchforthedummyedge0-0,whichmarksallofthe

verticesinthesameconnectedcomponentas0(secondrow)bysettingthemtoa

nonnegativevalues(indicatedby0s).Inthisexample,itmarks0,1,4,and9withthevalues0through3inthatorder.Next,itscansfromlefttorighttofindtheunmarkedvertex2andcallssearchforthedummyedge2-2(thirdrow),whichmarksthesevenverticesinthesameconnectedcomponentas2.

Continuingtheleft-to-rightscan,itcallssearchfor8-8tomark8and11(bottomrow).Finally,GRAPHsearchcompletesthesearchbydiscoveringthat9through

12areallmarked.

Program18.3Derivedclassfordepth-firstsearchThiscodeshowshowwederiveaspanning-forestDFSclassfromthebaseclassdefinedinProgram18.2.Theconstructorbuildsarepresentationoftheforestinst(parentlinks)alongwith ord (from the base class).Clients can use aDFS object to find any givenvertex’sparentintheforest(ST),oranygivenvertex’spositioninapreorderwalkoftheforest (overloaded [] operator). Properties of these forests and representations are thetopicofSection18.4.


classDFS:publicSEARCH<Graph>

{vector<int>st;

voidsearchC(Edgee)

{intw=e.w;

ord[w]=cnt++;st[e.w]=e.v;

typenameGraph::adjIteratorA(G,w);


if(ord[t]==-1)searchC(Edge(w,t));

}

public:

DFS(constGraph&G):SEARCH<Graph>(G),

st(G.V(),-1){search();}

intST(intv)const{returnst[v];}

};

Property18.2Agraph-searchfunctioncheckseachedgeandmarkseachvertexinagraphifandonlyifthesearchfunctionthatitusesmarkseachvertexandcheckseachedgeintheconnectedcomponentthatcontainsthestartvertex.Proof:Byinductiononthenumberofconnectedcomponents.•Graph-searchfunctionsprovideasystematicwayofprocessingeachvertexandeach edge in a graph. Generally, our implementations are designed to run inlinearornear-linear time,bydoingafixedamountofprocessingperedge.Weprove this fact now for DFS, noting that the same proof techniqueworks forseveralothersearchstrategies.Property18.3DFSof a graph representedwith an adjacencymatrix requirestimeproportionaltoV2.Proof:AnargumentsimilartotheproofofProperty18.1showsthatsearchCnotonlymarksallverticesconnectedtothestartvertexbutalsocallsitselfexactlyonceforeachsuchvertex(tomarkthatvertex).Anargumentsimilartotheproof

ofProperty18.2showsthatacalltosearchleadstoexactlyonecalltosearchCforeachgraphvertex.InsearchC,theiteratorcheckseveryentryinthevertex’srowintheadjacencymatrix.Inotherwords,thesearchcheckseachentryintheadjacencymatrixpreciselyonce.•Property 18.4DFS of a graph representedwith adjacency lists requires timeproportionaltoV+E.Proof: From the argument just outlined, it follows that we call the recursivefunctionpreciselyV times (hence theV term), andwe examine each entry oneachadjacencylist(hencetheEterm).•The primary implication of Properties 18.3 and 18.4 is that they establish therunning time of DFS to be linear in the size of the data structure used torepresent thegraph. Inmost situations,wearealso justified in thinkingof therunningtimeofDFSasbeinglinearinthesizeofthegraph,aswell:Ifwehavea dense graph (with the number of edges proportional to V2) then eitherrepresentationgivesthisresult;ifwehaveasparsegraph,thenweassumeuseofanadjacency-listsrepresentation.Indeed,wenormallythinkoftherunningtimeofDFSasbeinglinearinE.ThatstatementistechnicallynottrueifweareusingadjacencymatricesforsparsegraphsorforextremelysparsegraphswithE<<Vandmostverticesisolated,butwecanusuallyavoidtheformersituation,andwecanremoveisolatedvertices(seeExercise17.34)inthelattersituation.Asweshallsee,theseargumentsallapplytoanyalgorithmthathasafewofthesameessentialfeaturesofDFS.Ifthealgorithmmarkseachvertexandexaminesall the latter’s incident vertices (and does any other work that takes time pervertexboundedbyaconstant),thenthesepropertiesapply.Moregenerally,ifthetimepervertexisboundedbysomefunctionf(V,E),thenthetimeforthesearchisguaranteed tobeproportional toE+Vf (V,E). InSection18.8,we see thatDFS is one of a family of algorithms that has just these characteristics; inChapters 19 through 22,we see that algorithms from this family serve as thebasisforasubstantialfractionofthecodethatweconsiderinthisbook.Much of the graph-processing code that we examine is ADT-implementationcodeforsomeparticulartask,wherewedevelopaclassthatdoesabasicsearchtocomputestructuralinformationinothervertex-indexedvectors.Wecanderivethe class from Program 18.2 or, in simple cases, just reimplement the search.Manyofourgraph-processingclassesareofthisnaturebecausewetypicallycanuncoveragraph’sstructurebysearchingit.Wenormallyaddcodetothesearchfunctionthatisexecutedwheneachvertexismarked,insteadofworkingwithamoregenericsearch(forexample,onethatcallsaspecifiedfunctioneachtimea

vertexisvisited),solelytokeepthecodecompactandself-contained.Providingamore general ADTmechanism for clients to process all the vertices with aclient-supplied function is a worthwhile exercise (see Exercises 18.13 and18.14).In Sections 18.5 and 18.6, we examine numerous graph-processing functionsthat are based on DFS. In Sections 18.7 and 18.8, we look at otherimplementationsofsearchandatsomegraph-processingfunctionsthatarebasedon them.Althoughwedonotbuild this layerof abstraction intoourcode,wetakecare to identify thebasicgraph-searchstrategyunderlyingeachalgorithmthat we develop. For example, we use the term DFS class to refer to anyimplementation that is based on the recursiveDFS scheme. The simple-path–search class Program 17.11 and the spanning-forest class Program 18.3 areexamplesofDFSclasses.Manygraph-processingfunctionsarebasedontheuseofvertex-indexedvectors.We typically include such vectors as private data members in classimplementations, to hold information about the structure of graphs (which isdiscoveredduringthesearch)thathelpsussolvetheproblemathand.Examplesof such vectors are the deg vector in Program 17.11 and the ord vector inProgram18.1.Someimplementationsthatwewillexamineusemultiplevectorstolearncomplicatedstructuralproperties.Our convention in graph-search functions is to initialize all entries in vertex-indexedvectorsto-1,andtosettheentriescorrespondingtoeachvertexvisitedtononnegativevaluesinthesearchfunction.Anysuchvectorcanplaytheroleof theordvector (markingvertices as visited) inPrograms18.2 through18.3.Whenagraph-searchfunctionisbasedonusingorcomputingavertex-indexedvector,weoftenjustimplementthesearchandusethatvectortomarkvertices,ratherthanderivingtheclassfromSEARCHormaintainingtheordvector.The specific outcome of a graph search depends not just on the nature of thesearchfunctionbutalsoonthegraphrepresentationandeventheorderinwhichsearch examines the vertices. For specificity in the examples and exercises inthisbook,weusethetermstandardadjacency-listsDFStorefertotheprocessof inserting a sequence of edges into a graph ADT implemented with anadjacency-lists representation (Program 17.9), then doing a DFS with, forexample, Program 18.3. For the adjacency-matrix representation, the order ofedge insertion does not affect search dynamics, but we use the parallel termstandardadjacency-matrixDFStorefertotheprocessofinsertingasequenceofedges into agraphADT implementedwith an adjacency-matrix representation

(Program17.7),thendoingaDFSwith,forexample,Program18.3.

Exercises18.8Show, in the styleofFigure18.5, a traceof the recursive functioncallsmadeforastandardadjacency-matrixDFSofthegraph

3-71-47-80-55-23-82-90-64-92-66-4.18.9Show, in the styleofFigure18.7, a traceof the recursive functioncallsmadeforastandardadjacency-listsDFSofthegraph

3-71-47-80-55-23-82-90-64-92-66-4.18.10 Modify the adjacency-matrix graph ADT implementation in Program17.7 touseadummyvertex that is connected toall theothervertices.Then,provideasimplifiedDFSimplementationthattakesadvantageofthischange.18.11 Do Exercise 18.10 for the adjacency-lists ADT implementation inProgram17.9.•18.12Thereare13!differentpermutationsoftheverticesinthegraphdepictedin Figure 18.8. Howmany of these permutations could specify the order inwhichverticesarevisitedbyProgram18.2?18.13 Implement a graph ADT client function that calls a client-suppliedfunctionforeachvertexinthegraph.18.14 Implement a graphADTclient that calls a client-supplied function foreachedge in thegraph.Such a functionmight be a reasonable alternative toGRAPHedges(seeProgram17.2).

Figure18.9DFStreerepresentations

IfweaugmenttheDFSrecursive-calltreetorepresentedgesthatarecheckedbutnotfollowed,wegetacompletedescriptionoftheDFSprocess(left).Eachtreenodehasachildrepresentingeachofthenodesadjacenttoit,intheordertheywereconsideredbytheDFS,andapreordertraversalgivesthesame

informationasFigure18.5:firstwefollow0-0,then0-2,thenweskip2-0,thenwefollow2-6,thenweskip6-2,thenwefollow6-4,then4-3,andsoforth.Theordvectorspecifiestheorderinwhichwevisittreeverticesduringthispreorderwalk,whichisthesameastheorderinwhichwevisitgraphverticesintheDFS.Thestvectorisaparent-linkrepresentationoftheDFSrecursive-calltree(see

Figure18.6).

Therearetwolinksinthetreeforeveryedgeinthegraph,oneforeachofthetwo times it encounters the edge. The first is to an unshaded node and eithercorrespondstomakingarecursivecall(ifitistoaninternalnode)ortoskippinga recursive call because it goes to an ancestor forwhich a recursive call is inprogress (if it is toanexternalnode).Thesecond is toashadedexternalnodeandalwayscorrespondstoskippingarecursivecall,eitherbecauseitgoesbacktotheparent(circles)orbecauseitgoestoadescendentoftheparentforwhicharecursive call is in progress (squares). If we eliminate shaded nodes (center),thenreplacetheexternalnodeswithedges,wegetanotherdrawingofthegraph(right).

18.4PropertiesofDFSForestsAsnotedinSection18.2, the trees thatdescribe therecursivestructureofDFS

function calls give us the key to understanding how DFS operates. In thissection,weexaminepropertiesofthealgorithmbyexaminingpropertiesofDFStrees.If we add external nodes to the DFS tree to record the moments when weskipped recursive calls for vertices that had already been visited, we get thecompactrepresentationof thedynamicsofDFSillustratedinFigure18.9.Thisrepresentation is worthy of careful study. The tree is a representation of thegraph, with a vertex corresponding to each graph vertex and an edgecorresponding to each graph edge. We can choose to show the tworepresentationsoftheedgethatweprocess(oneineachdirection),asshownintheleftpartofthefigure,orjustonerepresentationofeachedge,asshowninthecenterandrightpartsofthefigure.Theformerisusefulinunderstandingthatthealgorithmprocesses each and every edge; the latter is useful in understandingthattheDFStreeissimplyanothergraphrepresentation.TraversingtheinternalnodesofthetreeinpreordergivestheverticesintheorderinwhichDFSvisitsthem;moreover,theorderinwhichwevisittheedgesofthetreeaswetraverseitinpreorderisthesameastheorderinwhichDFSexaminestheedgesofthegraph.Indeed,theDFStreeinFigure18.9containsthesameinformationasthetraceinFigure18.5orthestep-by-stepillustrationofTrémauxtraversalinFigures18.2and18.3.Edgestointernalnodesrepresentedges(passages)tounvisitedvertices(intersections), edges to external nodes represent occasionswhereDFSchecksedges that lead to previouslyvisitedvertices (intersections), and shadednodesrepresentedges tovertices forwhicha recursiveDFS is inprogress (whenweopenadoortoapassagewherethedoorattheotherendisalreadyopen).Withtheseinterpretations,apreordertraversalofthetreetellsthesamestoryasthatofthedetailedmaze-traversalscenario.To study more intricate graph properties, we classify the edges in a graphaccording to the role that they play in the search.We have two distinct edgeclasses:

•Edgesrepresentingarecursivecall(treeedges)•Edgesconnectingavertexwithanancestor in itsDFS tree that isnot itsparent(backedges)

WhenwestudyDFStreesfordigraphsinChapter19,weexamineothertypesofedges,notjusttotakethedirectionintoaccount,butalsobecausewecanhaveedges that go across the tree, connecting nodes that are neither ancestors nordescendantsinthetree.

SincetherearetworepresentationsofeachgraphedgethateachcorrespondtoalinkintheDFStree,wedividethetreelinksintofourclasses,usingthepreordernumbersandtheparentlinks(intheordandstarrays,respectively)thatourDFScodecomputes.WerefertoalinkfromvtowinaDFStreethatrepresentsatreeedgeas

•Atreelinkifwisunmarked•Aparentlinkifst[w]isv

andalinkfromvtowthatrepresentsabackedgeas•Abacklinkiford[w]<ord[v]•Adownlinkiford[w]>ord[v]

Each treeedge in thegraphcorresponds toa tree linkandaparent link in theDFS tree, and each back edge in the graph corresponds to a back link and adownlinkintheDFStree.InthegraphicalDFSrepresentationillustratedinFigure18.9,treelinkspointtounshaded circles, parent links point to shaded circles, back links point tounshadedsquares,anddownlinkspoint toshadedsquares.Eachgraphedge isrepresentedeitherasonetreelinkandoneparentlinkorasonedownlinkandonebacklink.Theseclassificationsaretrickyandworthyofstudy.Forexample,notethateventhoughparentlinksandbacklinksbothpointtoancestorsinthetree, theyarequitedifferent:Aparent link is just theother representationofatree link, but a back link gives us new information about the structure of thegraph.

Figure18.10DFStrace(treelinkclassifications)

ThisversionofFigure18.5showstheclassificationoftheDFStreelinkcorrespondingtoeachgraphedgerepresentationencountered.Treeedges(whichcorrespondtorecursivecalls)arerepresentedastreelinksonthefirstencounterandparentlinksonthesecondencounter.Backedgesarebacklinksonthefirst

encounteranddownlinksonthesecondencounter.

The definitions just given provide sufficient information to distinguish amongtree, parent, back, and down links in a DFS class implementation. Note thatparentlinksandbacklinksbothhaveord[w]<ord[v],sowehavealsotoknowthatst[w]isnotvtoknowthatv-wisabacklink.Figure18.10depictstheresultofprintingouttheclassificationoftheDFStreelinkforeachgraphedgeasthatedge is encountered during a sample DFS. It is yet another completerepresentationof thebasic searchprocess that is an intermediate stepbetweenFigure18.5andFigure18.9.Thefour typesof tree linkscorrespond to thefourdifferentways inwhichwetreatedgesduringaDFS,asdescribed(inmaze-explorationterms)attheendofSection18.1.A tree linkcorresponds toDFSencountering the firstof the tworepresentations of a tree edge, leading to a recursive call (to as-yet-unseenvertices);aparentlinkcorrespondstoDFSencounteringtheotherrepresentationof the tree edge (whengoing through the adjacency list on that first recursive

call)and ignoring it.Aback linkcorresponds toDFSencountering the firstofthe tworepresentationsofabackedge,whichpoints toavertex forwhich therecursivesearchfunctionhasnotyetcompleted;adownlinkcorrespondstoDFSencounteringavertexforwhichtherecursivesearchhascompletedat thetimethat theedge is encountered. InFigure18.9, tree links andback links connectunshadednodes,represent thefirstencounterwiththecorrespondingedge,andconstitutearepresentationofthegraph;parentlinksanddownlinksgotoshadednodesandrepresentthesecondencounterwiththecorrespondingedge.We have considered this tree representation of the dynamic characteristics ofrecursive DFS in detail not just because it provides a complete and compactdescriptionofboththegraphandtheoperationofthealgorithm,butalsobecauseit gives us a basis for understanding numerous important graph-processingalgorithms.Intheremainderofthischapter,andinthenextseveralchapters,weconsider a number of examples of graph-processing problems that drawconclusionsaboutagraph’sstructurefromtheDFStree.Searchinagraphisageneralizationoftreetraversal.Wheninvokedonatree,DFS is precisely equivalent to recursive tree traversal; for graphs, using itcorrespondstotraversingatreethatspansthegraph

Figure18.11DFSforest

TheDFSforestatthetoprepresentsaDFSofanadjacency-matrix

representationofthegraphatthebottomright.Thegraphhasthreeconnectedcomponents,sotheforesthasthreetrees.Theordvectorisapreordernumberingofthenodesinthetree(theorderinwhichtheyareexaminedbytheDFS)andthestvectorisaparent-linkrepresentationoftheforest.Theccvectorassociateseachvertexwithaconnected-componentindex(seeProgram18.4).AsinFigure18.9,edgestocirclesaretreeedges;edgesthatgotosquaresarebackedges;and

shadednodesindicatethattheincidentedgewasencounteredearlierinthesearch,intheotherdirection.

and that is discovered as the searchproceeds.Aswehave seen, the particulartree traversed depends on how the graph is represented. DFS corresponds topreordertreetraversal.InSection18.6,weexaminethegraph-searchinganalogtolevel-ordertreetraversalandexploreitsrelationshiptoDFS;inSection18.7,weexamineageneralschemathatencompassesanytraversalmethod.WhentraversinggraphswithDFS,wehavebeenusingtheordvectortoassignpreordernumberstotheverticesintheorderthatwestartprocessingthem.Wecan also assign postorder numbers to vertices, in the order that we finishprocessingthem(justbeforereturningfromtherecursivesearchfunction).Whenprocessingagraph,wedomore thansimply traverse thevertices—asweshallsee,thepreorderandpostordernumberinggiveusknowledgeaboutglobalgraphproperties that helps us to accomplish the task at hand. Preorder numberingsufficesforthealgorithmsthatweconsiderinthischapter,butweusepostordernumberinginlaterchapters.We describe the dynamics ofDFS for a general undirected graphwith aDFSforest that has oneDFS tree for each connected component.An example of aDFSforestisillustratedinFigure18.11.With an adjacency-lists representation, we visit the edges connected to eachvertexinanorderdifferentfromthatfortheadjacency-matrixrepresentation,sowe get a different DFS forest, as illustrated in Figure 18.12. DFS trees andforestsaregraphrepresentationsthatdescribenotonlythedynamicsofDFSbutalsotheinternalrepresentation

Figure18.12AnotherDFSforest

Thisforestdescribesdepth-firstsearchofthesamegraphasFigure18.11,butusinganadjacency-listrepresentation,sothesearchorderisdifferentbecauseitisdeterminedbytheorderthatnodesappearinadjacencylists.Indeed,theforestitselftellsusthatorder:itistheorderinwhichchildrenarelistedforeachnodeinthetree.Forinstance,thenodeson0’sadjacencylistwerefoundintheorder5216,thenodeson4’slistareintheorder653,andsoforth.Asbefore,all

verticesandedgesinthegraphareexaminedduringthesearch,inamannerthatispreciselydescribedbyapreorderwalkofthetree.Theordandstvectors

dependuponthegraphrepresentationandthesearchdynamicsandaredifferentfromFigure18.11,butthevectorccdependsongraphpropertiesandisthe

same.

ofthegraph.Forexample,byreadingthechildrenofanynodeinFigure18.12fromlefttoright,weseetheorderinwhichtheyappearontheadjacencylistofthevertexcorrespondingtothatnode.WecanhavemanydifferentDFSforestsfor the same graph—each ordering of nodes on adjacency lists leads to adifferentforest.Details of the structureof aparticular forest informourunderstandingofhowDFS operates for particular graphs, butmost of the importantDFS propertiesthatweconsiderdependongraphpropertiesthatareindependentofthestructureoftheforest.Forexample,theforestsinFigures18.11and18.12bothhavethreetrees(aswouldanyotherDFSforestforthesamegraph)becausetheyarejustdifferent representations of the same graph, which has three connectedcomponents.Indeed,adirectconsequenceofthebasicproofthatDFSvisitsallthe nodes and edges of a graph (see Properties 18.2 through 18.4) is that the

numberofconnectedcomponentsinthegraphisequaltothenumberoftreesintheDFS forest. This example illustrates the basis for our use of graph searchthroughoutthebook:Abroadvarietyofgraph-processingclassimplementationsare based on learning graph properties by processing a particular graphrepresentation(aforestcorrespondingtothesearch).Potentially, we could analyze DFS tree structures with the goal of improvingalgorithm performance. For example, should we attempt to speed up analgorithmbyrearrangingtheadjacencylistsbeforestartingthesearch?Formanyof the importantclassicalDFS-basedalgorithms, theanswer to thisquestion isno,becausetheyareoptimal—theirworst-caserunningtimedependsonneitherthegraph structurenor theorder inwhichedgesappearon theadjacency lists(they essentially process each edge exactly once). Still, DFS forests have acharacteristicstructurethatisworthunderstandingbecauseitdistinguishesthemfromotherfundamentalsearchschemathatweconsiderlaterinthischapter.Figure 18.13 shows a DFS tree for a larger graph that illustrates the basiccharacteristics of DFS search dynamics. The tree is tall and thin, and itdemonstratesseveralcharacteristicsofthegraphbeingsearchedandoftheDFSprocess.

•Thereexistsatleastonelongpaththatconnectsasubstantialfractionofthenodes.•Duringthesearch,mostverticeshaveat leastoneadjacentvertexthatwehavenotyetseen.•Werarelymakemorethanonerecursivecallfromanynode.•Thedepthoftherecursionisproportionaltothenumberofverticesinthegraph.

ThisbehavioristypicalforDFS,thoughthesecharacteristicsarenotguaranteedfor all graphs. Verifying facts of this kind for graph models of interest andvarious typesof graphs that arise in practice requires detailed study.Still, thisexamplegivesanintuitivefeelforDFS-basedalgorithmsthatisoftenborneoutin practice.Figure18.13 and similar figures for other graph-search algorithms(seeFigures18.24and18.29)helpusunderstanddifferencesintheirbehavior.

Exercises18.15DrawtheDFSforestthatresultsfromastandardadjacency-matrixDFSofthegraph

3-71-47-80-55-23-82-90-64-92-66-4.

18.16DrawtheDFSforestthatresultsfromastandardadjacency-listsDFSofthegraph

3-71-47-80-55-23-82-90-64-92-66-4.18.17WriteaDFStraceprogramtoproduceoutputthatclassifieseachofthetworepresentationsofeachgraphedgeascorrespondingtoatree,parent,back,ordownlinkintheDFStree,inthestyleofFigure18.10.•18.18Write aprogram that computes aparent-link representationof the fullDFStree(includingtheexternalnodes),usinganvectorofEintegersbetween0andV−1.Hint:ThefirstVentriesinthevectorshouldbethesameasthoseinthestvectordescribedinthetext.

Figure18.13Depth-firstsearch

ThisfigureillustratestheprogressofDFSinarandomEuclideannear-neighborgraph(left).ThefiguresshowtheDFStreeverticesandedgesinthegraphasthesearchprogressesthrough1/4,1/2,3/4,andallofthevertices(toptobottom).

TheDFStree(treeedgesonly)isshownattheright.Asisevidentfromthisexample,thesearchtreeforDFStendstobequitetallandthinforthistypeof

graph(asitisformanyothertypesofgraphscommonlyencounteredinpractice).Wenormallyfindavertexnearbythatwehavenotseenbefore.

• 18.19 Instrument the spanning-forest DFS class (Program 18.3) by addingmember functions (and appropriate private data members) that return theheight of the tallest tree in the forest, the number of back edges, and thepercentageofedgesprocessedtoseeeveryvertex.

• 18.20 Run experiments to determine empirically the average values of thequantitiesdescribedinExercise18.19forgraphsofvarioussizes,drawnfromvariousgraphmodels(seeExercises17.64–76).

• 18.21Write a function that builds a graph by inserting edges from a givenvectorintoaninitiallyemptygraph,inrandomorder.Usingthisfunctionwithan adjacency-lists implementation of the graph ADT, run experiments todetermineempiricallypropertiesofthedistributionofthequantitiesdescribedin Exercise 18.19 for all the adjacency-lists representations of large samplegraphs of various sizes, drawn from various graph models (see Exercises17.64–76).

18.5DFSAlgorithmsRegardlessofthegraphstructureortherepresentation,anyDFSforestallowsusto identify edges as tree or back edges and gives us dependable insights intographstructurethatallowustouseDFSasabasisforsolvingnumerousgraph-processing problems. We have already seen, in Section 17.7, basic examplesrelated to findingpaths. In this section,weconsiderDFS-basedADTfunctionimplementations for these andother typicalproblems; in the remainderof thischapterandinthenextseveralchapters,welookatnumeroussolutionstomuchmoredifficultproblems.CycledetectionDoes a given graphhave any cycles? (Is the graph a forest?)Thisproblem is easy to solvewithDFSbecauseanybackedge in aDFS treebelongs toacycleconsistingof theedgeplus the treepathconnecting the twonodes(seeFigure18.9).Thus,wecanuseDFSimmediatelytocheckforcycles:Agraphisacyclicifandonlyifweencounternoback(ordown!)edgesduringaDFS.Forexample,totestthisconditioninProgram18.1,wesimplyaddanelseclause to the if statement to testwhether t is equal to v. If it is,we have justencounteredtheparentlinkw-v(thesecondrepresentationoftheedgev-wthatledustow).Ifitisnot,w-tcompletesacyclewiththeedgesfromtdowntow

intheDFStree.Moreover,wedonotneedtoexaminealltheedges:Weknowthat we must find a cycle or finish the search without finding one beforeexaminingVedges,becauseanygraphwithVormoreedgesmusthaveacycle.Thus,wecantestwhetheragraphisacyclicintimeproportionaltoVwiththeadjacency-listsrepresentation,althoughwemayneedtimeproportionaltoV2(tofindtheedges)withtheadjacency-matrixrepresentation.SimplepathGiventwovertices,isthereapathinthegraphthatconnectsthem?Wesaw inSection17.7 that aDFSclass that can solve this problem in lineartimeiseasytodevise.SimpleconnectivityAsdiscussedinSection18.3,wedeterminewhetherornotagraph isconnectedwheneverweuseDFS, in linear time. Indeed,ourgraph-search strategy is based upon calling a search function for each connectedcomponent. In a DFS, the graph is connected if and only if the graph-searchfunction calls the recursive DFS function just once (see Program 18.2). Thenumberofconnectedcomponentsinthegraphispreciselythenumberoftimesthat the recursive function is called from GRAPHsearch, so we can find thenumberofconnectedcomponentsbysimplykeepingtrackofthenumberofsuchcalls.Moregenerally,Program18.4illustratesaDFSclassthatsupportsconstant-timeconnectivityqueriesaftera linear-timepreprocessingstep in theconstructor. Itvisitsvertices in the sameorder asdoesProgram18.3.The recursive functionusesavertexasitssecondargumentinsteadofanedge,sinceitdoesnotneedtoknow the identity of the parent. Each tree in the DFS forest identifies aconnectedcomponent,sowearrangetodecidequicklywhethertwoverticesarein the same component by including a vertex-indexed vector in the graphrepresentation,tobefilledinbyaDFSandaccessedforconnectivityqueries.Inthe recursive DFS function, we assign the current value of the componentcounter to the entry corresponding to eachvertexvisited.Then,weknow thattwoverticesareinthesamecomponentifandonlyiftheirentriesinthisvectorareequal.Again,notethatthisvectorreflectsstructuralpropertiesofthegraph,ratherthanartifactsofthegraphrepresentationorofthesearchdynamics.Program 18.4 typifies the basic approach that we shall use in addressingnumerousgraph-processingtasks.Wedevelopatask-specificclasssothatclientscancreateobjectstoperformthetask.Typically,weinvestpreprocessingtimeintheconstructortocomputeprivatedataaboutrelevantstructuralgraphpropertiesthat help us to provide efficient implementations of public query functions. Inthiscase,theconstructor

Program18.4GraphconnectivityTheCCconstructorcomputes,inlineartime,thenumberofconnectedcomponentsinagraphandstoresacomponent indexassociatedwitheachvertex in theprivatevertex-indexed vector id. Clients can use a CC object to find the number of connectedcomponents (count) or test whether any pair of vertices are connected (connect), inconstanttime.

template<classGraph>classCC

{constGraph&G;

intccnt;

vector<int>id;

voidccR(intw)

{

id[w]=ccnt;


for(intv=A.beg();!A.end();v=A.nxt())

if(id[v]==-1)ccR(v);

}

public:

CC(constGraph&G):G(G),ccnt(0),id(G.V(),-1)

{


if(id[v]==-1){ccR(v);ccnt++;}

}

intcount()const{returnccnt;}

boolconnect(ints,intt)const

{returnid[s]==id[t];}

};

preprocesses with a (linear-time) DFS and keeps a private data member (thevertex-indexed vector id) that allows it to answer connectivity queries inconstanttime.Forothergraph-processingproblems,ourconstructorsmightusemore space, preprocessing time, or query time. As usual, our focus is onminimizing such costs, although doing so is often challenging. For example,muchofChapter19isdevotedtosolvingtheconnectivityproblemfordigraphs,where achieving linear time preprocessing and constant query time, as inProgram18.4,isanelusivegoal.

Program18.5Two-wayEulertourThisDFSclassprintseachedgetwice,onceineachorientation,inatwo-way–Euler-tourorder.Wegobackandforthonbackedgesandignoredownedges(seetext).ItisderivedfromtheSEARCHbaseclassinProgram18.2.


classEULER:publicSEARCH<Graph>

{

voidsearchC(Edgee)

{intv=e.v,w=e.w;

ord[w]=cnt++;

cout<<“-”<<w;



if(ord[t]==-1)searchC(Edge(w,t));

elseif(ord[t]<ord[v])

cout<<“-”<<t<<“-”<<w;

if(v!=w)cout<<“-”<<v;elsecout<<endl;

}

public:

EULER(constGraph&G):SEARCH<Graph>(G)

{search();}

};

How does the DFS-based solution for graph connectivity in Program 18.4comparewith theunion-findapproach thatweconsidered inChapter1 for theproblem of determining whether a graph is connected, given an edge list? Intheory, DFS is faster than union-find because it provides a constant-timeguarantee,whichunion-finddoesnot; inpractice, thisdifference is negligible,andunion-findisfasterbecauseitdoesnothavetobuildafullrepresentationofthe graph. More important, union-find is an online algorithm (we can checkwhethertwoverticesareconnectedinnear-constanttimeatanypoint),whereasthe DFS solution preprocesses the graph to answer connectivity queries inconstant time.Therefore, for example,weprefer union-findwhendeterminingconnectivityisouronlytaskorwhenwehavealargenumber

Figure18.14Atwo-wayEulertour

Depth-firstsearchgivesusawaytoexploreanymaze,traversingbothpassagesineachdirection.WemodifyTrémauxexplorationtotakethestringwithus

whereverwegoandtakeaback-and-forthtriponpassageswithoutanystringinthemthatgotointersectionsthatwehavealreadyvisited.ThisfigureshowsadifferenttraversalorderthanshowninFigures18.2and18.3,primarilysothatwecandrawthetourwithoutcrossingitself.Thisorderingmightresult,for

example,iftheedgeswereprocessedinsomedifferentorderwhenbuildinganadjacency-listsrepresentationofthegraph;or,wemightexplicitlymodifyDFStotakethegeometricplacementofthenodesintoaccount(seeExercise18.26).Movingalongthelowertrackleadingoutof0,wemovefrom0to2to6to4to7,thentakeatripfrom7to0andbackbecauseord[0]islessthanord[7].Thenwegoto1,backto7,backto4,to3,to5,from5to0andback,from5to4and

back,backto3,backto4,backto6,backto2,andbackto0.Thispathmaybeobtainedbyarecursivepre-andpostorderwalkoftheDFStree(ignoringthe

shadedverticesthatrepresentthesecondtimeweencountertheedges)whereweprintoutthevertexname,recursivelyvisitthesubtrees,thenprintoutthevertex

nameagain.

ofqueriesintermixedwithedgeinsertionsbutmayfindtheDFSsolutionmoreappropriate for use in a graphADTbecause itmakes efficient use of existinginfrastructure.Neitherapproachhandlesefficientlyhugenumbersofintermixededgeinsertions,edgedeletions,andconnectivityqueries;bothrequireaseparateDFStocomputethepath.Theseconsiderationsillustratethecomplicationsthatwefacewhenanalyzinggraphalgorithms;weexplorethemindetailinSection18.9.Two-wayEulertourProgram18.5isaDFS-basedclassforfindingapaththatusesalltheedgesinagraphexactlytwice—onceineachdirection(seeSection17.7). The path corresponds to a Trémaux exploration in which we take ourstringwithuseverywherethatwego,checkforthestringinsteadofusinglights(so we have to go down the passages that lead to intersections that we havealreadyvisited),andfirstarrangetogobackandforthoneachbacklink(thefirsttimethatweencountereachbackedge),thenignoredownlinks(thesecondtimethatwe encounter each back edge).Wemight also choose to ignore the backlinks(firstencounter)andtogobackandforthondownlinks(secondencounter)(seeExercise18.25).SpanningforestGivenaconnectedgraphwithVvertices, findasetofV−1edgesthatconnectsthevertices.IfthegraphhasCconnectedcomponents,findaspanningforest(withV-Cedges).WehavealreadyseenaDFSclassthatsolvesthisproblem:Program18.3.VertexsearchHowmany vertices are in the same connected component as agivenvertex?WecansolvethisproblemeasilybystartingaDFSatthevertexandcountingthenumberofverticesmarked.Inadensegraph,wecanspeeduptheprocessconsiderablybystoppingtheDFSafterwehavemarkedVvertices—atthatpoint,weknowthat

Figure18.15Two-coloringaDFStree

Totwo-coloragraph,wealternatecolorsaswemovedowntheDFStree,thencheckthebackedgesforinconsistencies.Inthetreeatthetop,aDFStreeforthesamplegraphillustratedinFigure18.9,thebackedges5-4and7-0provethatthegraphisnottwo-colorablebecauseoftheodd-lengthcycles4-3-5-4and0-2-6-4-7-0,respectively.Inthetreeatthebottom,aDFStreeforthebipartitegraphillustratedinFigure17.5,therearenosuchinconsistencies,andtheindicated

shadingisatwo-coloringofthegraph.

Program18.6Two-colorability(bipartiteness)Theconstructor in thisDFSclasssetsOKtotrueifandonlyif it isable toassignthevalues0or1tothevertex-indexedvectorvcsuchthat,foreachgraphedgev-w,vc[v]andvc[w]aredifferent.

noedgewilltakeustoavertexthatwehavenotyetseen,sowewillbeignoringall the rest of the edges. This improvement is likely to allow us to visit allverticesintimeproportionaltoVlogV,notE(seeSection18.8).Two-colorability,bipartiteness,oddcycleIsthereawaytoassignoneoftwocolorstoeachvertexofagraphsuchthatnoedgeconnectstwoverticesofthesamecolor? Is agivengraphbipartite (seeSection17.1)?Does a givengraphhavea cycleofodd length?These three problems are all equivalent:The firsttwo are different nomenclature for the same problem; any graphwith an oddcycleisclearlynottwo-colorable,andProgram18.6demonstratesthatanygraphthatisfreeofoddcyclescanbetwo-colored.TheprogramisaDFS-basedADTfunction implementation that testswhether a graph is bipartite, two-colorable,and free of odd cycles. The recursive function is an outline for a proof byinductionthattheprogramtwo-colorsanygraphwithnooddcycles(orfindsanoddcycleasevidencethatagraphthatisnotfreeofoddcyclescannotbetwo-

colored).To two-coloragraphwithagivencolorassigned toavertexv, two-colortheremaininggraph,assigningtheothercolortoeachvertexadjacenttov.Thisprocessisequivalenttoassigningalternatecolorsonlevelsasweproceeddown the DFS tree, checking back edges for consistency in the coloring, asillustratedinFigure18.15.Anybackedgeconnectingtwoverticesof thesamecolorisevidenceofanoddcycle.ThesebasicexamplesillustratewaysinwhichDFScangiveusinsightintothestructureofagraph.Theyalsodemonstratethatwecanlearnvariousimportantgraph properties in a single linear-time sweep through the graph, where weexamineeveryedgetwice,onceineachdirection.Next,weconsideranexamplethat shows the utility of DFS in discovering more intricate details about thegraphstructure,stillinlineartime.

Exercises•18.22ImplementaDFS-basedcycle-testingclassthatpreprocessesagraphintimeproportional toV in theconstructor tosupportpublicmemberfunctionsfor detectingwhether a graphhas any cycles and for printing a cycle if oneexists.18.23 Describe a family of graphs with V vertices for which a standardadjacency-matrixDFSrequirestimeproportionaltoV2forcycledetection.• 18.24 Implement the graph-connectivity class of Program 18.4 as a derivedgraph-searchclass,likeProgram18.3.

• 18.25 Specify amodification to Program 18.5 that will produce a two-wayEuler tour that does the back-and-forth traversal on down edges instead ofbackedges.

•18.26ModifyProgram18.5suchthatitalwaysproducesatwo-wayEulertourthat,liketheoneinFigure18.14,canbedrawnsuchthatitdoesnotcrossitselfatanyvertex.Forexample,ifthesearchinFigure18.14weretotaketheedge4-3beforetheedge4-7,thenthetourwouldhavetocrossitself;yourtaskistoensurethatthealgorithmavoidssuchsituations.18.27 Develop a version of Program 18.5 that sorts the edges of a graph inorderofa two-wayEuler tour.Yourprogramshould returnavectorofedgesthatcorrespondstoatwo-wayEulertour.18.28Provethatagraphistwo-colorableifandonlyifitcontainsnooddcycle.Hint: Prove by induction that Program 18.6 determines whether or not anygivengraphistwo-colorable.

•18.29ExplainwhytheapproachtakeninProgram18.6doesnotgeneralizetogiveanefficientmethodfordeterminingwhetheragraphisthree-colorable.18.30Mostgraphsarenot two-colorable,andDFStendstodiscover thatfactquickly. Run empirical tests to study the number of edges examined byProgram18.6, for graphsof various sizes, drawn fromvariousgraphmodels(seeExercises17.64–76).•18.31Provethateveryconnectedgraphhasavertexwhoseremovalwillnotdisconnectthegraph,andwriteaDFSfunctionthatfindssuchavertex.Hint:ConsidertheleavesoftheDFStree.18.32 Prove that every graph with more than one vertex has at least twoverticeswhoseremovalwillnotincreasethenumberofconnectedcomponents.

18.6SeparabilityandBiconnectivityToillustratethepowerofDFSasthebasisforgraph-processingalgorithms,weturn to problems related to generalized notions of connectivity in graphs.Westudyquestionssuchasthefollowing:Giventwovertices,aretheretwodifferentpathsconnectingthem?If it is important that agraphbe connected in somesituation, itmightalsobeimportantthatitstayconnectedwhenanedgeoravertexisremoved.Thatis,wemaywanttohavemorethanoneroutebetweeneachpairofverticesinagraph,soastohandlepossiblefailures.Forexample,wecanflyfromNewYorktoSanFranciscoevenifChicagoissnowedinbygoingthroughDenverinstead.Or,wemight imagine a wartime situation where we want to arrange our railroadnetwork such that an enemymust destroy at least two stations to cut our raillines.Similarly,wemightexpectthemaincommunicationslinesinanintegratedcircuit or a communications network to be connected such that the rest of thecircuitstillcanfunctionifonewireisbrokenoronelinkisdown.These examples illustrate two distinct concepts: In the circuit and in thecommunications network,we are interested in staying connected if an edge isremoved; in the air or train routes,we are interested in staying connected if avertexisremoved.Webeginbyexaminingtheformerindetail.

Figure18.16Anedge-separablegraph

Thisgraphisnotedgeconnected.Theedges0-5,6-7,and11-12(shaded)areseparatingedges(bridges).Thegraphhas4edge-connectedcomponents:onecomprisingvertices0,1,2,and6;anothercomprisingvertices3,4,9,and11;

anothercomprisingvertices7,8,and10;andthesinglevertex12.

Definition18.1Abridgeinagraphisanedgethat,ifremoved,wouldseparateaconnectedgraph into twodisjointsubgraphs.Agraph thathasnobridges issaidtobeedge-connected.Whenwespeakofremovinganedge,wemeantodeletethatedgefromthesetofedgesthatdefinethegraph,evenwhenthatactionmightleaveoneorbothoftheedge’svertices isolated.Anedge-connectedgraphremainsconnectedwhenweremoveanysingleedge. Insomecontexts, it ismorenatural toemphasizeourabilitytodisconnectthegraphratherthanthegraph’sabilitytostayconnected,sowefreelyusealternateterminologythatprovidesthisemphasis:Werefertoagraphthatisnotedge-connectedasanedge-separablegraph,andwecallbridgesseparationedges. Ifwe removeall thebridges inanedge-separablegraph,wedivide it into edge-connected components or bridge-connected components:maximal subgraphs with no bridges. Figure 18.16 is a small example thatillustratestheseconcepts.Finding the bridges in a graph seems, at first blush, to be a nontrivial graph-processing problem, but it actually is an application of DFS where we canexploitbasicpropertiesoftheDFStreethatwehavealreadynoted.Specifically,backedgescannotbebridgesbecauseweknowthatthetwonodestheyconnectarealsoconnectedbyapathintheDFStree.Moreover,wecanaddasimpletesttoourrecursiveDFSfunctiontotestwhetherornottreeedgesarebridges.Thebasicidea,statedformallynext,isillustratedinFigure18.17.Property18.5InanyDFStree,atreeedgev-wisabridgeifandonlyiftherearenobackedgesthatconnectadescendantofwtoanancestorofw.Proof:Ifthereissuchanedge,v-wcannotbeabridge.Conversely,ifv-wisnotabridge,thentherehastobesomepathfromwtovinthegraphotherthanw-vitself.Everysuchpathhastohavesomesuchedge.•

Asserting thisproperty isequivalent to saying that theonly link in thesubtreerootedatwthatpointstoanodenotinthesubtreeistheparentlinkfromwbacktov.Thisconditionholdsifandonlyifeverypathconnectinganyofthenodesinw’ssubtreetoanynodethat

Figure18.17DFStreeforfindingbridges

Nodes5,7,and12inthisDFStreeforthegraphinFigure18.16allhavethepropertythatnobackedgeconnectsadescendantwithanancestor,andnoothernodeshavethatproperty.Therefore,asindicated,breakingtheedgebetweenoneofthesenodesanditsparentwoulddisconnectthesubtreerootedatthatnodefromtherestofthegraph.Thatis,theedges0-5,11-12,and6-7arebridges.Weusethevertex-indexedarraylowtokeeptrackofthelowestpreordernumber(ordvalue)referencedbyanybackedgeinthesubtreerootedatthevertex.Forexample,thevalueoflow[9]is2becauseoneofthebackedgesinthesubtreerootedat9pointsto4(thevertexwithpreordernumber2),andnootherbackedgepointshigherinthetree.Nodes5,7,and12aretheonesforwhichthelow

valueisequaltotheordvalue.

is not in w’s subtree includes v-w. In other words, removing v-w woulddisconnectfromtherestofthegraphthesubgraphcorrespondingtow’ssubtree.Program18.7showshowwecanaugmentDFS to identifybridges inagraph,using Property 18.5. For every vertex v, we use the recursive function tocomputethelowestpreordernumberthatcanbereachedbyasequenceofzero

ormoretreeedgesfollowedbyasinglebackedgefromanynodeinthesubtreerootedatv.Ifthecomputednumberisequaltov’spreordernumber,thenthereisno edge connecting a descendant with an ancestor, and we have identified abridge.Thecomputationforeachvertexisstraightforward:Weproceedthroughthe adjacency list, keeping track of theminimum of the numbers thatwe canreach by following each edge. For tree edges, we do the computationrecursively;forbackedges,weusethepreordernumberoftheadjacentvertex.Ifthecalltotherecursivefunctionforanedgew-tdoesnotuncoverapathtoanodewithapreordernumberlessthant’spreordernumber,thenw-tisabridge.Property18.6Wecanfindagraph’sbridgesinlineartime.Proof:Program18.7isaminormodificationtoDFSthatinvolvesaddingafewconstant-time tests, so it follows directly from Properties 18.3 and 18.4 thatfindingthebridgesinagraphrequirestimeproportionaltoV2fortheadjacency-matrixrepresentationandtoV+Efortheadjacency-listsrepresentation.•

Program18.7EdgeconnectivityThisDFSclasscountsthebridgesinagraph.ClientscanuseanECobjecttofindthenumberofedge-connectedcomponents;addingamember function for testingwhethertwo vertices are in the same edge-connected component is left as an exercise (seeExercise18.36).Thelowvectorkeepstrackofthelowestpreordernumberthatcanbereachedfromeachvertexbyasequenceoftreeedgesfollowedbyonebackedge.

In Program 18.7, we useDFS to discover properties of the graph. The graphrepresentationcertainlyaffectstheorderofthesearch,butitdoesnotaffecttheresultsbecausebridgesareacharacteristicof thegraph rather thanof thewaythatwechoosetorepresentorsearch

Figure18.18AnotherDFStreeforfindingbridges

ThisdiagramshowsadifferentDFStreethantheoneinFigure18.17forthegraphinFigure18.16,wherewestartthesearchatadifferentnode.Althoughwevisitthenodesandedgesinacompletelydifferentorder,westillfindthesamebridges(ofcourse).Inthistree,0,7,and11aretheonesforwhichthelowvalueisequaltotheordvalue,sotheedgesconnectingeachofthemtotheirparents

(12-11,5-0,and6-7,respectively)arebridges.

thegraph.Asusual,anyDFStreeissimplyanotherrepresentationofthegraph,soallDFS treeshave the sameconnectivityproperties.Thecorrectnessof thealgorithm depends on this fundamental fact. For example, Figure 18.18illustratesadifferentsearchofthegraph,startingfromadifferentvertex,that(ofcourse) finds the same bridges. Despite Property 18.6, when we examinedifferent DFS trees for the same graph, we see that some search costs maydependnotjustonpropertiesofthegraphbutalsoonpropertiesoftheDFStree.Forexample,theamountofspaceneededforthestacktosupporttherecursivecalls is larger for the example in Figure18.18 than for the example in Figure18.17.AswedidforregularconnectivityinProgram18.4,wemaywishtouseProgram

18.7tobuildaclassfortestingwhetheragraphisedge-connectedortocountthenumber of edge-connected components. If desired, we can proceed as forProgram18.4togivesclientstheabilitytocreate(inlineartime)objectsthatcanrespondinconstanttimetoqueriesthataskwhethertwoverticesareinthesameedge-connectedcomponent(seeExercise18.36).We conclude this section by considering other generalizations of connectivity,including the problem of determiningwhich vertices are critical to keeping agraphconnected.Byincludingthismaterialhere,wekeepinoneplacethebasicbackground material for the more complex algorithms that we consider inChapter22. Ifyouarenew toconnectivityproblems,youmaywish to skip toSection18.7andreturnherewhenyoustudyChapter22.When we speak of removing a vertex, we also mean that we remove all itsincidentedges.AsillustratedinFigure18.19,removingeitheroftheverticesonabridgewoulddisconnectagraph(unlessthebridgeweretheonlyedgeincidentononeorbothof thevertices),but therearealsoothervertices,notassociatedwithbridges,thathavethesameproperty.Definition18.2Anarticulationpoint in a graph is a vertex that, if removed,wouldseparateaconnectedgraphintoatleasttwodisjointsubgraphs.We also refer to articulation points as separation vertices or cut vertices. Wemightusetheterm“vertexconnected”todescribeagraphthathasnoseparationvertices, but we use different terminology based on a related characterizationthatturnsouttobeequivalent.Definition18.3A graph is said to bebiconnected if every pair of vertices isconnectedbytwodisjointpaths.The requirement that thepaths bedisjoint iswhat distinguishesbiconnectivityfromedgeconnectivity.Analternatedefinitionofedgeconnectivityisthateverypairofverticesisconnectedbytwoedge-disjointpaths—thesepathscanhaveavertex (but no edge) in common. Biconnectivity is a stronger condition: Anedge-connected graph remains connected if we remove any edge, but abiconnected graph remains connected if we remove any vertex (and all thatvertex’s incident edges). Every biconnected graph is edge-connected, but anedge-connectedgraphneednotbebiconnected.Wealsousethetermseparableto refer tographs that arenotbiconnected,because theycanbe separated intotwopiecesbyremovalofjustonevertex.Theseparationverticesarethekeytobiconnectivity.Property18.7Agraphisbiconnectedifandonlyifithasnoseparationvertices

(articulationpoints).Proof:Assumethatagraphhasaseparationvertex.Letsandtbeverticesthatwouldbeintwodifferentpiecesiftheseparationvertexwereremoved.Allpathsbetweens and tmust contain the separation vertex, therefore the graph is notbiconnected. The proof in the other direction is more difficult and is aworthwhile exercise for the mathematically inclined reader (see Exercise18.40).•

Figure18.19Graphseparabilityterminology

Thisgraphhastwoedge-connectedcomponentsandonebridge.Theedge-connectedcomponentabovethebridgeisalsobiconnected;theonebelowthebridgeconsistsoftwobiconnectedcomponentsthatarejoinedatanarticulation

point.

Figure18.20Articulationpoints(separationvertices)

Thisgraphisnotbiconnected.Thevertices0,4,5,6,7,and11(shaded)arearticulationpoints.Thegraphhasfivebiconnectedcomponents:onecomprisingedges4-9,9-11,and4-11;anothercomprisingedges7-8,8-10,and7-10;anothercomprisingedges0-1,1-2,2-6,and6-0;anothercomprisingedges3-5,4-5,and

3-4;andthesinglevertex12.Addinganedgeconnecting12to7,8,or10wouldbiconnectthegraph.

Wehave seen thatwecanpartition theedgesof agraph that isnot connectedintoasetofconnectedsubgraphs,andthatwecanpartitiontheedgesofagraphthat is not edge-connected into a set of bridges and edge-connected subgraphs(whichareconnectedbybridges).Similarly,wecandivideanygraphthatisnotbiconnected intoa setofbridgesandbiconnectedcomponents,whichare eachbiconnected subgraphs. The biconnected components and bridges are not aproperpartitionofthegraphbecausearticulationpointsmayappearonmultiplebiconnected components (see, for example, Figure 18.20). The biconnectedcomponentsareconnectedatarticulationpoints,perhapsbybridges.A connected component of a graph has the property that there exists a pathbetweenany twovertices in thegraph.Analogously,abiconnectedcomponenthasthepropertythatthereexisttwodisjointpathsbetweenanypairofvertices.We can use the same DFS-based approach that we used in Program 18.7 todeterminewhetherornotagraphisbiconnectedandtoidentifythearticulationpoints.We omit the code because it is very similar to Program 18.7, with anextratesttocheckwhethertherootoftheDFStreeisanarticulationpoint(seeExercise 18.43). Developing code to print out the biconnected components isalso a worthwhile exercise that is only slightly more difficult than thecorrespondingcodeforedgeconnectivity(seeExercise18.44).Property 18.8 We can find a graph’s articulation points and biconnectedcomponentsinlineartime.Proof: As for Property 18.7, this fact follows from the observation that thesolutionstoExercises18.43and18.44involveminormodificationstoDFSthatamounttoaddingafewconstant-timetestsperedge.Biconnectivity generalizes simple connectivity. Further generalizations havebeenthesubjectsofextensivestudiesinclassicalgraphtheoryandinthedesignof graph algorithms. These generalizations indicate the scope of graph-processingproblemsthatwemightface,manyofwhichareeasilyposedbutlesseasilysolved.Definition 18.4 Agraphis k-connected if there are at least k vertex-disjointpathsconnectingeverypairofverticesinthegraph.Thevertexconnectivityofagraphistheminimumnumberofverticesthatneedtoberemovedtoseparateitintotwopieces.

In this terminology, “1-connected” is the same as “connected” and “2-connected”isthesameas“biconnected.”Agraphwithanarticulationpointhasvertexconnectivity1(or0),soProperty18.7saysthatagraphis2-connectedifand only if its vertex connectivity is not less than 2. It is a special case of aclassicalresultfromgraphtheory,knownasWhitney’stheorem,whichsaysthatagraph isk -connected ifandonly if itsvertexconnectivity isnot less thank.Whitney’s theoremfollowsdirectly fromMenger’s theorem (seeSection 22.7),which says that the minimum number of vertices whose removal disconnectstwoverticesinagraphisequaltothemaximumnumberofvertex-disjointpathsbetweenthetwovertices(toproveWhitney’stheorem,applyMenger’stheoremtoeverypairofvertices).Definition18.5Agraphisk–edge-connectedifthereareatleastkedge-disjointpathsconnectingeverypairofverticesinthegraph.Theedgeconnectivityofagraph is theminimumnumberofedges thatneed tobe removed to separate itintotwopieces.In this terminology,“2–edge-connected” is thesameas“edge-connected” (thatis, an edge-connectedgraphhas edge connectivity greater than1, and a graphwithat leastonebridgehasedgeconnectivity1).AnotherversionofMenger’stheoremsaysthat theminimumnumberofverticeswhoseremovaldisconnectstwoverticesinagraphisequaltothemaximumnumberofvertex-disjointpathsbetweenthetwovertices,whichimpliesthatagraphisk–edge-connectedifandonlyifitsedgeconnectivityisk.With thesedefinitions,weare led togeneralize the connectivityproblems thatweconsideredatthebeginningofthissection.st-connectivity What is the minimum number of edges whose removal willseparate two given vertices s and t in a given graph?What is the minimumnumberofverticeswhoseremovalwillseparatetwogivenverticessandt inagivengraph?GeneralconnectivityIsagivengraphk-connected?Isagivengraphk–edge-connected?Whatistheedgeconnectivityandthevertexconnectivityofagivengraph?Although theseproblems aremuchmoredifficult to solve than are the simpleconnectivityproblemsthatwehaveconsideredinthissection,theyaremembersofalargeclassofgraph-processingproblemsthatwecansolveusingthegeneralalgorithmictoolsthatweconsiderinChapter22(withDFSplayinganimportantrole);weconsiderspecificsolutionsinSection22.7.

Exercises• 18.33 If a graph is a forest, all its edges are separation edges; but whichverticesareseparationvertices?


Drawthestandardadjacency-listsDFStree.Useittofindthebridgesandtheedge-connectedcomponents.18.35 Prove that every vertex in any graph belongs to exactly one edge-connectedcomponent.•18.36AddapublicmemberfunctiontoProgram18.7thatallowsclientstotestwhethertwoverticesareinthesameedge-connectedcomponent.


Draw the standard adjacency-lists DFS tree. Use it to find the articulationpointsandthebiconnectedcomponents.•18.38Dothepreviousexerciseusingthestandardadjacency-matrixDFStree.18.39Provethateveryedgeinagrapheitherisabridgeorbelongstoexactlyonebiconnectedcomponent.•18.40 Prove that any graphwith no articulation points is biconnected.Hint:Givenapairofverticessandtandapathconnectingthem,use thefact thatnoneoftheverticesonthepatharearticulationpointstoconstructtwodisjointpathsconnectingsandt.18.41 Derive a class from Program 18.2 for determiningwhether a graph isbiconnected,usingabrute-forcealgorithmthatrunsintimeproportional toV(V+E).Hint: Ifyoumarkavertexashavingbeenseenbeforeyoustart thesearch,youeffectivelyremoveitfromthegraph.•18.42ExtendyoursolutiontoExercise18.41toderiveaclassthatdetermineswhether a graph is 3-connected. Give a formula describing the approximatenumberoftimesyourprogramexaminesagraphedge,asafunctionofVandE.18.43ProvethattherootofaDFStreeisanarticulationpointifandonlyifithastwoormore(internal)children.• 18.44 Derive a class from Program 18.2 that prints the biconnectedcomponentsofthegraph.

18.45 What is the minimum number of edges that must be present in anybiconnectedgraphwithVvertices?18.46ModifyProgram18.7 to simplydeterminewhether or not the graph isedge-connected(returningassoonasit identifiesabridgeif thegraphisnot)and instrument it to keep track of the number of edges examined. Runempirical tests to study this cost, for graphs of various sizes, drawn fromvariousgraphmodels(seeExercises17.64–76).•18.47Derive a class fromProgram18.2 that allowsclients to createobjectsthat know the number of articulation points, bridges, and biconnectedcomponentsinagraph.

• 18.48 Run experiments to determine empirically the average values of thequantitiesdescribedinExercise18.47forgraphsofvarioussizes,drawnfromvariousgraphmodels(seeExercises17.64–76).18.49Givetheedgeconnectivityandthevertexconnectivityofthegraph0-10-20-82-12-88-13-83-73-63-53-44-64-55-66-77-8.

18.7Breadth-FirstSearchSupposethatwewanttofindashortestpathbetweentwospecificverticesinagraph—a path connecting the vertices with the property that no other pathconnecting those vertices has fewer edges. The classical method foraccomplishing this task, calledbreadth-first search (BFS), is also the basis ofnumerous algorithms for processing graphs, sowe consider it in detail in thissection.DFSoffersuslittleassistanceinsolvingthisproblem,becausetheorderinwhichittakesusthroughthegraphhasnorelationshiptothegoaloffindingshortestpaths.Incontrast,BFSisbasedonthisgoal.Tofindashortestpathfromvtow,westartatvandcheckforwamongalltheverticesthatwecanreachbyfollowing one edge, then we check all the vertices that we can reach byfollowingtwoedges,andsoforth.Whenwecometoapointduringagraphsearchwherewehavemorethanoneedgetotraverse,wechooseoneandsavetheotherstobeexploredlater.InDFS,weuseapushdownstack(thatismanagedbythesystemtosupporttherecursivesearch function) for this purpose. Using the LIFO rule that characterizes thepushdownstackcorrespondstoexploringpassagesthatareclosebyinamaze:We

Figure18.21Breadth-firstsearch

ThisfiguretracestheoperationofBFSonoursamplegraph.Webeginwithalltheedgesadjacenttothestartvertexonthequeue(topleft).Next,wemoveedge0-2fromthequeuetothetreeandprocessitsincidentedges2-0and2-6(secondfromtop,left).Wedonotput2-0onthequeuebecause0isalreadyonthetree.Third,wemoveedge0-5fromthequeuetothetree;again5’sincidentedge(to0)leadsnowherenew,butweadd5-3and5-4tothequeue(thirdfromtop,left).

Next,weadd0-7tothetreeandput7-1onthequeue(bottomleft).

The edge7-4 is printed in graybecausewe could also avoidputting it on thequeue, since there is another edge thatwill takeus to 4 that is alreadyon thequeue. To complete the search, we take the remaining edges off the queue,completely ignoring the gray edgeswhen they come to the front of the queue(right).Edgesenterandleavethequeueinorderoftheirdistancefrom0.choose, of the passages yet to be explored, the one that was most recentlyencountered.InBFS,wewant toexplore thevertices inorderof theirdistancefromthestart.Foramaze,doingthesearchinthisordermightrequireasearchteam;withinacomputerprogram,however,itiseasilyarranged:WesimplyuseaFIFOqueueinsteadofastack.Program18.8isanimplementationofBFS.Itisbasedonmaintainingaqueueofalledgesthatconnectavisitedvertexwithanunvisitedvertex.Weputadummyself-looptothestartvertexonthequeue,thenperformthefollowingstepsuntilthequeueisempty:

• Take edges from the queue until finding one that points to an unvisitedvertex.•Visit thatvertex;putonto thequeuealledges thatgo from thatvertex tounvisitedvertices.

Figure18.21showsthestep-by-stepdevelopmentofBFSonasamplegraph.AswesawinSection18.4,DFSisanalogous toonepersonexploringamaze.BFSisanalogoustoagroupofpeopleexploringbyfanningoutinalldirections.Although DFS and BFS are different in many respects, there is an essentialunderlyingrelationshipbetweenthetwomethods—onethatwenotedwhenwebriefly considered the methods in Chapter 5. In Section 18.8, we consider ageneralizedgraph-searchingmethodthatwecanspecializetoincludethesetwoalgorithms and a host of others. Each algorithm has particular dynamiccharacteristics thatwe use to solve associated graph-processing problems. ForBFS, thedistance fromeachvertex to the startvertex (the lengthofa shortestpathconnectingthetwo)isthekeypropertyofinterest.

Property18.9DuringBFS,verticesenterandleavetheFIFOqueueinorderoftheirdistancefromthestartvertex.Proof: A stronger property holds: The queue always consists of zero ormorevertices of distance k from the start, followed by zero or more vertices ofdistancek+1fromthestart,forsomeintegerk.Thisstrongerpropertyiseasytoprovebyinduction.ForDFS,weunderstood thedynamic characteristics of the algorithmwith theaid of the DFS search forest that describes the recursive-call structure of thealgorithm.An essential property of that forest is that the forest represents thepathsfromeachvertexbacktotheplace

Program18.8Breadth-firstsearchThis graph-search class visits a vertex by scanning through its incident edges, puttingany edges to unvisited vertices onto the queue of vertices to be visited. Vertices aremarkedinthevisitordergivenbythevectorord.Thesearchfunctionthatiscalledbytheconstructorbuildsanexplicitparent-link representationof theBFS tree (theedgesthat first takeus to eachnode) in another vector st,which canbeused to solvebasicshortest-pathsproblems(seetext).

that the search started for its connected component. As indicated in theimplementationandshowninFigure18.22,suchaspanningtreealsohelpsustounderstandBFS.AswithDFS,wehaveaforestthatcharacterizesthedynamicsof the search, one tree for each connected component, one tree node for eachgraph vertex, and one tree edge for each graph edge. BFS corresponds totraversingeachofthetreesinthisforestinlevelorder.AswithDFS,weuseavertex-indexedvector

Program18.9ImprovedBFSToguaranteethatthequeuethatweuseduringBFShasatmostVentries,wemarktheverticesasweputthemonthequeue.

voidsearchC(Edgee)

{QUEUE<Edge>Q;

Q.put(e);ord[e.w]=cnt++;

while(!Q.empty())

{

e=Q.get();st[e.w]=e.v;

typenameGraph::adjIteratorA(G,e.w);


if(ord[t]==-1)

{Q.put(Edge(e.w,t));ord[t]=cnt++;}

}

}

to represent explicitly the forestwithparent links.ForBFS, this forest carriesessentialinformationaboutthegraphstructure:Property18.10ForanynodewintheBFStreerootedatv,thetreepathfromvtowcorrespondstoashortestpathfromvtowinthecorrespondinggraph.Proof: The tree-path lengths from nodes coming off the queue to the root arenondecreasing,andallnodescloser to theroot thanw areon thequeue; sonoshorterpathtowwasfoundbeforeitcomesoffthequeue,andnopathtowthatis discovered after it comes off the queue can be shorter than w’s tree pathlength.•AsindicatedinFigure18.21andnotedinChapter5,thereisnoneedtoputanedgeonthequeuewiththesamedestinationvertexasanyedgealreadyonthequeue, since theFIFOpolicy ensures thatwewill process the old queue edge(andvisitthevertex)beforewegettothenewedge.Onewaytoimplementthispolicy is to use a queue ADT implementation where such duplication isdisallowedbyanignore-the-new-itempolicy(seeSection4.7).Anotherchoiceistouse theglobalvertex-markingvector for this purpose: Insteadofmarking avertexashavingbeenvisitedwhenwetakeitoffthequeue,

Figure18.22BFStree

ThistreeprovidesacompactdescriptionofthedynamicpropertiesofBFS,inamannersimilartothetreedepictedinFigure18.9.Traversingthetreeinlevel

ordertellsushowthesearchproceeds,stepbystep:firstwevisit0;thenwevisit2,5,and7;thenwecheckfrom2that0wasvisitedandvisit6;andsoforth.Eachtreenodehasachildrepresentingeachofthenodesadjacenttoit,intheordertheywereconsideredbytheBFS.AsinFigure18.9,linksintheBFStreecorrespondtoedgesinthegraph:ifwereplaceedgestoexternalnodesbylinestotheindicatednode,wehaveadrawingofthegraph.Linkstoexternalnodesrepresentedgesthatwerenotputontothequeuebecausetheyledtomarked

nodes:theyareeitherparentlinksorcrosslinksthatpointtoanodeeitheronthesameleveloronelevelclosertotheroot.

Thestvectorisaparent-linkrepresentationofthetree,whichwecanusetofinda shortest path fromanynode to the root.For example, 3-5-0 is a path in thegraph from3 to0, sincest[3] is5andst[5] is0.Nootherpath from3 to0 isshorter.we do so when we put it on the queue. Testing whether a vertex is marked(whetheritsentryhaschangedfromitsinitialsentinelvalue)thenstopsusfromputtinganyotheredgesthatpointtothesamevertexonthequeue.Thischange,showninProgram18.9,givesaBFSimplementationwheretherearenevermorethanVedgesonthequeue(oneedgepointingtoeachvertex,atmost).Property 18.11 BFS visits all the vertices and edges in a graph in timeproportionaltoV2for theadjacency-matrixrepresentationandtoV+Efor theadjacency-listsrepresentation.Proof: As we did in proving the analogous DFS properties, we note byinspectingthecodethatwecheckeachentryintheadjacency-matrixroworintheadjacencylistpreciselyonceforeveryvertexthatwevisit,soitsufficestoshow that we visit each vertex. Now, for each connected component, thealgorithm preserves the following invariant: All vertices that can be reachedfromthestartvertex(i)areontheBFStree,(ii)areonthequeue,or(iii)canbereachedfromavertexonthequeue.Eachvertexmovesfrom(iii)to(ii) to (i),andthenumberofverticesin(i)increasesoneachiterationoftheloop,sothattheBFS tree eventually contains all the vertices that can be reached from thestart vertex. Thus, as we did for DFS, we consider BFS to be a linear-timealgorithm.WithBFS,wecansolvethespanningtree,connectedcomponents,vertexsearch,andseveralotherbasicconnectivityproblemsthatwedescribedinSection18.4,sincethesolutionsthatweconsidereddependononlytheabilityofthesearchtoexamineeverynodeandedgeconnected to thestartingpoint.Asweshallsee,

BFS and DFS are representative of numerouus algorithms that have thisproperty.OurprimaryinterestinBFS,asmentionedattheoutsetofthissection,isthatitisthenaturalgraph-searchalgorithmforapplicationswherewewanttoknow a shortest path between two specified vertices. Next, we consider aspecificsolutiontothisproblemanditsextensiontosolvetworelatedproblems.ShortestpathFindashortestpathinthegraphfromvtow.WecanaccomplishthistaskbystartingaBFSthatmaintainstheparent-linkrepresentationstofthesearchtreeatv,thenstoppingwhenwereachw.Thepathupthetreefromwtovis a shortest path.For example, after construction aBFS<Graph> object bfs aclientcouldusethefollowingcodetoprintthepathconnectingwtov:

for(t=w;t!=v;t=bfs.ST(t))cout<<t<<“-”;

cout<<v<<endl;

Toget thepath fromv tow, replace the cout operations in this codeby stackpushes,thengointoaloopthatprintsthevertexindicesafterpoppingthemfromthestack.Or,startthesearchatwandstopatvinthefirstplace.Single-source shortest paths Find shortest paths connecting a given vertex vwitheachothervertexinthegraph.ThefullBFStreerootedatvprovidesawaytoaccomplishthistask:Thepathfromeachvertextotherootisashortestpathtotheroot.Therefore,tosolvetheproblem,werunBFStocompletionstartingat v. The st vector that results from this computation is a parent-linkrepresentationoftheBFStree,andthecodeinthepreviousparagraphwillgivetheshortestpathtoanyothervertexw.All-pairsshortestpathsFindshortestpathsconnectingeachpairofverticesinthegraph.ThewaytoaccomplishthistaskistouseaBFSclassthatsolvesthesingle-source problem for each vertex in the graph and supports memberfunctions that can handle huge numbers of shortest-path queries efficiently bystoringthepathlengthsandparent-linktreerepresentationsforeachvertex(seeFigure18.23).This preprocessing requires timeproportional toVE and spaceproportional to V2, a potentially prohibitive cost for huge sparse graphs.However, it allows us to build an ADT with optimal performance: Afterinvestinginthepreprocessing(andthespacetoholdtheresults),wecanreturnshortest-path lengths in constant time and the paths themselves in timeproportionaltotheirlength(seeExercise18.55).These BFS-based solutions are effective, but we do not considerimplementations in any further detail here, because they are special cases ofalgorithmsthatweconsider indetail inChapter21.The termshortestpaths ingraphs is generally taken to describe the corresponding problems for digraphs

andnetworks.Chapter21isdevotedtothistopic.ThesolutionsthatweexaminetherearestrictgeneralizationsoftheBFS-basedsolutionsdescribedhere.ThebasiccharacteristicsofBFSsearchdynamicscontrastsharplywiththoseforDFSsearch,asillustratedinthelargegraphdepictedinFigure18.24,whichyoushouldcomparewithFigure18.13.Thetree

Figure18.23All-pairsshortestpathsexample

ThesefiguresdepicttheresultofdoingBFSfromeachvertex,thuscomputingtheshortestpathsconnectingallpairsofvertices.EachsearchgivesaBFStreethatdefinestheshortestpathsconnectingallgraphverticestothevertexattheroot.Theresultsofallthesearchesaresummarizedinthetwomatricesatthebottom.Intheleftmatrix,theentryinrowvandcolumnwgivesthelengthoftheshortestpathfromvtow(thedepthofvinw’stree).Eachrowoftherightmatrixcontainsthestarrayforthecorrespondingsearch.Forexample,the

shortestpathfrom3to2hasthreeedges,asindicatedbytheentryinrow3andcolumn2oftheleftmatrix.ThethirdBFStreefromthetoponthelefttellsusthatthepathis3-4-6-2,andthisinformationisencodedinrow2intherightmatrix.Thematrixisnotnecessarilysymmetricwhenthereismorethanoneshortestpath,becausethepathsfounddependontheBFSsearchorder.For

example,theBFStreeatthebottomontheleftandrow3oftherightmatrixtellusthattheshortestpathfrom2to3is2-0-5-3.

Figure18.24Breadth-firstsearch

ThisfigureillustratestheprogressofBFSinrandomEuclideannear-neighborgraph(left),inthesamestyleasFigure18.13.Asisevidentfromthisexample,thesearchtreeforBFStendstobequiteshortandwideforthistypeofgraph(andmanyothertypesofgraphscommonlyencounteredinpractice).Thatis,verticestendtobeconnectedtooneanotherbyrathershortpaths.Thecontrast

betweentheshapesoftheDFSandBFStreesisstrikingtestimonytothedifferingdynamicpropertiesofthealgorithms.

is shallow and broad, and demonstrates a set of facts about the graph beingsearcheddifferentfromthoseshownbyDFS.Forexample,

•Thereexistsarelativelyshortpathconnectingeachpairofverticesinthegraph.• During the search, most vertices are adjacent to numerous unvisitedvertices.

Again, this example is typical of the behavior that we expect from BFS, butverifyingfactsofthiskindforgraphmodelsofinterestandgraphsthatariseinpracticerequiresdetailedanalysis.DFS wends its way through the graph, storing on the stack the points whereother paths branch off; BFS sweeps through the graph, using a queue torememberthefrontierofvisitedplaces.DFSexploresthegraphbylookingfornewverticesfarawayfromthestartpoint,takingcloserverticesonlywhendeadendsareencountered;BFScompletelycoverstheareaclosetothestartingpoint,moving farther away only when everything nearby has been examined. Theorder in which the vertices are visited depends on the graph structure andrepresentation,buttheseglobalpropertiesofthesearchtreesaremoreinformedbythealgorithmsthanbythegraphsortheirrepresentations.Thekeytounderstandinggraph-processingalgorithmsistorealizenotonlythatvarious different search strategies are effectiveways to learn various differentgraphproperties,butalsothatwecanimplementmanyof themuniformly.Forexample, theDFS illustrated inFigure18.13 tellsus that thegraphhasa longpath,andtheBFSillustratedinFigure18.24tellsusthatithasmanyshortpaths.Despitethesemarkeddynamicdifferences,DFSandBFSaresimilar,essentiallydiffering in only the data structure thatwe use to save edges that are not yetexplored (and the fortuitous circumstance that we can use a recursiveimplementation forDFSwith thesystemmaintainingan implicit stack forus).Indeed,weturnnext toageneralizedgraph-searchalgorithmthatencompassesDFS,BFS,andahostofotheruseful strategiesandwill serveas thebasis forsolutionstonumerousclassicgraph-processingproblems.

Exercises18.50DrawtheBFSforestthatresultsfromastandardadjacency-listsBFSofthegraph

3-71-47-80-55-23-82-90-64-92-66-4.18.51DrawtheBFSforestthatresultsfromastandardadjacency-matrixBFS

ofthegraph3-71-47-80-55-23-82-90-64-92-66-4.

•18.52ModifyPrograms18.8and18.9touseanSTLqueueinsteadoftheADTfromSection4.8.

•18.53 Give a BFS implementation (a version of Program 18.9) that uses aqueueofvertices(seeProgram5.22).IncludeatestintheBFSsearchcodetoensurethatnoduplicatesgoonthequeue.18.54Givetheall-shortest-pathmatrices(inthestyleofFigure18.23) for thegraph

3-71-47-80-55-23-82-90-64-92-66-4,assumingthatyouusetheadjacency-matrixrepresentation.18.55Developashortest-pathsclass,whichsupportsshortest-pathqueriesafterpreprocessing to compute all shortest paths. Specifically, define a two-dimensional matrix as a private data member, and write a constructor thatassignsvaluestoallitsentriesasillustratedinFigure18.23.Then,implementtwoqueryfunctionslength(v,w)(thatreturnstheshortest-pathlengthbetweenv and w) and path(v, w) (that returns the vertex adjacent to v that is on ashortestpathbetweenvandw).•18.56What does theBFS tree tell us about the distance from v tow whenneitherisattheroot?18.57 Develop a class whose objects know the path length that suffices toconnectanypairofverticesinagraph.(Thisquantityisknownasthegraph’sdiameter).Note:You need to define a convention for the return value in thecasethatthegraphisnotconnected.18.58Giveasimpleoptimalrecursivealgorithmforfindingthediameterofatree(seeExercise18.57).•18.59 Instrument the BFS class Program 18.9 by adding member functions(andappropriateprivatedatamembers) that return the theheightof theBFStreeandthepercentageofedgesthatmustbeprocessedforeveryvertextobeseen.

•18.60 Run experiments to determine empirically the average values of thequantitiesdescribedinExercise18.59forgraphsofvarioussizes,drawnfromvariousgraphmodels(seeExercises17.64–76).

18.8GeneralizedGraphSearch

DFSandBFSarefundamentalandessentialgraph-traversalmethodsthatlieatthe heart of numerous graph-processing algorithms. Knowing their essentialproperties,wenowmovetoahigherlevelofabstraction,whereweseethatbothmethodsarespecialcasesofageneralizedstrategyformovingthroughagraph,onethatissuggestedbyourBFSimplementation(Program18.9).Thebasic ideaissimple:WerevisitourdescriptionofBFSfromSection18.6,butweusethetermgenerictermfringe,insteadofqueue,todescribethesetofedges thatarepossiblecandidates forbeingnextadded to the tree.Weare ledimmediately to a general strategy for searching a connected component of agraph.Startingwithaself-looptoastartvertexonthefringeandanemptytree,performthefollowingoperationuntilthefringeisempty:

Moveanedgefromthefringetothetree.Ifthevertextowhichitleadsisunvisited,visitthatvertex,andputontothefringealledgesthatleadfromthatvertextounvisitedvertices.

This strategy describes a family of search algorithms that will visit all thevertices and edges of a connected graph,no matter what type of generalizedqueueweusetoholdthefringeedges.Whenweuseaqueuetoimplementthefringe,wegetBFS,thetopicofSection18.6.Whenweuseastacktoimplementthefringe,wegetDFS.Figure18.25,whichyoushouldcomparewithFigures18.6

Figure18.25Stack-basedDFS

TogetherwithFigure18.21,thisfigureillustratesthatBFSandDFSdifferonlyintheunderlyingdatastructure.ForBFS,weusedaqueue;forDFSweuseastack.Webeginwithalltheedgesadjacenttothestartvertexonthestack(topleft).Second,wemoveedge0-7fromthestacktothetreeandpushontothestackitsincidentedgesthatgotoverticesnotyetonthetree7-1,7-4,and7-6(secondfromtop,left).TheLIFOstackdisciplineimpliesthat,whenweputanedgeonthestack,anyedgesthatpointtothesamevertexareobsoleteandwillbeignoredwhentheyreachthetopofthestack.Suchedgesareprintedingray.Third,wemoveedge7-6fromthestacktothetreeandpushitsincidentedgesonthestack(thirdfromtop,left).Next,wepopedge4-6andpushitsincidentedgesonthestack,twoofwhichwilltakeustonewvertices(bottomleft).Tocompletethesearch,wetaketheremainingedgesoffthestack,completelyignoringthegrayedgeswhentheycometothetopofthestack(right).

and 18.21, illustrates this phenomenon in detail. Proving this equivalencebetween recursive and stack-basedDFS is an interesting exercise in recursionremoval, where we essentially transform the stack underlying the recursiveprogramintothestackimplementingthefringe(seeExercise18.63).Thesearchorder for the DFS depicted in Figure 18.25 differs from the one depicted inFigure18.6 only because the stack discipline implies thatwe check the edgesincident on each vertex in reverse of the order thatwe encounter them in theadjacency matrix (or the adjacency lists). The basic fact remains that, if wechangethedatastructureusedbyProgram18.8tobeastackinsteadofaqueue(which is trivial to do because the ADT interfaces of the two data structuresdiffer in only the function names), thenwe change that program fromBFS toDFS.Now, as we discussed in Section 18.7, this general method may not be asefficientaswewouldlike,becausethefringebecomesclutteredupwithedgesthatpointtoverticesthataremovedtothetreeduringthetimethattheedgeisonthefringe.ForFIFOqueues,weavoidthissituationbymarkingdestinationverticeswhenwe put edges on the queue.We ignore edges to fringe verticesbecauseweknow that theywill neverbeused:Theoldonewill comeoff thequeue(andthevertexvisited)beforethenewonedoes(seeProgram18.9).Forastackimplementation,wewanttheopposite:Whenanedgeistobeaddedtothefringethathasthesamedestinationvertexastheonealreadythere,weknowthattheold edgewill never be used, because the newonewill comeoff the stack(and the vertex visited) before the old one. To encompass these two extremesand to allow for fringe implementations that can use some other policy to

disallowedgesonthefringethatpointtothesamevertex,wemodifyourgeneralschemeasfollows:

Moveanedgefromthefringetothetree.Visitthevertexthatitleadsto,andputalledgesthatleadfrom that vertex to unvisited vertices onto the fringe, using a replacement policy on the fringe thatensuresthatnotwoedgesonthefringepointtothesamevertex.

Theno-duplicate-destination-vertexpolicyon the fringeguarantees thatwedonotneedtotestwhetherthedestinationvertexoftheedgecomingoffthequeuehasbeenvisited.ForBFS,weuseaqueue implementationwithan ignore-the-new-itempolicy;forDFS,weneed

Figure18.26Graphsearchterminology

Duringagraphsearch,wemaintainasearchtree(black)andafringe(gray)ofedgesthatarecandidatestobenextaddedtothetree.Eachvertexiseitheronthe

tree(black),thefringe(gray),ornotyetseen(white).Treeverticesareconnectedbytreeedges,andeachfringevertexisconnectedbyafringeedgeto

atreevertex.

Figure18.27Graphsearchstrategies

ThisfigureillustratesdifferentpossibilitieswhenwetakeanextstepinthesearchillustratedinFigure18.26.Wemoveavertexfromthefringetothetree(inthecenterofthewheelatthetopright)andcheckallitsedges,puttingthosetounseenverticesonthefringeandusinganalgorithm-specificreplacementruletodecidewhetherthosetofringeverticesshouldbeskippedorshouldreplacethefringeedgetothesamevertex.InDFS,wealwaysreplacetheoldedges;inBFS,wealwaysskipthenewedges;andinotherstrategies,wemightreplace

someandskipothers.

a stack with a forget-the-old-item policy; but any generalized queue and anyreplacementpolicyatallwillstillyieldaneffectivemethodforvisitingall theverticesandedgesofthegraphinlineartimeandextraspaceproportionaltoV.Figure18.27isaschematicillustrationofthesedifferences.Wehaveafamilyofgraph-searchingstrategiesthatincludesbothDFSandBFSandwhosemembersdifferonly in thegeneralized-queue implementation that theyuse.Asweshallsee, this family encompasses numerous other classical graph-processingalgorithms.Program18.10isanimplementationbasedontheseideasforgraphsrepresentedwith adjacency lists. It puts fringe edges on a generalized queue and uses theusual vertex-indexed vectors to identify fringe vertices so that it can use anexplicitupdateADToperationwheneveritencountersanotheredgetoafringevertex. The ADT implementation can choose to ignore the new edge or toreplacetheoldone.Property18.12GeneralizedgraphsearchingvisitsalltheverticesandedgesinagraphintimeproportionaltoV2fortheadjacency-matrixrepresentationandtoV +E for the adjacency-lists representation plus, in the worst case, the timerequiredforVinsert,Vremove,andEupdateoperationsinageneralizedqueueofsizeV.Proof: The proof of Property 18.12 does not depend on the queueimplementation,andthereforeapplies.Thestatedextratimerequirementsforthegeneralized-queueoperationsfollowimmediatelyfromtheimplementation.There are many other effective ADT designs for the fringe that we mightconsider. For example, as with our first BFS implementation, we could stickwith our first general scheme and simply put all the edges on the fringe, thenignorethosethatgototreeverticeswhen

Program18.10Generalizedgraphsearch

This graph-search class generalizes BFS and DFS and supports numerous graph-processing algorithms (see Section 21.2 for a discussion of these algorithms andalternateimplementations).Itmaintainsageneralizedqueueofedgescalledthe fringe.Weinitializethefringewithaself-looptothestartvertex;then,whilethefringeisnotempty,wemoveanedgee from the fringe to the tree (attachedate.v)andscane.w’sadjacencylist,movingunseenverticestothefringeandcallingupdatefornewedgestofringevertices.

Thiscodemakesjudicioususeofordandsttoguaranteethatnotwoedgesonthefringepointtothesamevertex.Avertexvisthedestinationvertexofafringeedgeifandonlyifitismarked(ord[v]isnot-1)butitisnotyetonthetree(st[v]is-1).

Program18.11RandomqueueimplementationWhenweremoveanitemfromthisdatastructure,itisequallylikelytobeanyoneoftheitemscurrentlyinthedatastructure.Wecanusethiscodetoimplementthegeneralized-queueADTforgraphsearchingtosearchagraphina“random”fashion(seetext).

template<classItem>

classGQ

{

private:

vector<Item>s;intN;

public:

GQ(intmaxN):s(maxN+1),N(0){}

intempty()const

{returnN==0;}

voidput(Itemitem)

{s[N++]=item;}

voidupdate(Itemx){}

Itemget()

{inti=int(N*rand()/(1.0+RAND_MAX));

Itemt=s[i];

s[i]=s[N-1];

s[N-1]=t;

returns[--N];}

};

we take themoff.Thedisadvantageof this approach, aswithBFS, is that themaximum queue size has to beE instead ofV. Or, we could handle updatesimplicitlyintheADTimplementation,justbyspecifyingthatnotwoedgeswiththe samedestinationvertexcanbeon thequeue.But the simplestway for theADTimplementationtodosoisessentiallyequivalenttousingavertex-indexed

vector(seeExercises4.51and4.54),so the test fitsmorecomfortably into theclientgraph-searchprogram.ThecombinationofProgram18.10andthegeneralized-queueabstractiongivesus a general and flexible graph-searchmechanism.To illustrate this point,wenowconsiderbrieflytwointerestingandusefulalternativestoBFSandDFS.Thefirstalternativestrategyisbasedonrandomizedqueues(seeSection4.6).Inarandomizedqueue,weremoveitemsrandomly:Eachitemonthedatastructureisequallylikelytobetheoneremoved.Program18.11isanimplementationthatprovides this functionality. If we use this code to implement the generalizedqueue ADT for Program 18.10, then we get a randomized graph-searchingalgorithm,whereeachvertexon the fringe isequally likely tobe thenextoneaddedtothetree.Theedge(tothatvertex)thatisaddedtothetreedependsonthe implementation of the update operation. The implementation in Program18.11doesnoupdates,soeachfringevertexisaddedtothetreewiththeedgethat caused it to be moved to the fringe. Alternatively, we might choose toalwaysdoupdates(whichresultsinthemostrecentlyencounterededgetoeachfringevertexbeingaddedtothetree),ortomakearandomchoice.Another strategy,which is critical in the study of graph-processing algorithmsbecause it serves as thebasis for anumberof the classical algorithms thatweaddressinChapters20through22,istouseapriority-queueADT(seeChapter9) for the fringe:Weassignpriorityvalues to each edgeon the fringe, updatethemasappropriate,andchoosethehighest-priorityedgeastheonetobeaddednexttothetree.WeconsiderthisformulationindetailinChapter20.Thequeue-maintenance operations for priority queues aremore costly than are those forstacks and queues because they involve implicit comparisons among items onthequeue,buttheycansupportamuchbroaderclassofgraph-searchalgorithms.As we shall see, several critical graph-processing problems can be addressedsimplywith judiciouschoiceofpriorityassignments inapriority-queue–basedgeneralizedgraphsearch.All generalized graph-searching algorithms examine each edge just once andtakeextraspaceproportionaltoVintheworstcase;theydodiffer,however,insomeperformancemeasures.For example,Figure18.28 shows the size of thefringe as the search progresses forDFS, BFS, and randomized search; Figure18.29showsthe treecomputedbyrandomizedsearchfor thesameexampleasFigure18.13andFigure18.24.RandomizedsearchhasneitherthelongpathsofDFSnorthehigh-degreenodesofBFS.Theshapesofthesetreesandthefringeplots depend on the structure of the particular graph being searched, but they

alsocharacterizethedifferentalgorithms.Figure18.28FringesizesforDFS,randomizedgraphsearch,andBFS

TheseplotsofthefringesizeduringthesearchesillustratedinFigures18.13,18.24,and18.29indicatethedramaticeffectsthatthechoiceofdatastructureforthefringecanhaveongraphsearching.Whenweuseastack,inDFS(top),wefillupthefringeearlyinthesearchaswefindnewnodesateverystep,thenweendthesearchbyremovingeverything.Whenweusearandomizedqueue

(center),themaximumqueuesizeismuchlower.WhenweuseaFIFOqueueinBFS(bottom),themaximumqueuesizeisstilllower,andwediscovernew

nodesthroughoutthesearch.

Figure18.29Randomizedgraphsearch

Thisfigureillustratestheprogressofrandomizedgraphsearching(left),inthesamestyleasFigures18.13and18.24.ThesearchtreeshapefallssomewherebetweentheBFSandDFSshapes.Thedynamicsofthesethreealgorithms,

whichdifferonlyinthedatastructureforworktobecompleted,couldhardlybemoredifferent.

Wecouldgeneralizegraphsearchingstill furtherbyworkingwithaforest(notnecessarilyatree)duringthesearch.Althoughwestopshortofworkingatthislevel of generality throughout, we consider a few algorithms of this sort inChapter20.

Exercises• 18.61 Discuss the advantages and disadvantages of a generalized graph-searching implementation that is based on the following policy: “Move anedgefromthefringetothetree.Ifthevertexthatitleadstoisunvisited,visitthatvertexandputallitsincidentedgesontothefringe.”18.62Develop an adjacency-listsADT implementation that keeps edges (not

justdestinationvertices)onthelists, thenimplementagraphsearchbasedonthestrategydescribedinExercise18.61thatvisitseveryedgebutdestroysthegraph,takingadvantageofthefactthatyoucanmoveallofavertex’sedgestothefringewithasinglelinkchange.•18.63 Prove that recursiveDFS (Program 18.3) is equivalent to generalizedgraphsearchusinga stack (Program18.10), in the sense that bothprogramswillvisitallverticesinpreciselythesameorderforallgraphsifandonlyiftheprogramsscantheadjacencylistsinoppositeorders.18.64 Give three different possible traversal orders for randomized searchthroughthegraph

3-71-47-80-55-23-82-90-64-92-66-4.18.65Couldrandomizedsearchvisittheverticesinthegraph

3-71-47-80-55-23-82-90-64-92-66-4innumericalorderoftheirindices?Proveyouranswer.18.66 Use the STL to build a generalized-queue implementation for graphedges that disallows edges with duplicate vertices on the queue, using anignore-the-new-itempolicy.•18.67Developarandomizedgraphsearchthatchooseseachedgeonthefringewithequallikelihood.Hint:SeeProgram18.8.

•18.68Describeamaze-traversalstrategythatcorrespondstousingastandardpushdownstackforgeneralizedgraphsearching(seeSection18.1).

•18.69Instrumentgeneralizedgraphsearching(seeProgram18.10)toprintouttheheightofthetreeandthepercentageofedgesprocessedforeveryvertextobeseen.

• 18.70 Run experiments to determine empirically the average values of thequantities described in Exercise 18.69 for generalized graph search with arandomqueue in graphs of various sizes, drawn from various graphmodels(seeExercises17.64–76).

•18.71 Implement a derived class that does dynamic graphical animations ofgeneralizedgraphsearchforgraphsthathave(x,y)coordinatesassociatedwitheachvertex(seeExercises17.55through17.59).TestyourprogramonrandomEuclidean neighbor graphs, using as many points as you can process in areasonable amount of time. Your program should produce images like thesnapshotsshowninFigures18.13,18.24,and18.29,althoughyoushouldfeelfreetousecolorsinsteadofshadesofgraytodenotetree,fringe,andunseen

verticesandedges.

18.9AnalysisofGraphAlgorithmsWehaveforourconsiderationabroadvarietyofgraph-processingproblemsandmethods for solving them, so we do not always compare numerous differentalgorithmsforthesameproblem,aswehaveinotherdomains.Still,itisalwaysvaluabletogainexperiencewithouralgorithmsbytestingthemonrealdata,oron artificial data thatweunderstand and that have relevant characteristics thatwemightexpecttofindinactualapplications.Aswediscussedbriefly inChapter2,we seek—ideally—natural inputmodelsthathavethreecriticalproperties:

•They reflect reality to a sufficient extent thatwecanuse them topredictperformance.• They are sufficiently simple that they are amenable to mathematicalanalysis.•Wecanwritegeneratorsthatprovideprobleminstancesthatwecanusetotestouralgorithms.

With these three components, we can enter into a design-analysis-implementation-test scenario that leads to efficient algorithms for solvingpracticalproblems.For domains such as sorting and searching,we have seen spectacular successalongtheselinesinParts3and4.Wecananalyzealgorithms,generaterandomproblem instances, and refine implementations to provide extremely efficientprogramsforuse inahostofpracticalsituations.Forsomeotherdomains thatwe study, various difficulties can arise. For example,mathematical analysis atthelevelthatwewouldlikeisbeyondourreachformanygeometricproblems,and developing an accurate model of the input is a significant challenge formanystring-processingalgorithms (indeed,doing so is anessentialpartof thecomputation).Similarly,graphalgorithmstakeustoasituationwhere,formanyapplications,weareonthinicewithrespect toall threepropertieslistedinthepreviousparagraph:

• The mathematical analysis is challenging, and many basic analyticquestionsremainunanswered.• There is a huge number of different types of graphs, and we cannotreasonablytestouralgorithmsonallofthem.•Characterizingthetypesofgraphsthatariseinpracticalproblemsis,forthe

mostpart,apoorlyunderstoodproblem.Graphs are sufficiently complicated that we often do not fully understand theessentialpropertiesof theones thatwe see inpracticeor of theartificialonesthatwecanperhapsgenerateandanalyze.Thesituation isperhapsnotasbleakas justdescribed foroneprimary reason:Manyofthegraphalgorithmsthatweconsiderareoptimalintheworstcase,sopredictingperformanceisatrivialexercise.Forexample,Program18.7findsthebridges after examining each edge and each vertex just once. This cost is thesame as the cost of building the graph data structure, andwe can confidentlypredict,forexample,thatdoublingthenumberofedgeswilldoubletherunningtime,nomatterwhatkindofgraphsweareprocessing.When the running time of an algorithm depends on the structure of the inputgraph,predictionsaremuchharder tocomeby.Still,whenweneed toprocesshugenumbersofhugegraphs,wewantefficientalgorithmsforthesamereasonsthatwewantthemforanyotherproblemdomain,andwewillcontinuetopursuethe goals of understanding the essential properties of algorithms andapplications, striving to identify thosemethods thatcanbesthandle thegraphsthatmightariseinpractice.To illustratesomeof these issues,werevisit thestudyofgraphconnectivity,aproblem that we considered already in Chapter 1 (!). Connectivity in randomgraphshasfascinatedmathematiciansforyears,andithasbeenthesubjectofanextensive literature. That literature is beyond the scope of this book, but itprovidesabackdropthatvalidatesouruseoftheproblemasthebasisforsomeexperimentalstudiesthathelpusunderstandthebasicalgorithmsthatweuseandthetypesofgraphsthatweareconsidering.For example, growing a graph by adding random edges to a set of initiallyisolatedvertices(essentially,theprocessbehindProgram17.12)isawell-studiedprocessthathasservedasthebasisfor

Table18.1Connectivityintworandomgraphmodels

Key:CnumberofconnectedcomponentsLsizeoflargestconnectedcomponent

Thistableshowsthenumberofconnectedcomponentsandthesizeofthelargestconnectedcomponentfor100000-vertexgraphsdrawnfromtwodifferent

distributions.Fortherandomgraphmodel,theseexperimentssupportthewell-knownfactthatthegraphishighlylikelytoconsistprimarilyofonegiant

componentiftheaveragevertexdegreeislargerthanasmallconstant.Therighttwocolumnsgiveexperimentalresultswhenwerestricttheedgestobechosenfromthosethatconnecteachvertextojustoneof10specifiedneighbors.

classical random graph theory. It is well known that, as the number of edgesgrows, the graph coalesces into just one giant component. The literature onrandomgraphsgivesextensiveinformationaboutthenatureofthisprocess.Forexample,

Property18.13IfE> +µV(withµpositive),arandomgraphwithVverticesandEedgesconsistsofasingleconnectedcomponentandanaverageof less than e−2µ isolated vertices, with probability approaching 1 as Vapproachesinfinity.

Figure18.30Connectivityinrandomgraphs

Thisfiguresshowtheevolutionoftwotypesofrandomgraphsat10equalincrementsasatotalof2Eedgesareaddedtoinitiallyemptygraphs.Eachplotisahistogramofthenumberofverticesincomponentsofsize1throughV−1(lefttoright).Westartoutwithallverticesincomponentsofsize1andendwithnearlyallverticesinagiantcomponent.Theplotatleftisforastandardrandomgraph:thegiantcomponentformsquickly,andallothercomponentsaresmall.Theplotatrightisforarandomneighborgraph:componentsofvarioussizes

persistforalongertime.

Proof:ThisfactwasestablishedbyseminalworkofErdösandRenyi in1960.Theproofisbeyondthescopeofthisbook(seereferencesection).•Thisproperty tellsus thatwecanexpect largenonsparse randomgraphs tobeconnected.Forexample,ifV>1000andE>10V,thenµ> >6.5andtheaveragenumberofverticesnotinthegiantcomponentis(almostsurely)

lessthane−13<.000003.Ifwegenerateamillionrandom1000-vertexgraphsofdensitygreaterthan10,wemightgetafewwithasingleisolatedvertex,buttherestwillallbeconnected.Figure18.30comparesrandomgraphswithrandomneighborgraphs,whereweallowonlyedgesthatconnectverticeswhoseindicesarewithinasmallconstantof one another. The neighbor-graph model yields graphs that are evidentlysubstantially different in character from random graphs. We eventually get agiantcomponent,butitappearssuddenly,whentwolargecomponentsmerge.Table18.2 shows that these structural differences between randomgraphs andneighbor graphs persist for V and E in ranges of practical interest. Suchstructural differences certainly may be reflected in the performance of ouralgorithms.Table 18.3 gives empirical results for the cost of finding the number ofconnected components in a randomgraph, usingvarious algorithms.Althoughthealgorithmsperhapsarenotsubjecttodirectcomparisonforthisspecifictaskbecause they are designed to handle different tasks, these experiments dovalidateasubsetofthegeneralconclusionsthatwehavedrawn.First, it is plain from the table that we should not use the adjacency-matrixrepresentationforlargesparsegraphs(andcannotuseitforhugeones),notjustbecause the cost of initializing thematrix is prohibitive, but also because thealgorithminspectseveryentryinthematrix,soitsrunningtimeisproportionaltothesize(V2)ofthematrixratherthantothenumberof1sinit(E).Thetableshows,forexample,thatit takesaboutaslongtoprocessagraphthatcontains1000edgesas itdoes toprocessone that contains100000edgeswhenweareusinganadjacencymatrix.Second,itisalsoplainfromTable18.3thatthecostofallocatingmemoryforthelist nodes is significantwhenwebuild adjacency lists for large sparse graphs.Thecostofbuildingthelistsismorethanfive

Table18.2Empiricalstudyofgraph-searchalgorithms

Key:Uweightedquickunionwithhalving(Program1.4)IinitialconstructionofthegraphrepresentationDrecursiveDFS(Program18.3)BBFS(Program18.9)*exitwhengraphisfoundtobefullyconnectedThistableshowsrelativetimingsforvariousalgorithmsforthetaskof

determiningthenumberofconnectedcomponents(andthesizeofthelargestone)forgraphswithvariousnumbersofverticesandedges.Asexpected,algorithmsthatusetheadjacency-matrixrepresentationareslowforsparse

graphsbutcompetitivefordensegraphs.Forthisspecializedtask,theunion-findalgorithmsthatweconsideredinChapter1arethefastest,becausetheybuilda

datastructuretailoredtosolvetheproblemanddonotneedotherwisetorepresentthegraph.Oncethedatastructurerepresentingthegraphhasbeen

built,however,DFSandBFSarefasterandmoreflexible.Addingatesttostopwhenthegraphisknowntoconsistofasingleconnectedcomponent

significantlyspeedsupDFSandunion-find(butnotBFS)fordensegraphs.

timesthecostoftraversingthem.Inthetypicalsituationwherewearegoingtoperformnumeroussearchesofvarioustypesafterbuildingthegraph,thiscostisacceptable.Otherwise,wemight consider alternate implementations to reducethiscost.Third, the absence of numbers in theDFS columns for large sparse graphs issignificant. These graphs cause excessive recursion depth, which (eventually)causetheprogramtocrash.IfwewanttouseDFSonsuchgraphs,weneedtousethenonrecursiveversionthatwediscussedinSection18.7.Fourth, the table shows that the union-find–based method from Chapter 1 isfaster than DFS or BFS, primarily because it does not have to represent theentiregraph.Withoutsucharepresentation,however,wecannotanswersimplequeries such as “Is there an edge connecting v and w?” so union-find–basedmethodsarenotsuitableifwewanttodomorethanwhattheyaredesignedtodo(answer “Is there a path between v and w?” queries intermixed with addingedges). Once the internal representation of the graph has been built, it is notworthwhiletoimplementaunion-findalgorithmjusttodeterminewhetheritisconnected,becauseDFSorBFScanprovidetheansweraboutasquickly.Whenwe runempirical tests that lead to tablesof this sort, variousanomaliesmight require furtherexplanation.Forexample,onmanycomputers, thecachearchitecture and other features of the memory system might have dramaticimpact on performance for large graphs. Improving performance in criticalapplications may require detailed knowledge of the machine architecture inadditiontoallthefactorsthatweareconsidering.Carefulstudyofthesetableswillrevealmorepropertiesofthesealgorithmsthanwe are able to address.Our aim is not to do an exhaustive comparison but toillustratethat,despitethemanychallengesthatwefacewhenwecomparegraphalgorithms, we can and should run empirical studies and make use of anyavailable analytic results, both to get a feeling for the algorithms’ importantcharacteristicsandtopredictperformance.

Exercises• 18.72 Do an empirical study culminating in a table like Table 18.2 for theproblemofdeterminingwhetherornotagraphisbipartite(two-colorable).

18.73 Do an empirical study culminating in a table like Table 18.2 for theproblemofdeterminingwhetherornotagraphisbiconnected.•18.74Do an empirical study to find the expected size of the second largestconnected component in sparse graphs of various sizes, drawn from variousgraphmodels(seeExercises17.64–76).18.75WriteaprogramthatproducesplotslikethoseinFigure18.30,andtestiton graphs of various sizes, drawn fromvarious graphmodels (seeExercises17.64–76).• 18.76 Modify your program from Exercise 18.75 to produce similarhistogramsforthesizesofedge-connectedcomponents.••18.77Thenumbersinthetablesinthissectionareresultsforonlyonesample.Wemightwishtoprepareasimilartablewherewerun1000experimentsforeachentryandgivethesamplemeanandstandarddeviation,butweprobablycouldnotincludenearlyasmanyentries.Wouldthisapproachbeabetteruseofcomputertime?Defendyouranswer.

CHAPTERNINETEENDigraphsandDAGs

WHENWEATTACH significance to the order inwhich the two vertices arespecified in each edge of a graph,wehave an entirely different combinatorialobject known as a digraph, or directed graph. Figure 19.1 shows a sampledigraph.Inadigraph,thenotations-tdescribesanedgethatgoesfromstotbutprovidesnoinformationaboutwhetherornotthereisanedgefromttos.Therearefourdifferentwaysinwhichtwoverticesmightberelatedinadigraph:noedge;anedges-t froms to t;anedge t-s fromt tos;or twoedgess-tand t-s,whichindicateconnectionsinbothdirections.Theone-wayrestrictionisnaturalin many applications, easy to enforce in our implementations, and seemsinnocuous; but it implies added combinatorial structure that has profoundimplicationsforouralgorithmsandmakesworkingwithdigraphsquitedifferentfromworkingwith undirected graphs. Processing digraphs is akin to travelingaround in a city where all the streets are one-way, with the directions notnecessarilyassigned inanyuniformpattern.Wecan imagine thatgetting fromonepointtoanotherinsuchasituationcouldbeachallengeindeed.

Figure19.1Adirectedgraph(digraph)

Adigraphisdefinedbyalistofnodesandedges(bottom),withtheorderthatwe

listthenodeswhenspecifyinganedgeimplyingthattheedgeisdirectedfromthefirstnodetothesecond.Whendrawingadigraph,weusearrowstodepict

directededges(top).We interpret edge directions in digraphs in many ways. For example, in atelephone-callgraph,wemightconsideranedgetobedirectedfromthecallertothe person receiving the call. In a transaction graph,wemight have a similarrelationship where we interpret an edge as representing cash, goods, orinformationflowingfromoneentitytoanother.WefindamodernsituationthatfitsthisclassicmodelontheInternet,withverticesrepresentingWebpagesandedgesthelinksbetweenthepages.InSection19.4,weexamineotherexamples,manyofwhichmodelsituationsthataremoreabstract.One common situation is for the edge direction to reflect a precedencerelationship.Forexample,adigraphmightmodelamanufacturingline:Verticescorrespondtojobstobedone,andanedgeexistsfromvertexstovertextifthejob corresponding to vertex s must be done before the job corresponding tovertext.AnotherwaytomodelthesamesituationistouseaPERTchart:edgesrepresent jobs and vertices implicitly specify the precedence relationships (ateach vertex, all incoming jobs must complete before any outgoing jobs canbegin).Howdowedecidewhentoperformeachofthejobssothatnoneoftheprecedencerelationshipsareviolated?Thisisknownasaschedulingproblem.Itmakesnosenseifthedigraphhasacycleso,insuchsituations,weareworkingwith directed acyclic graphs (DAGs). We shall consider basic properties ofDAGs and algorithms for this simple scheduling problem,which is known astopological sorting, in Sections 19.5 through 19.7. In practice, schedulingproblemsgenerallyinvolveweightsontheverticesoredgesthatmodelthetimeorcostofeachjob.WeconsidersuchproblemsinChapters21and22.

Thenumberofpossibledigraphsistrulyhuge.EachoftheV2possibledirectededges (including self-loops) could be present or not, so the total number ofdifferentdigraphs is2

V2.As illustrated inFigure19.2, thisnumbergrowsveryquickly,evenbycomparisonwiththenumberofdifferentundirectedgraphsandeven when V is small. As with undirected graphs, there is a much smallernumberofclassesofdigraphsthatareisomorphictoeachother(theverticesofone can be relabeled to make it identical to the other), but we cannot takeadvantageof this reductionbecausewedonotknowanefficientalgorithmfordigraphisomorphism.

Figure19.2Graphenumeration

WhilethenumberofdifferentundirectedgraphswithVverticesishuge,evenwhenVissmall,thenumberofdifferentdigraphswithVverticesismuchlarger.

Forundirectedgraphs,thenumberisgivenbytheformula2V(V+1)/2;fordigraphs,theformulais2V2.

Certainly,anyprogramwillhavetoprocessonlyatinyfractionofthepossibledigraphs;indeed,thenumbersaresolargethatwecanbecertainthatvirtuallyalldigraphswillnotbeamongthoseprocessedbyanygivenprogram.Generally,itisdifficult tocharacterize thedigraphs thatwemightencounter inpractice, sowe design our algorithms such that they can handle any possible digraph asinput.On the one hand, this situation is not new to us (for example, virtuallynoneof the1000!permutationsof1000elementshaveeverbeenprocessedbyanysortingprogram).On theotherhand, it isperhapsunsettling toknowthat,forexample,evenifall theelectronsin theuniversecouldrunsupercomputerscapableofprocessing1010 graphsper second for the estimated lifetimeof theuniverse,thosesupercomputerswouldseefarfewerthan10−100percentofthe10-vertexdigraphs(seeExercise19.9).Thisbriefdigressionongraphenumerationperhapsunderscores severalpointsthat we cover whenever we consider the analysis of algorithms and indicatestheirparticularrelevancetothestudyofdigraphs.Is it important todesignouralgorithmstoperformwellintheworstcase,whenwearesounlikelytoseeany

particularworst-casedigraph? Is ituseful tochoosealgorithmson thebasisofaverage-caseanalysis,oristhatamathematicalfiction?Ifourintentistohaveimplementationsthatperformefficientlyondigraphsthatweseeinpractice,weare immediately faced with the problem of characterizing those digraphs.Mathematicalmodelsthatcanconvincinglydescribethedigraphsthatwemightexpect in applications are evenmore difficult to develop than aremodels forundirectedgraphs.Inthischapter,werevisit,inthecontextofdigraphs,asubsetofthebasicgraph-processingproblemsthatweconsideredinChapter17,andweexamineseveralproblemsthatarespecifictodigraphs.Inparticular,welookatDFSandseveralofitsapplications,includingcycledetection(todeterminewhetheradigraphisaDAG);topologicalsort(tosolve,forexample,theschedulingproblemforDAGsthatwasjustdescribed);andcomputationofthetransitiveclosureandthestrongcomponents(whichhavetodowiththebasicproblemofdeterminingwhetherornot there is a directed path between two given vertices). As in other graph-processing domains, these algorithms range from the trivial to the ingenious;they are both informedby and give us insight into the complex combinatorialstructureofdigraphs.

Exercises•19.1Finda largedigraphsomewhereonline—perhapsa transactiongraph insomeonlinesystem,oradigraphdefinedbylinksonWebpages.

•19.2FindalargeDAGsomewhereonline—perhapsonedefinedbyfunction-definitiondependencies ina largesoftwaresystem,orbydirectory links inalargefilesystem.19.3Make a table like Figure 19.2, but exclude from the counts graphs anddigraphswithself-loops.19.4HowmanydigraphsaretherethatcontainVverticesandEedges?•19.5HowmanydigraphscorrespondtoeachundirectedgraphthatcontainsVverticesandEedges?

•19.6HowmanydigitsdoweneedtoexpressthenumberofdigraphsthathaveVverticesasabase-10number?

•19.7Drawthenonisomorphicdigraphsthatcontainthreevertices.•••19.8Howmanydifferentdigraphsare therewithVverticesandEedges ifweconsidertwodigraphstobedifferentonlyiftheyarenotisomorphic?

•19.9 Compute an upper bound on the percentage of 10-vertex digraphs that

couldeverbeexaminedbyanycomputer,undertheassumptionsdescribedinthe textandtheadditionalones that theuniversehas less than1080electronsandthattheageoftheuniversewillbelessthan1020years.

19.1GlossaryandRulesoftheGameOur definitions for digraphs are nearly identical to those in Chapter 17 forundirectedgraphs(asaresomeofthealgorithmsandprogramsthatweuse),butthey are worth restating. The slight differences in the wording to account foredgedirectionsimplystructuralpropertiesthatwillbethefocusofthischapter.Definition19.1Adigraph(ordirectedgraph)isasetofverticesplusasetofdirectededgesthatconnectorderedpairsofvertices(withnoduplicateedges).Wesaythatanedgegoesfromitsfirstvertextoitssecondvertex.Aswedidwithundirectedgraphs,wedisallowduplicateedgesinthisdefinitionbut reserve the option of allowing them when convenient for variousapplications and implementations. We explicitly allow self-loops in digraphs(andusuallyadopttheconventionthateveryvertexhasone)becausetheyplayacriticalroleinthebasicalgorithms.Definition19.2Adirectedpathinadigraphisalistofverticesinwhichthereisa(directed)digraphedgeconnectingeachvertexinthelisttoitssuccessorinthelist.Wesaythatavertextisreachablefromavertexsifthereisadirectedpathfromstot.Weadopttheconventionthateachvertexisreachablefromitselfandnormallyimplementthatassumptionbyensuringthatself-loopsarepresentinourdigraphrepresentations.Understandingmanyofthealgorithmsinthischapterrequiresunderstandingtheconnectivitypropertiesofdigraphsandtheeffectofthesepropertiesonthebasicprocessofmovingfromonevertextoanotheralongdigraphedges.Developingsuch an understanding is more complicated for digraphs than for undirectedgraphs. For example, we might be able to tell at a glance whether a smallundirected graph is connected or contains a cycle; these properties are not aseasytospotindigraphs,asindicatedinthetypicalexampleillustratedinFigure19.3.

Figure19.3Agriddigraph

ThissmalldigraphissimilartothelargegridnetworkthatwefirstconsideredinChapter1,exceptthatithasadirectededgeoneverygridline,withthedirection

randomlychosen.Eventhoughthegraphhasrelativelyfewnodes,itsconnectivitypropertiesarenotreadilyapparent.Isthereadirectedpathfromthe

upperleftcornertothelowerrightcorner?Whileexampleslikethishighlightdifferences,itisimportanttonotethatwhatahuman considers difficult may or may not be relevant to what a programconsidersdifficult—forinstance,writingaDFSclasstofindcyclesindigraphsis nomore difficult than for undirected graphs.More important, digraphs andgraphs have essential structural differences. For example, the fact that t isreachable from s in a digraph indicates nothing about whether s is reachablefromt.Thisdistinctionisobvious,butcritical,asweshallsee.Asmentioned inSection17.3, the representations thatweuse fordigraphsareessentiallythesameasthosethatweuseforundirectedgraphs.Indeed,theyaremorestraightforwardbecausewerepresenteachedgejustonce,asillustratedinFigure19.4.Intheadjacency-listsrepresentation,anedges-tisrepresentedasalist node containing t in the linked list corresponding to s. In the adjacency-matrixrepresentation,weneedtomaintainafullV-by-Vmatrixandtorepresentanedges-tbya1bitinrowsandcolumnt.Wedonotputa1bitinrowtandcolumnsunlessthereisalsoanedget-s.Ingeneral,theadjacencymatrixforadigraphisnotsymmetricaboutthediagonal.

Figure19.4Digraphrepresentations

Theadjacency-matrixandadjacency-listsrepresentationsofadigraphhaveonlyonerepresentationofeachedge,asillustratedintheadjacency-matrix(top)andadjacency-lists(bottom)representationofthegraphdepictedinFigure19.1.

Theserepresentationsbothincludeself-loopsateveryvertex,whichistypicalindigraphprocessing.

Thereisnodifferenceintheserepresentationsbetweenanundirectedgraphandadirectedgraphwithself-loopsateveryvertexandtwodirectededgesforeachedgeconnectingdistinctverticesintheundirectedgraph(oneineachdirection).Thus,wecanusethealgorithmsthatwedevelopinthischapterfordigraphstoprocessundirectedgraphs,providedthatweinterprettheresultsappropriately.Inaddition,weusetheprogramsthatweconsideredinChapter17asthebasisforour digraph programs: Our DenseGRAPH and SparseMultiGRAPH classimplementations Programs 17.7 through 17.10 build digraphs when theconstructorhastrueasasecondargument.Theindegreeofavertexinadigraphisthenumberofdirectededgesthatleadtothat vertex. Theoutdegree of a vertex in a digraph is the number of directededges that emanate from that vertex.No vertex is reachable from a vertex ofoutdegree 0,which is called a sink; a vertex of indegree 0, which is called asource, isnot reachable fromanyothervertex.Adigraphwhereself-loopsareallowedandeveryvertexhasoutdegree1iscalledamap(afunctionfromthesetofintegersfrom0toV−1ontoitself).Wecaneasilycomputetheindegreeandoutdegreeof eachvertex, and find sources and sinks, in linear time and spaceproportionaltoV,usingvertex-indexedvectors(seeExercise19.19).

Program19.1ReversingadigraphThis function adds the edges of the digraph in its first argument to the digraph in itssecondargument,withtheirdirectionsreversed.Itusestwotemplateparameters,sothatthegraphscanhavedifferentrepresentations.

template<classinGraph,classoutGraph>voidreverse(constinGraph&G,outGraph&R){for(intv=0;v<G.V();v++){typenameinGraph::adjIteratorA(G,v);for(intw=A.beg();!A.end();w=A.nxt())R.insert(Edge(w,v));}}

Thereverseofadigraphisthedigraphthatweobtainbyswitchingthedirectionof all the edges. Figure19.5 shows the reverse and its representations for the

digraphofFigure19.1.Weusethereverseindigraphalgorithmswhenweneedto know fromwhere edges come because our standard representations tell usonly where the edges go. For example, indegree and outdegree change roleswhenwereverseadigraph.

Figure19.5Digraphreversal

Reversingtheedgesofadigraphcorrespondstotransposingtheadjacencymatrixbutrequiresrebuildingtheadjacencylists(seeFigures19.1and19.4).

For an adjacency-matrix representation, we could compute the reverse bymaking a copy of the matrix and transposing it (interchanging its rows andcolumns).Ifweknowthatthegraphisnotgoingtobemodified,wecanactuallyuse the reverse without any extra computation by simply interchanging thevertices in our references to edgeswhenwewant to refer to the reverse. Forexample,anedges-tinadigraphGisindicatedbya1inadj[s][t].Thus,ifwewere tocompute the reverseRofG, itwouldhavea1 inadj[t][s].Wedonotneedtodoso,however, ifwebaseourclient implementationsontheedgetestedge(s,t),becausetoswitchtothereversewejustreplaceeverysuchreferencebyedge(t,s).Thisopportunitymayseemobvious,butitisoftenoverlooked.Foran adjacency-lists representation, the reverse is a completely different datastructure,andweneedtotaketimeproportionaltothenumberofedgestobuildit,asshowninProgram19.1.Yet another option,whichwe shall address in Chapter 22, is tomaintain tworepresentationsofeachedge,inthesamemanneraswedoforundirectedgraphs(see Section 17.3) but with an extra bit that indicates edge direction. Forexample, to use this method in an adjacency-lists representation we wouldrepresentanedges-tbyanodefortontheadjacencylistfors(withthedirectionbitsettoindicatethattomovefromstotisaforwardtraversaloftheedge)andanodeforsontheadjacencylistfort(withthedirectionbitsettoindicatethattomove from t to s is a backward traversal of the edge). This representationsupportsalgorithmsthatneedtotraverseedgesindigraphsinbothdirections.Itis also generally convenient to include pointers connecting the tworepresentations of each edge in such cases. We defer considering thisrepresentationindetailtoChapter22,whereitplaysanessentialrole.Indigraphs,byanalogytoundirectedgraphs,wespeakofdirectedcycles,whichare directed paths from a vertex back to itself, and simple directed paths andcycles,where the vertices and edges are distinct. Note that s-t-s is a cycle oflength2inadigraphbutthatcyclesinundirectedgraphsmusthaveatleastthreedistinctvertices.Inmanyapplicationsofdigraphs,wedonot expect to see anycycles, andweworkwithyetanothertypeofcombinatorialobject.Definition19.3Adirectedacyclicgraph(DAG) isadigraphwithnodirectedcycles.

WeexpectDAGs,forexample, inapplicationswhereweareusingdigraphs tomodelprecedencerelationships.DAGsnotonlyarisenaturallyintheseandotherimportantapplications,butalso,asweshallsee,inthestudyofthestructureofgeneraldigraphs.AsampleDAGisillustratedinFigure19.6.

Figure19.6Adirectedacyclicgraph(DAG)

Thisdigraphhasnocycles,apropertythatisnotimmediatelyapparentfromtheedgelistorevenfromexaminingitsdrawing.

Directedcyclesare therefore thekeytounderstandingconnectivity indigraphsthat are not DAGs. An undirected graph is connected if there is a path fromevery vertex to every other vertex; for digraphs, we modify the definition asfollows:Definition 19.4A digraph is strongly connected if every vertex is reachablefromeveryvertex.ThegraphinFigure19.1isnotstronglyconnectedbecause,forexample, therearenodirectedpathsfromvertices9through12toanyoftheotherverticesinthegraph.Asindicatedbystrongly,thisdefinitionimpliesarelationshipbetweeneachpairof vertices stronger than reachability. In any digraph, we say that a pair ofverticessandtarestronglyconnectedormutuallyreachableifthereisadirectedpathfromstotandadirectedpathfromttos.(Ourconventionthateachvertexisreachablefromitselfimpliesthateachvertexisstronglyconnectedtoitself.)Adigraph is strongly connected if and only if all pairs of vertices are stronglyconnected.Thedefiningpropertyofstronglyconnecteddigraphsisonethatwetake forgranted in connectedundirectedgraphs: If there is apath froms to t,

thenthereisapathfromttos.Inthecaseofundirectedgraphs,weknowthisfact because the same path fits the bill, traversed in the other direction; indigraphs,itmustbeadifferentpath.Anotherwayofsayingthatapairofverticesisstronglyconnectedisthattheylieonsomedirectedcyclicpath.Recallthatweusethetermcyclicpathinsteadofcycletoindicatethatthepathdoesnotneedtobesimple.Forexample,inFigure19.1, 5 and 6 are strongly connected because 6 is reachable from 5 via thedirectedpath5-4-2-0-6and5isreachablefrom6viathedirectedpath6-4-3-5;andthesepathsimplythat5and6lieonthedirectedcyclicpath5-4-2-0-6-4-3-5, but they do not lie on any (simple) directed cycle.Note that noDAG thatcontainsmorethanonevertexisstronglyconnected.Likesimpleconnectivity inundirectedgraphs, this relation is transitive: If s isstrongly connected to t, and t is strongly connected to u, then s is stronglyconnected to u. Strong connectivity is an equivalence relation that divides theverticesintoequivalenceclassescontainingverticesthatarestronglyconnectedto each other. (See Section 19.4 for a detailed discussion of equivalencerelations.)Again, strong connectivity provides a property for digraphs thatwetakeforgrantedwithrespecttoconnectivityinundirectedgraphs.

Figure19.7Digraphterminology

Sources(verticeswithnoedgescomingin)andsinks(verticeswithnoedgesgoingout)areeasytoidentifyindigraphdrawingslikethisone,butdirected

cyclesandstronglyconnectedcomponentsaremoredifficulttoidentify.Whatis

thelongestdirectedcycleinthisdigraph?Howmanystronglyconnectedcomponentswithmorethanonevertexarethere?

Property 19.1 A digraph that is not strongly connected comprises a set ofstrongly connected components (strongcomponents , for short), which aremaximalstronglyconnectedsubgraphs,andasetofdirectededgesthatgofromonecomponenttoanother.Proof:Likecomponentsinundirectedgraphs,strongcomponentsindigraphsareinduced subgraphs of subsets of vertices:Eachvertex is in exactly one strongcomponent.Toprovethisfact,wefirstnotethateveryvertexbelongstoatleastonestrongcomponent,whichcontains(atleast)thevertexitself.Thenwenotethateveryvertexbelongstoatmostonestrongcomponent:Ifavertexweretobelong to two different components, then there would be paths through thatvertexconnectingverticesinthosecomponentstoeachother,inbothdirections,whichcontradictsthemaximalityofbothcomponents.Forexample,adigraphthatconsistsofasingledirectedcyclehasjustonestrongcomponent.Attheotherextreme,eachvertexinaDAGisastrongcomponent,soeachedgeinaDAGgoesfromonecomponenttoanother.Ingeneral,notalledgesinadigraphareinthestrongcomponents.Thissituationisincontrasttothe analogous situation for connected components inundirectedgraphs,whereeveryvertexandeveryedgebelongstosomeconnectedcomponent,butsimilartotheanalogoussituationforedge-connectedcomponentsinundirectedgraphs.The strong components in a digraph are connected by edges that go from avertexinonecomponenttoavertexinanotherbutdonotgobackagain.Property 19.2Given a digraph D, define another digraph K (D )with onevertexcorresponding to each strongcomponentofDandoneedge inK (D )corresponding to each edge in D that connects vertices in different strongcomponents (connecting the vertices in K that correspond to the strongcomponents that itconnects inD).Then,K (D ) isaDAG(whichwecall thekernelDAGofD).Proof: IfK (D )were to have a directed cycle, then vertices in two differentstrong components ofD would fall on a directed cycle, and that would be acontradiction.Figure 19.8 shows the strong components and the kernel DAG for a sampledigraph. We look at algorithms for finding strong components and buildingkernelDAGsinSection19.6.

Figure19.8StrongcomponentsandkernelDAG

Thisdigraph(top)consistsoffourstrongcomponents,asidentified(witharbitraryintegerlabels)bythevertex-indexedarrayid(center).Component0consistsofthevertices9,10,11and12;component1consistsofthesingle

vertex1;component2consistsofthevertices0,2,3,4,5,and6;andcomponent3consistsofthevertices7and8.Ifwedrawthegraphdefinedbytheedges

betweendifferentcomponents,wegetaDAG(bottom).Fromthesedefinitions,properties,andexamples, it isclear thatweneed tobeprecise when referring to paths in digraphs.We need to consider at least thefollowingthreesituations:ConnectivityWereservethetermconnectedforundirectedgraphs.Indigraphs,we might say that two vertices are connected if they are connected in theundirected graph defined by ignoring edge directions, but we generally avoidsuchusage.ReachabilityIndigraphs,wesaythatvertextisreachablefromvertexsifthereis a directed path from s to t. We generally avoid the term reachable whenreferringtoundirectedgraphs,althoughwemightconsiderittobeequivalenttoconnected because the idea of one vertex being reachable from another isintuitiveincertainundirectedgraphs(forexample,thosethatrepresentmazes).StrongconnectivityTwoverticesinadigrapharestronglyconnectediftheyaremutually reachable; in undirected graphs, two connected vertices imply theexistence of paths from each to the other. Strong connectivity in digraphs issimilarincertainwaystoedgeconnectivityinundirectedgraphs.Wewish to support digraphADT operations that take two vertices s and t as

argumentsandallowustotestwhether•tisreachablefroms•sandtarestronglyconnected(mutuallyreachable)What resource requirementsarewewilling toexpendfor theseoperations?Aswe saw in Section 17.5, DFS provides a simple solution for connectivity inundirected graphs that takes time proportional to V, but if we are willing toinvestpreprocessingtimeproportionaltoV+EandspaceproportionaltoV,wecan answer connectivity queries in constant time. Later in this chapter, weexamine algorithms for strong connectivity that have these same performancecharacteristics.Butourprimaryaimis toaddress thefact thatreachabilityqueries indigraphsaremoredifficulttohandlethanconnectivityorstrongconnectivityqueries.Inthis chapter, we examine classical algorithms that require preprocessing timeproportionaltoVEandspaceproportionaltoV2,developimplementationsthatcan achieve constant-time reachability queries with linear space andpreprocessingtimeforsomedigraphs,andstudythedifficultyofachievingthisoptimalperformanceforalldigraphs.

Exercises•19.10Give theadjacency-listsstructure that isbuiltbyProgram17.9for thedigraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.11Writeaprogramtogeneraterandomsparsedigraphsforawell-chosensetofvaluesofVandEsuchthatyoucanuseittorunmeaningfulempiricaltestsondigraphsdrawnfromtherandom-edgesmodel.

•19.12Writeaprogramtogeneraterandomsparsegraphsforawell-chosensetofvaluesofVandEsuchthatyoucanuseittorunmeaningfulempiricaltestsongraphsdrawnfromtherandom-graphmodel.

•19.13Writeaprogramthatgeneratesrandomdigraphsbyconnectingverticesarrangedina -by- gridtotheirneighbors,withedgedirectionsrandomlychosen(seeFigure19.3).

• 19.14 Augment your program from Exercise 19.13 to add R extra randomedges(allpossibleedgesequallylikely).ForlargeR,shrinkthegridsothatthetotal number of edges remains about V. Test your program as described inExercise19.11.

•19.15ModifyyourprogramfromExercise19.14suchthatanextraedgegoes

from a vertex s to a vertex t with probability inversely proportional to theEuclideandistancebetweensandt.

•19.16WriteaprogramthatgeneratesV randomintervals in theunit interval,alloflengthd,thenbuildsadigraphwithanedgefromintervalstointervaltifandonlyifatleastoneoftheendpointsofsfallswithint(seeExercise17.75).Determinehowtosetdsothat theexpectednumberofedgesisE.TestyourprogramasdescribedinExercise19.11(forlowdensities)andasdescribedinExercise19.12(forhighdensities).

• 19.17 Write a program that chooses V vertices and E edges from the realdigraph thatyou found forExercise19.1.Testyourprogramasdescribed inExercise19.11(forlowdensities)andasdescribedinExercise19.12(forhighdensities).

• 19.18Write a program that produces each of the possible digraphs withVvertices and E edges with equal likelihood (see Exercise 17.70). Test yourprogramasdescribedinExercise19.11(forlowdensities)andasdescribedinExercise19.12(forhighdensities).

•19.19 Implementaclass thatprovidesclientswith thecapability to learn theindegreeandoutdegreeofanygivenvertexinadigraph,inconstanttime,afterlinear-timepreprocessing in theconstructor.Thenaddmemberfunctions thatreturnthenumberofsourcesandsinks,inconstanttime.

•19.20UseyourprogramfromExercise19.19 to find theaveragenumberofsourcesandsinksinvarioustypesofdigraphs(seeExercises19.11–18).

• 19.21 Show the adjacency-lists structure that is produced when you useProgram19.1tofindthereverseofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.22Characterizethereverseofamap.19.23Designadigraphclassthatexplicitlyprovidesclientswiththecapabilitytorefer tobothadigraphandits reverse,andprovidean implementation, foranyrepresentationthatsupportsedgequeries.19.24ProvideanalternateimplementationforyourclassinExercise19.23thatmaintainsbothorientationsofedgesonadjacencylists.•19.25DescribeafamilyofstronglyconnecteddigraphswithVverticesandno(simple)directedcyclesoflengthgreaterthan2.

•19.26GivethestrongcomponentsandakernelDAGofthedigraph3-71-47-80-55-23-82-90-64-92-66-4.

•19.27GiveakernelDAGofthegriddigraphshowninFigure19.3.19.28HowmanydigraphshaveVvertices,allofoutdegreek?•19.29Whatistheexpectednumberofdifferentadjacency-listsrepresentationsofarandomdigraph?Hint:Dividethetotalnumberofpossiblerepresentationsbythetotalnumberofdigraphs.

19.2AnatomyofDFSinDigraphsWecanuseourDFScodeforundirectedgraphsfromChapter18 tovisiteachedge and each vertex in a digraph. The basic principle behind the recursivealgorithmholds:Tovisiteveryvertexthatcanbereachedfromagivenvertex,wemarkthevertexashavingbeenvisited,then(recursively)visitalltheverticesthatcanbereachedfromeachoftheverticesonitsadjacencylist.Inanundirectedgraph,wehavetworepresentationsofeachedge,butthesecondrepresentationthatisencounteredinaDFSalwaysleadstoamarkedvertexandis ignored (seeSection18.2). In a digraph,wehave just one representationofeachedge,sowemightexpectDFSalgorithmstobemorestraightforward.Butdigraphsthemselvesaremorecomplicatedcombinatorialobjectsthanundirectedgraphs,sothisexpectationisnotjustified.Forexample,thesearchtreesthatweuse to understand the operation of the algorithm have a more complicatedstructure for digraphs than for undirected graphs. This complication makesdigraph-processingalgorithmsmoredifficulttodevise.Forexample,aswewillsee,itismoredifficulttomakeinferencesaboutdirectedpathsindigraphsthanitistomakeinferencesaboutpathsingraphs.AswedidinChapter18,weusethetermstandardadjacency-listsDFStorefertotheprocessofinsertingasequenceofedgesintoadigraphADTimplementedwith an adjacency-lists representation (Program17.9, invokedwith true as theconstructor’s secondargument), thendoingaDFSwith, forexample,Program18.3andtheparalleltermstandardadjacency-matrixDFStorefertotheprocessof inserting a sequence of edges into a digraph ADT implemented with anadjacency-matrix representation (Program 17.7, invoked with true as theconstructor’s secondargument), thendoingaDFSwith, forexample,Program18.3.For example, Figure 19.9 shows the recursive-call tree that describes theoperation of a standard adjacency-lists DFS on the sample digraph in Figure19.1. Just as for undirected graphs, such trees have internal nodes thatcorrespondtocallsontherecursiveDFSfunctionforeachvertex,withlinkstoexternalnodesthatcorrespondtoedgesthattakeustoverticesthathavealready

beenseen.Classifyingthenodesandlinksgivesusinformationaboutthesearch(and thedigraph),but theclassificationfordigraphs isquitedifferent fromtheclassificationforundirectedgraphs.

Figure19.9DFSforestforadigraph

Thisforestdescribesastandardadjacency-listsDFSofthesampledigraphinFigure19.1.Externalnodesrepresentpreviouslyvisitedinternalnodeswiththesamelabel;otherwisetheforestisarepresentationofthedigraph,withalledgespointingdown.Therearefourtypesofedges:treeedges,tointernalnodes;backedges,toexternalnodesrepresentingancestors(shadedcircles);downedges,toexternalnodesrepresentingdescendants(shadedsquares);andcrossedges,toexternalnodesrepresentingnodesthatareneitherancestorsnordescendants(whitesquares).Wecandeterminethetypeofedgestovisitednodes,by

comparingthepreorderandpostordernumbers(bottom)oftheirsourceanddestination:

Forexample,7-6isacrossedgebecause7‘spreorderandpostordernumbersarebothlargerthan6’s.

In undirected graphs, we assigned each link in the DFS tree to one of fourclasses according to whether it corresponded to a graph edge that led to arecursivecallandtowhetheritcorrespondedtothefirstorsecondrepresentationof the edge encountered by the DFS. In digraphs, there is a one-to-onecorrespondence between tree links and graph edges, and they fall into fourdistinctclasses:•Thoserepresentingarecursivecall(treeedges)

•ThosefromavertextoanancestorinitsDFStree(backedges)•ThosefromavertextoadescendantinitsDFStree(downedges)• Those from a vertex to another vertex that is neither an ancestor nor adescendantinitsDFStree(crossedges)

Atreeedgeisanedgetoanunvisitedvertex,correspondingtoarecursivecallintheDFS.Back,cross,anddownedgesgotovisitedvertices.Toidentifythetypeofagivenedge,weusepreorderandpostordernumbering(theorder inwhichnodesarevisitedinpreorderandpostorderwalksoftheforest,respectively).Property19.3InaDFSforestcorrespondingtoadigraph,anedgetoavisitednode is a back edge if it leads to a node with a higher postorder number;otherwise,itisacrossedgeifitleadstoanodewithalowerpreordernumberandadownedgeifitleadstoanodewithahigherpreordernumber.Proof:Thesefactsfollowfromthedefinitions.Anode’sancestorsinaDFStreehave lower preorder numbers and higher postorder numbers; its descendantshavehigherpreordernumbersandlowerpostordernumbers.It isalsotruethatbothnumbersarelowerinpreviouslyvisitednodesinotherDFStrees,andbothnumbersarehigherinyetto-be-visitednodesinotherDFStrees,butwedonotneedcodethattestsforthesecases.Program19.2isaDFSclassthatidentifiesthetypeofeachedgeinthedigraph.Figure 19.10 illustrates its operation on the example digraph of Figure 19.1.Duringthesearch,testingtoseewhetheranedgeleadstoanodewithahigherpostorder number is equivalent to testingwhether a postorder number has yetbeenassigned.Anynodeforwhichapreordernumberhasbeenassignedbutforwhichapostordernumberhasnotyetbeenassignedisanancestor in theDFStreeandwill thereforehaveapostordernumberhigher than thatof thecurrentnode.

Figure19.10DigraphDFStrace

ThisDFStraceistheoutputofProgram19.2fortheexampledigraphinFigure19.1.ItcorrespondspreciselytoapreorderwalkoftheDFStreedepictedin

Figure19.9.AswesawinChapter17forundirectedgraphs,theedgetypesarepropertiesofthedynamicsofthesearch,ratherthanofonlythegraph.Indeed,differentDFSforests of the same graph can differ remarkably in character, as illustrated inFigure19.11.Forexample,eventhenumberoftreesintheDFSforestdependsuponthestartvertex.Despite these differences, several classical digraph-processing algorithms areable to determine digraph properties by taking appropriate action when theyencounter thevarious typesof edgesduringaDFS.For example, consider thefollowingbasicproblem:DirectedcycledetectionDoesagivendigraphhaveanydirectedcycles?(IsthedigraphaDAG?)Inundirectedgraphs,anyedgetoavisitedvertexindicatesacycleinthegraph;indigraphs,wemustrestrictourattentiontobackedges.

Program19.2DFSofadigraphThisDFSclassusespreorderandpostordernumberingstoshowtherolethateachedgeinthegraphplaysintheDFS(seeFigure19.10).

Property19.4Adigraph isaDAGifandonly ifweencounternobackedgeswhenweuseDFStoexamineeveryedge.Proof:Anybackedgebelongstoadirectedcyclethatconsistsoftheedgeplusthe tree path connecting the two nodes, so we will find no back edges whenusingDFSonaDAG.Toprovetheconverse,weshowthatifthedigraphhasacycle,thentheDFSencountersaback

Figure19.11DFSforestsforadigraph

Theseforestsdescribesdepth-firstsearchofthesamegraphasFigure19.9,whenthegraphsearchfunctionchecksthevertices(andcallstherecursivefunctionfortheunvisitedones)intheorders,s+1,…,V-1,0,1,…,s-1foreachs.Theforeststructureisdeterminedbothbythesearchdynamicsandthegraphstructure.Eachnodehasthesamechildren(thenodesonitsadjacencylist,inorder)ineveryforest.Theleftmosttreeineachforestcontainsallthenodesreachablefromitsroot,butreachabilityinferencesaboutothernodesarecomplicated

becauseofback,cross,anddownedges.Eventhenumberoftreesintheforestdependsonthestartingnode,sowedonotnecessarilyhaveadirect

correspondencebetweentreesintheforestandstrongcomponents,thewaythatwedidforcomponentsinundirectedgraphs.Forexample,weseethatall

verticesarereachablefrom8onlywhenwestarttheDFSat8.edge.SupposethatvisthefirstoftheverticesonthecyclethatisvisitedbytheDFS.Thatvertexhasthelowestpreordernumberofalltheverticesonthecycle.Theedgethatpoints to itwill thereforebeabackedge:Itwillbeencounteredduringtherecursivecallforv(foraproofthatitmustbe,seeProperty19.5),anditpointsfromsomenodeonthecycletov,anodewithalowerpreordernumber(seeProperty19.3).We can convert any digraph into a DAG by doing a DFS and removing anygraphedgesthatcorrespondtobackedgesintheDFS.Forexample,Figure19.9tellsus thatremovingtheedges2-0,3-5,2-3,9-11,10-12,4-2,and8-7makesthedigraph inFigure19.1 aDAG.The specificDAG thatweget in thiswaydepends on the graph representation and the associated implications for thedynamics of the DFS (see Exercise 19.37). This method is a useful way togeneratelargearbitraryDAGsrandomly(seeExercise19.76)foruseintestingDAG-processingalgorithms.Directed cycle detection is a simple problem, but contrasting the solution justdescribed with the solution that we considered in Chapter 18 for undirectedgraphsgivesinsightintothenecessityofconsideringthetwotypesofgraphsasdifferentcombinatorialobjects,eventhoughtheirrepresentationsaresimilarandthesameprogramsworkonbothtypesforsomeapplications.Byourdefinitions,we seem to be using the same method to solve this problem as for cycledetectioninundirectedgraphs(lookforbackedges),buttheimplementationthatwe used for undirected graphs would not work for digraphs. For example, inSection18.5wewerecarefultodistinguishbetweenparentlinksandbacklinkssincetheexistenceofaparentlinkdoesnotindicateacycle(cyclesinundirectedgraphsmustinvolveatleastthreevertices).Buttoignorelinksbacktoanode’s

parentsindigraphswouldbeincorrect;wedoconsideradoubly-connectedpairofverticesinadigraphtobeacycle.Theoretically,wecouldhavedefinedbackedgesinundirectedgraphsinthesamewayaswehavedonehere,butthenwewould have needed an explicit exception for the two-vertex case. Moreimportant,wecandetect cycles inundirectedgraphs in timeproportional toV(seeSection18.5),butwemayneedtimeproportionaltoEtofindacycleinadigraph(seeExercise19.32).The essential purpose of DFS is to provide a systematic way to visit all theverticesandall theedgesofagraph.It thereforegivesusabasicapproachforsolvingreachabilityproblemsindigraphs,although,again,thesituationismorecomplicatedthanforundirectedgraphs.Single-source reachabilityWhich vertices in a given digraph can be reachedfromagivenstartvertexs?Howmanysuchverticesarethere?Property19.5WitharecursiveDFSstartingats,wecansolvethesingle-sourcereachabilityproblemforavertexsintimeproportionaltothenumberofedgesinthesubgraphinducedbythereachablevertices.Proof:ThisproofisessentiallythesameastheproofofProperty18.1,butitisworthrestatingtounderlinethedistinctionbetweenreachabilityindigraphsandconnectivity in undirected graphs. The property is certainly true for a digraphthathasonevertexandnoedges.Foranydigraphthathasmorethanonevertex,weassumethepropertytobetrueforalldigraphsthathavefewervertices.Now,thefirstedgethatwetakefromsdividesthedigraphintothesubgraphsinducedbytwosubsetsofvertices(seeFigure19.12):(i)theverticesthatwecanreachbydirectedpathsthatbeginwiththatedgeanddonototherwiseincludes;and(ii)theverticesthatwecannotreachwithadirectedpaththatbeginswiththatedgewithout returning to s.We apply the inductive hypothesis to these subgraphs,notingthattherearenodirectededgesfromavertexinthefirstsubgraphtoanyvertex other than s in the second subgraph (such an edge would be acontradictionbecauseitsdestinationvertexshouldbeinthefirstsubgraph),thatdirectededges toswill be ignoredbecause it has the lowestpreordernumber,andthatalltheverticesinthefirstsubgraphhavelowerpreordernumbersthanany vertex in the second subgraph, so all directed edges from a vertex in thesecondsubgraphtoavertexinthefirstsubgraphwillbeignored.

Figure19.12Decomposingadigraph

ToprovebyinductionthatDFStakesuseverywherereachablefromagivennodeinadigraph,weuseessentiallythesameproofasforTrémauxexploration.Thekeystepisdepictedhereasamaze(top),forcomparisonwithFigure18.4.Webreakthegraphintotwosmallerpieces(bottom),inducedbytwosetsof

vertices:Thoseverticesthatcanbereachedbyfollowingthefirstedgefromthestartvertexwithoutrevisitingit(bottompiece),andthoseverticesthatcannotbereachedbyfollowingthefirstedgewithoutgoingbackthroughthestartvertex(toppiece).Anyedgethatgoesfromavertexinthefirstsettothestartvertexisskippedduringthesearchofthefirstsetbecauseofthemarkonthestartvertex.Anyedgethatgoesfromavertexinthesecondsettoavertexinthefirstsetisskippedbecauseallverticesinthefirstsetaremarkedbeforethesearchofthe

secondsubgraphbegins.By contrast with undirected graphs, a DFS on a digraph does not give fullinformationaboutreachabilityfromanyvertexotherthanthestartnode,becausetreeedgesaredirectedandbecausethesearchstructureshavecrossedges.Whenweleaveavertex to traveldowna treeedge,wecannotassumethat there isawaytogetbacktothatvertexviadigraphedges;indeed,thereisnot,ingeneral.Forexample,thereisnowaytogetbackto4afterwetakethetreeedge4-11inFigure19.9.Moreover,whenweignorecrossandforwardedges(becausetheyleadtoverticesthathavebeenvisitedandarenolongeractive),weareignoringinformation that they imply (the set of vertices that are reachable from thedestination).Forexample,followingthecrossedge6-9inFigure19.9istheonlywayforustofindoutthat10,11,and12arereachablefrom6.Todeterminewhichvertices are reachable fromanother vertex,we apparentlyneedtostartoverwithanewDFSfromthatvertex(seeFigure19.11).Canwemake use of information from previous searches to make the process moreefficientforlaterones?WeconsidersuchreachabilityquestionsinSection19.7.

Todetermineconnectivityinundirectedgraphs,werelyonknowingthatverticesareconnectedtotheirancestorsintheDFStree,through(atleast)thepathinthetree.Bycontrast,thetreepathgoesinthewrongdirectioninadigraph:Thereisadirectedpathfromavertexinadigraphtoanancestoronlyifthereisabackedge from a descendant to that or a more distant ancestor. Moreover,connectivity in undirected graphs for each vertex is restricted to theDFS treerooted at that vertex; in contrast, in digraphs, cross edges can take us to anypreviously visited part of the search structure, even one in another tree in theDFS forest. For undirected graphs, we were able to take advantage of thesepropertiesofconnectivitytoidentifyeachvertexwithaconnectedcomponentinasingleDFS,thentousethatinformationasthebasisforaconstant-timeADToperationtodeterminewhetheranytwoverticesareconnected.Fordigraphs,asweseeinthischapter,thisgoaliselusive.We have emphasized throughout this and the previous chapter that differentwaysofchoosingunvisitedverticesleadtodifferentsearchdynamicsforDFS.Fordigraphs, thestructuralcomplexityof theDFStrees leadstodifferences insearch dynamics that are even more pronounced than those we saw forundirected graphs. For example, Figure 19.11 illustrates that we get markeddifferences for digraphs even when we simply vary the order in which thevertices are examined in the top-level search function.Only a tiny fraction ofeven these possibilities is shown in the figure—in principle, each of the V !differentordersofexaminingverticesmightleadtodifferentresults.InSection19.7,weshallexamineanimportantalgorithmthatspecificallytakesadvantageofthisflexibility,processingtheunvisitedverticesatthetoplevel(therootsofthe DFS trees) in a particular order that immediately exposes the strongcomponents.

Exercises19.30DrawtheDFSforestthatresultsfromastandardadjacency-listsDFSofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.19.31DrawtheDFSforestthatresultsfromastandardadjacency-matrixDFSofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.32DescribeafamilyofdigraphswithVverticesandEedgesforwhichastandard adjacency-lists DFS requires time proportional to E for cycledetection.

• 19.33 Show that, during a DFS in a digraph, no edge connects a node toanothernodewhosepreorderandpostordernumbersarebothsmaller.

•19.34ShowallpossibleDFSforestsforthedigraph0-10-20-31-32-3.Tabulatethenumberoftree,back,cross,anddownedgesforeachforest.19.35Ifwedenotethenumberoftree,back,cross,anddownedgesbyt,b,c,andd,respectively,thenwehavet+b+c+d=Eandt<VforanyDFSofanydigraphwithVverticesandEedges.Whatotherrelationshipsamongthesevariables can you infer?Which of the values are dependent solely on graphproperties,andwhicharedependentondynamicpropertiesoftheDFS?•19.36ProvethateverysourceinadigraphmustbearootofsometreeintheforestcorrespondingtoanyDFSofthatdigraph.

• 19.37 Construct a connected DAG that is a subgraph of Figure 19.1 byremovingfiveedges(seeFigure19.11).19.38 Implement a digraph class that provides the capability for a client tocheck that a digraph is indeed a DAG, and provide a DFS-basedimplementation.19.39 Use your solution to Exercise 19.38 to estimate (empirically) theprobability that a randomdigraphwithV vertices andE edges is aDAG forvarioustypesofdigraphs(seeExercises19.11–18).19.40Runempiricalstudiestodeterminetherelativepercentagesoftree,back,cross, and down edgeswhenwe runDFS on various types of digraphs (seeExercises19.11–18).19.41DescribehowtoconstructasequenceofdirectededgesonVverticesforwhichtherewillbenocrossordownedgesandforwhichthenumberofbackedgeswillbeproportionaltoV2inastandardadjacency-listsDFS.•19.42DescribehowtoconstructasequenceofdirectededgesonVverticesforwhichtherewillbenobackordownedgesandforwhichthenumberofcrossedgeswillbeproportionaltoV2inastandardadjacency-listsDFS.19.43DescribehowtoconstructasequenceofdirectededgesonVverticesforwhichtherewillbenobackorcrossedgesandforwhichthenumberofdownedgeswillbeproportionaltoV2inastandardadjacency-listsDFS.•19.44GiverulescorrespondingtoTrémauxtraversalforamazewhereallthepassagesareone-way.

• 19.45 Extend your solutions to Exercises 17.56 through 17.60 to include

arrowsonedges(seethefiguresinthischapterforexamples).

19.3ReachabilityandTransitiveClosureTo develop efficient solutions to reachability problems in digraphs, we beginwiththefollowingfundamentaldefinition.Definition19.5Thetransitiveclosureofadigraphisadigraphwiththesameverticesbutwithanedgefromstotinthetransitiveclosureifandonlyifthereisadirectedpathfromstotinthegivendigraph.In otherwords, the transitive closure has an edge from each vertex to all theverticesreachablefromthatvertexinthedigraph.Clearly,thetransitiveclosureembodiesalltherequisiteinformationforsolvingreachabilityproblems.Figure19.13illustratesasmallexample.

Figure19.13Transitiveclosure

Thisdigraph(top)hasjusteightdirectededges,butitstransitiveclosure(bottom)showsthattherearedirectedpathsconnecting19ofthe30pairsofvertices.Structuralpropertiesofthedigrapharereflectedinthetransitive

closure.Forexample,rows0,1,and2intheadjacencymatrixforthetransitiveclosureareidentical(asarecolumns0,1,and2)becausethoseverticesareona

directedcycleinthedigraph.Oneappealingway tounderstand the transitiveclosure isbasedonadjacency-matrix digraph representations, and on the following basic computationalproblem.BooleanmatrixmultiplicationABooleanmatrixisamatrixwhoseentriesareallbinaryvalues,either0or1.GiventwoBooleanmatricesAandB,computeaBooleanproductmatrixC,usingthelogicalandandoroperationsinsteadofthearithmeticoperations*and+,respectively.The textbook algorithm for computing the product of two V -by-V matrices

computes,foreachsandt,thedotproductofrowsinthefirstmatrixandrowtinthesecondmatrix,asfollows:

for(s=0;s<V;s++)

for(t=0;t<V;t++)

for(i=0,C[s][t]=0;i<V;i++)

C[s][t]+=A[s][i]*B[i][t];

operation isdefined formatrices comprisingany typeof entry forwhich0,+,and*aredefined.Inparticular,ifweinterpreta+btobethelogicaloroperationand a*b to be the logical and operation, then we have Boolean matrixmultiplication.InC++,wecanusethefollowingversion:

for(s=0;s<V;s++)

for(t=0;t<V;t++)

for(i=0,C[s][t]=0;i<V;i++)

if(A[s][i]&&B[i][t])C[s][t]=1;

TocomputeC[s][t]intheproduct,weinitializeitto0,thensetitto1ifwefindsome value i for which both A[s][i] and B[i][t] are both 1. Running thiscomputation is equivalent to setting C[s][t] to 1 if and only if the result of abitwiselogicalandofrowsinAwithcolumntinBhasanonzeroentry.NowsupposethatAistheadjacencymatrixofadigraphAandthatweusetheprecedingcodetocomputeC=A*A2(simplybychangingthereferencetoBinthecodeintoareferencetoA).Readingthecodeintermsoftheinterpretationofthe adjacency-matrix entries immediately tells us what it computes: For eachpairofverticessand t,weputanedgefroms to t inC if andonly if there issomevertexiforwhichthereisbothapathfromstoiandapathfromitot inA.Inotherwords,directededgesinA2correspondpreciselytodirectedpathsoflength2inA.Ifweincludeself-loopsateveryvertexinA,thenA2alsohastheedges ofA; otherwise, it does not. This relationship between Boolean matrixmultiplication and paths in digraphs is illustrated in Figure 19.14. It leadsimmediately to an elegantmethod for computing the transitive closure of anydigraph.

Figure19.14Squaringanadjacencymatrix

Ifweput0sonthediagonalofadigraph’sadjacencymatrix,thesquareofthematrixrepresentsagraphwithanedgecorrespondingtoeachpathoflength2(top).Ifweput1sonthediagonal,thesquareofthematrixrepresentsagraph

withanedgecorrespondingtoeachpathoflength1or2(bottom).Property 19.6 We can compute the transitive closure of a digraph byconstructingthelatter’sadjacencymatrixA,addingself-loopsforeveryvertex,andcomputingAV.

Proof:Continuingtheargument in thepreviousparagraph,A3hasanedgeforeverypathoflengthlessthanorequal to3inthedigraph,A4hasanedgeforeverypathoflengthlessthanorequalto4inthedigraph,andsoforth.Wedonot need to consider paths of length greater thanV because of the pigeonholeprinciple: Any such path must revisit some vertex (since there are onlyV ofthem) and therefore adds no information to the transitive closure because thesametwoverticesareconnectedbyadirectedpathoflengthlessthanV(whichwecouldobtainbyremovingthecycletotherevisitedvertex).Figure 19.15 shows the adjacency-matrix powers for a sample digraphconverging to transitive closure. This method takes V matrix multiplications,each of which takes time proportional toV3, for a grand total ofV4.We canactuallycomputethetransitiveclosureforanydigraphwithjustlg[V]Booleanmatrix-multiplicationoperations:WecomputeA2,A4,A8,…untilwereachanexponentgreaterthanorequaltoV.AsshownintheproofofProperty19.6,At=A V for any t > V; so the result of this computation, which requires timeproportionaltoV3lgV,isAV—thetransitiveclosure.

Figure19.15Adjacencymatrixpowersanddirectedpaths

Thissequenceshowsthefirst,second,third,andfourthpowers(right,toptobottom)oftheadjacencymatrixatthetopright,whichgivesgraphswithedgesforeachofthepathsoflengthslessthan1,2,3,and4,respectively,(left,topto

bottom)inthegraphthatthematrixrepresents.Thebottomgraphisthetransitiveclosureforthisexample,sincetherearenopathsoflengthgreaterthan

4thatconnectverticesnotconnectedbyshorterpaths.Although the approach just described is appealing in its simplicity, an evensimplermethodisavailable.Wecancomputethetransitiveclosurewithjustoneoperation of this kind, building up the transitive closure from the adjacencymatrixinplace,asfollows:

for(i=0;i<V;i++)

for(s=0;s<V;s++)

for(t=0;t<V;t++)

if(A[s][i]&&A[i][t])A[s][t]=1;

Thisclassicalmethod,inventedbyS.Warshallin1962,isthemethodofchoiceforcomputingthetransitiveclosureofdensedigraphs.ThecodeissimilartothecodethatwemighttrytousetosquareaBooleanmatrixinplace:Thedifference(whichissignificant!)liesintheorderoftheforloops.Property19.7WithWarshall’salgorithm,wecancomputethetransitiveclosureofadigraphintimeproportionaltoV3.Proof:Therunningtimeis immediatelyevidentfromthestructureofthecode.Weprovethatitcomputesthetransitiveclosurebyinductiononi.Afterthefirstiterationoftheloop,thematrixhasa1inrowsandcolumntifandonlyifwehave either the paths s-t or s-0-t. The second iteration checks all the pathsbetweensandtthatinclude1andperhaps0,suchass-1-t,s-1-0-t,ands-0-1-t.Weare led to the following inductivehypothesis:The ith iterationof the loopsets the bit in row s and column t in thematrix to 1 if and only if there is adirectedpathfroms to t in thedigraph thatdoesnot includeanyverticeswithindicesgreaterthani(exceptpossiblytheendpointssandt).Asjustargued,theconditionistruewheniis0,afterthefirstiterationoftheloop.Assumingthatitis truefor the ith iterationof the loop, there isapathfroms to t thatdoesnotincludeanyverticeswith indicesgreater than i+1 ifandonly if ( i ) there is apathfromstotthatdoesnotincludeanyverticeswithindicesgreaterthani,inwhichcaseA[s][t]wassetonapreviousiterationoftheloop(bytheinductivehypothesis);or(ii)thereisapathfromstoi+1andapathfromi+1tot,neitherofwhichincludesanyverticeswithindicesgreaterthani(exceptendpoints),inwhichcaseA[s][i+1]andA[i+1][t]werepreviouslysetto1(byhypothesis),sotheinnerloopsetsA[s][t].

Figure19.16Warshall’salgorithm

Thissequenceshowsthedevelopmentofthetransitiveclosure(bottom)ofanexampledigraph(top)ascomputedwithWarshall’salgorithm.Thefirstiterationoftheloop(leftcolumn,top)addstheedges1-2and1-5becauseofthepaths1-0-2and1-0-5,whichincludevertex0(butnovertexwithahighernumber);theseconditerationoftheloop(leftcolumn,secondfromtop)addstheedges2-0and2-5becauseofthepaths2-1-0and2-1-0-5,whichincludevertex1(butnovertexwithahighernumber);andthethirditerationoftheloop(leftcolumn,bottom)addstheedges0-1,3-0,3-1,and3-5becauseofthepaths0-2-1,3-2-1-

0,3-2-1,and3-2-1-0-5,whichincludevertex2(butnovertexwithahighernumber).Therightcolumnshowstheedgesaddedwhenpathsthrough3,4,and5areconsidered.Thelastiterationoftheloop(rightcolumn,bottom)addstheedgesfrom0,1,and2,to4,becausetheonlydirectedpathsfromthosenodesto

4include5,thehighest-numberedvertex.We can improve the performance of Warshall’s algorithm with a simpletransformation of the code:Wemove the test ofA[s][i] out of the inner loopbecause its value does not change as t varies. This move allows us to avoidexecutingthe t loopentirelywhenA[s][i] iszero.Thesavingsthatweachievefrom this improvement depends on the digraph and is substantial for manydigraphs (see Exercises 19.53 and 19.54). Program 19.3 implements thisimprovementandpackagesWarshall’smethodsuchthatclientscanpreprocessadigraph (compute the transitive closure), then compute the answer to anyreachabilityqueryinconstanttime.We are interested in pursuing more efficient solutions, particularly for sparsedigraphs.Wewould like to reduce both the preprocessing time and the spacebecausebothmake the use ofWarshall’smethodprohibitively costly for hugesparsedigraphs.Inmodernapplications,abstractdatatypesprovideuswiththeabilitytoseparateouttheideaofanoperationfromanyparticularimplementationsothatwecanfocusonefficientimplementations.Forthetransitiveclosure,thispointofviewleads to a recognition that we do not necessarily need to compute the entirematrixtoprovideclientswiththetransitive-closureabstraction.Onepossibilitymightbethatthetransitiveclosureisahugesparsematrix,soanadjacency-listsrepresentation is called for becausewe cannot store thematrix representation.Evenwhenthetransitiveclosureisdense,clientprogramsmighttestonlyatinyfractionofpossiblepairsofedges,socomputingthewholematrixiswasteful.

Program19.3Warshall’salgorithmThe constructor for class TC computes the transitive closure ofG in the private datamember T, so that clients can use TC objects to test whether any given vertex in adigraph is reachable fromanyother givenvertex.The constructor initializesTwith acopyofG,addsself-loops,thenusesWarshall’salgorithmtocompletethecomputation.ThetcGraphclassmustincludeanimplementationoftheedgeexistencetest.

templateclasstcGraph,classGraph>classTC{tcGraphT;public:TC(constGraph&G):T(G){for(ints=0;s<T.V();s++)

T.insert(Edge(s,s));for(inti=0;i<T.V();i++)for(ints=0;s<T.V();s++)if(T.edge(s,i))for(intt=0;t<T.V();t++)if(T.edge(i,t))T.insert(Edge(s,t));}boolreachable(ints,intt)const{returnT.edge(s,t);}};

We use the term abstract transitive closure to refer to an ADT that providesclients with the ability to test reachability after preprocessing a digraph, likeProgram19.3. In thiscontext,weneed tomeasureanalgorithmnot justby itscosttocomputethetransitiveclosure(preprocessingcost)butalsobythespacerequired and the query time achieved. That is, we rephrase Property 19.7 asfollows:Property 19.8 We can support constant-time reachability testing (abstracttransitive closure) for a digraph, using space proportional to V2 and timeproportionaltoV3forpreprocessing.ThispropertyfollowsimmediatelyfromthebasicperformancecharacteristicsofWarshall’salgorithm.Formostapplications,ourgoalisnotjusttocomputethetransitiveclosureofadigraphquicklybutalsotosupportconstantquerytimefortheabstracttransitiveclosure using far less space and far less preprocessing time than specified inProperty19.8.Canwefindanimplementationthatwillallowustobuildclientsthat can afford to handle such digraphs?We return to this question inSection19.8.There is an intimate relationship between the problem of computing thetransitiveclosureofadigraphandanumberofotherfundamentalcomputationalproblems, and that relationship can help us to understand this problem’sdifficulty. We conclude this section by considering two examples of suchproblems.First, we consider the relationship between the transitive closure and the all-pairsshortest-pathsproblem(seeSection18.7).Fordigraphs,theproblemistofind,foreachpairofvertices,adirectedpathwithaminimalnumberofedges.Givenadigraph,weinitializeaV-by-VintegermatrixAbysettingA[s][t]to1ifthereisanedgefromstotandtothesentinelvalueVifthereisnosuchedge.Our

goalis tosetA[s][t]equaltothelengthof(thenumberofedgeson)ashortestdirectedpathfromsto t,usingthesentinelvalueVtoindicate that thereisnosuchpath.Thefollowingcodeaccomplishesthisobjective:

for(i=0;i<V;i++)

for(s=0;s<V;s++)

for(t=0;t<V;t++)

if(A[s][i]+A[i][t]<A[s][t])

A[s][t]=A[s][i]+A[i][t];

This code differs from the version of Warshall’s algorithm that we saw justbefore Property 19.7 in only the if statement in the inner loop. Indeed, in theproper abstract setting, the computations areprecisely the same (seeExercises19.55and19.56).ConvertingtheproofofProperty19.7intoadirectproofthatthismethodaccomplishesthedesiredobjectiveisstraightforward.Thismethodis a special case of Floyd’s algorithm for finding shortest paths in weightedgraphs(seeChapter21).TheBFS-basedsolutionforundirectedgraphsthatweconsidered inSection18.7 also finds shortest paths in digraphs (appropriatelymodified).ShortestpathsarethesubjectofChapter21,sowedeferconsideringdetailedperformancecomparisonsuntilthen.Second,aswehaveseen,thetransitive-closureproblemisalsocloselyrelatedtotheBooleanmatrix-multiplicationproblem.Thebasic algorithms thatwehaveseen for both problems require time proportional to V3, using similarcomputationalschema.Booleanmatrixmultiplicationisknowntobeadifficultcomputational problem: Algorithms that are asymptotically faster than thestraightforwardmethod are known, but it is debatablewhether the savings aresufficientlylargetojustifytheeffortofimplementinganyofthem.Thisfactissignificant in the present context because we could use a fast algorithm forBoolean matrix multiplication to develop a fast transitive-closure algorithm(slowerbyjustafactoroflgV)usingtherepeated-squaringmethodillustratedin Figure 19.15. Conversely, we have a lower bound on the difficulty ofcomputingthetransitiveclosure:Property 19.9 We can use any transitive-closure algorithm to compute theproduct of two Boolean matrices with at most a constant-factor difference inrunningtime.Proof:GiventwoV-by-VBooleanmatricesAandB,weconstructthefollowing3V-by-3Vmatrix:

Here,0denotestheV-by-Vmatrixwithallentriesequalto0,andIdenotesthe

V-by-Videntitymatrixwithallentriesequalto0exceptthoseonthediagonal,whichareequal to1.Now,weconsider thismatrix tobe theadjacencymatrixfor a digraph and compute its transitive closure by repeated squaring.Butweonlyneedonestep:

The matrix on the right-hand side of this equation is the transitive closurebecause furthermultiplicationsgiveback the samematrix.But thismatrix hastheV-by-VproductABinitsupper-rightcorner.Whateveralgorithmweusetosolvethetransitive-closureproblem,wecanuseittosolvetheBooleanmatrix-multiplicationproblematthesamecost(towithinaconstantfactor).The significance of this property depends on the conviction of experts thatBooleanmatrixmultiplication is difficult:Mathematicians have been workingfor decades to try to learn precisely how difficult it is, and the question isunresolved; the best known results say that the running time should beproportional to aboutV2 . 5 ( seereferencesection ).Now, ifwe could find alinear-time (proportional toV2) solution to the transitive-closureproblem, thenwe would have a linear-time solution to the Boolean matrix-multiplicationproblemaswell.Thisrelationshipbetweenproblemsisknownasreduction:Wesay that the Boolean matrix-multiplication problem reduces to the transitive-closureproblem(seeSection21.6andPart8).Indeed,theproofactuallyshowsthatBooleanmatrixmultiplicationreducestofindingthepathsoflength2inadigraph.Despiteagreatdealofresearchbymanypeople,noonehasbeenabletofindalinear-time Boolean matrix-multiplication algorithm, so we cannot present asimple linear-time transitive-closure algorithm.On the other hand, no one hasproved that no such algorithm exists, sowe hold open that possibility for thefuture. In short, we take Property 19.9 to mean that, barring a researchbreakthrough,we cannot expect theworst-case running timeof any transitive-closure algorithm that we can concoct to be proportional to V2. Despite thisconclusion,wecandevelop fastalgorithmsforcertainclassesofdigraphs.Forexample, we have already touched on a simple method for computing thetransitive closure that is much faster than Warshall’s algorithm for sparsedigraphs.Property19.10WithDFS,wecansupportconstantquerytimefortheabstracttransitive closure of a digraph, with space proportional to V2 and time

proportionaltoV(E+V)forpreprocessing(computingthetransitiveclosure).Proof: As we observed in the previous section, DFS gives us all the verticesreachable from the start vertex in time proportional to E, if we use theadjacency-listsrepresentation(seeProperty19.5andFigure19.11).Therefore,ifwe run DFSV times, once with each vertex as the start vertex, then we cancomputethesetofverticesreachablefromeachvertex—thetransitiveclosure—intimeproportionaltoV(E+V).Thesameargumentholdsforanylinear-timegeneralizedsearch(seeSection18.8andExercise19.66).

Program19.4DFS-basedtransitiveclosureThisDFSclass implements the same interface as doesProgram19.3. It computes thetransitiveclosureTbydoingaseparateDFSstartingateachvertexofGtocomputeitssetofreachablenodes.Eachcallontherecursivefunctionaddsanedgefromthestartvertexandmakes recursivecalls to fill thecorresponding row in the transitive-closurematrix.ThematrixalsoservestomarkthevisitedverticesduringtheDFS,soitrequiresthattheGraphclasssupporttheedgeexistencetest.

template<classGraph>classtc

{GraphT;constGraph&G;

voidtcR(intv,intw)

{

T.insert(Edge(v,w));



if(!T.edge(v,t))tcR(v,t);

}

public:

tc(constGraph&G):G(G),T(G.V(),true)

{for(intv=0;v<G.V();v++)tcR(v,v);}

boolreachable(intv,intw)

{returnT.edge(v,w);}

};

Program 19.4 is an implementation of this search-based transitive-closurealgorithm.ThisclassimplementsthesameinterfaceasdoesProgram19.3.TheresultofrunningthisprogramonthesampledigraphinFigure19.1isillustratedinthefirsttreeineachforestinFigure19.11.For sparse digraphs, this search-based approach is the method of choice. Forexample, ifE is proportional toV, then Program 19.4 computes the transitiveclosure in time proportional toV2. How can it do so, given the reduction toBooleanmatrixmultiplication thatwe just considered?Theanswer is that thistransitive-closurealgorithmdoesindeedgiveanoptimalwaytomultiplycertaintypesofBooleanmatrices(thosewithO(V)nonzeroentries).Thelowerboundtellsusthatweshouldnotexpecttofindatransitive-closurealgorithmthatrunsintimeproportionaltoV2foralldigraphs,butitdoesnotprecludethepossibilitythatwemightfindalgorithms,likethisone,thatarefasterforcertainclassesofdigraphs. If suchgraphs are theones thatweneed toprocess, the relationship

between transitive closure and Boolean matrix multiplication may not berelevanttous.Itiseasytoextendthemethodsthatwehavedescribedinthissectiontoprovideclientswiththeabilitytofindaspecificpathconnectingtwoverticesbykeepingtrackofthesearchtree,asdescribedinSection17.8.WeconsiderspecificADTimplementations of this sort in the context of themore general shortest-pathsproblemsinChapter21.Table19.1showsempirical resultscomparing theelementary transitive-closurealgorithmsdescribed in this section.Theadjacency-lists implementationof thesearch-based solution is by far the fastest method for sparse digraphs. Theimplementationsallcomputeanadjacencymatrix(ofsizeV2),sononeofthemaresuitableforhugesparsedigraphs.

Table19.1Empiricalstudyoftransitive-closurealgorithms

Thistableshowsrunningtimesthatexhibitdramaticperformancedifferencesforvariousalgorithmsforcomputingthetransitiveclosureofrandomdigraphs,bothdenseandsparse.Forallbuttheadjacency-listsDFS,therunningtimegoesupbyafactorof8whenwedoubleV,whichsupportstheconclusionthatitis

essentiallyproportionaltoV3.Theadjacency-listsDFStakestimeproportionaltoVE,whichexplainstherunningtimeroughlyincreasingbyafactorof4when

wedoublebothVandE(sparsegraphs)andbyafactorofabout2whenwedoubleE(densegraphs),exceptthatlist-traversaloverheaddegrades

performanceforhigh-densitygraphs.For sparse digraphs whose transitive closure is also sparse, we might use anadjacency-lists implementation for the closure so that the size of the output isproportional to the number of edges in the transitive closure. This numbercertainlyisalowerboundonthecostofcomputingthetransitiveclosure,whichwe can achieve for certain types of digraphs using various algorithmictechniques(seeExercises19.64and19.65).Despitethispossibility,wegenerallyviewtheobjectiveofatransitive-closurecomputationtobedense,sowecanusearepresentation likeDenseGRAPHthatcaneasilyanswerreachabilityqueries,and we regard transitive-closure algorithms that compute the matrix in timeproportionaltoV2asbeingoptimalsincetheytaketimeproportionaltothesizeoftheiroutput.Iftheadjacencymatrixissymmetric,itisequivalenttoanundirectedgraph,andfindingthetransitiveclosureisthesameasfindingtheconnectedcomponents—the transitive closure is the union of complete graphs on the vertices in theconnected components (see Exercise 19.48). Our connectivity algorithms inSection18.5amounttoanabstract–transitive-closurecomputationforsymmetricdigraphs(undirectedgraphs)thatusesspaceproportionaltoVandstillsupportsconstant-timereachabilityqueries.Canwedoaswellingeneraldigraphs?Canwereducethepreprocessingtimestillfurther?Forwhattypesofgraphscanwecompute the transitive closure in linear time? To answer these questions, weneedtostudythestructureofdigraphsinmoredetail,including,specifically,thatofDAGs.

Exercises• 19.46 What is the transitive closure of a digraph that consists solely of adirectedcyclewithVvertices?19.47 Howmany edges are there in the transitive closure of a digraph thatconsistssolelyofasimpledirectedpathwithVvertices?•19.48Givethetransitiveclosureoftheundirectedgraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.49ShowhowtoconstructadigraphwithVverticesandEedgeswiththepropertythatthenumberofedgesinthetransitiveclosureisproportionaltot,foranytbetweenEandV2.Asusual,assumethatE>V.

19.50 Give a formula for the number of edges in the transitive closure of adigraph that is a directed forest as a function of structural properties of theforest.19.51 Show, in the style of Figure 19.15, the process of computing thetransitiveclosureofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4throughrepeatedsquaring.19.52 Show, in the style of Figure 19.16, the process of computing thetransitiveclosureofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4withWarshall’salgorithm.• 19.53 Give a family of sparse digraphs for which the improved version ofWarshall’salgorithmforcomputingthetransitiveclosure(Program19.3) runsintimeproportionaltoVE.

• 19.54 Find a sparse digraph for which the improved version of Warshall’salgorithm for computing the transitive closure (Program 19.3) runs in timeproportionaltoV3.

•19.55DevelopabaseclassfromwhichyoucanderiveclassesthatimplementbothWarshall’salgorithmandFloyd’salgorithm.(ThisexerciseisaversionofExercise19.56forpeoplewhoaremorefamiliarwithabstractdatatypesthanwithabstractalgebra.)

•19.56Use abstract algebra to develop a generic algorithm that encompassesbothWarshall’salgorithmandFloyd’salgorithm.(ThisexerciseisaversionofExercise 19.55 for people who aremore familiar with abstract algebra thanwithabstractdatatypes.)

•19.57Show, in thestyleofFigure19.16, thedevelopmentof theall-shortestpaths matrix for the example graph in the figure as computed with Floyd’salgorithm.19.58 Is theBooleanproductof two symmetricmatrices symmetric?Explainyouranswer.19.59AddapublicfunctionmembertoPrograms19.3and19.4toallowclientstousetcobjectstofindthenumberofedgesinthetransitiveclosure.19.60 Design a way to maintain the count of the number of edges in thetransitiveclosurebymodifyingitwhenedgesareaddedandremoved.Givethecostofaddingandremovingedgeswithyourscheme.

•19.61AddapublicmemberfunctionforusewithPrograms19.3and19.4thatreturns a vertex-indexed vector that indicates which vertices are reachablefromagivenvertex.

•19.62Runempiricalstudiestodeterminethenumberofedgesinthetransitiveclosure,forvarioustypesofdigraphs(seeExercises19.11–18).

•19.63Considerthebit-matrixgraphrepresentationthatisdescribedinExercise17.23. Which method can you speed up by a factor of B (where B is thenumberofbitsperwordonyourcomputer):Warshall’salgorithmortheDFS-based algorithm? Justify your answer by developing an implementation thatdoesso.

•19.64Giveaprogramthatcomputesthetransitiveclosureofadigraphthatisadirected forest in time proportional to the number of edges in the transitiveclosure.

• 19.65 Implement an abstract–transitive-closure algorithm for sparse graphsthat uses space proportional to T and can answer reachability requests inconstanttimeafterpreprocessingtimeproportionaltoVE+T,whereTisthenumberofedgesinthetransitiveclosure.Hint:Usedynamichashing.

•19.66Provide aversionofProgram19.4 that is based on generalized graphsearch(seeSection18.8),andrunempiricalstudiestoseewhetherthechoiceofgraph-searchalgorithmhasanyeffectonperformance.

19.4EquivalenceRelationsandPartialOrdersThissectionisconcernedwithbasicconceptsinsettheoryandtheirrelationshiptoabstract–transitive-closurealgorithms.Itspurposesaretoputtheideasthatwearestudyingintoalargercontextandtodemonstratethewideapplicabilityofthealgorithms that we are considering. Mathematically inclined readers who arefamiliarwithset theorymaywish toskip toSection19.5because thematerialthatwecover is elementary (althoughourbrief reviewof terminologymaybehelpful); readerswho are not familiarwith set theorymaywish to consult anelementarytextondiscretemathematicsbecauseourtreatmentisrathersuccinct.Theconnectionsbetweendigraphsandthesefundamentalmathematicalconceptsaretooimportantforustoignore.Givenaset,arelationamongitsobjectsisdefinedtobeasetoforderedpairsoftheobjects.Exceptpossiblyfordetailsrelatingtoparalleledgesandself-loops,thisdefinitionisthesameasourdefinitionofadigraph:Relationsanddigraphsaredifferentrepresentationsofthesameabstraction.Themathematicalconceptis somewhat more powerful because the sets may be infinite, whereas our

computerprogramsallworkwithfinitesets,butweignorethisdifferenceforthemoment.Typically,wechooseasymbolRanduse thenotationsRtasshorthandfor thestatement“theorderedpair(s,t)isintherelationR.”Forexample,weusethesymbol “<” to represent the “less than” relation among numbers. Using thisterminology,wecancharacterizevariouspropertiesofrelations.Forexample,arelationRissaidtobesymmetricifsRtimpliesthattRsforallsandt;itissaidtobereflexiveifsRsforalls.Symmetricrelationsarethesameasundirectedgraphs.Reflexive relations correspond to graphs inwhich all vertices have self-loops;relationsthatcorrespondtographswherenoverticeshaveself-loopsaresaidtobeirreflexive.ArelationRissaidtobetransitivewhensRtandtRuimpliesthatsRuforalls,t,andu.Thetransitiveclosureofarelationisawell-definedconcept;butinsteadof redefining it inset-theoretic terms,weappeal to thedefinition thatwegavefor digraphs in Section 19.3. Any relation is equivalent to a digraph, and thetransitive closure of the relation is equivalent to the transitive closure of thedigraph.Thetransitiveclosureofanyrelationistransitive.In the context of graph algorithms, we are particularly interested in twoparticular transitive relations thataredefinedby furtherconstraints.These twotypes, which are widely applicable, are known as equivalence relations andpartialorders.An equivalence relation is a transitive relation that is also reflexive andsymmetric.Notethatasymmetric,transitiverelationthatincludeseachobjectinsomeorderedpairmustbeanequivalencerelation:Ifst,thents(bysymmetry)and s s (by transitivity). Equivalence relations divide the objects in a set intosubsets known as equivalence classes. Two objects s and t are in the sameequivalence class if and only if s t. The following examples are typicalequivalencerelations:ModulararithmeticAnypositive integerkdefinesanequivalencerelationonthesetofintegers,withst(modk)ifandonlyiftheremainderthatresultswhenwedividesbykisequaltothetheremainderthatresultswhenwedividetbyk.The relation is obviously symmetric; a short proof establishes that it is alsotransitive(seeExercise19.67)andthereforeisanequivalencerelation.ConnectivityingraphsTherelation“is inthesameconnectedcomponentas”amongverticesisanequivalencerelationbecauseitissymmetricandtransitive.Theequivalenceclassescorrespondtotheconnectedcomponentsinthegraph.

Whenwebuild a graphADT that gives clients the ability to testwhether twovertices are in the same connected component, we are implementing anequivalence-relationADT that provides clientswith the ability to testwhethertwoobjectsareequivalent.Inpractice,thiscorrespondenceissignificantbecausethegraph is a succinct representationof theequivalence relation (seeExercise19.71).Infact,aswesawinChapters1and18,tobuildsuchanADTweneedtomaintainonlyasinglevertex-indexedvector.A partial order is a transitive relation that is also irreflexive. As a directconsequence of transitivity and irreflexivity, it is trivial to prove that partialordersarealsoasymmetric:Ifs tandt s,thens s(bytransitivity),whichcontradicts irreflexivity, so we cannot have both s t and t s. Moreover,extending the same argument shows that a partial order cannot have a cycle,suchass t,t u,andu s.Thefollowingexamplesaretypicalpartialorders:SubsetinclusionTherelation“includesbutisnotequalto”(⊂)amongsubsetsofagivensetisapartialorder—itiscertainlyirreflexive,andifs⊂tandt⊂u,thencertainlys⊂u.PathsinDAGsTherelation“canbereachedbyanonemptydirectedpathfrom”isapartialorderonverticesinDAGswithnoself-loopsbecauseitistransitiveandirreflexive.Likeequivalencerelationsandundirectedgraphs,thisparticularpartial order is significant for many applications because a DAG provides asuccinct implicit representationof thepartialorder.Forexample,Figure 19.17illustratesDAGsforsubsetcontainmentpartialorderswhosenumberofedgesisonlyafractionofthecardinalityofthepartialorder(seeExercise19.73).

Figure19.17Set-inclusionDAG

IntheDAGatthetop,weinterpretvertexindicestorepresentsubsetsofasetof3elements,asshowninthetableatthebottom.ThetransitiveclosureofthisDAGrepresentsthesubsetinclusionpartialorder:Thereisadirectedpath

betweentwonodesifandonlyifthesubsetrepresentedbythefirstisincludedinthesubsetrepresentedbythesecond.

Indeed,we rarely define partial orders by enumerating all their ordered pairs,because there are too many of such pairs. Instead, we generally specify anirreflexiverelation(aDAG)andconsider its transitiveclosure.Thisusage isaprimaryreasonforconsideringabstract–transitive-closureADTimplementationsforDAGs.UsingDAGs,weconsiderexamplesofpartialordersinSection19.5.AtotalorderTisapartialorderwhereeithersTtortTsforalls[negationslash]=t.Familiarexamplesoftotalordersarethe“lessthan”relationamongintegersor real numbers and lexicographic ordering among strings of characters. OurstudyofsortingandsearchingalgorithmsinParts3and4wasbasedonatotal-orderADT implementation for sets. Ina totalorder, there isoneandonlyonewaytoarrangetheelementsinthesetsuchthatsTtwheneversisbeforetinthearrangement;inapartialorderthatisnottotal,therearemanywaystodoso.InSection19.5,weexaminealgorithmsforthistask.Insummary,thefollowingcorrespondencesbetweensetsandgraphmodelshelpus to understand the importance and wide applicability of fundamental graphalgorithms.

•Relationsanddigraphs•Symmetricrelationsandundirectedgraphs•Transitiverelationsandpathsingraphs•Equivalencerelationsandpathsinundirectedgraphs•PartialordersandpathsinDAGsThiscontextplacesinperspectivethetypesofgraphsandalgorithmsthatweareconsidering and provides onemotivation for us tomove on to consider basicpropertiesofDAGsandalgorithmsforprocessingthoseDAGs.

Exercises19.67 Show that “has the same remainder after dividing byk” is a transitiverelation(andthereforeisanequivalencerelation)onthesetofintegers.19.68 Show that “is in the same edge-connected component as” is anequivalencerelationamongverticesinanygraph.19.69 Show that “is in the same biconnected component as” is not anequivalencerelationamongverticesinallgraphs.19.70 Prove that the transitive closure of an equivalence relation is anequivalencerelationandthatthetransitiveclosureofapartialorderisapartialorder.•19.71Thecardinalityofarelationisitsnumberoforderedpairs.Provethatthecardinalityofanequivalencerelationisequaltothesumofthesquaresofthecardinalitiesofthatrelation’sequivalenceclasses.

•19.72Usinganonlinedictionary,buildagraphthatrepresentstheequivalencerelation“hasklettersincommonwith”amongwords.Determinethenumberofequivalenceclassesfork=1through5.19.73Thecardinalityofapartialorderisitsnumberoforderedpairs.Whatisthecardinalityofthesubsetcontainmentpartialorderforann-elementset?•19.74Showthat“isafactorof”isapartialorderamongintegers.

19.5DAGsIn this section, we consider various applications of directed acyclic graphs(DAGs).We have two reasons to do so. First, because they serve as implicitmodelsforpartialorders,weworkdirectlywithDAGsinmanyapplicationsandneed efficient algorithms to process them. Second, these various applicationsgiveusinsightintothenatureofDAGs,andunderstandingDAGsisessentialto

understandinggeneraldigraphs.SinceDAGsareaspecialtypeofdigraph,allDAG-processingproblemstriviallyreducetodigraph-processingproblems.AlthoughweexpectprocessingDAGstobe easier than processing general digraphs, we know when we encounter aproblemthatisdifficulttosolveonDAGsweshouldnotexpecttodoanybettersolvingthesameproblemongeneraldigraphs.Asweshallsee,theproblemofcomputingthetransitiveclosureliesinthiscategory.Conversely,understandingthe difficulty of processing DAGs is important because every digraph has akernelDAG (seeProperty 19.2), sowe encounterDAGs evenwhenweworkwithdigraphsthatarenotDAGs.The prototypical application where DAGs arise directly is called scheduling.Generally, solving scheduling problems has to do with arranging for thecompletionofasetoftasks,underasetofconstraints,byspecifyingwhenandhow the tasks are to beperformed.Constraintsmight involve functionsof thetimetakenorotherresourcesconsumedbythetasks.Themostimportanttypeofconstraintsareprecedenceconstraints,whichspecifythatcertaintasksmustbeperformedbeforecertainothers,thuscomprisingapartialorderamongthetasks.Different types of additional constraints lead to many different types ofscheduling problems, of varying difficulty. Literally thousands of differentproblems have been studied, and researchers still seek better algorithms formanyofthem.Perhapsthesimplestnontrivialschedulingproblemmaybestatedasfollows:Scheduling Given a set of tasks to be completed, with a partial order thatspecifies that certain taskshave tobe completedbefore certainother tasksarebegun,howcanweschedulethetaskssuchthattheyareallcompletedwhilestillrespectingthepartialorder?Inthisbasicform,theschedulingproblemiscalledtopologicalsorting;itisnotdifficult to solve, as we shall see in the next section by examining twoalgorithmsthatdoso.Inmorecomplicatedpracticalapplications,wemightneedtoaddotherconstraintsonhowthetasksmightbescheduled,andtheproblemcan becomemuchmore difficult. For example, the tasksmight correspond tocourses in a student’s schedule,with thepartial order specifyingprerequisites.Topologicalsortinggivesafeasiblecourseschedulethatmeets theprerequisiterequirements,butperhapsnotonethatrespectsotherconstraintsthatneedtobeaddedtothemodel,suchascourseconflicts,limitationsonenrollments,andsoforth.Asanotherexample, the tasksmightbepartofamanufacturingprocess,with the partial order specifying sequential requirements of the particular

process.Topological sortinggivesus away to schedule the tasks, but perhapsthereisanotherwaytodosothatuseslesstime,money,orsomeotherresourcesnotincludedinthemodel.WeexamineversionsoftheschedulingproblemthatcapturemoregeneralsituationssuchastheseinChapters21and22.Often, our first task is to check whether or not a given DAG indeed has nodirectedcycles.AswesawinSection19.2,wecaneasilyimplementaclassthatallows clients to test whether a general digraph is a DAG in linear time, byrunningastandardDFSandcheckingthattheDFSforesthasnobackedges(seeExercise 19.75). To implement DAG-specific algorithms, we implement task-specific client classes of our standard GRAPH ADT that assume they areprocessing digraphs with no cycles, leaving the client the responsibility ofchecking for cycles. This arrangement allows for the possibility that a DAG-processingalgorithmproducesuseful results evenwhen runonadigraphwithcycles, which is sometimes the case. Sections 19.6 and 19.7 are devoted toimplementationsof classes for topological sorting (DAGts) and reachability inDAGs(DAGtcandDAGreach);Program19.13isanexampleofaclientofsuchaclass.Inasense,DAGsarepart tree,partgraph.Wecancertainly takeadvantageoftheirspecialstructurewhenweprocessthem.Forexample,wecanviewaDAGalmost as we view a tree, if we wish. Suppose that we want to traverse theverticesoftheDAGDasthoughitwereatreerootedatw,sothat,forexample,theresultoftraversingthetwoDAGsinFigure19.18withthisprogramwouldbethesame.Thefollowingsimpleprogramaccomplishesthistaskinthesamemanneraswouldarecursivetreetraversal:

voidtraverseR(DagD,intv)

{

visit(v);

typenameDag::adjIteratorA(D,v);


traverseR(D,t);

}

Werarelyuseafulltraversalofthiskind,however,becausewenormallywanttotakeadvantageofthesameeconomiesthatsavespaceinaDAGtosavetimeintraversingit(forexample,bymarkingvisitednodesinanormalDFS).Thesameidea applies to a search, where we make a recursive call for only one linkincidentoneachvertex.Insuchanalgorithm,thesearchcostwillbethesamefortheDAGandthetree,buttheDAGusesfarlessspace.Because they provide a compact way to represent trees that have identicalsubtrees,weoftenuseDAGsinsteadoftreeswhenwerepresentcomputationalabstractions. In the context of algorithm design, the distinction between the

DAGrepresentationandthetreerepresentationofaprograminexecutionistheessential distinction behind dynamic programming (see, for example, Figure19.18 and Exercise 19.78). DAGs are also widely used in compilers asintermediate representations of arithmetic expressions and programs (see, forexample, Figure 19.19) and in circuit-design systems as intermediaterepresentationsofcombinationalcircuits.Alongtheselines,animportantexamplethathasmanyapplicationsariseswhenwe consider binary trees.We can apply the same restriction toDAGs thatweappliedtotreestodefinebinarytrees.Definition 19.6 A binary DAG is a directed acyclic graph with two edgesleavingeachnode,identifiedastheleftedgeandtherightedge,eitherorbothofwhichmaybenull.

Figure19.18DAGmodelofFibonaccicomputation

ThetreeatthetopshowsthedependenceofcomputingeachFibonaccinumberoncomputingitstwopredecessors.TheDAGatthebottomshowsthesame

dependencewithonlyafractionofthenodes.The distinction between a binaryDAG and a binary tree is that in the binaryDAGwecanhavemorethanonelinkpointingtoanode.Asdidourdefinitionforbinarytrees,thisdefinitionmodelsanaturalrepresentation,whereeachnodeis a structurewith a left link anda right link that point to other nodes (or arenull), subject toonly theglobal restriction thatnodirectedcyclesareallowed.BinaryDAGsaresignificantbecause theyprovideacompactway to represent

binarytreesincertainapplications.Forexample,wecancompressanexistencetrie intoabinaryDAGwithoutchanging the search implementation,as shownFigure19.20andProgram19.5.

Figure19.19DAGrepresentationofanarithmeticexpression

BothoftheseDAGsarerepresentationsofthearithmeticexpression(c*(a+b))-((a+b))*((a+b)+e)).Inthebinaryparsetreeatleft,leafnodesrepresentoperandsandinternalnodeseachrepresentoperatorstobeappliedtotheexpressions

representedbytheirtwosubtrees(seeFigure5.31).TheDAGatrightisamorecompactrepresentationofthesametree.Moreimportant,wecancomputethevalueoftheexpressionintimeproportionaltothesizeoftheDAG,whichistypicallysignificantlylessthanthesizeofthetree(seeExercises19.112and

19.113).Anequivalentapplicationistoviewthetriekeysascorrespondingtorowsinthetruth table of aBoolean function forwhich the function is true (seeExercises19.84through19.87).ThebinaryDAGisamodelforaneconomicalcircuitthatcomputes the function. In this application, binaryDAGs are known as binarydecisiondiagrams(BDD)s.Motivatedbytheseapplications,weturn,inthenexttwosections,tothestudyofDAG-processingalgorithms.NotonlydothesealgorithmsleadtoefficientandusefulDAGADTfunction implementations,but also theyprovide insight intothedifficultyofprocessingdigraphs.Asweshallsee,eventhoughDAGswouldseem to be substantially simpler structures than general digraphs, some basicproblemsareapparentlynoeasiertosolve.

Exercises•19.75ImplementaDFSclassforusebyclientsforverifyingthataDAGhasnocycles.

• 19.76Write a program that generates randomDAGs by generating randomdigraphs,doingaDFSfromarandomstartingpoint,andthrowingouttheback

edges(seeExercise19.40).RunexperimentstodecidehowtosetparametersinyourprogramtoexpectDAGswithEedges,givenV.

•19.77HowmanynodesarethereinthetreeandintheDAGcorrespondingtoFigure19.18forFN,theNthFibonaccinumber?

19.78GivetheDAGcorrespondingtothedynamic-programmingexamplefortheknapsackmodelfromChapter5(seeFigure5.17).•19.79DevelopanADTforbinaryDAGs.

Program19.5RepresentingabinarytreewithabinaryDAGThis code snippet is a postorder walk that constructs a compact representation of abinary DAG corresponding to a binary tree structure (see Chapter 12) by identifyingcommonsubtrees.ItusesanindexingclasslikeSTinProgram17.15(modifiedtoacceptpairs of integer instead of string keys) to assign a unique integer to each distinct treestructureforuseinrepresentingtheDAGasanvectorof2-integerstructures(seeFigure19.20).Theempty tree (null link) isassigned index0, thesingle-node tree (nodewithtwonulllinks)isassignedindex1,andsoforth.

Theindexcorrespondingtoeachsubtreeiscomputedrecursively.Thenakeyiscreatedsuch that any node with the same subtrees will have the same index and that indexreturnedaftertheDAG’sedge(subtree)linksarefilled.

intcompressR(linkh)

{STxst;

if(h==NULL)return0;

l=compressR(h->l);

r=compressR(h->r);

t=st.index(l,r);

adj[t].l=l;adj[t].r=r;

returnt;

}

Figure19.20Binarytreecompression

ThetableofninepairsofintegersatthebottomleftisacompactrepresentationofabinaryDAG(bottomright)thatisacompressedversionofthebinarytreestructureattop.Nodelabelsarenotexplicitlystoredinthedatastructure:Thetablerepresentstheeighteenedges1-0,1-0,2-1,2-1,3-1,3-2,andsoforth,butdesignatesaleftedgeandarightedgeleavingeachnode(asinabinarytree)and

leavesthesourcevertexforeachedgeimplicitinthetableindex.AnalgorithmthatdependsonlyuponthetreeshapewillworkeffectivelyontheDAG.Forexample,supposethatthetreeisanexistencetrieforbinarykeyscorrespondingtotheleafnodes,soitrepresentsthekeys0000,0001,0010,0110,1100,and1101.Asuccessfulsearchforthekey1101inthetriemovesright,right,left,andrighttoendataleafnode.IntheDAG,thesamesearch

goesfrom9to8to7to2to1.•19.80CaneveryDAGberepresentedasabinaryDAG(seeProperty5.4)?•19.81Write a function that performs an inorder traversal of a single-sourcebinaryDAG.Thatis,thefunctionshouldvisitallverticesthatcanbereachedvia the left edge, then visit the source, then visit all the vertices that can bereachedviatherightedge.

•19.82 In the styleofFigure19.20,give the existence trie andcorrespondingbinaryDAGforthekeys0100101010010101001000011110110001010001001000010000011101010011.19.83ImplementanADTbasedonbuildinganexistencetriefromasetof32-bit keys, compressing it as a binary DAG, then using that data structure tosupportexistencequeries.•19.84 Draw theBDD for the truth table for the odd parity function of fourvariables,whichis1ifandonlyifthenumberofvariablesthathavethevalue1isodd.19.85Writeafunctionthattakesa2n-bittruthtableasargumentandreturnsthecorresponding BDD. For example, given the input 1110001000001100, yourprogramshouldreturnarepresentationofthebinaryDAGinFigure19.20.19.86Write a function that takes a 2n-bit truth table as argument, computesevery permutation of its argument variables, and, using your solution toExercise19.85,findsthepermutationthatleadstothesmallestBDD.•19.87Runempirical studies todetermine theeffectivenessof thestrategyofExercise 19.87 for various Boolean functions, both standard and randomlygenerated.19.88WriteaprogramlikeProgram19.5thatsupportscommonsubexpression

removal:Givenabinarytreethatrepresentsanarithmeticexpression,computea binary DAG that represents the same expression with commonsubexpressionsremoved.• 19.89 Draw all the nonisomorphic DAGs with two, three, four, and fivevertices.••19.90HowmanydifferentDAGsaretherewithVverticesandEedges?•••19.91HowmanydifferentDAGsaretherewithVverticesandEedges,ifweconsidertwoDAGstobedifferentonlyiftheyarenotisomorphic?

Figure19.21Topologicalsort(relabeling)

GivenanyDAG(top),topologicalsortingallowsustorelabelitsverticessothateveryedgepointsfromalower-numberedvertextoahigher-numberedone

(bottom).Inthisexample,werelabel4,5,7,and8to7,8,5,and4,respectively,asindicatedinthearraytsI.Therearemanypossiblelabelingsthatachievethe

desiredresult.

19.6TopologicalSortingThegoalof topological sorting is tobeable toprocess theverticesof aDAGsuch that every vertex is processed before all the vertices to which it points.There are two naturalways to define this basic operation; they are essentiallyequivalent.Bothtaskscallforapermutationoftheintegers0throughV-1,whichweputinvertex-indexedvectors,asusual.

Topological sort (relabel) Given a DAG, relabel its vertices such that everydirected edge points from a lower-numbered vertex to a higher-numbered one(seeFigure19.21).Topological sort (rearrange) Given a DAG, rearrange its vertices on ahorizontallinesuchthatallthedirectededgespointfromlefttoright(seeFigure19.22).As indicated in Figure 19.22, it is easy to establish that the relabeling andrearrangementpermutationsareinversesofoneanother:Givenarearrangement,wecanobtainarelabelingbyassigningthelabel0tothefirstvertexonthelist,1tothesecondlabelonthelist,andsoforth.Forexample, ifavectortshastheverticesintopologicallysortedorder,thentheloop

Figure19.22Topologicalsorting(rearrangement)

ThisdiagramshowsanotherwaytolookatthetopologicalsortinFigure19.21,wherewespecifyawaytorearrangethevertices,ratherthanrelabelthem.Whenweplacetheverticesintheorderspecifiedinthearrayts,fromlefttoright,thenalldirectededgespointfromlefttoright.Theinverseofthepermutationtsisthe

permutationtsIthatspecifiestherelabelingdescribedinFigure19.21.for(i=0;i<V;i++)tsI[ts[i]]=i;

definesarelabelinginthevertex-indexedvectortsI.Conversely,wecangettherearrangementfromtherelabelingwiththeloop

for(i=0;i<V;i++)ts[tsI[i]]=i;which puts the vertex thatwould have label 0 first in the list, the vertex thatwouldhavelabel1secondinthelist,andsoforth.Mostoften,weusethetermtopologicalsorttorefertotherearrangementversionoftheproblem.Notethattsisnotavertex-indexedvector.

In general, the vertex order produced by a topological sort is not unique. Forexample,

arealltopologicalsortsoftheexampleDAGinFigure19.6(andtherearemanyothers).Inaschedulingapplication,thissituationariseswheneveronetaskhasno direct or indirect dependence on another and thus they can be performedeither before or after the other (or even in parallel). The number of possibleschedulesgrowsexponentiallywiththenumberofsuchpairsoftasks.Aswehavenoted,itissometimesusefultointerprettheedgesinadigraphtheotherwayaround:Wesaythatanedgedirectedfromstotmeansthatvertexs“depends” on vertex t. For example, the vertices might represent terms to bedefinedinabook,withanedgefroms to t if thedefinitionofsuses t. In thiscase,itwouldbeusefultofindanorderingwiththepropertythateverytermisdefinedbeforeitisusedinanotherdefinition.Usingthisorderingcorrespondstopositioning the vertices in a line such that edges all go from right to left—areversetopologicalsort.Figure19.23illustratesareversetopologicalsortofoursampleDAG.

Figure19.23Reversetopologicalsort

Inthisreversetopologicalsortofoursampledigraph,theedgesallpointfromrighttoleft.NumberingtheverticesasspecifiedbytheinversepermutationtsIgivesagraphwhereeveryedgepointsfromahigher-numberedvertextoa

lower-numberedvertex.Now,itturnsoutthatwehavealreadyseenanalgorithmforreversetopologicalsorting: our standard recursive DFS! When the input graph is a DAG, apostordernumberingputs thevertices in reverse topologicalorder.That is,wenumbereachvertexas thefinalactionof therecursiveDFSfunction,as in thepostvectorintheDFSimplementationinProgram19.2.AsillustratedinFigure19.24, using this numbering is equivalent to numbering the nodes in theDFSforestinpostorder,andgivesatopologicalsort:Thevertex-indexedvectorpostgivestherelabelinganditsinversetherearrangementdepictedinFigure19.23—areversetopologicalsortoftheDAG.Property19.11Postordernumbering inDFSyields a reverse topological sortforanyDAG.Proof:Suppose thatsand tare twoverticessuch thatsappearsbefore t in thepostordernumberingeventhoughthereisadirectededges-tinthegraph.Sincewe are finished with the recursive DFS for s at the time that we assign s itsnumber, we have examined, in particular, the edge s-t. But if s-t were a tree,down, or cross edge, the recursiveDFS for twouldbe complete, and twouldhave a lower number; however, s-t cannot be a back edgebecause thatwouldimplyacycle.Thiscontradictionimpliesthatsuchanedges-tcannotexist.Thus,wecaneasilyadaptastandardDFStodoatopologicalsort,asshowninProgram19.6.Thisimplementationdoesareversetopologicalsort:Itcomputesthepostordernumberingpermutationandits inverse,sothatclientscanrelabelorrearrangevertices.

Program19.6ReversetopologicalsortThisDFSclasscomputespostordernumberingoftheDFSforest(areversetopologicalsort).ClientscanuseaTSobjecttorelabelaDAG’sverticessothateveryedgepointsfromahigher-numberedvertextoalower-numberedoneortoarrangeverticessuchthatthesourcevertexofeveryedgeappearsafterthedestinationvertex(seeFigure19.23).

Figure19.24DFSforestforaDAG

ADFSforestofadigraphhasnobackedges(edgestonodeswithahigherpostordernumber)ifandonlyifthedigraphisaDAG.Thenon-treeedgesinthisDFSforestfortheDAGofFigure19.21areeitherdownedges(shadedsquares)orcrossedges(unshadedsquares).Theorderinwhichverticesare

encounteredinapostorderwalkoftheforest,shownatthebottom,isareversetopologicalsort(seeFigure19.23).

Computationally,thedistinctionbetweentopologicalsortandreversetopologicalsortisnotcrucial.Wecansimplychangethe[]operatortoreturnpostI[G.V()-1-v],orwecanmodifytheimplementationinoneofthefollowingways:•DoareversetopologicalsortonthereverseofthegivenDAG.• Rather than using it as an index for postorder numbering, push the vertexnumberonastackasthefinalactoftherecursive

Program19.7TopologicalsortIf we use this implementation of tsR in Program 19.6, the constructor computes atopological sort, not the reverse (for any DAG implementation that supports edge),becauseitreplacesthereferencetoedge(v,w)intheDFSbyedge(w,v),thusprocessingthereversegraph(seetext).

voidtsR(intv)

{

pre[v]=cnt++;

for(intw=0;w<D.V();w++)

if(D.edge(w,v))

if(pre[w]==-1)tsR(w);

post[v]=tcnt;postI[tcnt++]=v;

}

procedure.Afterthesearchiscomplete,poptheverticesfromthestack.Theycomeoffthestackintopologicalorder.•Number the vertices in reverse order (start atV 1 and count down to 0). Ifdesired, compute the inverse of the vertex numbering to get the topologicalorder.

Theproofsthatthesechangesgiveapropertopologicalorderingareleftforyoutodoasanexercise(seeExercise19.97).Toimplementthefirstoftheoptionslistedinthepreviousparagraphforsparsegraphs(representedwithadjacencylists),wewouldneedtouseProgram19.1tocomputethereversegraph.Doingsoessentiallydoublesourspaceusage,whichmaythusbecomeonerousforhugegraphs.Fordensegraphs(representedwithanadjacencymatrix),asnoted inSection19.1,wecandoDFSon the reversewithoutusinganyextra spaceordoinganyextrawork, simplybyexchangingrowsandcolumnswhenreferringtothematrix,asillustratedinProgram19.7.Next,weconsideranalternativeclassicalmethodfortopologicalsortingthatismore like breadth-first search (BFS) (see Section 18.7). It is based on thefollowingpropertyofDAGs.

Property19.12EveryDAGhasatleastonesourceandatleastonesink.Figure19.25TopologicallysortingaDAGbyremovingsources

Sinceitisasource(noedgespointtoit),0canappearfirstinatopologicalsortofthisexamplegraph(left,top).Ifweremove0(andalltheedgesthatpointfromittoothervertices),then1and2becomesourcesintheresultingDAG(left,secondfromtop),whichwecanthensortusingthesamealgorithm.ThisfigureillustratestheoperationofProgram19.8,whichpicksfromamongthesources(theshadednodesineachdiagram)usingtheFIFOdiscipline,thoughanyofthesourcescouldbechosenateachstep.SeeFigure19.26forthe

contentsofthedatastructuresthatcontrolthespecificchoicesthatthealgorithmmakes.Theresultofthetopologicalsortillustratedhereisthenodeorder082

1736549111012.Proof: Suppose that we have a DAG that has no sinks. Then, starting at anyvertex,wecanbuildanarbitrarilylongdirectedpathbyfollowinganyedgefromthat vertex to any other vertex (there is at least one edge, since there are nosinks), then following another edge from that vertex, and so on.But oncewehave been to V +1 vertices, we must have seen a directed cycle, by thepigeonholeprinciple(seeProperty19.6),whichcontradictstheassumptionthatwe have aDAG. Therefore, everyDAG has at least one sink. It follows thateveryDAGalsohasatleastonesource:itsreverse’ssink.Fromthisfact,wecanderiveatopological-sortalgorithm:Labelanysourcewiththesmallestunusedlabel,thenremoveitandlabeltherestoftheDAG,usingthesame algorithm.Figure 19.25 is a trace of this algorithm in operation for oursampleDAG.Implementing this algorithm efficiently is a classic exercise in data-structuredesign(seereferencesection).First,theremaybemultiplesources,soweneedto maintain a queue to keep track of them (any generalized queue will do).Second, we need to identify the sources in the DAG that remains when weremoveasource.Wecanaccomplishthistaskbymaintainingavertex-indexedvectorthatkeepstrackoftheindegreeofeachvertex.Verticeswithindegree0are sources, so we can initialize the queue with one scan through the DAG(using DFS or any other method that examines all of the edges). Then, weperformthefollowingoperationsuntilthesourcequeueisempty:•Removeasourcefromthequeueandlabelit.•Decrementtheentriesintheindegreevectorcorrespondingtothedestinationvertexofeachoftheremovedvertex’sedges.

• If decrementing any entry causes it to become 0, insert the correspondingvertexontothesourcequeue.

Program 19.8 is an implementation of this method, using a FIFO queue, andFigure19.26illustrates itsoperationonoursampleDAG,providingthedetailsbehindthedynamicsoftheexampleinFigure19.25.

Figure19.26Indegreetableandqueuecontents

Thissequencedepictsthecontentsoftheindegreetable(left)andthesourcequeue(right)duringtheexecutionofProgram19.8onthesampleDAG

correspondingtoFigure19.25.Atanygivenpointintime,thesourcequeuecontainsthenodeswithindegree0.Readingfromtoptobottom,weremovethe

leftmostnodefromthesourcequeue,decrementtheindegreeentrycorrespondingtoeveryedgeleavingthatnode,andaddanyverticeswhose

entriesbecome0tothesourcequeue.Forexample,thesecondlineofthetablereflectstheresultofremoving0fromthesourcequeue,then(becausetheDAGhastheedges0-1,0-2,0-3,0-5,and0-6)decrementingtheindegreeentriescorrespondingto1,2,3,5,and6andadding2and1tothesourcequeue

(becausedecrementingmadetheirindegreeentries0).Readingtheleftmostentriesinthesourcequeuefromtoptobottomgivesatopologicalorderingfor

thegraph.The source queue does not empty until every vertex in the DAG is labeled,becausethesubgraphinducedbytheverticesnotyetlabeledisalwaysaDAG,andeveryDAGhasatleastonesource.Indeed,wecanusethealgorithmtotestwhetheragraphisaDAGbyinferringthattheremustbeacycleinthesubgraphinducedbytheverticesnot

Program19.8Source-queue–basedtopologicalsortThisclassimplementsthesameinterfaceasdoesPrograms19.6and19.7.Itmaintainsa

queueofsourcesandusesatablethatkeepstrackoftheindegreeofeachvertexintheDAGinducedbytheverticesthathavenotbeenremovedfromthequeue.

When we remove a source from the queue, we decrement the indegree entriescorresponding to each of the vertices on its adjacency list (and put on the queue anyvertices corresponding to entries that become 0). Vertices come off the queue intopologicallysortedorder.

yetlabeledifthequeueemptiesbeforealltheverticesarelabeled(seeExercise19.104).Processing vertices in topologically sorted order is a basic technique inprocessingDAGs.AclassicexampleistheproblemoffindingthelengthofthelongestpathinaDAG.Consideringtheverticesinreversetopologicallysortedorder, the length of the longest path originating at each vertex v is easy tocompute: Add one to the maximum of the lengths of the longest pathsoriginating at each of the vertices reachable by a single edge from v. Thetopological sort ensures that all those lengths are knownwhenv is processed,andthatnootherpathsfromvwillbefoundafterwards.Forexample,takingaleft-to-rightscanof thereversetopologicalsortshowninFigure19.23,wecanquicklycomputethefollowingtableoflengthsofthelongestpathsoriginatingateachvertexinthesamplegraphinFigure19.21.

For example, the 6 corresponding to 0 (third column from the right) says thatthereisapathoflength6originatingat0,whichweknowbecausethereisanedge0-2,wepreviouslyfoundthelengthofthelongestpathfrom2tobe5,andnootheredgefrom0leadstoanodehavingalongerpath.

Wheneverweuse topological sorting for such an application,wehave severalchoicesindevelopinganimplementation:•UseDAGts inaDAGADT, thenproceed through thevector it computes toprocessthevertices.

•ProcesstheverticesaftertherecursivecallsinaDFS.• Process the vertices as they come off the queue in a source-queue– basedtopologicalsort.

All of these methods are used in DAG-processing implementations in theliterature, and it is important to know that they are all equivalent. We willconsiderothertopological-sortapplicationsinExercises19.111and19.114andinSections19.7and21.4.

Exercises•19.92Write a function that checkswhether or not a given permutation of aDAG’sverticesisapropertopologicalsortofthatDAG.19.93 How many different topological sorts are there of the DAG that isdepictedinFigure19.6?•19.94Give theDFS forest and the reverse topological sort that results fromdoingastandardadjacency-listsDFS(withpostordernumbering)oftheDAG

3-71-47-80-55-23-82-90-64-92-66-44-32-3.•19.95GivetheDFSforestandthetopologicalsortthatresultsfrombuildingastandardadjacency-listsrepresentationoftheDAG

3-71-47-80-55-23-82-90-64-92-66-44-32-3,thenusingProgram19.1tobuildthereverse,thendoingastandardadjacency-listsDFSwithpostordernumbering.•19.96Program19.6usespostordernumberingtodoareversetopologicalsort—why not use preorder numbering to do a topological sort? Give a three-vertexexamplethatillustratesthereason.

•19.97Provethecorrectnessofeachofthethreesuggestionsgiveninthetextfor modifying DFS with postorder numbering such that it computes atopologicalsortinsteadofareversetopologicalsort.

•19.98Give theDFSforestand the topological sort that results fromdoingastandard adjacency-matrix DFS with implicit reversal (and postordernumbering)oftheDAG

3-71-47-80-55-23-82-90-64-92-66-44-32-3(seeProgram19.7).•19.99GivenaDAG,doesthereexistatopologicalsortthatcannotresultfromapplyingaDFS-basedalgorithm,nomatterwhatordertheverticesadjacenttoeachvertexarechosen?Proveyouranswer.

•19.100Show,inthestyleofFigure19.26,theprocessoftopologicallysortingtheDAG

3-71-47-80-55-23-82-90-64-92-66-44-32-3withthesource-queuealgorithm(Program19.8).•19.101Give the topological sort that results if thedata structureused in theexampledepictedinFigure19.25isastackratherthanaqueue.

•19.102 Given a DAG, does there exist a topological sort that cannot resultfromapplyingthesource-queuealgorithm,nomatterwhatqueuedisciplineisused?Proveyouranswer.19.103Modifythesource-queuetopological-sortalgorithmtouseageneralizedqueue. Use your modified algorithm with a LIFO queue, a stack, and arandomizedqueue.• 19.104 Use Program 19.8 to provide an implementation for a class forverifyingthataDAGhasnocycles(seeExercise19.75).

•19.105Convertthesource-queuetopological-sortalgorithmintoasink-queuealgorithmforreversetopologicalsorting.19.106Writeaprogram thatgeneratesallpossible topologicalorderingsof agivenDAG,or, if thenumberofsuchorderingsexceedsaboundtakenasanargument,printsthatnumber.19.107WriteaprogramthatconvertsanydigraphwithVverticesandEedgesinto a DAG by doing a DFS-based topological sort and changing theorientation of any back edge encountered. Prove that this strategy alwaysproducesaDAG.•• 19.108 Write a program that produces each of the possible DAGs with VverticesandEedgeswithequallikelihood(seeExercise17.70).19.109Give necessary and sufficient conditions for aDAG to have just onepossibletopologicallysortedorderingofitsvertices.19.110Runempiricalteststocomparethetopological-sortalgorithmsgiveninthis section for various DAGs (see Exercise 19.2, Exercise 19.76, Exercise

19.107, and Exercise 19.108). Test your program as described in Exercise19.11 (for low densities) and as described in Exercise 19.12 (for highdensities).• 19.111 Modify Program 19.8 so that it computes the number of differentsimplepathsfromanysourcetoeachvertexinaDAG.

•19.112WriteaclassthatevaluatesDAGsthatrepresentarithmeticexpressions(seeFigure19.19).Useavertex-indexedvector toholdvaluescorrespondingto each vertex. Assume that values corresponding to leaves have beenestablished.

•19.113Describeafamilyofarithmeticexpressionswiththepropertythat thesize of the expression tree is exponentially larger than the size of thecorresponding DAG (so the running time of your program from Exercise19.112fortheDAGisproportionaltothelogarithmoftherunningtimeforthetree).

•19.114 Develop a method for finding the longest simple directed path in aDAG, in time proportional toV. Use yourmethod to implement a class forprintingaHamiltonpathinagivenDAG,ifoneexists.

19.7ReachabilityinDAGsTo conclude our study of DAGs, we consider the problem of computing thetransitiveclosureofaDAG.CanwedevelopalgorithmsforDAGsthataremoreefficientthanthealgorithmsforgeneraldigraphsthatweconsideredinSection19.3?Anymethodfortopologicalsortingcanserveasthebasisforatransitive-closurealgorithm for DAGs, as follows: We proceed through the vertices in reversetopologicalorder,computing thereachabilityvector foreachvertex(its rowinthe transitive-closure matrix) from the rows corresponding to its adjacentvertices. The reverse topological sort ensures that all those rows have alreadybeen computed. In total, we check each of the V entries in the vectorcorrespondingtothedestinationvertexofeachoftheEedges,foratotalrunningtimeproportionaltoVE.Althoughitissimpletoimplement,thismethodisnomoreefficientforDAGsthanforgeneraldigraphs.WhenweuseastandardDFSforthetopologicalsort(seeProgram19.7),wecanimproveperformance for someDAGs, as shown inProgram19.9. Since therearenocycles inaDAG,therearenobackedges inanyDFS.More important,both cross edges and down edges point to nodes for which the DFS hascompleted. To take advantage of this fact,we develop a recursive function to

computeallvertices reachable fromagivenstartvertex,but (asusual inDFS)wemakenorecursivecallsforverticesforwhichthereachablesethasalreadybeencomputed.Inthiscase, thereachableverticesarerepresentedbyarowinthe transitive closure, and the recursive function takes the logicalor of all therowsassociatedwithitsadjacentedges.Fortreeedges,wedoarecursivecalltocompute the row; for cross edges, we can skip the recursive call because weknow that the row has been computed by a previous recursive call; for downedges,wecanskip thewholecomputationbecauseany reachablenodes that itwouldaddhavealreadybeenaccountedforinthesetofreachablenodesforthedestinationvertex(lowerandearlierintheDFStree).Using this version of DFS might be characterized as using dynamicprogrammingtocomputethetransitiveclosurebecausewemakeuseofresultsthat have already been computed (and saved in the adjacency matrix rowscorresponding to previously-processed vertices) to avoid making unnecessaryrecursivecalls.Figure19.27illustratesthecomputationofthetransitiveclosureforthesampleDAGinFigure19.6.Property19.13WithdynamicprogrammingandDFS,wecansupportconstantquerytimefortheabstracttransitiveclosureofaDAGwithspaceproportionalto V2 and time proportional to V2 + VX for preprocessing (computing thetransitiveclosure),whereXisthenumberofcrossedgesintheDFSforest.

Figure19.27TransitiveclosureofaDAG

ThissequenceofrowvectorsisthetransitiveclosureoftheDAGinFigure19.21,withrowscreatedinreversetopologicalorder,computedasthelastactioninarecursiveDFSfunction(seeProgram19.9).Eachrowisthelogicalorofthe

rowsforadjacentvertices,whichappearearlierinthelist.Forexample,tocomputetherowfor0wetakethelogicaloroftherowsfor5,2,1,and6(andputa1correspondingto0itself)becausetheedges0-5,0-2,0-1,and0-6takeusfrom0toanyvertexthatisreachablefromanyofthosevertices.Wecanignoredownedgesbecausetheyaddnonewinformation.Forexample,weignoretheedgefrom0to3becausetheverticesreachablefrom3arealreadyaccountedfor

intherowcorrespondingto2.

Program19.9TransitiveclosureofaDAGThe constructor in this class computes the transitive closure of a DAGwith a singleDFS.ItrecursivelycomputesthereachableverticesfromeachvertexfromthereachableverticesofitschildrenintheDFStree.

Proof: The proof is immediate by induction from the recursive function inProgram 19.9. We visit the vertices in reverse topological order. Every edgepoints to a vertex forwhichwe have already computed all reachable vertices,and we can therefore compute the set of reachable vertices of any vertex bymerging together the sets of reachable vertices associatedwith the destinationvertexofeachedge.Takingthelogicalorofthespecifiedrowsintheadjacencymatrixaccomplishes thismerge.Weaccessa rowof sizeV foreach treeedgeandeachcrossedge.Therearenobackedges,andwecan ignoredownedgesbecause we accounted for any vertices they reach when we processed anyancestorsofbothnodesearlierinthesearch.If our DAG has no down edges (see Exercise 19.42), the running time ofProgram 19.9 is proportional toVE and represents no improvement over thetransitive-closure algorithms thatwe examined for general digraphs inSection19.3(suchas,forexample,Program19.4)ortheapproachbasedontopologicalsortingthatisdescribedatthebeginningofthissection.Ontheotherhand,ifthenumberof downedges is large (or, equivalently, thenumberof cross edges issmall),Program19.9willbesignificantlyfasterthanthesemethods.Theproblemoffindinganoptimalalgorithm(onethatisguaranteedtofinishintimeproportionaltoV2)forcomputingthetransitiveclosureofdenseDAGsisstillunsolved.Thebestknownworst-caseperformanceboundisVE.However,wearecertainlybetteroffusinganalgorithmthatrunsfasterforalargeclassofDAGs, suchasProgram19.9, thanwe are using one that always runs in timeproportional to V E, such as Program 19.4. As we see in Section 19.9, thisperformance improvement forDAGs has direct implications for our ability tocomputethetransitiveclosureofgeneraldigraphsaswell.

Exercises•19.115Show,in thestyleofFigure19.27, thereachabilityvectors that resultwhenweuseProgram19.9tocomputethetransitiveclosureoftheDAG

3-71-47-80-55-23-82-90-64-92-66-44-32-3.•19.116Develop a version of Program 19.9 that uses a representation of thetransitive closure that does not support edge testing and that runs in timeproportionaltoV2+Σev(e),wherethesumisoveralledgesintheDAGandv(e) is thenumberofverticesreachablefromthedestinationvertexofedgee.This cost will be significantly less than V E for some sparse DAGs (seeExercise19.65).

•19.117Implementanabstract–transitive-closureclassforDAGsthatusesextraspace at most proportional to V (and is suitable for huge DAGs). Usetopological sorting to provide quick response when the vertices are notconnected,anduseasource-queueimplementationtoreturnthelengthofthepathwhentheyareconnected.

• 19.118 Develop a transitive-closure implementation based on a sink-queue–basedreversetopologicalsort(seeExercise19.105).

• 19.119 Does your solution to Exercise 19.118 require that you examine alledges in theDAG,orare thereedges that canbe ignored, suchas thedownedges in DFS? Give an example that requires examination of all edges, orcharacterizetheedgesthatcanbeskipped.

19.8StrongComponentsinDigraphsUndirectedgraphsandDAGsarebothsimplerstructuresthangeneraldigraphsbecause of the structural symmetry that characterizes the reachabilityrelationshipsamongthevertices:Inanundirectedgraph,ifthereisapathfromstot, thenweknowthat there isalsoapathfrom ttos; inaDAG,if there isadirectedpathfromstot,thenweknowthatthereisnodirectedpathfromttos.Forgeneral digraphs, knowing that t is reachable from s gives no informationaboutwhethersisreachablefromt.Tounderstandthestructureofdigraphs,weconsiderstrongconnectivity,whichhasthesymmetrythatweseek.Ifsandtarestronglyconnected(eachreachablefromtheother),then,bydefinition,soaretands.AsdiscussedinSection19.1,this symmetry implies that the vertices of the digraph divide into strongcomponents, which consist of mutually reachable vertices. In this section, wediscussthreealgorithmsforfindingthestrongcomponentsinadigraph.

We use the same interface as for connectivity in our general graph-searchingalgorithmsforundirectedgraphs(seeProgram18.4).Thegoalofouralgorithmsistoassigncomponentnumberstoeachvertexinavertex-indexedvector,usingthe labels 0, 1, [triangleright][triangleright][triangleright], for the strongcomponents.Thehighestnumberassignedisonelessthanthenumberofstrongcomponents,andwecanusethecomponentnumberstoprovideaconstant-timetestofwhethertwoverticesareinthesamestrongcomponent.A brute-force algorithm to solve the problem is simple to develop. Using anabstract–transitive-closure ADT, check every pair of vertices s and t to seewhether t is reachable from s and s is reachable from t. Define an undirectedgraphwithanedgeforeachsuchpair:Theconnectedcomponentsofthatgraphare thestrongcomponentsof thedigraph.Thisalgorithmissimple todescribeandtoimplement,anditsrunningtimeisdominatedbythecostsoftheabstract–transitive-closureimplementation,asdescribedby,say,Property19.10.Thealgorithmsthatweconsiderinthissectionaretriumphsofmodernalgorithmdesignthatcanfindthestrongcomponentsofanygraphinlineartime,afactorofVfasterthanthebrute-forcealgorithm.For100vertices,thesealgorithmswillbe100timesfasterthanthebrute-forcealgorithm;for1000vertices,theywillbe1000 times faster; and we can contemplate addressing problems involvingbillions of vertices. This problem is a prime example of the power of goodalgorithm design, one which has motivated many people to study graphalgorithms closely.Where elsemightwe contemplate reducing resource usagebyafactorof1billionormorewithanelegantsolutiontoanimportantpracticalproblem?Thehistoryofthisproblemisinstructive(seereferencesection).Inthe1950sand 1960s, mathematicians and computer scientists began to study graphalgorithms in earnest in a context where the analysis of algorithms itself wasunderdevelopmentasafieldofstudy.Thebroadvarietyofgraphalgorithmstobe considered—coupled with ongoing developments in computer systems,languages, and our understandingof performing computations efficiently—leftmanydifficult problemsunsolved.As computer scientists began tounderstandmany of the basic principles of the analysis of algorithms, they began tounderstandwhichgraphproblemscouldbe solvedefficiently andwhich couldnot and then to develop increasingly efficient algorithms for the former set ofproblems. Indeed, R. Tarjan introduced linear-time algorithms for strongconnectivity and other graph problems in 1972, the same year that R. Karpdocumented the intractability of the traveling-salesperson problem and manyothergraphproblems.Tarjan’salgorithmhasbeenastapleofadvancedcourses

in the analysis of algorithms for many years because it solves an importantpracticalproblemusingsimpledatastructures.Inthe1980s,R.Kosarajutookafresh look at the problem and developed a new solution; people later realizedthatapaperthatdescribesessentiallythesamemethodappearedintheRussianscientific literature in 1972. Then, in 1999, H. Gabow found a simpleimplementationofoneofthefirstapproachestriedinthe1960s,givingathirdlinear-timealgorithmforthisproblem.Thepoint of this story is not just that difficult graph-processingproblems canhavesimplesolutions,butalsothattheabstractionsthatweareusing(DFSandadjacencylists)aremorepowerful thanwemightrealize.Aswebecomemoreaccustomed to using these and similar tools, we should not be surprised todiscover simple solutions to other important graph problems as well.Researchers still seek concise implementations like these for numerous otherimportantgraphalgorithms;manysuchalgorithmsremaintobediscovered.

Figure19.28Computingstrongcomponents(Kosaraju’salgorithm)

Tocomputethestrongcomponentsofthedigraphatthelowerleft,wefirstdoaDFSofitsreverse(topleft),computingapostordervectorthatgivesthevertexindicesintheorderinwhichtherecursiveDFScompleted(top).ThisorderisequivalenttoapostorderwalkoftheDFSforest(topright).Thenweusethe

reverseofthatordertodoaDFSoftheoriginaldigraph(bottom).Firstwecheckallnodesreachablefrom9,thenwescanfromrighttoleftthroughthevectortofindthat1istherightmostunvisitedvertex,sowedotherecursivecallfor1,andsoforth.ThetreesintheDFSforestthatresultsfromthisprocessdefinethe

strongcomponents:Allverticesineachtreehavethesamevalueinthevertex-indexedidvector(bottom).

Kosaraju’s method is simple to explain and implement. To find the strongcomponentsofagraph,firstrunDFSonitsreverse,computingthepermutation

of vertices defined by the postorder numbering. (This process constitutes atopologicalsortifthedigraphisaDAG.)Then,runDFSagainonthegraph,buttofindthenextvertextosearch(whencallingtherecursivesearchfunction,bothattheoutsetandeachtimethattherecursivesearchfunctionreturnstothetop-level search function), use the unvisited vertex with the highest postordernumber.The magic of the algorithm is that, when the unvisited vertices are checkedaccordingtothetopologicalsort inthisway, thetreesintheDFSforestdefinethe strong components just as trees in a DFS forest define the connectedcomponents in undirected graphs—two vertices are in the same strongcomponentifandonlyiftheybelong

Program19.10Strongcomponents(Kosaraju’salgorithm)Clients can use objects of this class to find the number of strong components of adigraph (count) and to do strong-connectivity tests (stronglyconnected). The SCconstructor first builds the reverse digraph and does a DFS to compute a postordernumbering. Next, it does a DFS of the original digraph, using the reverse of thepostorder from the firstDFS in the search loop that calls the recursive function.EachrecursivecallinthesecondDFSvisitsalltheverticesinastrongcomponent.

tothesametreeinthisforest.Figure19.28illustratesthisfactforourexample,andwewillproveitinamoment.Therefore,wecanassigncomponentnumbersaswedidforundirectedgraphs,incrementingthecomponentnumbereachtimethat the recursive function returns to the top-level search function. Program19.10isafullimplementationofthemethod.Property19.14Kosaraju’smethod finds the strong components of a graph in

lineartimeandspace.Proof:Themethodconsistsofminormodifications to twoDFSprocedures, sotherunningtimeiscertainlyproportionaltoV2 fordensegraphsandV+E forsparsegraphs(usinganadjacency-listsrepresentation),asusual.Toprovethatitcomputesthestrongcomponentsproperly,wehavetoprovethattwoverticessandtareinthesametreeintheDFSforestforthesecondsearchifandonlyiftheyaremutuallyreachable.Ifs and t aremutually reachable, they certainlywill be in the sameDFS treebecause when the first of the two is visited, the second is unvisited and isreachable from the first and sowillbevisitedbefore the recursivecall for therootterminates.Toprovetheconverse,weassumethatsandtareinthesametree,andletrbetherootofthetree.Thefactthatsisreachablefromr(throughadirectedpathoftreeedges)impliesthatthereisadirectedpathfromstorinthereversedigraph.Now, the key to the proof is that theremust also be a path from r to s in thereverse digraph because r has a higher postorder number than s (since r waschosenfirstinthesecondDFSatatimewhenbothwereunvisited)andthereisapathfromstor:Iftherewerenopathfromrtos,thenthepathfromstorinthereverse would leave s with a higher postorder number. Therefore, there aredirectedpathsfromstorandfromrtosinthedigraphanditsreverse:sandrare strongly connected. The same argument proves that t and r are stronglyconnected,andthereforesandtarestronglyconnected.The implementation forKosaraju’s algorithm for the adjacency-matrixdigraphrepresentation is even simpler thanProgram19.10 becausewe do not need tocomputethereverseexplicitly;thatproblemisleftasanexercise(seeExercise19.125).

Program19.11Strongcomponents(Tarjan’salgorithm)ThisDFSclassisanotherimplementationofthesameinterfaceasProgram19.10.Itusesa stackS to hold each vertex until determining that all the vertices down to a certainpointat thetopof thestackbelongtothesamestrongcomponent.Thevertex-indexedvector low keeps track of the lowest preorder number reachable via a series of downlinksfollowedbyoneuplinkfromeachnode(seetext).

Program19.10representsanoptimalsolutiontothestrong-connectivityproblemthatisanalogoustooursolutionsforconnectivityinChapter18.InSection19.9,weexaminethetaskofextendingthissolutiontocomputethetransitiveclosureandtosolvethereachability(abstract–transitive-closure)problemfordigraphs.First, however, we consider Tarjan’s algorithm and Gabow’s algorithm—ingeniousmethodsthatrequireonlyafewsimplemodificationstoourbasicDFSprocedure. They are preferable toKosaraju’s algorithm because they use onlyonepassthroughthegraphandbecausetheydonotrequirecomputationofthereverseforsparsegraphs.Tarjan’s algorithm is similar to the program thatwe studied inChapter17 forfindingbridgesinundirectedgraphs(seeProgram18.7).Themethodisbasedontwoobservationsthatwehavealreadymadeinothercontexts.First,weconsiderthe vertices in reverse topological order so thatwhenwe reach the endof therecursivefunctionforavertexweknowwewillnotencounteranymoreverticesinthesamestrongcomponent(becausealltheverticesthatcanbereachedfromthat vertex have been processed). Second, the back links in the tree provide asecondpathfromonevertextoanotherandbindtogetherthestrongcomponents.TherecursiveDFSfunctionusesthesamecomputationasProgram18.7tofindthe highest vertex reachable (via a back edge) from any descendant of eachvertex. It also uses a vertex-indexed vector to keep track of the strongcomponentsanda stack tokeep trackof thecurrent searchpath. Itpushes thevertexnamesontoastackonentrytotherecursivefunction,thenpopsthemandassigns component numbers after visiting the final member of each strong

component.Thealgorithmisbasedonourabilitytoidentifythismomentwithasimpletest(basedonkeepingtrackofthehighestancestorreachableviaoneuplinkfromalldescendantsofeachnode)attheendoftherecursiveprocedurethattellsusthatallverticesencounteredsinceentry(exceptthosealreadyassignedtoacomponent)belongtothesamestrongcomponent.TheimplementationinProgram19.11isasuccinctandcompletedescriptionofthe algorithm that fills in the detailsmissing from the brief sketch just given.Figure 19.29 illustrates the operation of the algorithm for our sample digraphfromFigure19.1.

Figure19.29Computingstrongcomponents(TarjanandGabowalgorithms)

Tarjan’salgorithmisbasedonarecursiveDFS,augmentedtopushverticesonastack.Itcomputesacomponentindexforeachvertexinavertex-indexedvectorid,usingauxiliaryvectorspreandlow(center).TheDFStreeforoursample

graphisshownatthetopandanedgetraceatthebottomleft.Inthecenteratthebottomisthemainstack:Wepushverticesreachedbytreeedges.UsingaDFStoconsidertheverticesinreversetopologicalorder,wecompute,foreachv,thehighestpointreachableviaabacklinkfromanancestor(low[v]).Whenavertexvhaspre[v]=low[v](vertices11,1,0,and7here)wepopitandallthevertices

aboveit(shaded)andassignthemallthenextcomponentnumber.InGabow’salgorithm,wepushverticesonthemainstack,justasinTarjan’salgorithm,butwealsokeepasecondstack(bottomright)withverticesonthesearchpaththatareknowntobeindifferentstrongcomponents,bypoppingallverticesafterthedestinationofeachbackedge.Whenwecompleteavertexvwithvatthetopofthissecondstack(shaded),weknowthatallverticesabovev

onthemainstackareinthesamestrongcomponent.Property19.15Tarjan’salgorithmfindsthestrongcomponentsofadigraphinlineartime.Proofsketch:IfavertexshasnodescendantsoruplinksintheDFStree,orifithas a descendant in the DFS tree with an up link that points to s and nodescendants with up links that point higher up in the tree, then it and all itsdescendants (except those vertices that satisfy the same property and theirdescendants)constituteastrongcomponent.Toestablishthisfact,wenotethatevery descendant t of s that does not satisfy the stated property has somedescendantthathasanuplinkpointinghigherthantinthetree.Thereisapathfromstotdownthroughthetreeandwecanfindapathfromttosasfollows:Godownfromttothevertexwiththeuplinkthatreachespastt,thencontinuethesameprocessfromthatvertexuntilreachings.Asusual,themethodislineartimebecauseitconsistsofaddingafewconstant-timeoperationstoastandardDFS.In1999GabowdiscoveredtheversionofTarjan’salgorithminProgram19.12.The algorithmmaintains the same stack of vertices in the same way as doesTarjan’salgorithm,butitusesasecondstack(insteadofavertex-indexedvectorof preorder numbers) to decide when to pop all the vertices in each strongcomponent from the main stack. The second stack contains vertices on thesearchpath.Whenabackedgeshowsthatasequenceofsuchverticesallbelongto the samestrongcomponent,wepop that stack to leaveonly thedestinationvertexofthebackedge,whichisnearertherootofthetreethanareanyofthe

othervertices.Afterprocessingall theedgesforeachvertex(makingrecursivecallsforthetreeedges,poppingthepathstackforthebackedges,andignoringthedownedges),wechecktoseewhetherthecurrentvertexisatthetopofthepathstack.Ifitis,itandalltheverticesaboveitonthemainstackmakeastrongcomponent,andwepopthemandassignthenextstrongcomponentnumbertothem,aswedidinTarjan’salgorithm.TheexampleinFigure19.29alsoshowsthecontentsofthissecondstack.Thus,thisfigurealsoillustratestheoperationofGabow’salgorithm.Property19.16Gabow’salgorithmfindsthestrongcomponentsofadigraphinlineartime.

Program19.12Strongcomponents(Gabow’salgorithm)ThisalternateimplementationoftherecursivememberfunctioninProgram19.11usesasecond stackpath insteadof thevertex-indexedvector low todecidewhen topop theverticesineachstrongcomponentfromthemainstack(seetext).

Formalizingtheargumentjustoutlinedandprovingtherelationshipbetweenthestackcontentsthatitdependsuponisaninstructiveexerciseformathematicallyinclined readers (see Exercise 19.132). As usual, the method is linear timebecauseitconsistsofaddingafewconstant-timeoperationstoastandardDFS.Thestrong-componentsalgorithmsthatwehaveconsideredinthissectionareallingeniousandaredeceptivelysimple.Wehaveconsideredallthreebecausetheyaretestimonytothepoweroffundamentaldatastructuresandcarefullycrafted

recursive programs. From a practical standpoint, the running time of all thealgorithms is proportional to the number of edges in the digraph, andperformancedifferencesarelikelytobedependentuponimplementationdetails.For example, pushdown-stack ADT operations constitute the inner loop ofTarjan’sandGabow’salgorithm.Ourimplementationsusethebare-bonesstackclassimplementationsofChapter4;implementationsthatuseSTLstacks,whichdoerror-checkingandcarryotheroverhead,maybeslower.TheimplementationofKosaraju’s algorithm is perhaps the simplest of the three, but it suffers theslight disadvantage (for sparse digraphs) of requiring three passes through theedges(onetomakethereverseandtwoDFSpasses).Next,weconsiderakeyapplicationofcomputingstrongcomponents:buildinganefficientreachability(abstract–transitive-closure)ADTfordigraphs.

Exercises•19.120DescribewhathappenswhenyouuseKosaraju’salgorithmtofindthestrongcomponentsofaDAG.

•19.121DescribewhathappenswhenyouuseKosaraju’salgorithmtofindthestrongcomponentsofadigraphthatconsistsofasinglecycle.••19.122Canweavoidcomputingthereverseofthedigraphintheadjacency-listsversionofKosaraju’smethod(Program19.10)byusingoneof the threetechniques mentioned in Section 19.4 for avoiding the reverse computationwhendoingatopologicalsort?Foreachtechnique,giveeitheraproofthat itworksoracounterexamplethatshowsthatitdoesnotwork.

•19.123Show,inthestyleofFigure19.28,theDFSforestsandthecontentsofthe auxiliary vertex-indexed vectors that result when you use Kosaraju’salgorithmtocompute thestrongcomponentsof the reverseof thedigraph inFigure19.5.(Youshouldhavethesamestrongcomponents.)19.124Show,inthestyleofFigure19.28,theDFSforestsandthecontentsofthe auxiliary vertex-indexed vectors that result when you use Kosaraju’salgorithmtocomputethestrongcomponentsofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.125ImplementKosaraju’salgorithmforfindingthestrongcomponentsofadigraphforadigraphrepresentationthatsuportsedgeexistencetesting.Donotexplicitly compute the reverse.Hint: Consider using two different recursiveDFSfunctions.

•19.126 Describewhat happenswhen you use Tarjan’s algorithm to find the

strongcomponentsofaDAG.•19.127 Describewhat happenswhen you use Tarjan’s algorithm to find thestrongcomponentsofadigraphthatconsistsofasinglecycle.

• 19.128 Show, in the style of Figure 19.29, the DFS forest, stack contentsduring theexecutionof thealgorithm,and the final contentsof the auxiliaryvertex-indexedvectorsthatresultwhenyouuseTarjan’salgorithmtocomputethe strong components of the reverse of the digraph in Figure 19.5. (Youshouldhavethesamestrongcomponents.)19.129Show,inthestyleofFigure19.29,theDFSforest,stackcontentsduringtheexecutionof thealgorithm,and the finalcontentsof theauxiliaryvertex-indexed vectors that result when you use Tarjan’s algorithm to compute thestrongcomponentsofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.• 19.130Modify the implementations of Tarjan’s algorithm in Program 19.11andofGabow’salgorithminProgram19.12suchthattheyusesentinelvaluestoavoidtheneedtocheckexplicitlyforcrosslinks.19.131 Show, in the style of Figure 19.29, the DFS forest, contents of bothstacks during the execution of the algorithm, and the final contents of theauxiliaryvertex-indexedvectorsthatresultwhenyouuseGabow’salgorithmtocomputethestrongcomponentsofthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.•19.132GiveafullproofofProperty19.16.•19.133DevelopaversionofGabow’salgorithmthatfindsbridgesandedge-connectedcomponentsinundirectedgraphs.

•19.134DevelopaversionofGabow’salgorithmthatfindsarticulationpointsandbiconnectedcomponentsinundirectedgraphs.19.135DevelopatableinthespiritofTable18.1tostudystrongconnectivityinrandomdigraphs(seeTable19.2).LetSbethesetofverticesinthelargeststrong component.Keep track of the size of S and study the percentages ofedges in the following fourclasses: thoseconnecting twovertices inS, thosepointingoutofS,thosepointingintoS,thoseconnectingtwoverticesnotinS.19.136Runempirical tests to compare thebrute-forcemethod for computingstrong components described at the beginning of this section, Kosaraju’salgorithm, Tarjan’s algorithm, and Gabow’s algorithm, for various types ofdigraphs(seeExercises19.11–18).

•••19.137Developalinear-timealgorithmforstrong2-connectivity:Determinewhetherastronglyconnecteddigraphhasthepropertythatitremainsstronglyconnectedafterremovinganyvertex(andallitsincidentedges).

19.9TransitiveClosureRevisitedByputtingtogethertheresultsoftheprevioustwosections,wecandevelopanalgorithm to solve the abstract–transitive-closure problem for digraphs that—althoughitoffersnoimprovementoveraDFS-basedsolutionintheworstcase—willprovideanoptimalsolutioninmanysituations.Thealgorithmisbasedonpreprocessingthedigraphtobuildthelatter’skernelDAG(seeProperty19.2).ThealgorithmisefficientifthekernelDAGissmallrelativetothesizeoftheoriginaldigraph.IfthedigraphisaDAG(andthereforeisidenticaltoitskernelDAG)orifithasjustafewsmallcycles,wewillnotseeany significant cost savings; however, if the digraph has large cycles or largestrongcomponents(andthereforeasmallkernelDAG),wecandevelopoptimalor near-optimal algorithms. For clarity, we assume the kernel DAG to besufficiently small that we can afford an adjacency-matrix representation,althoughthebasicideaisstilleffectiveforlargerkernelDAGs.To implement the abstract transitive closure, we preprocess the digraph asfollows:•Finditsstrongcomponents•BuilditskernelDAG•ComputethetransitiveclosureofthekernelDAGWe can use Kosaraju’s, Tarjan’s, or Gabow’s algorithm to find the strongcomponents; a single pass through the edges to build the kernel DAG (asdescribed in the next paragraph); and DFS (Program 19.9) to compute itstransitive closure. After this preprocessing, we can immediately access theinformationnecessarytodeterminereachability.Oncewehaveavertex-indexedvectorwiththestrongcomponentsofadigraph,building an adjacency-matrix representation of its kernel DAG is a simplematter.TheverticesoftheDAGarethecomponentnumbersinthedigraph.Foreachedges-tintheoriginaldigraph,wesimplysetD->adj[sc[s]][sc[t]]to1.WewouldhavetocopewithduplicateedgesinthekernelDAGifwewereusinganadjacency-lists representation—in an adjacencymatrix, duplicate edges simplycorrespond to setting to 1 amatrix entry that has already been set to 1. Thissmall point is significant because the number of duplicate edges is potentiallyhuge(relativetothesizeofthekernelDAG)inthisapplication.

Property19.17GiventwoverticessandtinadigraphD,letsc(s)andsc(t),respectively, be their corresponding vertices in D’s kernel DAG K. Then, t isreachablefromsinDifandonlyifsc(t)isreachablefromsc(s)inK.Thissimplefactfollowsfromthedefinitions.Inparticular,thispropertyassumestheconventionthatavertexisreachablefromitself(allverticeshaveself-loops).Iftheverticesareinthesamestrongcomponent(sc(s)=sc(t)),thentheyaremutuallyreachable.

Program19.13Strong-component–basedtransitiveclosureThis class implements the (abstract) transitive closure interface for digraphs bycomputing the strong components (using, say Program 19.11), kernel DAG, and thetransitiveclosureofthekernelDAG(usingProgram19.9).ItassumesthatclassSChasapublicmemberfunctionIDthatreturnsthestrongcomponentindex(fromtheidarray)foranygivenvertex.ThesenumbersarethevertexindicesinthekernelDAG.AvertextisreachablefromavertexsinthedigraphifandonlyifID(t)isreachablefromID(s)inthekernelDAG.

We determine whether a given vertex t is reachable from a given vertex sinthesamewayaswebuiltthekernelDAG:Weusethevertex-indexed vector

computedbythestrong-componentsalgorithmtogetthecomponentnumberssc(s)andsc(t)(inconstanttime),whichweinterpretasabstractvertexindicesforthekernelDAG.UsingthemasvertexindicesforthetransitiveclosureofthekernelDAGtellsustheresult.Program19.13isanimplementationoftheabstract–transitive-closureADTthatembodies these ideas.We use an abstract –transitive-closure interface for thekernelDAGaswell.TherunningtimeofthisimplementationdependsnotjustonthenumberofverticesandedgesinthedigraphbutalsoonpropertiesofitskernelDAG. For purposes of analysis,we suppose thatwe use an adjacency-matrixrepresentationforthekernelDAGbecauseweexpectthekernelDAGtobesmall,ifnotalsodense.Property19.18Wecansupportconstantquery time for theabstract transitiveclosureofadigraphwithspaceproportionaltoV+V2andtimeproportionaltoE+V2+vxforpreprocessing(computingthetransitiveclosure),wherevisthenumberofverticesinthekernelDAGandxisthenumberofcrossedgesinitsDFSforest.Proof:ImmediatefromProperty19.13.If thedigraph isaDAG, then the strong-componentscomputationprovidesnonew information, and this algorithm is the same as Program 19.9; in generaldigraphs that have cycles, however, this algorithm is likely to be significantlyfaster than Warshall’s algorithm or the DFS-based solution. For example,Property19.18immediatelyimpliesthefollowingresult.Property19.19Wecansupportconstantquery time for theabstract transitiveclosureofanydigraphwhosekernelDAGhaslessthan verticeswithspaceproportionaltoVandtimeproportionaltoE+Vforpreprocessing.Proof:Takev< inProperty19.18andnotethatx<v2.We might consider other variations on these bounds. For example, if we arewilling to use space proportional toE, we can achieve the same time boundswhen thekernelDAGhasup to vertices.Moreover, these timebounds areconservativebecausetheyassumethatthekernelDAGisdensewithcrossedges—andcertainlyitneednotbeso.TheprimarylimitingfactorintheapplicabilityofthismethodisthesizeofthekernelDAG.Themore similar our digraph is to aDAG (the larger its kernelDAG),themoredifficultywefaceincomputingitstransitiveclosure.Notethat(ofcourse)westillhavenotviolatedthelowerboundimplicitinProperty19.9,since thealgorithmruns in timeproportional toV3 fordenseDAGs;wehave,

however,significantlybroadenedtheclassofgraphsforwhichwecanavoidthisworst-case performance. Indeed, constructing a random-digraph model thatproducesdigraphsforwhich thealgorithmisslowisachallenge(seeExercise19.142).

Table19.2Propertiesofrandomdigraphs

This table shows the numbers of edges and vertices in the kernel DAGs forrandomdigraphsgeneratedfromtwodifferentmodels(thedirectedversionsofthemodelsinTable18.1).Inbothcases,thekernelDAGbecomessmall(andissparse)asthedensityincreases.

Table 19.2 displays the results of an empirical study; it shows that randomdigraphs have small kernel DAGs even for moderate densities and even inmodelswith severe restrictions on edge placement. Although there can be noguarantees in the worst case, we can expect to see huge digraphs with smallkernelDAGsinpractice.Whenwedohavesuchadigraph,wecanprovidean

efficientimplementationoftheabstract–transitive-closureADT.

Exercises• 19.138 Develop a version of the implementation of the abstract transitiveclosurefordigraphsbasedonusingasparse-graphrepresentationofthekernelDAG.Your challenge is to eliminate duplicates on the listwithout using anexcessiveamountoftimeorspace(seeExercise19.65).

•19.139Show thekernelDAGcomputedbyProgram19.13and its transitiveclosureforthedigraph

3-71-47-80-55-23-82-90-64-92-66-4.• 19.140 Convert the strong-component–based abstract–transitive-closureimplementation(Program19.13) into an efficient program that computes theadjacencymatrix of the transitive closure for a digraph representedwith anadjacencymatrix,usingGabow’salgorithmtocomputethestrongcomponentsandtheimprovedWarshall’salgorithmtocomputethetransitiveclosureoftheDAG.19.141Doempirical studies toestimate theexpectedsizeof thekernelDAGforvarioustypesofdigraphs(seeExercises19.11–18).••19.142 Develop a random-digraphmodel that generates digraphs that havelargekernelDAGs.Yourgeneratormustgenerateedgesoneata time,but itmustnotmakeuseofanystructuralpropertiesoftheresultinggraph.19.143 Develop an implementation of the abstract transitive closure in adigraphbyfindingthestrongcomponentsandbuildingthekernelDAG,thenansweringreachabilityqueries in theaffirmativeif thetwoverticesare in thesamestrongcomponent,anddoingaDFSintheDAGtodeterminereachabilityotherwise.

19.10PerspectiveIn this chapter, we have considered algorithms for solving the topological-sorting, transitive-closure, and shortest-paths problems for digraphs and forDAGs, including fundamental algorithms for finding cycles and strongcomponents in digraphs. These algorithms have numerous importantapplicationsintheirownrightandalsoserveasthebasisforthemoredifficultproblemsinvolvingweightedgraphsthatweconsider in thenext twochapters.Worst-caserunningtimesofthesealgorithmsaresummarizedinTable19.3.Inparticular,acommonthemethroughthechapterhasbeenthesolutionoftheabstract–transitive-closureproblem,wherewewish

Table19.3Worst-casecostofdigraph-processingoperationsThis table summarizes the cost (worst-case running time) of algorithms forvarious digraph-processing problems considered in this chapter, for randomgraphs and graphs where edges randomly connect each vertex to one of 10specified neighbors.All costs assume use of the adjacency-list representation;fortheadjacency-matrixrepresentationtheEentriesbecomeV2entries,so, forexample, the cost of computing all shortest paths is V3. The linear-timealgorithmsareoptimal,sothecostswillreliablypredicttherunningtimeonanyinput; the othersmay be overly conservative estimates of cost, so the runningtimemaybelowerforcertaintypesofgraphs.Performancecharacteristicsofthefastest algorithms for computing the transitive closure of a digraphdependonthedigraph’sstructure,particularlythesizeofitskernelDAG.

tosupportanADTthatcandeterminequickly,afterpreprocessing,whetherthereisadirectedpathfromonegivenvertextoanother.Despitealowerboundthatimpliesthatourworst-casepreprocessingcostsaresignificantlyhigherthanV2,themethoddiscussedinSection19.7meldsthebasicmethodsfromthroughoutthechapter intoa simple solution thatprovidesoptimalperformance formanytypes of digraphs—the significant exception being dense DAGs. The lowerboundsuggeststhatbetterguaranteedperformanceonallgraphswillbedifficult

toachieve,butwecanusethesemethodstogetgoodperformanceonpracticalgraphs.Thegoalofdevelopinganalgorithmwithperformancecharacteristicssimilartothe union-find algorithms of Chapter 1 for dense digraphs remains elusive.Ideally,wewouldliketodefineanADTwherewecanadddirectededgesortestwhetheronevertexisreachablefromanotherandtodevelopanimplementationwherewecansupportall theoperationsinconstanttime(seeExercises19.153through19.155).AsdiscussedinChapter1,wecancomeclosetothatgoalforundirectedgraphs,butcomparablesolutionsfordigraphsorDAGsarestillnotknown. (Note that removing edges presents a challenge even for undirectedgraphs.)Notonlydoesthisdynamicreachabilityproblemhavebothfundamentalappeal anddirect application inpractice, but also it plays a critical role in thedevelopment of algorithms at a higher level of abstraction. For example,reachabilityliesattheheartoftheproblemofimplementingthenetworksimplexalgorithm for the mincost flow problem, a problem-solving model of wideapplicabilitythatweconsiderinChapter22.Many other algorithms for processing digraphs and DAGs have importantpractical applications and have been studied in detail, and many digraph-processingproblems still call for thedevelopment of efficient algorithms.Thefollowinglistisrepresentative.DominatorsGivenaDAGwithallverticesreachablefromasinglesourcer,avertexsdominatesavertextifeverypathfromrtotcontainss. (Inparticular,each vertex dominates itself.) Every vertex v other than the source has animmediatedominatorthatdominatesvbutdoesnotdominateanydominatorofvbutvanditself.Thesetofimmediatedominatorsisatreethatspansallverticesreachable from the source. This structure is important in compilerimplementations. The dominator tree can be computed in linear time with aDFS-based approach that uses several ancillary data structures, although aslightlyslowerversionistypicallyusedinpractice.Transitive reduction Given a digraph, find a digraph that has the sametransitive closure and the smallest number of edges among all such digraphs.Thisproblemistractable(seeExercise19.150);butifwerestrictittoinsistthattheresultbeasubgraphoftheoriginalgraph,itisNP-hard.DirectedEulerpathGivenadigraph, is there adirectedpath connecting twogivenverticesthatuseseachedgeinthedigraphexactlyonce?Thisproblemiseasy by essentially the same arguments as for the corresponding problem forundirectedgraphs,whichweconsideredinSection17.7(seeExercise17.92).

Directedmail carrier Given a digraph, find a directed tour with a minimalnumberofedgesthatuseseveryedgeinthegraphatleastonce(butisallowedtouseedgesmultipletimes).AsweshallseeinSection22.7,thisproblemreducestothemincost-flowproblemandisthereforetractable.DirectedHamiltonpathFindthelongestsimpledirectedpathinadigraph.ThisproblemisNP-hard,butitiseasyifthedigraphisaDAG(seeExercise19.114).UniconnectedsubgraphAdigraphissaidtobeuniconnectedifthereisatmostonedirectedpathbetweenanypairofvertices.Givenadigraphandanintegerk,determinewhether thereisauniconnectedsubgraphwithat leastkedges.ThisproblemisknowntobeNP-hardforgeneralk.FeedbackvertexsetDecidewhetheragivendigraphhasasubsetofatmostkvertices that contains at least one vertex from every directed cycle inG. ThisproblemisknowntobeNP-hard.Even cycle Decide whether a given digraph has a cycle of even length. AsmentionedinSection17.8,thisproblem,whilenotintractable,issodifficulttosolvethatnoonehasyetdevisedanalgorithmthatisusefulinpractice.Just as for undirected graphs, myriad digraph-processing problems have beenstudied, and knowing whether a problem is easy or intractable is often achallenge(seeSection17.8).As indicated throughout thischapter,someof thefacts thatwe have discovered about digraphs are expressions ofmore generalmathematical phenomena, and many of our algorithms have applicability atlevelsofabstractiondifferentfromthatatwhichwehavebeenworking.Ontheone hand, the concept of intractability tells us that we might encounterfundamentalroadblocksinourquestforefficientalgorithmsthatcanguaranteeefficientsolutions tosomeproblems.On theotherhand, theclassicalgorithmsdescribed in this chapter are of fundamental importance and have broadapplicability,astheyprovideefficientsolutionstoproblemsthatarisefrequentlyinpracticeandwouldotherwisebedifficulttosolve.

Exercises19.144 Adapt Programs 17.18 and 17.19 to implement anADT function forprintinganEulerpath inadigraph, ifoneexists.Explain thepurposeofanyadditionsorchangesthatyouneedtomakeinthecode.•19.145Drawthedominatortreeofthedigraph

3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6.

••19.146WriteclassthatusesDFStocreateaparent-linkrepresentationofthedominatortreeofagivendigraph(seereferencesection).

•19.147Findatransitivereductionofthedigraph3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6.

•19.148Findasubgraphofthedigraph3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6

thathasthesametransitiveclosureandthesmallestnumberofedgesamongallsuchsubgraphs.•19.149Prove thateveryDAGhasaunique transitive reduction,andgiveanefficientADTfunctionimplementationforcomputingthetransitivereductionofaDAG.

•19.150WriteanefficientADTfunctionfordigraphsthatcomputesatransitivereduction.19.151 Give an algorithm that determineswhether or not a given digraph isuniconnected. Your algorithm should have a worst-case running timeproportionaltoVE.19.152Findthelargestuniconnectedsubgraphinthedigraph

3-71-47-80-55-23-02-90-64-92-66-41-58-29-08-34-52-31-63-57-6.

•19.153Developadigraphclassimplementionthatsupportsinsertinganedge,removing an edge, and testing whether two vertices are in the same strongcomponent, such thatconstruction,edge insertion,andedge removalall takelinear time and strong-connectivity queries take constant time, in the worstcase.

•19.154SolveExercise19.153,insuchawaythatedgeinsertion,edgeremoval,andstrong-connectivityqueriesalltaketimeproportionaltologVintheworstcase.

••• 19.155 Solve Exercise 19.153, in such a way that edge insertion, edgeremoval, and strong-connectivity queries all take near-constant time (as theydofortheunion-findalgorithmsforconnectivityinundirectedgraphs).

CHAPTERTWENTYMinimumSpanningTrees

GRAPHMODELSWHEREweassociateweightsorcostswith each edge arecalled for inmanyapplications. Inanairlinemapwhereedges represent flightroutes, these weights might represent distances or fares. In an electric circuitwhereedgesrepresentwires,theweightsmightrepresentthelengthofthewire,its cost, or the time that it takes a signal to propagate through it. In a job-scheduling problem, weights might represent time or the cost of performingtasksorofwaitingfortaskstobeperformed.Questions that entail costminimization naturally arise for such situations.Weexamine algorithms for two such problems: (i) find the lowest-cost way toconnect allof thepoints, and (ii) find the lowest-cost pathbetween twogivenpoints. The first type of algorithm,which is useful for undirected graphs thatrepresentobjects suchaselectriccircuits, findsaminimumspanning tree; it isthe subject of this chapter. The second type of algorithm,which is useful fordigraphs that representobjects such as an airline routemap, finds the shortestpaths;itisthesubjectofChapter21.Thesealgorithmshavebroadapplicabilitybeyond circuit and map applications, extending to a variety of problems thatariseonweightedgraphs.Whenwestudyalgorithms thatprocessweightedgraphs,our intuition isoftensupported by thinking of the weights as distances: We speak of “the vertexclosesttox,”andsoforth. Indeed, the term“shortestpath”embraces thisbias.Despite numerous applications where we actually do work with distance anddespitethebenefitsofgeometricintuitioninunderstandingthebasicalgorithms,itisimportanttorememberthattheweightsdonotneedtobeproportionaltoadistanceatall;theymightrepresenttimeorcostoranentirelydifferentvariable.Indeed,asweseeinChapter21,weightsinshortest-pathsproblemscanevenbenegative.Toappealtointuitionindescribingalgorithmsandexampleswhilestillretaininggeneral applicability, we use ambiguous terminology where we referinterchangeablytoedgelengthsandweights.Whenwerefertoa“short”edge,wemeana “low-weight” edge, and so forth.Formost of the examples in thischapter, we use weights that are proportional to the distances between thevertices, as shown in Figure 20.1. Such graphs are more convenient forexamples,becausewedonotneedtocarrytheedgelabelsandcanstilltellata

glancethatlongeredgeshaveweightshigherthanthoseofshorteredges.Whenthe weights do represent distances, we can consider algorithms that gainefficiencybytakingintoaccountgeometricproperties(Sections20.7and21.5).With that exception, the algorithms thatwe consider simplyprocess the edgesand do not take advantage of any implied geometric information (see Figure20.2).The problem of finding the minimum spanning tree of an arbitrary weightedundirectedgraphhasnumerousimportantapplications,andalgorithmstosolveithavebeenknownsinceatleastthe1920s;buttheefficiencyofimplementationsvaries widely, and researchers still seek better methods. In this section, weexamine three classical algorithms that are easily understood at a conceptuallevel; in Sections 20.3 through 20.5, we examine implementations of each indetail; and inSection20.6,we consider comparisons of and improvements onthesebasicapproaches.Definition 20.1Aminimum spanning tree (MST)of a weighted graph is aspanning treewhoseweight (the sum of theweights of its edges) is no largerthantheweightofanyotherspanningtree.If theedgeweightsareallpositive, it suffices todefine theMSTas the setofedgeswithminimal totalweight thatconnectsall thevertices,assuchasetofedgesmustformaspanningtree.Thespanning-treeconditioninthedefinitionisincludedsothatitappliesforgraphsthatmayhavenegativeedgeweights(seeExercises20.2and20.3).

Figure20.1AweightedundirectedgraphanditsMST

Aweightedundirectedgraphisasetofweightededges.TheMSTisasetofedgesofminimaltotalweightthatconnectsthevertices(blackintheedgelist,thickedgesinthegraphdrawing).Inthisparticulargraph,theweightsare

proportionaltothedistancesbetweenthevertices,butthebasicalgorithmsthatweconsiderareappropriateforgeneralgraphsandmakenoassumptionsabout

theweights(seeFigure20.2).Ifedgescanhaveequalweights,theminimumspanningtreemaynotbeunique.For example, Figure 20.2 shows a graph that has two different MSTs. Thepossibility of equal weights also complicates the descriptions and correctnessproofsofsomeofouralgorithms.Wehavetoconsiderequalweightscarefully,because they are not unusual in applications and we want to know that ouralgorithmsoperatecorrectlywhentheyarepresent.NotonlymighttherebemorethanoneMST,butalsothenomenclaturedoesnotcapturepreciselytheconceptthatweareminimizingtheweightratherthanthe

treeitself.Theproperadjectivetodescribeaspecifictreeisminimal(onehavingthesmallestweight).For these reasons,manyauthorsusemoreaccurate termslikeminimalspanningtreeorminimum-weightspanning tree.TheabbreviationMST,whichwe shall usemost often, is universally understood to capture thebasicconcept.Still,toavoidconfusionwhendescribingalgorithmsfornetworksthatmayhaveedges with equal weights, we do take care to be precise to use the term“minimal” torefer to“anedgeofminimumweight”(amongalledges insomespecifiedset)and“maximal”toreferto“anedgeofmaximumweight.”Thatis,ifedgeweightsaredistinct,aminimaledgeistheshortestedge(andistheonlyminimaledge);butifthereismorethanoneedgeofminimumweight,anyoneofthemmightbeaminimaledge.We work exclusively with undirected graphs in this chapter. The problem offindingaminimum-weightdirectedspanningtreeinadigraphisdifferent,andismoredifficult.Severalclassicalalgorithmshavebeendeveloped for theMSTproblem.Thesemethodsareamongtheoldestandmostwell-knownalgorithmsinthisbook.Aswe have seen before, the classical methods provide a general approach, butmodern algorithms and data structures can give us compact and efficientimplementations. Indeed, these implementationsprovideacompellingexampleof the effectiveness of careful ADT design and proper choice of fundamentalADT data structure and algorithm implementations in solving increasinglydifficultalgorithmicproblems.

Exercises20.1Assumethattheweightsinagrapharepositive.ProvethatyoucanrescalethembyaddingaconstanttoallofthemorbymultiplyingthemallbyaconstantwithoutaffectingtheMSTs,providedonlythattherescaledweightsarepositive.

Figure20.2Arbitraryweights

Inthisexample,theedgeweightsarearbitraryanddonotrelatetothegeometryofthedrawngraphrepresentationatall.Thisexamplealsoillustratesthatthe

MSTisnotnecessarilyuniqueifedgeweightsmaybeequal:wegetoneMSTbyincluding3-4(shown)andadifferentMSTbyincluding0-5instead(although7-

6,whichhasthesameweightasthosetwoedges,isnotinanyMST).20.2Showthat,ifedgeweightsarepositive,asetofedgesthatconnectsalltheverticeswhoseweightssumtoaquantitynolargerthanthesumoftheweightsofanyothersetofedgesthatconnectsalltheverticesisanMST.20.3 Show that the property stated in Exercise 20.2 holds for graphs withnegativeweights,providedthattherearenocycleswhoseedgesallhavenon-positiveweights.•20.4Howwouldyoufindamaximumspanningtreeofaweightedgraph?•20.5Showthatifagraph’sedgesallhavedistinctweights,theMSTisunique.

•20.6 Consider the assertion that a graph has a uniqueMST only if its edgeweightsaredistinct.Giveaprooforacounterexample.

•20.7Assumethatagraphhast<Vedgeswithequalweightsandthatallotherweightsaredistinct.GiveupperandlowerboundsonthenumberofdifferentMSTsthatthegraphmighthave.

20.1RepresentationsInthischapter,weconcentrateonweightedundirectedgraphs—themostnaturalsetting forMSTproblems. Perhaps the simplestway to begin is to extend thebasic graph representations from Chapter 17 to represent weighted graphs asfollows: In the adjacency-matrix representation, the matrix can contain edgeweightsratherthanBooleanvalues;intheadjacency-listsrepresentation,wecanaddafieldfortheweightstothelistelementsthatrepresentedges.Thisclassicapproachisappealinginitssimplicity,butwewilluseadifferentmethodthatisnotmuchmorecomplicatedandwillmakeourprogramsusefulinmoregeneralsettings. Comparative performance characteristics of the two approaches arebrieflytreatedlaterinthissection.Toaddresstheissuesraisedwhenweshiftfromgraphswhereouronlyinterestisin the presence or absence of edges to graphs where we are interested ininformation associated with edges, it is useful to imagine scenarios whereverticesandedgesareobjectsofunknowncomplexity,perhapspartofaahugeclient data base that is built and maintained by some other application. Forexample,wemightliketoviewstreets,roads,andhighwaysinamapdatabaseas abstract edgeswith their lengths asweights, but, in the data base, a road’sentrymightcarryasubstantialamountofotherinformationsuchasitsnameandtype, a detailed description of its physical properties, how much traffic iscurrentlyusingit,andsoforth.Now,aclientprogram

Program20.1ADTinterfaceforgraphswithweightededgesThiscodedefinesaninterfaceforgraphswithweightsandotherinformationassociatedwithedges.ItconsistsofanEDGEADTinterfaceandatemplatizedGRAPHinterfacethatmaybeusedforanyEDGEimplementation.GRAPHimplementationsmanipulateedgepointers(providedbyclientswithinsert),notedges.Theedgeclassalsoincludesmember functions that give information about the orientation of the edge: Either e->from(v)istrue,e->v()isvande->other(v)ise->w();ore->from(v)isfalse,e->w()isvande->other(v)ise->v().

Program20.2Exampleofagraph-processingclientfunctionThis function illustrates the use of ourweighted-graph interface inProgram20.1. Foranyimplementationoftheinterface,edgesreturnsavectorcontainingpointerstoallthegraph’sedges.AsinChapters17through19,wegenerallyusetheiteratorfunctionsonlyinthemannerillustratedhere.

could extract the information that it needs, build a graph, process it, and theninterprettheresultsinthecontextofthedatabase,butthatmightbeadifficultandcostlyprocess. Inparticular, it amounts to (at least)makinga copyof thegraph.

A better alternative is for the client to define an Edge data type and for ourimplementationstomanipulatepointerstoedges.Program20.1showsthedetailsoftheEDGE,GRAPH,anditeratorADTsthatweusetosupportthisapproach.ClientscaneasilymeetourminimalrequirementsfortheEDGEdatatypewhileretainingtheflexibilitytodefineapplication-specificdatatypesforuseinothercontexts. Our implementations can manipulate edge pointers and use theinterface to extract the information they need from EDGEswithout regard tohowtheyarerepresented.Program20.2 isanexampleofaclientprogramthatusesthisinterface.Fortestingalgorithmsandbasicapplications,weuseaclassEDGE,whichhastwo ints and a double as private data members that are initialized fromconstructorargumentsandarethereturnvaluesofthev(),w(),andwt()memberfunctions, respectively (see Exercise 20.8). To avoid proliferation of simpletypes,weuseedgeweightsoftypedoublethroughoutthischapterandChapter21.Inourexamples,weuseedgeweightsthatarerealnumbersbetween0and1.Thisdecisiondoesnotconflictwithvariousalternatives thatwemightneed inapplications, becausewe can explicitly or implicitly rescale theweights to fitthis model (see Exercises 20.1 and 20.10). For example, if the weights arepositiveintegerslessthanaknownmaximumvalue,wecandividethemallbythemaximumvaluetoconvertthemtorealnumbersbetween0and1.

Program20.3Weighted-graphclass(adjacencymatrix)Fordenseweightedgraphs,weuseamatrixofpointers toEdges,withapointer to theedgev-winrowvandcolumnw.Forundirectedgraphs,weputanotherpointertotheedgeinrowwandcolumnv.Thenullpointerindicatestheabsenceofanedge;whenweremoveanedge,we removeourpointer to it.This implementationdoesnotcheck forparalleledges,butclientscoulduseedgetodoso.

Program 20.4 Iterator class for adjacency-matrixrepresentationThiscodeisastraighforwardadaptationofProgram17.8toreturnedgepointers.

Ifwewantedtodoso,wecouldbuildamoregeneralADTinterfaceanduseanydatatypeforedgeweightsthatsupportsaddition,subtraction,andcomparisons,since we do little more with the weights than to accumulate sums and makedecisions based on their values. In Chapter 22, our algorithms are concernedwith comparing linear combinations of edgeweights, and the running time ofsomealgorithmsdependsonarithmeticpropertiesoftheweights,soweswitchtointegerweightstoallowustomoreeasilyanalyzethealgorithms.Programs 20.3 and 20.4 implement theweighted-graphADTof Program20.1with an adjacency-matrix representation. As before, inserting an edge into anundirectedgraphamountstostoringapointertoitintwoplacesinthematrix—one for each orientation of the edge. As is true of algorithms that use theadjacency-matrixrepresentationforunweightedgraphs,therunningtimeofany

algorithm that uses this representation is proportional to V2 (to initialize thematrix)orhigher.With this representation, we test for the existence of an edge v-w by testingwhetherthepointerinrowvandcolumnwisnull.Insomesituations,wecouldavoidsuchtestsbyusingsentinelvaluesfortheweights,butweavoidtheuseofsentinelvaluesinourimplementations.Program20.5givestheimplementationdetailsoftheweighted-graphADTthatusesedgepointersforanadjacency-listsrepresentation.Avertex-indexedvectorassociateseachvertexwithalinkedlistofthatvertex’sincidentedges.Eachlistnodecontainsapointer toanedge.Aswith theadjacencymatrix,wecould, ifdesired,savespacebyputting just thedestinationvertexand theweight in thelistnodes(leavingthesourceverteximplicit)atthecostofamorecomplicatediterator(seeExercises20.11and20.14).Atthispoint,itisworthwhiletocomparetheserepresentationswiththesimplerepresentations that were mentioned at the beginning of this section (seeExercises 20.11 and 20.12). If we are building a graph from scratch, usingpointers certainly requires more space. Not only do we need space for thepointers, but also we need space for the indices (vertex names), which areimplicit in the simple representations. To use edge pointers in the adjacency-matrix representation,weneedextraspaceforV2edgepointersandE pairs ofindices. Similarly, to use edge pointers in the adjacency-list representationweneedextraspaceforEedgepointersandEindices.On the other hand, the use of edge pointers is likely to lead to faster code,becausethecompiledclientcodewillfollowonepointertogetdirectaccesstoanedge’sweight,incontrasttothesimplerepresentation,whereanEdgeneedstobeconstructedand its fieldsaccessed. If thespacecost isprohibitive,usingtheminimalrepresentations(andperhapsstreamliningtheiteratorstosavetime)certainly is a reasonable alternative; otherwise, the flexibility afforded by thepointersisprobablyworththespace.

Program20.5Weighted-graphclass(adjacencylists)ThisimplementationoftheinterfaceinProgram20.1isbasedonadjacencylistsandistherefore appropriate for sparse weighted graphs. As with unweighted graphs, werepresenteachedgewithalistnode,buthereeachnodecontainsapointertotheedgethat it represents,not just thedestinationvertex.The iteratorclass isastraightforwardadaptionofProgram17.10(seeExercise20.13).

Toreduceclutter,wedousethesimplerepresentationsinallofourfigures.Thatis,ratherthanshowingamatrixofpointerstoedgestructuresthatcontainpairsof integers andweights,we simply show amatrix ofweights, and rather thanshowinglistnodesthatcontainpointerstoedgestructures,weshownodesthatcontain destination vertices. The adjacency-matrix and adjacency-listsrepresentationsofoursamplegraphareshowninFigure20.3.

Figure20.3Weighted-graphrepresentations(undirected)

Thetwostandardrepresentationsofweightedundirectedgraphsincludeweightswitheachedgerepresentation,asillustratedintheadjacency-matrix(left)andadjacency-lists(right)representationofthegraphdepictedinFigure20.1.Forsimplicityinthesefigures,weshowtheweightsinthematrixentriesandlist

nodes;inourprogramsweusepointerstoclientedges.Theadjacencymatrixissymmetricandtheadjacencylistscontaintwonodesforeachedge,asin

unweighteddirectedgraphs.Nonexistentedgesarerepresentednullpointersinthematrix(indicatedbyasterisksinthefigure)andarenotpresentatallinthelists.Self-loopsareabsentinbothoftherepresentationsillustratedherebecause

MSTalgorithmsaresimplerwithoutthem;otheralgorithmsthatprocessweightedgraphsusethem(seeChapter21).

As with our undirected-graph implementations, we do not explicitly test forparallel edges in either implementation. Depending upon the application, wemight alter the adjacency-matrix representation to keep the parallel edge oflowest or highest weight, or to effectively coalesce parallel edges to a singleedge with the sum of their weights. In the adjacency-lists representation, weallow parallel edges to remain in the data structure, butwe could buildmorepowerfuldatastructurestoeliminatethemusingoneoftherulesjustmentionedforadjacencymatrices(seeExercise17.49).HowshouldwerepresenttheMSTitself?TheMSTofagraphGisasubgraphofGthatisalsoatree,sowehavenumerousoptions.Chiefamongthemare•Agraph•Alinkedlistofedges•Avectorofpointerstoedges•Avertex-indexedvectorwithparentlinksFigure 20.4 illustrates these options for the example MST in Figure 20.1.AnotheralternativeistodefineanduseanADTfortrees.Thesametreemighthavemanydifferentrepresentationsinanyoftheschemes.

Inwhatordershouldtheedgesbepresentedinthelist-of-edgesrepresentation?Whichnodeshouldbechosenas theroot in theparent-linkrepresentation(seeExercise 20.21)? Generally speaking, when we run an MST algorithm, theparticularMSTrepresentationthatresultsisanartifactofthealgorithmusedanddoesnotreflectanyimportantfeaturesoftheMST.

Figure20.4MSTrepresentations

ThisfiguredepictsvariousrepresentationsoftheMSTinFigure20.1.Themoststraightforwardisalistofitsedges,innoparticularorder(left).TheMSTisalsoasparsegraph,andmightberepresentedwithadjacencylists(center).Themostcompactisaparent-linkrepresentation:wechooseoneoftheverticesastherootandkeeptwovertex-indexedvectors,onewiththeparentofeachvertexinthetree,theotherwiththeweightoftheedgefromeachvertextoitsparent(right).Theorientationofthetree(choiceofrootvertex)isarbitrary,notapropertyoftheMST.Wecanconvertfromanyoneoftheserepresentationstoanyotherin

lineartime.Fromanalgorithmicpointofview,thechoiceofMSTrepresentationisoflittleconsequencebecausewecanconverteasilyfromeachoftheserepresentationstoanyoftheothers.Toconvertfromthegraphrepresentationtoavectorofedges,we can use theGRAPHedges function of Program 20.2. To convert from theparent-linkrepresentationinavectorst(withweightsinanothervectorwt)toavectorofpointerstoedgesinmst,wecanusetheloop

for(k=1;k<G.V();k++)

mst[k]=newEDGE(k,st[k],wt[k]);

ThiscodeisforthetypicalcasewheretheMSTisrootedat0,anditdoesnotputthedummyedge0-0ontotheMSTedgelist.These two conversions are trivial, but howdowe convert from the vector-of-edge-pointersrepresentationtotheparent-linkrepresentation?Wehavethebasictools to accomplish this task easily as well: We convert to the graphrepresentationusingaloopliketheonejustgiven(changedtocallinsertforeach

edge), then run a a DFS starting at any vertex to compute the parent-linkrepresentationoftheDFStree,inlineartime.Inshort,althoughthechoiceofMSTrepresentationisamatterofconvenience,wepackageallofouralgorithmsinagraph-processingclassMSTthatcomputesa private vector mst of pointers to edges. Depending on the needs ofapplications,we can implementmember functions for this class to return thisvectorortogiveclientprogramsotherinformationabouttheMST,butwedonotspecify further details of this interface, other than to include a showmemberfunction that calls a similar function for each edge in theMST (see Exercise20.8).

Exercises• 20.8 Write a WeightedEdge class that implements the EDGE interface ofProgram20.1andalsoincludesamemberfunctionshowthatprintsedgesandtheirweightsintheformatusedinthefiguresinthischapter.

• 20.9 Implement an io class for weighted graphs that has show, scan, andscanEZmemberfunctions(seeProgram17.4).

• 20.10 Build a graph ADT that uses integer weights, but keep track of theminimum andmaximumweights in the graph and include anADT functionthatalwaysreturnsweightsthatarenumbersbetween0and1.

• 20.11 Give an interface like Program 20.1 such that clients andimplementationsmanipulateEdges(notpointerstothem).

•20.12Developan implementationofyour interfacefromExercise20.11 thatuses aminimalmatrix-of-weights representation,where the iterator functionnxtuses the information implicit in the rowandcolumn indices to createanEdgetoreturntotheclient.20.13 Implement the iterator class for use with Program 20.5 (see Program20.4).•20.14Developan implementationofyour interfacefromExercise20.11 thatuses a minimal adjacency-lists representation, where list nodes contain theweightandthedestinationvertex(butnotthesource)andtheiteratorfunctionnxtusesimplicitinformationtocreateanEdgetoreturntotheclient.

•20.15Modifythesparse-random-graphgeneratorinProgram17.12toassignarandomweight(between0and1)toeachedge.

•20.16Modifythedense-random-graphgeneratorinProgram17.13toassignarandomweight(between0and1)toeachedge.

20.17Writeaprogramthatgenerates randomweightedgraphsbyconnectingverticesarrangedinaV-by-Vgrid to theirneighbors (as inFigure19.3,butundirected)withrandomweights(between0and1)assignedtoeachedge.20.18WriteaprogramthatgeneratesarandomcompletegraphthathasweightschosenfromaGaussiandistribution.•20.19WriteaprogramthatgeneratesVrandompointsintheplanethenbuildsaweightedgraphbyconnectingeachpairofpointswithinagivendistancedofoneanotherwithanedgewhoseweightisthedistance(seeExercise17.74).DeterminehowtosetdsothattheexpectednumberofedgesisE.

• 20.20 Find a large weighted graph online—perhaps a map with distances,telephoneconnectionswithcosts,oranairlinerateschedule.20.21Writedownan8-by-8matrixthatcontainsparent-linkrepresentationsofalltheorientationsoftheMSTofthegraphinFigure20.1.Puttheparent-linkrepresentationofthetreerootedatiintheithrowofthematrix.•20.22AssumethataMSTclassconstructorproducesavector-of-edge-pointersrepresentationofanMSTinmst[1] throughmst[V].Addamember functionSTforclients (as in, forexample,Program18.3) such thatST(v) returnsv’sparentinthetree(vifitistheroot).

•20.23UndertheassumptionsofExercise20.22,writeamemberfunctionthatreturnsthetotalweightoftheMST.

• 20.24 Suppose that a MST class constructor produces a parent-linkrepresentation of an MST in a vector st. Give code to be added to theconstructor to compute a vector-of-edge-pointers representation in entries 1throughVofaprivatevectormst.

•20.25DefineaTREEclass.Then,under theassumptionsofExercise20.22,writeamemberfunctionthatreturnsaTREE.

20.2UnderlyingPrinciplesofMSTAlgorithmsTheMSTproblemisoneofthemostheavilystudiedproblemsthatweencounterin this book. Basic approaches to solving it were invented long before thedevelopmentofmoderndatastructuresandmoderntechniquesforanalyzingtheperformance of algorithms, at a time when finding the MST of a graph thatcontained,say,thousandsofedgeswasadauntingtask.Asweshallsee,severalnew MST algorithms differ from old ones essentially in their use andimplementationofmodernalgorithmsanddatastructuresforbasictasks,which(coupledwithmodern computing power)makes it possible for us to computeMSTswithmillionsorevenbillionsofedges.

Oneofthedefiningpropertiesofatree(seeSection5.4)isthataddinganedgeto a tree creates a unique cycle. This property is the basis for proving twofundamentalpropertiesofMSTs,whichwenowconsider.Allthealgorithmsthatweencounterarebasedononeorbothofthesetwoproperties.The first property, which we refer to as the cut property, has to do withidentifyingedgesthatmustbeinanMSTofagivengraph.Thefewbasictermsfromgraphtheorythatwedefinenextmakepossibleaconcisestatementofthisproperty,whichfollows.Definition20.2Acutinagraph isapartitionof thevertices into twodisjointsets.Acrossingedgeisonethatconnectsavertexinonesetwithavertexintheother.Wesometimesspecifyacutbyspecifyingasetofvertices,leavingimplicittheassumptionthatthecutcomprisesthatsetanditscomplement.Generally,weusecutswherebothsetsarenonempty—otherwisetherearenocrossingedges.Property20.1(Cutproperty)Givenanycutinagraph,everyminimalcrossingedge belongs to someMST of the graph, and everyMST contains a minimalcrossingedge.Proof:Theproofisbycontradiction.SupposethateisaminimalcrossingedgethatisnotinanyMST,andletTbeanyMST;orsupposethatTisanMSTthatcontainsnominimalcrossingedge,andletebeanyminimalcrossingedge.Ineithercase,TisanMSTthatdoesnotcontaintheminimalcrossingedgee.NowconsiderthegraphformedbyaddingetoT.Thisgraphhasacyclethatcontainse,andthatcyclemustcontainatleastoneothercrossingedge—say,f,whichisequalorhigherweightthane(sinceeisminimal).Wecangetaspanningtreeofequal or lower weight by deleting f and adding e, contradicting either theminimalityofTortheassumptionthateisnotinT.Ifagraph’sedgeweightsaredistinct,ithasauniqueMST;andthecutpropertysays that the shortest crossing edge for every cutmust be in theMST.Whenequal weights are present, wemay havemultipleminimal crossing edges. Atleastoneof themwill be in anygivenMSTand theothersmaybepresentorabsent.Figure20.5illustratesseveralexamplesofthiscutproperty.Notethatthereisnorequirement that theminimal edge be theonlyMST edge connecting the twosets;indeed,fortypicalcutsthereareseveralMSTedgesthatconnectavertexinonesetwithavertexintheother.Ifwecouldbesurethattherewereonlyonesuchedge,wemightbeabletodevelopdivide-and-conqueralgorithmsbasedon

judiciousselectionofthesets;butthatisnotthecase.WeusethecutpropertyasthebasisforalgorithmstofindMSTs,anditalsocanserve as an optimality condition that characterizesMSTs. Specifically, the cutpropertyimpliesthateveryedgeinanMSTisaminimalcrossingedgefor thecutdefinedbytheverticesinthetwosubtreesconnectedbytheedge.

Figure20.5Cutproperty

ThesefourexamplesillustrateProperty20.1.Ifwecoloronesetofverticesgrayandanothersetwhite,thentheshortestedgeconnectingagrayvertexwitha

whiteonebelongstoanMST.The second property,whichwe refer to as the cycle property, has to do withidentifyingedgesthatdonothavetobeinagraph’sMST.Thatis,ifweignoretheseedges,wecanstillfindanMST.

Property20.2(Cycleproperty)GivenagraphG,consider thegraphG[prime]defined by adding an edge e to G. Adding e to anMST of G and deleting amaximaledgeontheresultingcyclegivesanMSTofG[prime].Proof:Ifeislongerthanalltheotheredgesonthecycle,itcannotbeonanMSTofG [prime],becauseofProperty20.1:Removinge fromanysuchMSTwouldsplitthelatterintotwopieces,andewouldnotbetheshortestedgeconnectingverticesineachofthosetwopieces,becausesomeotheredgeonthecyclemustdoso.Otherwise,lettbeamaximaledgeonthecyclecreatedbyaddingetotheMSTofG.RemovingtwouldsplittheoriginalMSTintotwopieces,andedgesofGconnectingthosepiecesarenoshorter than t;soe isaminimaledge inG[prime]connectingverticesinthosetwopieces.Thesubgraphsinducedbythetwosubsets of vertices are identical forG andG [prime], so anMST forG [prime]consistsofeandtheMSTsofthosetwosubsets.In particular, note that ife ismaximal on the cycle, thenwehave shown thatthereexistsanMSTofG[prime]thatdoesnotcontaine(theMSTofG).Figure20.6 illustrates this cycle property.Note that the process of taking anyspanningtree,addinganedgethatcreatesacycle,andthendeletingamaximaledge on that cycle gives a spanning tree of weight less than or equal to theoriginal. The new treeweightwill be less than the original if and only if theaddededgeisshorterthansomeedgeonthecycle.The cycle property also serves as the basis for an optimality condition thatcharacterizesMSTs:It impliesthateveryedgeinagraphthatisnotinagivenMSTisamaximaledgeonthecyclethatitformswithMSTedges.ThecutpropertyandthecyclepropertyarethebasisfortheclassicalalgorithmsthatweconsiderfortheMSTproblem.Weconsideredgesoneatatime,usingthe cut property to accept them asMST edges or the cycle property to rejectthem as not needed. The algorithms differ in their approaches to efficientlyidentifyingcutsandcycles.

Figure20.6Cycleproperty

Addingtheedge1-3tothegraphinFigure20.1invalidatestheMST(top).TofindtheMSTofthenewgraph,weaddthenewedgetotheMSToftheold

graph,whichcreatesacycle(center).Deletingthelongestedgeonthecycle(4-7)yieldstheMSTofthenewgraph(bottom).OnewaytoverifythataspanningtreeisminimalistocheckthateachedgenotontheMSThasthelargestweightonthecyclethatitformswithtreeedges.Forexample,inthebottomgraph,4-6

hasthelargestweightonthecycle4-6-7-1-3-4.ThefirstapproachtofindingtheMSTthatweconsiderindetailistobuildtheMSToneedgeatatime:Startwithanyvertexasasingle-vertexMST,thenaddV−1edgestoit,alwaystakingnextaminimaledgethatconnectsavertexonthe MST to a vertex not yet on the MST. This method is known as Prim’salgorithm;itisthesubjectofSection20.3.Property20.3Prim’salgorithmcomputesanMSTofanyconnectedgraph.Proof:AsdescribedindetailinSection20.2,themethodisageneralizedgraph-searchmethod.ImplicitintheproofofProperty18.12isthefactthattheedgeschosen are a spanning tree. To show that they are an MST, apply the cutproperty,usingverticesontheMSTasthefirstsetandverticesnotontheMST

asthesecondset.Another approach to computing the MST is to apply the cycle propertyrepeatedly:Weaddedgesoneata timetoaputativeMST,deletingamaximaledgeonthecycleifoneisformed(seeExercises20.33and20.71).Thismethodhas received less attention than the others that we consider because of thecomparative difficulty of maintaining a data structure that supports efficientimplementationofthe“deletethelongestedgeonthecycle”operation.ThesecondapproachtofindingtheMSTthatweconsiderindetailistoprocesstheedges inorderof their length(shortestfirst),addingto theMSTeachedgethat does not form a cyclewith edges previously added, stopping afterV− 1edgeshavebeenadded.ThismethodisknownasKruskal’salgorithm; it is thesubjectofSection20.4.Property20.4Kruskal’salgorithmcomputesanMSTofanyconnectedgraph.Proof: We prove by induction that the method maintains a forest of MSTsubtrees.Ifthenextedgetobeconsideredwouldcreateacycle,itisamaximaledge on the cycle (since all others appeared before it in the sorted order), soignoring it still leaves anMST, by the cycle property. If the next edge to beconsidereddoesnotformacycle,applythecutproperty,usingthecutdefinedbythesetofverticesconnectedtooneoftheedge’svertexbyMSTedges(anditscomplement).Since theedgedoesnotcreateacycle, it is theonlycrossingedge,andsinceweconsidertheedgesinsortedorder,itisaminimaledgeandtherefore in anMST. The basis for the induction is theV individual vertices;oncewehavechosenV−1edges,wehaveonetree(theMST).NounexaminededgeisshorterthananMSTedge,andallwouldcreateacycle,soignoringalloftherestoftheedgesleavesanMST,bythecycleproperty.The thirdapproach tobuildinganMST thatweconsider indetail isknownasBoruvka’salgorithm;itisthesubjectofSection20.4.Thefirststepistoaddtothe MST the edges that connect each vertex to its closest neighbor. If edgeweightsaredistinct,thisstepcreatesaforestofMSTsubtrees(weprovethisfactandconsiderarefinementthatdoessoevenwhenequal-weightedgesarepresentinamoment).Then,weadd to theMST theedges that connecteach tree toaclosestneighbor(aminimaledgeconnectingavertexinonetreewithavertexinanyother),anditeratetheprocessuntilweareleftwithjustonetree.Property20.5Boruvka’salgorithmcomputestheMSTofanyconnectedgraph.First,supposethattheedgeweightsaredistinct.Inthiscase,eachvertexhasauniqueclosestneighbor,theMSTisunique,andweknowthateachedgeadded

isanMSTedgebyapplyingthecutproperty(itistheshortestedgecrossingthecutfromavertextoalltheothervertices).SinceeveryedgechosenisfromtheuniqueMST,therecanbenocycles,eachedgeaddedmergestwotreesfromtheforestintoabiggertree,andtheprocesscontinuesuntilasingletree,theMST,remains.If edgeweights arenotdistinct, theremaybemore thanoneclosestneighbor,andacyclecouldformwhenweaddtheedgestoclosestneighbors(seeFigure20.7). Put another way, wemight include two edges from the set ofminimalcrossingedges for somevertex,whenonlyonebelongson theMST.Toavoidthisproblem,weneedanappropriatetie-breakingrule.Onechoiceistochoose,amongtheminimalneighbors,theonewiththelowestvertexnumber.Thenanycyclewouldpresentacontradiction: Ifv is thehighest-numberedvertex in thecycle,thenneitherneighborofvwouldhaveledtoitschoiceastheclosest,andvwouldhaveledtothechoiceofonlyoneofitslower-numberedneighbors,notboth.

Figure20.7CyclesinBoruvka’salgorithm

Inthegraphoffourverticesandfouredgesshownhere,theedgesareallthesamelength.Whenweconnecteachvertextoanearestneighbor,wehavetomakeachoicebetweenminimaledges.Intheexampleatthetop,wechoose1from0,2from1,3from2,and0from3,whichleadstoacycleintheputativeMST.EachoftheedgesareinsomeMST,butnotallareineveryMST.Toavoidthisproblem,weadoptatie-breakingrule,asshowninthebottom:Choosetheminimaledgetothevertexwiththelowestindex.Thus,wechoose1from0,0from1,1from2,and0from3,whichyieldsanMST.Thecycleisbroken

becausehighest-numberedvertex3isnotchosenfromeitherofitsneighbors2or1,anditcanchooseonlyoneofthem(0).

These algorithms are all special casesof a general paradigm that is still beingusedbyresearchersseekingnewMSTalgorithms.Specifically,wecanapplyinarbitraryorderthecutpropertytoacceptanedgeasanMSTedgeorthecycleproperty to reject anedge, continuinguntilneithercan increase thenumberofacceptedorrejectededges.Atthatpoint,anydivisionofthegraph’sverticesintotwosetshasanMSTedgeconnectingthem(soapplyingthecutpropertycannotincreasethenumberofMSTedges),andallgraphcycleshaveatleastonenon-MSTedge (soapplying thecyclepropertycannot increase thenumberofnon-MST edges). Together, these properties imply that a completeMST has beencomputed.Morespecifically,thethreealgorithmsthatweconsiderindetailcanbeunifiedwithageneralizedalgorithmwherewebeginwithaforestofsingle-vertexMSTsubtrees (each with no edges) and perform the step of adding to the MST aminimaledgeconnectinganytwosubtreesintheforest,continuinguntilV−1edgeshavebeenaddedandasingleMSTremains.Bythecutproperty,noedgethatcausesacycleneedbeconsideredfortheMST,sincesomeotheredgewaspreviouslyaminimaledgecrossingacutbetweenMSTsubtreescontainingeachofitsvertices.WithPrim’salgorithm,wegrowasingletreeanedgeatatime;withKruskal’sandBoruvka’salgorithms,wecoalescetreesinaforest.Asdescribedinthissectionandintheclassicalliterature,thealgorithmsinvolvecertainhigh-levelabstractoperations,suchasthefollowing:•Findaminimaledgeconnectingtwosubtrees.•Determinewhetheraddinganedgewouldcreateacycle.•Deletethelongestedgeonacycle.Ourchallengeistodevelopalgorithmsanddatastructuresthatimplementtheseoperationsefficiently.Fortunately,thischallengepresentsuswithanopportunitytoputtogoodusebasicalgorithmsanddatastructuresthatwedevelopedearlierinthisbook.MST algorithms have a long and colorful history that is still evolving; wediscussthathistoryasweconsiderthemindetail.Ourevolvingunderstandingofdifferent methods of implementing the basic abstract operations has createdsomeconfusionsurroundingtheoriginsofthealgorithmsovertheyears.Indeed,themethodswere first described in the 1920s, pre-dating the development ofcomputersasweknow them,aswell aspre-datingourbasicknowledgeaboutsorting and other algorithms. As we now know, the choices of underlyingalgorithm and data structure can have substantial influences on performance,even when we are implementing the most basic schemes. In recent years,

researchontheMSTproblemhasconcentratedonsuchimplementationissues,stillusingtheclassicalschemes.Forconsistencyandclarity,werefertothebasicapproachesbythenameslistedhere,althoughabstractversionsofthealgorithmswere considered earlier, andmodern implementations use algorithms and datastructuresinventedlongafterthesemethodswerefirstcontemplated.Asyetunsolvedinthedesignandanalysisofalgorithmsisthequestforalinear-timeMSTalgorithm.Asweshallsee,manyofourimplementationsarelinear-timeinabroadvarietyofpracticalsituations,buttheyaresubjecttoanonlinearworstcase.Thedevelopmentofanalgorithmthatisguaranteedtobelinear-timeforsparsegraphsisstillaresearchgoal.Beyond our normal quest in search of the best algorithm for this fundamentalproblem, the study of MST algorithms underscores the importance ofunderstandingthebasicperformancecharacteristicsoffundamentalalgorithms.Asprogrammerscontinue tousealgorithmsanddata structuresat increasinglyhigherlevelsofabstraction,situationsofthissortbecomeincreasinglycommon.OurADTimplementationshavevaryingperformancecharacteristics—asweusehigher-levelADTsascomponentswhensolvingmoreyethigher-levelproblems,the possibilities multiply. Indeed, we often use algorithms that are based onusingMSTsand similar abstractions (enabledby theefficient implementationsthatweconsiderinthischapter)tohelpussolveotherproblemsatayethigherlevelofabstraction.

Exercises•20.26Labelthefollowingpointsintheplane0through5,respectively:

(1,3)(2,1)(6,5)(3,4)(3,7)(5,3).Takingedge lengths tobeweights,giveanMSTof thegraphdefinedby theedges

1-03-55-23-45-10-30-44-22-3.20.27 Suppose that a graphhas distinct edgeweights.Does its shortest edgehavetobelongtotheMST?Provethatitdoesorgiveacounterexample.20.28AnswerExercise20.27forthegraph’slongestedge.20.29Giveacounterexample thatshowswhythefollowingstrategydoesnotnecessarilyfindtheMST:“Start with any vertex as a single-vertexMST, then addV− 1 edges to it,always taking next a minimal edge incident upon the vertex most recentlyaddedtotheMST.”

20.30Supposethatagraphhasdistinctedgeweights.Doesaminimaledgeonevery cycle have to belong to the MST? Prove that it does or give acounterexample.20.31 Given anMST for a graphG, suppose that an edge inG is deleted.Describe how to find anMST of the new graph in time proportional to thenumberofedgesinG.20.32ShowtheMSTthatresultswhenyourepeatedlyapplythecyclepropertytothegraphinFigure20.1,takingtheedgesintheordergiven.20.33ProvethatrepeatedapplicationofthecyclepropertygivesanMST.20.34 Describe how each of the algorithms described in this section can beadapted(ifnecessary)totheproblemoffindingaminimalspanningforestofaweightedgraph(theunionoftheMSTsofitsconnectedcomponents).

20.3Prim’sAlgorithmandPriority-FirstSearchPrim’salgorithmisperhapsthesimplestMSTalgorithmtoimplement,anditisthemethodof choice fordensegraphs.Wemaintain a cutof thegraph that iscomprised of tree vertices (those chosen for the MST) and nontree vertices(thosenotyetchosenfortheMST).WestartbyputtinganyvertexontheMST,thenputaminimalcrossingedgeontheMST(whichchangesitsnontreevertextoatreevertex)andrepeatthesameoperationV−1times,toputallverticesonthetree.A brute-force implementation of Prim’s algorithm follows directly from thisdescription.TofindtheedgetoaddnexttotheMST,wecouldexaminealltheedgesthatgofromatreevertextoanontreevertex,thenpicktheshortestoftheedges found to put on theMST.Wedo not consider this implementation herebecause it is overly expensive (see Exercises 20.35 through 20.37). Adding asimpledatastructuretoeliminateexcessiverecomputationmakesthealgorithmbothsimplerandfaster.Adding a vertex to theMST is an incremental change: To implement Prim’salgorithm,wefocusonthenatureofthatincrementalchange.Thekeyistonotethatourinterestisintheshortestdistancefromeachnontreevertextothetree.Whenweaddavertexv to the tree, theonlypossiblechangeforeachnontreevertexwisthatadding

Figure20.8Prim’sMSTalgorithm

ThefirststepincomputingtheMSTwithPrim’salgorithmistoadd0tothetree.Thenwefindalltheedgesthatconnect0toothervertices(whicharenotyetonthetree)andkeeptrackoftheshortest(topleft).Theedgesthatconnecttree

verticeswithnontreevertices(thefringe)areshadowedingrayandlistedbeloweachgraphdrawing.Forsimplicityinthisfigure,welistthefringeedgesinorderoftheirlengthsothattheshortestisthefirstinthelist.Different

implementationsofPrim’salgorithmusedifferentdatastructurestomaintainthislistandtofindtheminimum.Thesecondstepistomovetheshortestedge0-2(alongwiththevertexthatittakesusto)fromthefringetothetree(secondfromtop,left).Third,wemove0-7fromthefringetothetree,replace0-1by7-1and0-6by7-6onthefringe(becauseadding7tothetreebrings1and6closertothetree),andadd7-4tothefringe(becauseadding7tothetreemakes7-4anedgethatconnectsatreevertexwithanontreevertex)(thirdfromtop,left).

Next,wemoveedge7-1tothetree(bottom,left).Tocompletethecomputation,wetake7-6,7-4,4-3,and3-5offthequeue,updatingthefringeaftereach

insertiontoreflectanyshorterornewpathsdiscovered(right,toptobottom).AnorienteddrawingofthegrowingMSTisshownattherightofeachgraphdrawing.Theorientationisanartifactofthealgorithm:Wegenerallyviewthe

MSTitselfasasetofedges,unorderedandunoriented.vbringswcloserthanbeforetothetree.Inshort,wedonotneedtocheckthedistancefromwtoalltreevertices—wejustneedtokeeptrackoftheminimumandcheckwhethertheadditionofvtothetreenecessitatesthatweupdatethatminimum.To implement this idea,weneeddatastructures thatcangiveus the followinginformation:•Thetreeedges•Theshortestedgeconnectingeachnontreevertextothetree•ThelengthofthatedgeThesimplestimplementationforeachofthesedatastructuresisavertex-indexedvector (wecanuse suchavector for the treeedgesby indexingonverticesasthey are added to the tree). Program 20.6 is an implementation of Prim’salgorithmfordensegraphs.Itusesthevectorsmst,fr,andwt(respectively)forthesethreedatastructures.After adding a new edge (and vertex) to the tree, we have two tasks toaccomplish:•Checktoseewhetheraddingthenewedgebroughtanynontreevertexcloser

tothetree.•Findthenextedgetoaddtothetree.The implementation in Program 20.6 accomplishes both of these taskswith asinglescanthroughthenontreevertices,updatingwt[w]andfr[w]ifv-wbringswclosertothetree,thenupdatingthecurrentminimumifwt[w](thelengthoffr[w])indicatesthatwisclosertothetreethananyothernontreevertexwithalowerindex).Property20.6UsingPrim’salgorithm,wecanfindtheMSTofadensegraphinlineartime.Proof:ItisimmediatelyevidentfrominspectionoftheprogramthattherunningtimeisproportionaltoV2andthereforeislinearfordensegraphs.Figure20.8showsanexampleMSTconstructionwithPrim’salgorithm;Figure20.9showstheevolvingMSTforalargerexample.Program 20.6 is based on the observation that we can interleave the find theminimum and update operations in a single loop where we examine all thenontree edges. In a dense graph, the number of edges that we may have toexamine to update the distance from the nontree vertices to the tree isproportional toV, so looking at all the nontree edges to find the one that isclosesttothetreedoesnotrepresent

Figure20.9Prim’sMSTalgorithm

ThissequenceshowshowtheMSTgrowsasPrim’salgorithmdiscovers1/4,1/2,3/4,andalloftheedgesintheMST(toptobottom).Anoriented

representationofthefullMSTisshownattheright.

Program20.6Prim’sMSTalgorithmThisimplementationofPrim’salgorithmisthemethodofchoicefordensegraphsandcanbeusedforanygraphrepresentationthatsupportstheedgeexistencetest.TheouterloopgrowstheMSTbychoosingaminimaledgecrossingthecutbetweentheverticesontheMSTandverticesnotontheMST.Thewloopfindstheminimaledgewhileatthesametime(ifwisnotontheMST)maintainingtheinvariantthattheedgefr[w]istheshortestedge(ofweightwt[w])fromwtotheMST.Theresultofthecomputationisavectorofedgepointers.Thefirst(mst[0])isunused;therest(mst[1]throughmst[G.V()])comprisetheMSToftheconnectedcomponentofthegraphthatcontains0.

excessiveextra cost.But in a sparsegraph,wecanexpect touse substantiallyfewerthanVstepstoperformeachoftheseoperations.Thecruxofthestrategythatwewillusetodosoistofocusonthesetofpotentialedgestobeaddednextto the MST—a set that we call the fringe. The number of fringe edges istypically substantially smaller than the number of nontree edges, and we canrecastourdescriptionofthealgorithmasfollows.Startingwithaselflooptoa

startvertexonthefringeandanemptytree,weperformthefollowingoperationuntilthefringeisempty:

Moveaminimaledgefromthefringetothetree.Visitthevertexthatitleadsto,andputontothefringeanyedgesthatleadfromthatvertextoannontreevertex,replacingthelongeredgewhentwoedgesonthefringepointtothesamevertex.

From this formulation, it is clear thatPrim’s algorithm isnothingmore thanageneralizedgraphsearch(seeSection18.8),wherethefringeisapriorityqueuebased on a remove the minimum operation (see Chapter 9). We refer togeneralizedgraphsearchingwithpriorityqueuesaspriority-firstsearch(PFS).Withedgeweightsforpriorities,PFSimplementsPrim’salgorithm.This formulation encompasses a key observation that we made already inconnectionwith implementing BFS in Section 18.7. An even simpler generalapproachistosimplykeeponthefringealloftheedgesthatareincidentupontree vertices, letting the priority-queue mechanism find the shortest one andignorelongerones(seeExercise20.41).AswesawwithBFS,thisapproachisunattractive because the fringe data structure becomes unnecessarily clutteredwithedgesthatwillnevergettotheMST.Thesizeofthefringecouldgrowtobe proportional to E (with whatever attendant costs having a fringe this sizemightinvolve),whilethePFSapproachjustoutlinedensuresthatthefringewillneverhavemorethanVentries.As with any implementation of a general algorithm, we have a number ofavailableapproachesforinterfacingwithpriority-queueADTs.Oneapproachisto use a priority queue of edges, as in our generalized graph-searchimplementation of Program 18.10. Program 20.7 is an implementation that isessentiallyequivalenttoProgram18.10butusesavertex-basedapproachsothatitcanuseanindexedpriority-queue,asdescribedinSection9.6.(Weconsideracomplete implemen tation of the specific priority-queue interface used byProgram20.7 is at theendof this chapter, inProgram20.10.)We identify thefringevertices,thesubsetofnontreeverticesthatareconnectedbyfringeedgestotreevertices,andkeepthesamevertex-indexedvectorsmst,fr,andwtasinProgram20.6.Thepriorityqueuecontains the indexofeachfringevertex,andthatentrygivesaccesstotheshortestedgeconnectingthefringevertexwiththetreeandthelengthofthatedge,throughthesecondandthirdvectors.The first call to pfs in the constructor in Program 20.7 finds theMST in theconnectedcomponentthatcontainsvertex0,andsubsequentcallsfindMSTsinothercomponents,sothisclassactuallyfindsminimalspanningforestsingraphsthatarenotconnected(seeExercise20.34).

Property20.7UsingaPFSimplementationofPrim’salgorithmthatusesaheapfor the priority-queue implementation, we can compute an MST in timeproportionaltoElgV.Proof:Thealgorithmdirectly implements thegeneric ideaofPrim’salgorithm(addnexttotheMSTaminimaledgethatconnectsavertexontheMSTwithavertexnot on theMST).Eachpriority-queueoperation requires less than lgVsteps.Eachvertexischosenwitharemove theminimumoperation;and, in theworstcase,eachedgemightrequireachangepriorityoperation.Priority-first search is a proper generalization of breadth-first and depth-firstsearchbecause thosemethodsalsocanbederived throughappropriateprioritysettings.Forexample,wecan(somewhatartificially)useavariablecnttoassignauniqueprioritycnt++ toeachvertexwhenweput thatvertexon thepriorityqueue. IfwedefineP to be cnt,weget preorder numbering andDFSbecausenewlyencounterednodeshave thehighestpriority. IfwedefineP tobeV-cnt,we get BFS because old nodes have the highest priority. These priorityassignments make the priority queue operate like a stack and a queue,respectively.Thisequivalence ispurelyofacademic interest since thepriority-queue operations are unnecessary for DFS and BFS. Also, as discussed inSection18.8,aformalproofofequivalencewouldrequireapreciseattentiontoreplacement rules to obtain the same sequence of vertices as result from theclassicalalgorithms.

Program20.7Priority-firstsearchThefunctionpfs isageneralizedgraphsearchthatusesapriorityqueuefor thefringe(seeSection18.8).ThepriorityPisdefinedsuchthatthisclassimplementsPrim’sMSTalgorithm; other priority definitions give other algorithms. The main loop moves thehighest-priority(lowest-weight)edgefromthefringetothetree,thencheckseveryedgeadjacenttothenewtreevertextoseewhetheritimplieschangesinthefringe.Edgestovertices not on the fringe or the tree are added to the fringe; shorter edges to fringeverticesreplacecorrespondingfringeedges.

ThePQiclassisanindirectpriority-queueinterface(seeSection9.6)modifiedtopassareferencetothepriorityarraytotheconstructor,tosubstitutegetminfordelmax,andtosubstitutelowerforchange.Program20.10isanimplementationofthisinterface.

As we shall see, PFS encompasses not just DFS, BFS, and Prim’s MSTalgorithm but also several other classical algorithms. The various algorithmsdiffer only in their priority functions. Thus, the running times of all thesealgorithms depend on the performance of the priority-queueADT. Indeed,weareledtoageneralresultthatencompassesnotjustthetwoimplementationsofPrim’salgorithmsthatwehaveexaminedinthissectionbutalsoabroadclassoffundamentalgraph-processingalgorithms.Property 20.8 For all graphs and all priority functions, we can compute a

spanningtreewithPFSinlineartimeplustimeproportionaltothetimerequiredfor V insert, V delete the minimum, and E decrease key operations in apriorityqueueofsizeatmostV.Proof:TheproofofProperty20.7establishesthismoregeneralresult.Wehaveto examine all the edges in the graph; hence the “linear time” clause. Thealgorithm never increases the priority (it changes the priority to only a lowervalue);bymorepreciselyspecifyingwhatweneedfromthepriority-queueADT(decreasekey, not necessarily changepriority ), we strengthen this statementaboutperformance.Inparticular,useofanunordered-arraypriority-queueimplementationgivesanoptimalsolutionfordensegraphs thathas thesameworst-caseperformanceasthe classical implementation of Prim’s algorithm (Program 20.6). That is,Properties20.6and20.7arespecialcasesofProperty20.8;throughoutthisbookwe shall see numerous other algorithms that essentially differ in only theirchoiceofpriorityfunctionandtheirpriority-queueimplementation.Property20.7 isan importantgeneral result:The timeboundstated isaworst-caseupperboundthatguaranteesperformancewithinafactoroflgVofoptimal(lineartime)foralargeclassofgraph-processingproblems.Still,itissomewhatpessimistic for many of the graphs that we encounter in practice, for tworeasons.First, thelgVboundforpriority-queueoperationholdsonlywhenthenumberofverticesonthefringeisproportionaltoV,andeventhenitisjustanupper bound. For a real graph in a practical application, the fringe might besmall(seeFigures20.10and20.11),andsomepriority-queueoperationsmighttakemany fewer than lgV steps. Although noticeable, this effect is likely toaccountforonlyasmallconstantfactorintherunningtime;forexample,aproofthat thefringeneverhasmorethanpVverticesonitwouldimprovetheboundbyonlyafactorof2.Moreimportant,wegenerallyperformmanyfewerthanEdecreasekeyoperationssincewedothatoperationonlywhenwefindanedgetoafringenodethatisshorterthanthecurrentbest-knownedgetoafringenode.Thiseventisrelativelyrare:Mostedgeshavenoeffectonthepriorityqueue(seeExercise 20.40). It is reasonable to regard PFS as an essentially linear-timealgorithmunlessVlgVissignificantlygreaterthanE.

Figure20.10PFSimplementationofPrim’sMSTalgorithm

WithPFS,Prim’salgorithmprocessesjusttheverticesandedgesclosesttotheMST(ingray).

Figure20.11FringesizeforPFSimplementationofPrim’salgorithm

TheplotatthebottomshowsthesizeofthefringeasthePFSproceedsfortheexampleinFigure20.10.Forcomparison,thecorrespondingplotsforDFS,randomizedsearch,andBFSfromFigure18.28areshownaboveingray.

The priority-queueADT and generalized graph-searching abstractionsmake iteasy for us to understand the relationships among various algorithms. Sincethese abstractions (and software mechanisms to support their use) were

developed many years after the basic methods, relating the algorithms toclassical descriptions of them becomes an exercise for historians. However,knowingbasicfactsaboutthehistoryisusefulwhenweencounterdescriptionsofMSTalgorithmsintheresearchliteratureorinothertexts,andunderstandinghowthesefewsimpleabstractionstietogethertheworkofnumerousresearchersover a time span of decades is persuasive evidence of their value and power.Thus,weconsiderbrieflytheoriginsofthesealgorithmshere.AnMSTimplementationfordensegraphsessentiallyequivalenttoProgram20.6was first presented by Prim in 1961, and, independently, by Dijkstra soonthereafter. It is usually referred to as Prim’s algorithm, although Dijkstra’spresentationwasmoregeneral,sosomescholarsrefertotheMSTalgorithmasaspecial case ofDijkstra’s algorithm.But the basic ideawas also presented byJarnikin1939,sosomeauthorsrefertothemethodasJarnik’salgorithm, thuscharacterizingPrim’s(orDijkstra’s)roleasfindinganefficientimplementationofthealgorithmfordensegraphs.Asthepriority-queueADTcameintouseinthe early 1970s, its application to finding MSTs of sparse graphs wasstraightforward;thefactthatMSTsofsparsegraphscouldbecomputedintimeproportionaltoElgVbecamewidelyknownwithoutattributiontoanyparticularresearcher.Sincethattime,aswediscussinSection20.6,manyresearchershaveconcentrated on finding efficient priority-queue implementations as the key tofindingefficientMSTalgorithmsforsparsegraphs.

Exercises•20.35Analyze the performance of the brute-force implementation of Prim’salgorithmmentionedatthebeginningofthissectionforacompleteweightedgraph with V vertices. Hint: The following combinatorial sum might beuseful:Σ1k<Vk(V−k)=(V+1)V(V−1)/6.

•20.36AnswerExercise20.35 forgraphs inwhichallverticeshave thesamefixeddegreet.

•20.37AnswerExercise20.35 for general sparse graphs that haveV verticesandEedges.Sincetherunningtimedependsontheweightsoftheedgesandon the degrees of the vertices, do aworst-case analysis. Exhibit a family ofgraphsforwhichyourworst-caseboundisconfirmed.20.38Show,inthestyleofFigure20.8,theresultofcomputingtheMSTofthenetworkdefinedinExercise20.26withPrim’salgorithm.•20.39DescribeafamilyofgraphswithVverticesandEedgesforwhichtheworst-case running time of the PFS implementation of Prim’s algorithm is

confirmed.••20.40DevelopareasonablegeneratorforrandomgraphswithVverticesandE edges such that the running time of the PFS implementation of Prim’salgorithm(Program20.7)isnonlinear.

•20.41ModifyProgram20.7sothatitworkslikeProgram18.8,inthatitkeepson the fringe all edges incident upon tree vertices. Run empirical studies tocompareyourimplementationwithProgram20.7,forvariousweightedgraphs(seeExercises20.9–14).20.42DeriveapriorityqueueimplementationforusebyProgram20.7fromtheinterface defined in Program 9.12 (so any implementation of that interfacecouldbeused).20.43 Use the STL priority_queue to implement the priority-queue interfacethatisusedbyProgram20.7.20.44Suppose thatyouuse apriority-queue implementation thatmaintains asorted list. What would be the worst-case running time for graphs with VverticesandEedges,towithinaconstantfactor?Whenwouldthismethodbeappropriate,ifever?Defendyouranswer.• 20.45 AnMST edge whose deletion from the graphwould cause theMSTweighttoincreaseiscalledacriticaledge.ShowhowtofindallcriticaledgesinagraphintimeproportionaltoElgV.20.46Run empirical studies to compare the performanceofProgram20.6 tothatofProgram20.7,usinganunorderedarrayimplementationforthepriorityqueue,forvariousweightedgraphs(seeExercises20.9–14).•20.47Runempirical studies todetermine theeffectofusingan index-heap–tournament (see Exercise 9.53) priority-queue implementation instead ofProgram 9.12 in Program 20.7, for various weighted graphs (see Exercises20.9–14).20.48Runempiricalstudies toanalyze treeweights(seeExercise20.23)asafunctionofV,forvariousweightedgraphs(seeExercises20.9–14).20.49RunempiricalstudiestoanalyzemaximumfringesizeasafunctionofV,forvariousweightedgraphs(seeExercises20.9–14).20.50 Run empirical studies to analyze tree height as a function of V, forvariousweightedgraphs(seeExercises20.9–14).20.51RunempiricalstudiestostudythedependenceoftheresultsofExercises20.49and20.50on thestartvertex.Would itbeworthwhile tousea random

startingpoint?•20.52WriteaclientprogramthatdoesdynamicgraphicalanimationsofPrim’salgorithm. Your program should produce images like Figure 20.10 (seeExercises 17.56 through 17.60). Test your program on random Euclideanneighborgraphsandongridgraphs(seeExercises20.17and20.19),usingasmanypointsasyoucanprocessinareasonableamountoftime.

20.4Kruskal’sAlgorithmPrim’s algorithm builds the MST one edge at a time, finding a new edge toattachtoasinglegrowingtreeateachstep.Kruskal’salgorithmalsobuildstheMSToneedgeatatime;but,bycontrast,itfindsanedgethatconnectstwotreesinaspreadingforestofgrowingMSTsubtrees.WestartwithadegenerateforestofVsingle-vertextreesandperformtheoperationofcombiningtwotrees(usingtheshortestedgepossible)untilthereisjustonetreeleft:theMST.Figure 20.12 shows a step-by-step example of the operation of Kruskal’salgorithm;Figure20.13 illustrates thealgorithm’sdynamiccharacteristicsonalargerexample.ThedisconnectedforestofMSTsubtreesevolvesgraduallyintoa tree. Edges are added to the MST in order of their length, so the forestscompriseverticesthatareconnectedtooneanotherbyrelativelyshortedges.Atanypointduring the executionof the algorithm, eachvertex is closer to somevertexinitssubtreethantoanyvertexnotinitssubtree.Kruskal’s algorithm is simple to implement, given the basic algorithmic toolsthatwehaveconsideredinthisbook.Indeed,wecanuseanysortfromPart3tosorttheedgesbyweightandanyoftheconnectivityalgorithmsfromChapter1toeliminate those that causecycles!Program20.8 is an implementationalongtheselinesofanMSTfunctionforagraphADTthatisfunctionallyequivalenttothe other MST implementations that we consider in this chapter. Theimplementationdoesnotdependonthegraphrepresentation:ItcallsaGRAPHclienttoreturnavectorthatcontainsthegraph’sedges,thencomputestheMSTfromthatvector.

Figure20.12Kruskal’sMSTalgorithm

Givenalistofagraph’sedgesinarbitraryorder(leftedgelist),thefirststepinKruskal’salgorithmistosortthembyweight(rightedgelist).Thenwego

throughtheedgesonthelistinorderoftheirweight,addingedgesthatdonotcreatecyclestotheMST.Weadd5-3(theshortestedge),then7-1,then7-6(left),then0-2(right,top)and0-7(right,secondfromtop).Theedgewiththe

nextlargestweight,0-1,createsacycleandisnotincluded.EdgesthatwedonotaddtotheMSTareshowningrayonthesortedlist.Thenweadd4-3(right,thirdfromtop).Next,wereject5-4becauseitcausesacycle,thenweadd7-4(right,bottom).OncetheMSTiscomplete,anyedgeswithlargerweightswouldcausecyclesandberejected(westopwhenwehaveaddedV-1edgestothe

MST).Theseedgesaremarkedwithasterisksonthesortedlist.NotethattherearetwowaysinwhichKruskal’salgorithmcanterminate.IfwefindV−1edges,thenwehaveaspanningtreeandcanstop.Ifweexamineallthe edges without findingV −1 tree edges, then we have determined that thegraphisnotconnected,preciselyaswedidinChapter1.AnalyzingtherunningtimeofKruskal’salgorithmisasimplematterbecauseweknowtherunningtimeofitsconstituentADToperations.Property 20.9 Kruskal’s algorithm computes the MST of a graph in timeproportionaltoElgE.Proof:Thisproperty isaconsequenceof themoregeneralobservation that therunning timeofProgram20.8 isproportional to thecostof sortingE numbersplus the cost ofE find andV− 1 union operations. If we use standardADTimplementations such asmergesort andweighted union-findwith halving, thecostofsortingdominates.WeconsiderthecomparativeperformanceofKruskal’sandPrim’salgorithminSection20.6.Forthemoment,notethatarunningtimeproportionaltoElgEisnotnecessarilyworsethanElgV,becauseEisatmostV2,solgEisatmost2lgV.Performancedifferencesforspecificgraphsareduetowhatthepropertiesoftheimplementationsareandtowhethertheactualrunningtimeapproachestheseworst-casebounds.Inpractice,wemightusequicksortora fastsystemsort (which is likely tobebasedonquicksort).Althoughthisapproachmaygivetheusualunappealing(intheory) quadraticworst case for the sort, it is likely to lead to the fastest runtime.Indeed,ontheotherhand,wecouldusearadixsorttodothesortinlineartime (under certain conditions on the weights) so that the cost of the E findoperationsdominatesandthenadjustProperty20.9tosaythattherunningtime

of Kruskal’s algorithm is within a constant factor of E lg*E under thoseconditionson theweights (seeChapter 2).Recall that the function lg*E is thenumber of iterations of the binary logarithm function before the result is lessthan 1, which is less than 5 if E is less than 265536. In other words, theseadjustments make Kruskal’s algorithm effectively linear in most practicalcircumstances.

Figure20.13Kruskal’sMSTalgorithm

Thissequenceshows1/4,1/2,3/4,andthefullMSTasitevolves.

Program20.8Kruskal’sMSTalgorithmThis implementation uses our sorting ADT from Chapter 6 and our union-find ADTfrom Chapter 4 to find theMST by considering the edges in order of their weights,discardingedges thatcreatecyclesuntil findingV−1edges thatcompriseaspanningtree.NotshownareawrapperclassEdgePtrthatwrapspointerstoEdgessosortcancomparethemusinganoverloaded<,asdescribedinSection6.8,andaversionofProgram20.2withathirdtemplateargument.

Typically, thecostof finding theMSTwithKruskal’s algorithm is even lowerthanthecostofprocessingalledgesbecausetheMSTiscompletewellbeforeasubstantialfractionofthe(long)graphedgesiseverconsidered.Wecantakethisfact into account to reduce the running time significantly in many practicalsituationsbykeepingedgesthatare longer thanthelongestMSTedgeentirelyout of the sort.One easyway to accomplish this objective is to use a priorityqueue,withan implementation thatdoes theconstructoperation in linear timeandtheremovetheminimumoperationinlogarithmictime.Forexample,wecanachievetheseperformancecharacteristicswithastandardheap implementation, using bottom-up construction (see Section 9.4).Specifically,wemakethefollowingchangestoProgram20.8:First,wechangethecallonsorttoacallonpq.construct()tobuildaheapintimeproportionaltoE. Second,we change the inner loop to take the shortest edge off the priorityqueuewithe=pq.delmin()andtochangeallreferencestoa[i]torefertoe.Property20.10Apriority-queue–basedversionofKruskal’salgorithmcomputestheMSTofagraphintimeproportionaltoE+XlgV,whereXisthenumberofgraphedgesnotlongerthanthelongestedgeintheMST.Proof: See the preceding discussion, which shows the cost to be the cost of

building a priority queue of sizeE plus the cost of running theX delete theminimum, X find, and V −1 union operations. Note that the priority-queue–construction costs− dominate (and the algorithm is linear time) unless X isgreaterthanE/lgV.We can also apply the same idea to reap similar benefits in a quicksort-basedimplementation. Consider what happens when we use a straight recursivequicksort,wherewepartitionati,thenrecursivelysortthesubfiletotheleftofiandthesubfiletotherightofi.Wenotethat,byconstructionofthealgorithm,thefirstielementsareinsortedorderaftercompletionofthefirstrecursivecall(seeProgram9.2).ThisobviousfactleadsimmediatelytoafastimplementationofKruskal’salgorithm: Ifweput thecheck forwhether theedgea[i] causesacycle between the recursive calls, then we have an algorithm that, byconstruction,haschecked the first i edges, in sortedorder, after completionofthe first recursivecall! Ifwe includea test to returnwhenwehave foundV-1MSTedges,thenwehaveanalgorithmthatsortsonlyasmanyedgesasweneedto compute the MST, with a few extra partitioning stages involving largerelements (see Exercise 20.57). Like straight sorting implementations, thisalgorithmcould run in quadratic time in theworst case, butwe canprovide aprobabilisticguarantee that theworst-case running timewillnotcomeclose tothislimit.Also,likestraightsortingimplementations,thisprogramislikelytobefasterthanaheap-basedimplementationbecauseofitsshorterinnerloop.If the graph is not connected, the partial-sort versions of Kruskal’s algorithmoffer no advantage because all the edges have to be considered. Even for aconnected graph, the longest edge in the graphmight be in theMST, so anyimplementationofKruskal’smethodwouldstillhavetoexamineall theedges.Forexample, thegraphmightconsistof tightclustersofverticesallconnectedtogetherbyshortedges,withoneoutlierconnected tooneof theverticesbyalong edge. Despite such anomalies, the partial-sort approach is probablyworthwhilebecauseitofferssignificantgainwhenitappliesandincurslittleifanyextracost.Historicalperspectiveisrelevantandinstructivehereaswell.Kruskalpresentedthisalgorithmin1956,but,again, therelevantADTimplementationswerenotcarefully studied for many years, so the performance characteristics ofimplementations such as the priority-queue version of Program 20.8were notwell understood until the 1970s. Other interesting historical notes are thatKruskal’s papermentioned a version of Prim’s algorithm (seeExercise 20.59)and that Boruvka mentioned both approaches. Efficient implementations ofKruskal’smethodforsparsegraphsprecededimplementationsofPrim’smethod

for sparse graphs because union-find (and sort) ADTs came into use beforepriority-queue ADTs. Generally, as was true of implementations of Prim’salgorithm,advancesinthestateoftheartforKruskal’salgorithmareattributedprimarilytoadvancesinADTperformance.Ontheotherhand,theapplicabilityoftheunion-findabstractiontoKruskal’salgorithmandtheapplicabilityofthepriority-queueabstraction toPrim’salgorithmhavebeenprimemotivationsformanyresearcherstoseekbetterimplementationsofthoseADTs.

Exercises•20.53Show,inthestyleofFigure20.12,theresultofcomputingtheMSTofthenetworkdefinedinExercise20.26withKruskal’salgorithm.

•20.54Runempirical studies to analyze the lengthof the longest edge in theMST and the number of graph edges that are not longer than that one, forvariousweightedgraphs(seeExercises20.9–14).

•20.55Developanimplementationoftheunion-findADTthatimplementsfindinconstanttimeandunionintimeproportionaltolgV.20.56RunempiricalteststocompareyourADTimplementationfromExercise20.55 to weighted union-find with halving (Program 1.4) when Kruskal’salgorithm is the client, for variousweighted graphs (see Exercises20.9–14).Separateout thecostofsortingtheedgessothatyoucanstudytheeffectsofthechangebothonthetotalcostandonthepartofthecostassociatedwiththeunion-findADT.20.57Developanimplementationbasedontheideadescribedinthetextwherewe integrate Kruskal’s algorithm with quicksort so as to check MSTmembershipofeachedgeassoonasweknowthatallsmalleredgeshavebeenchecked.•20.58AdaptKruskal’salgorithmto implement twoADTfunctions that fillaclient-suppliedvertex-indexedvector classifyingvertices intok clusterswiththe property that no edge of length greater than d connects two vertices indifferentclusters.Forthefirstfunction,takekasanargumentandreturnd;forthesecond,takedasanargumentandreturnk.TestyourprogramonrandomEuclideanneighborgraphsandongridgraphs(seeExercises20.17and20.19)ofvarioussizesforvariousvaluesofkandd.20.59 Develop an implementation of Prim’s algorithm that is based onpresortingtheedges.• 20.60 Write a client program that does dynamic graphical animations ofKruskal’s algorithm (see Exercise 20.52). Test your program on random

Euclideanneighborgraphsandongridgraphs(seeExercises20.17and20.19),usingasmanypointsasyoucanprocessinareasonableamountoftime.

Figure20.14Boruvka’sMSTalgorithm

Thediagramatthetopshowsadirectededgefromeachvertextoitsclosestneighbor.Theseedgesshowthat0-2,1-7,and3-5areeachtheshortestedgeincidentonboththeirvertices,6-7is6’sshortestedge,and4-3is4’sshortest

edge.TheseedgesallbelongtotheMSTandcompriseaforestofMSTsubtrees(center),ascomputedbythefirstphaseofBoruvka’salgorithm.Inthesecondphase,thealgorithmcompletestheMSTcomputation(bottom)byaddingtheedge0-7,whichistheshortestedgeincidentonanyoftheverticesinthe

subtreesitconnects,andtheedge4-7,whichistheshortestedgeincidentonanyoftheverticesinthebottomsubtree.

20.5Boruvka’sAlgorithmThe next MST algorithm that we consider is also the oldest. Like Kruskal’salgorithm, we build theMST by adding edges to a spreading forest of MST

subtrees; butwe do so in stages, adding severalMST edges at each stage.Ateach stage, we find the shortest edge that connects eachMST subtree with adifferentone,thenaddallsuchedgestotheMST.Again,ourunion-findADTfromChapter1leadstoanefficientimplementation.For this problem, it is convenient to extend the interface to make the findoperation available to clients.We use this function to associate an indexwitheachsubtreesothatwecantellquicklytowhichsubtreeagivenvertexbelongs.With this capability, we can implement efficiently each of the necessaryoperationsforBoruvka’salgorithm.First,wemaintainavertex-indexedvectorthatidentifies,foreachMSTsubtree,thenearestneighbor.Then,weperformthefollowingoperationsoneachedgeinthegraph:•Ifitconnectstwoverticesinthesametree,discardit.• Otherwise, check the nearest-neighbor distances between the two trees theedgeconnectsandupdatethemifappropriate.

After this scan of all the graph edges, the nearest-neighbor vector has theinformation that we need to connect the subtrees. For each vertex index, weperform a union operation to connect it with its nearest neighbor. In the nextstage,wediscardallthelongeredgesthatconnectotherpairsofverticesinthenow-connectedMSTsubtrees.Figures20.14and20.15illustratethisprocessonoursamplealgorithmProgram20.9isadirectimplementationofBoruvka’salgorithm.Therearethreemajorfactorsthatmakethisimplementationefficient:•Thecostofeachfindoperationisessentiallyconstant.•Eachstagedecreases thenumberofMSTsubtrees in the forestbyat leastafactorof2.

•Asubstantialnumberofedgesisdiscardedduringeachstage.Itisdifficulttoquantify precisely all these factors, but the following bound is easy toestablish.

Property20.11TherunningtimeofBoruvka’salgorithmforcomputingtheMSTofagraphisO(ElgVlg*E).Proof:Sincethenumberoftreesintheforestishalvedateachstage,thenumberofstagesisnolargerthanlgV.ThetimeforeachstageisatmostproportionaltothecostofEfind operations,which is less thanE lg*E, or linear for practicalpurposes.

TherunningtimegiveninProperty20.11isaconservativeupperboundsinceitdoes not take into account the substantial reduction in the number of edgesduringeachstage.Thefindoperationstakeconstanttimeintheearlypasses,andthereareveryfewedgesinthelaterpasses.Indeed,formanygraphs,thenumberof edges decreases exponentially with the number of vertices, and the totalrunningtimeisproportionaltoE.Forexample,asillustratedinFigure20.16,thealgorithmfindstheMSTofourlargersamplegraphinjustfourstages.

It is possible to remove the lg*E factor to lower the theoretical bound on therunning time of Boruvka’s algorithm to be proportional to E lg V, byrepresentingMST subtreeswith doubly-linked lists instead of using theunionandfindoperations.However,thisimprovementissufficientlymorecomplicatedto implementand thepotentialperformance improvementsufficientlymarginalthat it is not likely to beworth considering for use in practice (see Exercises20.66and20.67).

Figure20.15Union-findarrayinBoruvka’salgorithm

Thisfiguredepictsthecontentsoftheunion-findarraycorrespondingtotheexampledepictedinFigure20.14.Initially,eachentrycontainsitsownindex,indicatingaforestofsingletonvertices.Afterthefirststage,wehavethree

components,representedbytheindices0,1,and3(theunion-findtreesareallflatforthistinyexample).Afterthesecondstage,wehavejustonecomponent,

representedby1.

Program20.9Boruvka’sMSTalgorithmThisimplementationofBoruvka’sMSTalgorithmusesaversionoftheunion-findADTfromChapter4(withasingle-argumentfindaddedtotheinterface)toassociateindiceswithMSTsubtreesas theyarebuilt.Eachphasechecksall theremainingedges; thosethatconnectdisjointsubtreesarekeptforthenextphase.ThearrayahastheedgesnotyetdiscardedandnotyetintheMST.TheindexNisusedtostorethosebeingsavedforthenextphase (thecode resetsE fromNat theendofeachphase)and the indexh isusedtoaccessthenextedgetobechecked.Eachcomponent’snearestneighboriskeptinthe array bwith find component numbers as indices.At the end of each phase, eachcomponent isunitedwith itsnearestneighborandthenearest-neighboredgesaddedtotheMST.

Aswementioned,Boruvka’sistheoldestofthealgorithmsthatweconsider:Itwas originally conceived in 1926, for a power-distribution application. Themethod was rediscovered by Sollin in 1961; it later attracted attention as thebasisforMSTalgorithmswithefficientasymptoticperformanceandasthebasisforparallelMSTalgorithms.

Exercises•20.61Show,inthestyleofFigure20.14,theresultofcomputingtheMSTofthenetworkdefinedinExercise20.26withBoruvka’salgorithm.

•20.62WhydoesProgram20.9doafindtestbeforedoingtheunionoperation?Hint:Considerequal-lengthedges.

•20.63Explainwhyb(h)couldbenull inthetest inProgram20.9 thatguardstheunionoperation.

•20.64DescribeafamilyofgraphswithVverticesandEedgesforwhichthenumberofedgesthatsurviveeachstageofBoruvka’salgorithmissufficientlylargethattheworst-caserunningtimeisachieved.20.65Develop an implementation ofBoruvka’s algorithm that is based on apresortingoftheedges.•20.66 Develop an implementation of Boruvka’s algorithm that uses doubly-linkedcircularliststorepresentMSTsubtrees,sothatsubtreescanbemergedandrenamedintimeproportionaltoEduringeachstage(andtheequivalence-relationsADTisthereforenotneeded).

• 20.67 Do empirical studies to compare your implementation of Boruvka’salgorithm in Exercise 20.66 with the implementation in the text (Program20.9),forvariousweightedgraphs(seeExercises20.9–14).

•20.68DoempiricalstudiestotabulatethenumberofstagesandthenumberofedgesprocessedperstageinBoruvka’salgorithm,forvariousweightedgraphs(seeExercises20.9–14).20.69DevelopanimplementationofBoruvka’salgorithmthatconstructsanewgraph(onevertexforeachtreeintheforest)ateachstage.• 20.70 Write a client program that does dynamic graphical animations ofBoruvka’s algorithm (seeExercises20.52 and 20.60). Test your program onrandomEuclidean neighbor graphs and on grid graphs (see Exercises 20.17and20.19),usingasmanypointsasyoucanprocessinareasonableamountoftime.

Figure20.16Boruvka’sMSTalgorithm

TheMSTevolvesinjustfourstagesforthisexample(toptobottom).

20.6ComparisonsandImprovementsTable20.1summarizes therunning timesof thebasicMSTalgorithms thatwehaveconsidered;Table20.2presentstheresultsofanempiricalstudycomparingthe algorithms. From these tables, we can conclude that the adjacency-matriximplementation of Prim’s algorithm is themethod of choice for dense graphs,that all the other methods perform within a small constant factor of the bestpossible (the time that it takes to extract the edges) forgraphsof intermediatedensity,andthatKruskal’smethodessentiallyreducestheproblemtosortingforsparsegraphs.In short, we might consider the MST problem to be “solved” for practicalpurposes.Formostgraphs, thecostof finding theMSTisonlyslightlyhigherthan the cost of extracting the graph’s edges. This rule holds except for hugegraphs that are extremely sparse, but the available performance improvementoverthebestknownalgorithmseveninthiscaseisapproximatelyafactorof10atbest.Theresults inTable20.2aredependenton themodelused togenerategraphs, but they are borne out formany other graphmodels aswell (see, forexample,Exercise20.80).Still,thetheoreticalresultsdonotdenytheexistence

ofanalgorithm that isguaranteed to run in linear time forallgraphs;herewetake a look at the extensive research on improved implementations of thesemethods.First, much research has gone into developing better priority-queueimplementations. The Fibonacci heap data structure, an extension of thebinomial queue, achieves the theoretically optimal performance of takingconstant time fordecreasekey operations and logarithmic time for remove theminimumoperations,whichbehavior translates,byProperty20.8, toa runningtimeproportionaltoE+VlgVforPrim’salgorithm.Fibonacciheapsaremorecomplicated thanbinomialqueuesandare somewhatunwieldy inpractice,butsome simpler priority-queue implementations have similar performancecharacteristics(seereferencesection).One effective approach is to use radix methods for the priority-queueimplementation.PerformanceofsuchmethodsistypicallyequivalenttothatofradixsortingforKruskal’smethod,oreventothatofusingaradixquicksortforthepartial-sortingmethodthatwediscussedinSection20.4.Anothersimpleearlyapproach,proposedbyD.Johnsonin1977, isoneof themost effective: Implement the priority queue for Prim’s algorithmwithd -aryheaps,insteadofwithstandardbinaryheaps(seeFigure20.17).Program20.10isacompleteimplementationofthepriority-queueinterfacethatwehavebeenusing that is based on this method. For this priority-queue implementation,decreasekey takes less than log

dV steps, and remove theminimum takes timeproportionaltodlog

dV.ByProperty20.8,thisbehaviorleadstoarunningtimeproportional toVd log

dV +E logdV for Prim’s algorithm,which is linear forgraphsthatarenotsparse.Table20.1CostofMSTalgorithmsThis table summarizes the cost (worst-case running time) of various MSTalgorithmsconsideredinthischapter.Theformulasarebasedontheassumptionsthat anMSTexists (which implies thatEV 1) and that there areX edges notlonger than the longestedge in theMST− (seeProperty20.10).Theseworst-caseboundsmaybetooconservativetobeusefulinpredictingperformanceonrealgraphs.Thealgorithmsruninnear-lineartimeinabroadvarietyofpracticalsituations.

Property 20.12Given a graph with V vertices and E edges, let d denote thedensityE/V.Ifd<2,thentherunningtimeofPrim’salgorithmisproportionaltoVlgV.Otherwise,wecanimprovetheworst-caserunningtimebyafactoroflg(E/V)byusingaE/V-aryheapforthepriorityqueue.Proof:Continuingthediscussioninthepreviousparagraph,thenumberofstepsisVdlog

dV+ElogdV,sotherunningtimeisatmostproportionaltoElogdV=(ElgV)/lgd.

WhenE isproportionaltoV1+[epsilon1],Property20.12 leads toaworst-caserunning time proportional to E/[epsilon1], and that value is linear for anyconstant[epsilon1].Forexample,ifthenumberofedgesisproportionaltoV3/2,thecostislessthan2E;ifthenumberofedgesisproportionaltoTable20.2EmpiricalstudyofMSTalgorithmsThistableshowsrelativetimingsforvariousalgorithmsforfindingtheMST,forrandom weighted graphs of various density. For low densities, Kruskal’salgorithm is best because it amounts to a fast sort. For high densities, theclassical implementation of Prim’s algorithm is best because it does not incurlist-processingoverhead.Forintermediatedensities,thePFSimplementationofPrim’salgorithmrunswithinasmallfactorofthetimethatit takestoexamineeachgraphedge.

Key:CextractedgesonlyHPrim’salgorithm(adjacencylists/indexedheap)JJohnson’sversionofPrim’salgorithm(d-heappriorityqueue)PPrim’salgorithm(adjacency-matrixrepresentation)KKruskal’salgorithmK*Partial-sortversionofKruskal’salgorithmBBoruvka’salgorithm

eedgesexamined(unionoperations)

V4/3,thecostislessthan3E;andifthenumberofedgesisproportionaltoV5/4,thecostislessthan4E.Foragraphwith1millionvertices,thecostislessthan6Eunlessthedensityislessthan10.The temptation tominimize the bound on theworst-case running time in thiswayneedstobetemperedwiththerealizationthattheVdlog

dVpartofthecostis not likely to be avoided (for remove the minimum, we have to examine dsuccessors in theheapaswesiftdown),but theE lgdpart isnot likely tobeachieved (since most edges will not require a priority-queue update, as weshowedinthediscussionfollowingProperty20.8).FortypicalgraphssuchasthoseintheexperimentsinTable20.2,decreasingdhasnoeffectontherunningtime,andusingalargevalueofdcanslowdowntheimplementation slightly. Still, the slight protection offered for worst-caseperformancemakesthemethodworthwhilesinceit issoeasytoimplement.Inprinciple, we could tune the implementation to pick the best value of d forcertain types of graphs (choose the largest value that does not slow down thealgorithm), but a small fixed value (such as 4, 5, or 6) will be fine exceptpossibly for some particular huge classes of graphs that have atypicalcharacteristics.Usingd-heapsisnoteffectiveforsparsegraphsbecausedhastobeanintegergreater than or equal to 2, a condition that implies that we cannot bring theasymptoticrunningtimelowerthanVlgV.Ifthedensityisasmallconstant,thenalinear-timeMSTalgorithmwouldhavetorunintimeproportionaltoV.The goal of developing practical algorithms for computing theMSTof sparsegraphsinlineartimeremainselusive.Agreatdealofresearchhasbeendoneonvariations of Boruvka’s algorithm as the basis for nearly linear-time MSTalgorithmsforextremelysparsegraphs(seereferencesection ).Suchresearchstill holds the potential to lead us eventually to a practical linear-time MSTalgorithm and has even shown the existence of a randomized linear-timealgorithm.While these algorithms are generally quite complicated, simplifiedversions of some of them may yet be shown to be useful in practice. In themeantime, we can use the basic algorithms that we have considered here tocomputetheMSTinlineartimeinmostpracticalsituations,perhapspayinganextrafactoroflgVforsomesparsegraphs.

Figure20.172-,3-,and4-aryheaps

Whenwestoreastandardbinaryheap-orderedcompletetreeinanarray(top),weuseimplicitlinkstotakeusfromanodeidownthetreetoitschildren2iand2i+1andupthetreetoitsparenti/2.Ina3-aryheap(center),implicitlinksforiaretoitschildren3i1,3i,and3i+1andtoitsparent−[floorleft](i+1)/3[floorright];andina4-aryheap(bottom),implicitlinksforiaretoitschildren4i−2,4i1,4i,and4i+1andtoitsparent−[floorleft](i+2)/4.Increasingthebranchingfactorin[floorright]animplicitheapimplementationcanbevaluable

inapplications,likePrim’salgorithm,thatrequireasignificantnumberofdecreasekeyoperations.

Program20.10MultiwayheapPQimplementationThisclassusesmultiwayheapstoimplementtheindirectpriority-queueinterfacethatweuse in thisbook. It is basedon changingProgram9.12 tohave the constructor take areferencetoavectorofpriorities,toimplementgetminandlowerinsteadofdelmaxandchange,and togeneralize fixUpandfixDownso that theymaintainad -wayheap(soremovetheminimumtakestimeproportionaltodlog

dV,butdecreasekeyrequireslessthanlog

dVsteps).

Exercises•20.71[V.Vyssotsky]DevelopanimplementationofthealgorithmdiscussedinSection20.2thatbuilds theMSTbyaddingedgesoneata timeanddeletingthelongestedgesonthecycleformed(seeExercise20.33).Useaparent-linkrepresentation of a forest of MST subtrees. Hint: Reverse pointers when

traversingpathsintrees.•20.72RunempiricalteststocomparetherunningtimeofyourimplementationinExercise20.71withthatofKruskal’salgorithm,forvariousweightedgraphs(seeExercises20.9–14). Checkwhether randomizing the order inwhich theedgesareconsideredaffectsyourresults.

•20.73DescribehowyouwouldfindtheMSTofagraphsolargethatonlyVedgescanfitintomainmemoryatonce.

• 20.74 Develop a priority-queue implementation for which remove theminimum and find theminimum are constant-time operations, and forwhichdecrease key takes time proportional to the logarithm of the priority-queuesize. Compare your implementation with 4-heaps when you use Prim’salgorithmtofindtheMSTofsparsegraphs,forvariousweightedgraphs(seeExercises20.9–14).20.75Run empirical studies to compare the performance of various priority-queue implementations when used in Prim’s algorithm for various weightedgraphs (see Exercises 20.9–14). Consider d -heaps for various values of d,binomial queues, the STL priority_queue, balanced trees, and any other datastructurethatyouthinkmightbeeffective.20.76 Develop an implementation that generalizes Boruvka’s algorithm tomaintain a generalized queue containing the forest ofMST subtrees. (UsingProgram 20.9 corresponds to using a FIFO queue.) Experiment with othergeneralized-queueimplementations,forvariousweightedgraphs(seeExercises20.9–14).•20.77Developageneratorforrandomconnectedcubicgraphs(eachvertexofdegree3) thathaverandomweightson theedges.Fine-tunefor thiscase theMSTalgorithmsthatwehavediscussed,thendeterminewhichisthefastest.

•20.78 ForV = 106, plot the ratio of the upper bound on the cost for Prim’salgorithmwithd-heapstoEasafunctionofthedensityd,ford intherangefrom1to100.

• 20.79 Table 20.2 suggests that the standard implementation of Kruskal’salgorithm is significantly faster than thepartial-sort implementation for low-densitygraphs.Explainthisphenomenon.

•20.80Runanempiricalstudy,inthestyleofTable20.2,forrandomcompletegraphsthathaveGaussianweights(seeExercise20.18).

20.7EuclideanMST

SupposethatwearegivenNpointsintheplaneandwewanttofindtheshortestset of lines connecting all the points. This geometric problem is called theEuclideanMST problem (seeFigure20.18).Oneway to solve it is to build acompletegraphwithN vertices andN (N−1) / 2 edges—one edge connectingeach pair of vertices weighted with the distance between the correspondingpoints.Then,wecanusePrim’salgorithmtofindtheMSTintimeproportionaltoN2.This solution is generally too slow. The Euclidean problem is somewhatdifferentfromthegraphproblemsthatwehavebeenconsideringbecausealltheedgesareimplicitlydefined.ThesizeoftheinputisjustproportionaltoN,sothesolution that we have sketched is a quadratic algorithm for the problem.Research has proved that it is possible to do better. The geometric structuremakesmostoftheedgesinthecompletegraphirrelevanttotheproblem,andwedonotneedtoaddmostofthemtothegraphbeforeweconstructtheminimumspanningtree.Property20.13WecanfindtheEuclideanMSTofNpointsintimeproportionaltoNlogN.ThisfactisadirectconsequenceoftwobasicfactsaboutpointsintheplanethatwediscussindetailinPart7.First,agraphknownastheDelauneytriangulationcontainstheMST,bydefinition.Second,theDelauneytriangulationisaplanargraphwhosenumberofedgesisproportionaltoN.In principle, then, we could compute the Delauney triangulation in timeproportionaltoNlogN,thenruneitherKruskal’salgorithmorthepriority-firstsearchmethodtofindtheEuclideanMST,intimeproportionaltoNlogN.ButwritingaprogramtocomputetheDelauneytriangulationisachallengeforevenanexperiencedprogrammer,sothisapproachmaybeoverkillforthisprobleminpractice.

Figure20.18EuclideanMST

GivenasetofNpointsintheplane(top),theEuclideanMSTistheshortestsetoflinesconnectingthemtogether(bottom).Thisproblemisnotjustagraph-

processingproblem,becauseweneedtomakeuseofglobalgeometricinformationaboutthepointstoavoidhavingtoprocessallN2implicitedges

connectingthepoints.OtherapproachesderivefromthegeometricalgorithmsthatweconsiderinPart7.Forrandomlydistributedpoints,wecandivideuptheplaneintosquaressuchthat each square is likely to contain about lgN/ 2 points, as we did for theclosest-point computation in Program 3.20. Then, even if we include in thegraph only the edges connecting each point to the points in the neighboringsquares, we are likely (but are not guaranteed) to get all the edges in theminimumspanning tree; in that case,wecoulduseKruskal’s algorithmor thePFS implementation of Prim’s algorithm to finish the job efficiently. Theexample that we have used in Figure 20.10,Figure 20.13,Figure 20.16 andsimilarfigureswascreatedinthisway(seeFigure20.19).Or,wecoulddevelopaversionofPrim’salgorithmbasedonusingnear-neighboralgorithmstoavoidupdatingdistantvertices.WithallthepossiblechoicesthatwehaveforapproachingthisproblemandwiththepossibilityoflinearalgorithmsforthegeneralMSTproblem,itisimportanttonotethatthereisasimplelowerboundonthebestthatwecoulddo.Property20.14FindingtheEuclideanMSTofNpointsisnoeasierthansortingNnumbers.Proof:Givenalistofnumberstobesorted,convertthelistintoalistofpointswhere thex coordinate is taken from thecorrespondingnumberof the list andthey coordinate is0.Find theMSTof that listofpoints.Then (aswedid forKruskal’salgorithm),putthepointsintoagraphADTandrunDFStoproducea

spanningtree,startingat thepointwith the lowestxcoordinate.Thatspanningtreeamountstoalinked-listsortofthenumbersinorder;thus,wehavesolvedthesortingproblem.Precise interpretations of this lower bound are complicated because the basicoperationsusedforthetwoproblems(comparisonsofcoordinatesforthesortingproblem, distances for theMST problem) are different and because there is apossibilityofusingmethodssuchasradixsortandgridmethods.However,wemay interpret thebound tomean that, aswedo sorting,we should consider aEuclideanMSTalgorithmthatusesNlgNcomparisonstobeoptimalunlessweexploitnumericalpropertiesof thecoordinates, inwhichcasewemightexpecttoittobelineartime(seereferencesection).

Figure20.19Euclideannear-neighborgraphs

OnewaytocomputetheEuclideanMSTistogenerateagraphwithedgesconnectingeverypairofpointswithinadistanced,asinthegraphinFigure20.8etal.However,thismethodyieldstoomanyedgesifdistoolarge(top)andisnotguaranteedtohaveedgesconnectingallthepointsifdissmallerthanthe

longestedgeintheMST(bottom).It is interesting to reflect on the relationship between graph and geometricalgorithms that is brought out by the Euclidean MST problem. Many of thepractical problems that we might encounter could be formulated either asgeometricproblemsorasgraphproblems.Ifthephysicalplacementofobjectsisa dominating characteristic, then the geometric algorithms of Part 7 may be

called for; but if interconnections between objects are of fundamentalimportance,thenthegraphalgorithmsofthissectionmaybebetter.TheEuclideanMSTseemstofallattheinterfacebetweenthesetwoapproaches(theinputinvolvesgeometryandtheoutputinvolvesinterconnections),andthedevelopmentofsimple,straightforwardmethodsfortheEuclideanMSTremainsan elusive goal. In Chapter 21, we see a similar problem that falls at thisinterface,butwhereaEuclideanapproachadmitssubstantiallyfasteralgorithmsthandothecorrespondinggraphproblems.

Exercises•20.81Giveacounterexample toshowwhy thefollowingmethodfor findingtheEuclideanMSTdoesnotwork:“Sortthepointsontheirxcoordinates,thenfindtheminimumspanningtreesofthefirsthalfandthesecondhalf,thenfindtheshortestedgethatconnectsthem.”

•20.82DevelopafastversionofPrim’salgorithmforcomputingtheEuclideanMSTof a uniformly distributed set of points in the plane based on ignoringdistantpointsuntilthetreeapproachesthem.

••20.83Developanalgorithmthat,givenasetofNpointsintheplane,findsasetofedgesofcardinalityproportionaltoNthatiscertaintocontaintheMSTandissufficientlyeasytocomputethatyoucandevelopaconciseandefficientimplementationofyouralgorithm.

• 20.84 Given a random set of N points in the unit square (uniformlydistributed),empiricallydetermineavalueofd,towithintwodecimalplaces,suchthatthesetofedgesdefinedbyallpairsofpointswithindistancedofoneanotheris99percentcertaintocontaintheMST.

•20.85WorkExercise20.84forpointswhereeachcoordinateisdrawnfromaGaussiandistributionwithmean0.5andstandarddeviation0.1.

•20.86 Describe how you would improve the performance of Kruskal’s andBoruvka’salgorithmforsparseEuclideangraphs.

CHAPTERTWENTY-ONEShortestPaths

EVERYPATHINaweighteddigraphhasanassociatedpathweight,thevalueofwhich is the sum of the weights of that path’s edges. This essential measureallowsus to formulate suchproblemsas“find the lowest-weightpathbetweentwogivenvertices.”Theseshortest-pathsproblemsarethetopicofthischapter.Notonlyareshortest-pathsproblemsintuitiveformanydirectapplications,butthey also take us into a powerful and general realm where we seek efficientalgorithms to solve general problems that can encompass a broad variety ofspecificapplications.Several of the algorithms that we consider in this chapter relate directly tovarious algorithms that we examined in Chapters 17 through 20. Our basicgraph-search paradigm applies immediately, and several of the specificmechanisms that we used in Chapters 17 and 19 to address connectivity ingraphsanddigraphsprovidethebasisforustosolveshortest-pathsproblems.Foreconomy,we refer toweighteddigraphsasnetworks.Figure21.1 shows asample network,with standard representations.Wehave already developed anADTinterfacewithadjacency-matrixandadjacency-listsclassimplementationsfornetworksinSection20.1—wejustpasstrueasasecondargumentwhenwecall the constructor so that the class keeps one representation of each edge,precisely aswedidwhenderivingdigraph representations inChapter19 fromtheundirectedgraphrepresentationsinChapter17(seePrograms20.1 through20.4).As discussed at length in Chapter 20, we use pointers to abstract edges forweighted digraphs to broaden the applicability of our implementations. Thisapproachhascertainimplicationsthataredifferent

Figure21.1Samplenetworkandrepresentations

Thisnetwork(weighteddigraph)isshowninfourrepresentations:listofedges,drawing,adjacencymatrix,andadjacencylists(lefttoright).AswedidforMSTalgorithms,weshowtheweightsinmatrixentriesandinlistnodes,butuseedgepointersinourprograms.Whileweoftenuseedgeweightsthatareproportionaltotheirlengthsinthedrawing(aswedidforMSTalgorithms),wedonotinsistonthisrulebecausemostshortest-pathsalgorithmshandlearbitrarynonnegativeweights(negativeweightsdopresentspecialchallenges).Theadjacencymatrixisnotsymmetric,andtheadjacencylistscontainonenodeforeachedge(asinunweighteddigraphs).Nonexistentedgesarerepresentedbynullpointersinthematrix(blankinthefigure)andarenotpresentatallinthelists.Self-loopsof

length0arepresentbecausetheysimplifyourimplementationsofshortest-pathsalgorithms.Theyareomittedfromthelistofedgesatleftforeconomyandtoindicatethetypicalscenariowhereweaddthembyconventionwhenwecreate

anadjacency-matrixoradjacency-listsrepresentation.

fordigraphs than theones thatweconsidered forundirectedgraphs inSection20.1andareworthnoting.First, since there isonlyone representationofeachedge,wedonotneed touse the from function in the edgeclass (seeProgram20.1)whenusinganiterator:Inadigraph,e->from(v)istrueforeveryedgepointerefeturnbyan iterator forv.Second,aswesaw inChapter19, it isoftenusefulwhenprocessingadigraphtobeabletoworkwithitsreversegraph,butweneeda different approach than that taken by Program 19.1, because thatimplementationcreatesedgestocreatethereverse,andweassumethatagraphADTwhoseclientsprovidepointerstoedgesshouldnotcreateedgesonitsown(seeExercise21.3).Inapplicationsorsystemsforwhichweneedalltypesofgraphs,itisatextbookexerciseinsoftwareengineeringtodefineanetworkADTfromwhichADTsforthe unweighted undirected graphs of Chapters 17 and 18, the unweighteddigraphsofChapter19,ortheweightedundirectedgraphsofChapter20canbederived(seeExercise21.10).Whenweworkwithnetworks,itisgenerallyconvenienttokeepself-loopsinallthe representations. This convention allows algorithms the flexibility to use asentinelmaximum-valueweighttoindicatethatavertexcannotbereachedfromitself.Inourexamples,weuseself-loopsofweight0,althoughpositive-weightself-loops certainly make sense in many applications. Many applications alsocall for parallel edges, perhaps with differing weights. As we mentioned inSection 20.1, various options for ignoring or combining such edges areappropriateinvariousdifferentapplications.Inthischapter,forsimplicity,none

of our examples use parallel edges, andwedo not allowparallel edges in theadjacency-matrix representation; we also do not check for parallel edges orremovetheminadjacencylists.AlltheconnectivitypropertiesofdigraphsthatweconsideredinChapter19arerelevantinnetworks.Inthatchapter,wewishedto

Figure21.2Shortest-pathtrees

Ashortest-pathtree(SPT)definesshortestpathsfromtheroottoothervertices(seeDefinition21.2).Ingeneral,differentpathsmayhavethesamelength,sotheremaybemultipleSPTsdefiningtheshortestpathsfromagivenvertex.Intheexamplenetworkshownatleft,allshortestpathsfrom0aresubgraphsoftheDAGshowntotherightofthenetwork.Atreerootedat0spansthisDAGifand

onlyifitisanSPTfor0.Thetwotreesatrightaresuchtrees.

knowwhetheritispossibletogetfromonevertextoanother;inthischapter,wetakeweightsintoconsideration—wewishtofindthebestway toget fromonevertextoanother.Definition21.1Ashortestpathbetween twoverticessand t inanetwork isadirectedsimplepathfromstotwiththepropertythatnoothersuchpathhasalowerweight.Thisdefinitionissuccinct,butitsbrevitymaskspointsworthexamining.First,iftisnotreachablefroms,thereisnopathatall,andthereforethereisnoshortestpath.For convenience, the algorithms thatwe consider often treat this case asequivalent to one in which there exists an infinite-weight path from s to t.Second,aswedidforMSTalgorithms,weusenetworkswhereedgeweightsareproportional to edge lengths in examples, but the definition has no suchrequirementandouralgorithms(otherthantheoneinSection21.5)donotmakethis assumption. Indeed, shortest-paths algorithms are at their best when theydiscover counterintuitive shortcuts, such as a path between two vertices thatpassesthroughseveralotherverticesbuthastotalweightsmallerthanthatofadirectedgeconnectingthosevertices.Third,theremaybemultiplepathsofthe

sameweightfromonevertextoanother;wetypicallyarecontenttofindoneofthem.Figure21.2showsanexamplewithgeneralweights that illustrates thesepoints.Therestrictioninthedefinitiontosimplepathsisunnecessaryinnetworksthatcontainedgesthathavenonnegativeweight,becauseanycycleinapathinsuchanetworkcanberemovedtogiveapaththatisnolonger(andisshorterunlessthe cycle comprises zero-weight edges).Butwhenwe consider networkswithedges that could have negative weight, the need for the restriction to simplepaths is readily apparent: Otherwise, the concept of a shortest path ismeaningless if there is a cycle in the network that has negative weight. Forexample,supposethattheedge3-5inthenetworkinFigure21.1were tohaveweight-.38,andedge5-1weretohaveweight-.31.Then,theweightofthecycle1-4-3-5-1wouldbe .32+ .36- .38- .31=-.01,andwecouldspinaroundthatcycle to generate arbitrarily short paths. Note carefully that, as is true in thisexample,itisnotnecessaryforalltheedgesonanegative-weightcycletobeofnegativeweight;whatcountsisthesumoftheedgeweights.Forbrevity,weusethetermnegativecycletorefertodirectedcycleswhosetotalweightisnegative.In the definition, suppose that some vertex on a path from s to t is also on anegativecycle.Inthiscase,theexistenceofa(nonsimple)shortestpathfromstotwouldbeacontradiction,becausewecoulduse thecycle toconstructapaththat had aweight lower than anygivenvalue.To avoid this contradiction,werestricttosimplepathsinthedefinitionsothattheconceptofashortestpathiswell defined in any network.However,we do not consider negative cycles innetworks until Section 21.7, because, as we see there, they present a trulyfundamentalbarriertothesolutionofshortest-pathsproblems.Tofindshortestpathsinaweightedundirectedgraph,webuildanetworkwiththe sameverticesandwith twoedges (one ineachdirection)corresponding toeach edge in the graph.There is a one-to-one correspondence between simplepathsinthenetworkandsimplepathsinthegraph,andthecostsofthepathsarethesame;soshortest-pathsproblemsareequivalent.Indeed,webuildpreciselysuchanetworkwhenwebuildthestandardadjacency-listsoradjacency-matrixrepresentationof aweighted undirectedgraph (see, for example, Figure20.3).This construction is not helpful if weights can be negative, because it givesnegativecyclesinthenetwork,andwedonotknowhowtosolveshortest-pathsproblems in networks that havenegative cycles (seeSection21.7).Otherwise,the algorithms for networks that we consider in this chapter also work forweightedundirectedgraphs.

Incertainapplications,itisconvenienttohaveweightsonverticesinsteadof,orinadditionto,weightsonedges;andwemightalsoconsidermorecomplicatedproblemswhereboththenumberofedgesonthepathandtheoverallweightofthepathplayarole.Wecanhandlesuchproblemsbyrecastingthemintermsofedge-weighted networks (see, for example, Exercise 21.4) or by slightlyextendingthebasicalgorithms(see,forexample,Exercise21.52).

Figure21.3Allshortestpaths

ThistablegivesalltheshortestpathsinthenetworkofFigure21.1andtheirlengths.Thisnetworkisstronglyconnected,sothereexistpathsconnectingeach

pairofvertices.

The goal of a source-sink shortest-path algorithm is to compute one of theentries in this table; the goal of a single-source shortest-paths algorithm is tocomputeoneoftherowsinthistable;andthegoalofanall-pairsshortest-pathsalgorithm is to compute the whole table. Generally, we use more compactrepresentations,whichcontainessentiallythesameinformationandallowclientstotraceanypathintimeproportionaltoitsnumberofedges(seeFigure21.8).Because the distinction is clear from the context,we do not introduce specialterminologytodistinguishshortestpathsinweightedgraphsfromshortestpathsingraphs thathavenoweights (whereapath’sweight is simply itsnumberofedges—see Section 17.7). The usual nomenclature refers to (edge-weighted)networks,asusedinthischapter,sincethespecialcasespresentedbyundirectedorunweightedgraphsarehandledeasilybyalgorithmsthatprocessnetworks.WeareinterestedinthesamebasicproblemsthatwedefinedforundirectedandunweightedgraphsinSection18.7.Werestatethemhere,notingthatDefinition21.1implicitlygeneralizesthemtotakeweightsintoaccountinnetworks.Source–sinkshortestpathGivena start vertex s and a finish vertex t, find ashortestpathinthegraphfromstot.Werefer tothestartvertexasthesourceandtothefinishvertexasthesink,exceptincontextswherethisusageconflictswith the definition of sources (vertices with no incoming edges) and sinks(verticeswithnooutgoingedges)indigraphs.Single-sourceshortestpathsGivenastartvertexs,findshortestpathsfromsto

eachothervertexinthegraph.All-pairsshortestpathsFindshortestpathsconnectingeachpairofverticesinthegraph.Forbrevity,wesometimesusethetermallshortestpaths torefertothissetofV2paths.Iftherearemultipleshortestpathsconnectinganygivenpairofvertices,wearecontenttofindanyoneofthem.Sincepathshavevaryingnumberofedges,ourimplementations providemember functions that allow clients to trace paths intimeproportionaltothepaths’lengths.Anyshortestpathalsoimplicitlygivesusthe shortest-path length,butour implementationsexplicitlyprovide lengths. Insummary, to be precise, when we say “find a shortest path” in the problemstatementsjustgiven,wemean“computetheshortest-pathlengthandawaytotraceaspecificpathintimeproportionaltothatpath’slength.”Figure21.3illustratesshortestpathsfortheexamplenetworkinFigure21.1. InnetworkswithVvertices,weneedtospecifyVpathstosolvethesingle-sourceproblem, and to specify V2 paths to solve the all-pairs problem. In ourimplementations,weusearepresentationmorecompactthantheselistsofpaths;wefirstnoteditinSection18.7,andweconsideritindetailinSection21.1.InC++ implementations,webuildouralgorithmic solutions to theseproblemsintoADT implementations that allowus tobuildefficient clientprograms thatcansolveavarietyofpracticalgraph-processingproblems.Forexample,asweseeinSection21.3,weimplementsolutionstotheall-pairsshortest-pathsclassesas constructorswithin classes that support constant-time shortest-path queries.Wealsobuildclassestosolvesingle-sourceproblemssothatclientswhoneedtocomputeshortestpathsfromaspecificvertex(orasmallsetofthem)canavoidtheexpenseofcomputingshortestpathsforothervertices.Carefulconsiderationofsuch issuesandproperuseof thealgorithms thatweexaminecanmean thedifferencebetweenanefficientsolutiontoapracticalproblemandasolutionthatissocostlythatnoclientcouldaffordtouseit.Shortest-pathsproblemsariseinvariousguisesinnumerousapplications.Manyof theapplicationsappeal immediately togeometric intuition,butmanyothersinvolve arbitrary cost structures. As we did with minimum spanning trees(MSTs) inChapter20,we sometimes take advantage of geometric intuition tohelp develop an understanding of algorithms that solve the problems but staycognizant that our algorithms operate properly in more general settings. InSection 21.5, we do consider specialized algorithms for Euclidean networks.Moreimportant,inSections21.6and21.7,weseethatthebasicalgorithmsareeffectivefornumerousapplicationswherenetworksrepresentanabstractmodel

ofthecomputation.Roadmaps Tables that give distances between all pairs ofmajor cities are aprominentfeatureofmanyroadmaps.Wepresumethatthemapmakertookthetroubletobesurethatthedistancesaretheshortestones,butourassumptionisnotnecessarilyalwaysvalid(see,forexample,Exercise21.11).Generally,suchtablesareforundirectedgraphsthatweshouldtreatasnetworkswithedgesinboth directions corresponding to each road, though we might contemplatehandlingone-waystreetsforcitymapsandsomesimilarapplications.

Figure21.4Distancesandpaths

RoadmapstypicallycontaindistancetablesliketheoneinthecenterforthistinysubsetofFrenchcitiesconnectedbyhighwaysasshowninthegraphatthetop.Thoughrarelyfoundinmaps,atableliketheoneatthebottomwouldalso

beuseful,asittellswhatsignstofollowtoexecutetheshortestpath.Forexample,todecidehowtogetfromParistoNice,wecancheckthetable,which

saystobeginbyfollowingsignstoLyon.

AsweseeinSection21.3,itisnotdifficulttoprovideotherusefulinformation,suchasatablethattellshowtoexecutetheshortestpaths(seeFigure21.4). In

modernapplications, embedded systemsprovide thiskindof capability incarsand transportation systems. Maps are Euclidean graphs; in Section 21.4, weexamine shortest-paths algorithms that take into account the vertex positionwhentheyseekshortestpaths.Airline routes Route maps and schedules for airlines or other transportationsystems can be represented as networks for which various shortest-pathsproblemsareofdirectimportance.Forexample,wemightwishtominimizethetime that it takes to flybetween twocities,or tominimize thecostof the trip.Costs insuchnetworksmight involve functionsof time,ofmoney,orofothercomplicated resources. For example, flights between two cities typically takemore time in one direction than the other because of prevailing winds. Airtravelers also know that the fare is not necessarily a simple function of thedistance between the cities—situationswhere it is cheaper to use a circuitousroute(orendureastopover)thantotakeadirectflightarealltoocommon.Suchcomplications can be handled by the basic shortest-paths algorithms that weconsider in this chapter; these algorithms are designed to handle any positivecosts.The fundamental shortest-paths computations suggested by these applicationsonly scratch the surface of the applicability of shortest-paths algorithms. InSection21.6,weconsiderproblemsfromapplicationsareasthatappearunrelatedto these natural ones, in the context of a discussion of reduction, a formalmechanism for proving relationships among problems.We solve problems forthese applications by transforming them into abstract shortest-paths problemsthat do not have the intuitive geometric feel of the problems just described.Indeed, some applications lead us to consider shortest-paths problems innetworkswithnegativeweights.Suchproblemscanbefarmoredifficulttosolvethanareproblemswherenegativeweightscannotoccur.Shortest-pathsproblemsforsuchapplicationsnotonlybridgeagapbetweenelementaryalgorithmsandunsolved algorithmic challenges but also lead us to powerful and generalproblem-solvingmechanisms.As with MST algorithms in Chapter 20, we often mix the weight, cost, anddistancemetaphors.Again,wenormallyexploitthenaturalappealofgeometricintuition even when working in more general settings with arbitrary edgeweights; thuswe refer to the“length”ofpathsandedgeswhenweshouldsay“weight”and toonepathas“shorter” thananotherwhenweshouldsay that it“has lowerweight.”Wealsomightsay thatv is“closer” tos thanwwhenweshould say that “the lowest-weightdirectedpath froms tovhasweight lowerthanthatofthelowest-weightdirectedpathstow,”andsoforth.Thisusageis

inherentinthestandarduseoftheterm“shortestpaths”andisnaturalevenwhenweightsarenotrelatedtodistances(seeFigure21.2);however,whenweexpandour algorithms to handle negative weights in Section 21.6, we must abandonsuchusage.This chapter is organized as follows. After introducing the basic underlyingprinciples inSection21.1,we introduce basic algorithms for the single-sourceand all-pairs shortest-paths problems in Sections 21.2 and 21.3. Then, weconsideracyclicnetworks(or,inaclashofshorthandterms,weightedDAGs)inSection 21.4 and ways of exploiting geometric properties for the source–sinkproblem in Euclidean graphs in Section 21.5. We then cast off in the otherdirectiontolookatmoregeneralproblemsinSections21.6and21.7,whereweexplore shortest-paths algorithms, perhaps involving networks with negativeweights,asahigh-levelproblem-solvingtool.

Exercises•21.1Labelthefollowingpointsintheplane0through5,respectively:

(1,3)(2,1)(6,5)(3,4)(3,7)(5,3).Takingedgelengthstobeweights,considerthenetworkdefinedbytheedges

1-03-55-23-45-10-30-44-22-3.Draw the network and give the adjacency-lists structure that is built byProgram20.5.21.2Show,inthestyleofFigure21.3,allshortestpathsinthenetworkdefinedinExercise21.1.•21.3Developanetworkclassimplementationthatrepresentsthereverseoftheweighted digraph defined by the edges inserted. Include a “reverse copy”constructorthattakesagraphasargumentandinsertsallthatgraph’sedgestobuilditsreverse.

• 21.4 Show that shortest-paths computations in networks with nonnegativeweightsonbothverticesandedges(wheretheweightofapathisdefinedtobethe sum of the weights of the vertices and the edges on the path) can behandledbybuildinganetworkADTthathasweightsononlytheedges.21.5Findalargenetworkonline—perhapsageographicdatabasewithentriesfor roads that connect cities or an airline or railroad schedule that containsdistancesorcosts.21.6Writearandom-networkgeneratorforsparsenetworksbasedonProgram17.12.Toassignedgeweights,definea random-edge–weightADTandwrite

twoimplementations:onethatgeneratesuniformlydistributedweights,anotherthat generates weights according to a Gaussian distribution. Write clientprograms to generate sparse random networks for both weight distributionswithawell-chosensetofvaluesofVandE so thatyoucanuse them to runempiricaltestsongraphsdrawnfromvariousdistributionsofedgeweights.•21.7Writearandom-networkgeneratorfordensenetworksbasedonProgram17.13andedge-weightgeneratorsasdescribed inExercise21.6.Write clientprograms to generate random networks for both weight distributions with awell-chosensetofvaluesofVandEsothatyoucanusethemtorunempiricaltestsongraphsdrawnfromthesemodels.21.8 Implement a representation-independent network client function thatbuildsanetworkbytakingedgeswithweights(pairsofintegersbetween0andV−1withweightsbetween0and1)fromstandardinput.•21.9WriteaprogramthatgeneratesVrandompointsintheplane,thenbuildsanetworkwithedges(inbothdirections)connectingallpairsofpointswithinagiven distance d of one another (see Exercise 17.74), setting each edge’sweight to the distance between the two points that that edge connects.DeterminehowtosetdsothattheexpectednumberofedgesisE.

•21.10WriteabaseclassandderivedclassesthatimplementADTsforgraphsthatmaybeundirectedordirectedgraphs,weightedorunweighted,anddenseorsparse.

•21.11 The following table from a published road map purports to give thelengthoftheshortestroutesconnectingthecities.Itcontainsanerror.Correctthetable.Also,addatablethatshowshowtoexecutetheshortestroutes,inthestyleofFigure21.4.

21.1UnderlyingPrinciplesOur shortest-paths algorithms are based on a simple operation known asrelaxation.Westartashortest-pathsalgorithmknowingonlythenetwork’sedgesandweights.Asweproceed,wegatherinformationabouttheshortestpathsthatconnect various pairs of vertices. Our algorithms all update this information

incrementally,makingnewinferencesaboutshortestpathsfromtheknowledgegained so far.At each step,we testwhetherwecan findapath that is shorterthansomeknownpath.Theterm“relaxation”iscommonlyusedtodescribethisstep,which relaxes constraints on the shortest path.We can think of a rubberbandstretched tightonapathconnecting twovertices:Asuccessful relaxationoperationallowsustorelaxthetensionontherubberbandalongashorterpath.Ouralgorithmsarebasedonapplyingrepeatedlyoneoftwotypesofrelaxationoperations:• Edge relaxation: Test whether traveling along a given edge gives a newshortestpathtoitsdestinationvertex.

•Path relaxation: Testwhether traveling through a given vertex gives a newshortestpathconnectingtwoothergivenvertices.

Edgerelaxationisaspecialcaseofpathrelaxation;weconsidertheoperationsseparately,however,becauseweusethemseparately(theformerinsingle-sourcealgorithms; the latter in all-pairs algorithms). In both cases, the primerequirement thatwe imposeon thedatastructures thatweuse to represent thecurrent stateofourknowledgeaboutanetwork’s shortestpaths is thatwecanupdatethemeasilytoreflectchangesimpliedbyarelaxationoperation.First,we consider edge relaxation,which is illustrated in Figure21.5.All thesingle-sourceshortest-pathsalgorithmsthatweconsiderarebasedon thisstep:Doesagivenedgeleadustoconsiderashorterpathtoitsdestinationfromthesource?

Figure21.5Edgerelaxation

Thesediagramsillustratetherelaxationoperationthatunderliesoursingle-sourceshortest-pathsalgorithms.Wekeeptrackoftheshortestknownpathfrom

thesourcestoeachvertexandaskwhetheranedgev-wgivesusashorterpathtow.Inthetopexample,itdoesnot;sowewouldignoreit.Inthebottom

example,itdoes;sowewouldupdateourdatastructurestoindicatethatthebestknownwaytogettowfromsistogotov,thentakev-w.

The data structures thatwe need to support this operation are straightforward.First,wehavethebasicrequirementthatweneedtocomputetheshortest-pathslengthsfromthesourcetoeachoftheothervertices.Ourconventionwillbetostoreinavertex-indexedvectorwtthelengthsoftheshortestknownpathsfromthe source toeachvertex.Second, to record thepaths themselvesaswemovefromvertextovertex,ourconventionwillbethesameastheonethatweusedforothergraph-searchalgorithmsthatweexaminedinChapters18through20:We use a vertex-indexed vector spt to record the last edge on a shortest pathfromthesourcetotheindexedvertex.Theseedgesconstituteatree.With these data structures, implementing edge relaxation is a straightforwardtask.Inoursingle-sourceshortest-pathscode,weusethefollowingcodetorelaxalonganedgeefromvtow:if(wt[w]>wt[v]+e->wt()){wt[w]=wt[v]+e->wt();spt[w]=e;}

Thiscodefragmentisbothsimpleanddescriptive;weincludeitinthisforminour implementations, rather than defining relaxation as a higher-level abstractoperation.Definition 21.2Given a network and a designated vertex s, a shortest-pathstree(SPT)forsisasubnetworkcontainingsandalltheverticesreachablefromsthatformsadirectedtreerootedatssuchthateverytreepathisashortestpathinthenetwork.There may be multiple paths of the same length connecting a given pair ofnodes, so SPTs are not necessarily unique. In general, as illustrated in Figure21.2,ifwetakeshortestpathsfromavertexstoeveryvertexreachablefromsinanetworkandfromthesubnetworkinducedbytheedgesinthepaths,wemaygetaDAG.Differentshortestpathsconnectingpairsofnodesmayeachappearasasubpathinsomelongerpathcontainingbothnodes.Becauseofsucheffects,we generally are content to compute any SPT for a given digraph and startvertex.Our algorithms generally initialize the entries in thewt vectorwith a sentinelvalue.Thatvalueneedstobesufficientlysmallthattheadditionintherelaxationtest does not cause overflow and sufficiently large that no simple path has a

largerweight.Forexample,ifedgeweightsarebetween0and1,wecanusethevalueV.Note thatwehave to take extra care to checkour assumptionswhenusing sentinels in networks that could have negativeweights. For example, ifboth vertices have the sentinel value, the relaxation code just given takes noaction if e.wt is nonnegative (which is probably what we intend in mostimplementations),butitwillchangewt[w]andspt[w]iftheweightisnegative.OurcodealwaysusesthedestinationvertexastheindextosavetheSPTedges(spt[w]->w()==w).ForeconomyandconsistencywithChapters17through19,weusethenotationst[w]torefertothevertexspt[w]->v()(inthetextandparticularlyinthefigures)toemphasizethatthesptvectorisactuallyaparent-linkrepresentationof

Figure21.6Shortestpathstrees

Theshortestpathsfrom0totheothernodesinthisnetworkare0-1,0-5-4-2,0-5-4-3,0-5-4,and0-5,respectively.Thesepathsdefineaspanningtree,whichisdepictedinthreerepresentations(grayedgesinthenetworkdrawing,orientedtree,andparentlinkswithweights)inthecenter.Linksintheparent-link

representation(theonethatwetypicallycompute)runintheoppositedirectionthanlinksinthedigraph,sowesometimesworkwiththereversedigraph.Thespanningtreedefinedbyshortestpathsfrom3toeachoftheothernodesinthereverseisdepictedontheright.Theparent-linkrepresentationofthistreegivestheshortestpathsfromeachoftheothernodesto2intheoriginalgraph.For

example,wecanfindtheshortestpath0-5-4-3from0to3byfollowingthelinksst[0]=5,st[5]=4,andst[4]=3.

theshortest-pathstree,asillustratedinFigure21.6.Wecancomputetheshortestpath from s to t by traveling up the tree from t to s;when we do so, we aretraversingedgesinthedirectionoppositefromtheirdirectioninthenetworkandarevisitingtheverticesonthepathinreverseorder(t,st[t],st[st[t]],andsoforth).OnewaytogettheedgesonthepathinorderfromsourcetosinkfromanSPTis

touseastack.Forexample,thefollowingcodeprintsapathfromthesourcetoagivenvertexw:

stack<EDGE>P;EDGEe=spt[w];

while(e){P.push(e);e=spt[e->v()]);}

if(P.empty())cout<<P.top()->v();

while(!P.empty())

{cout<<’-’<<P.top()->w();P.pop();}

Inaclass implementation,wecouldusecode similar to this toprovideclientswithavectorthatcontainstheedgesofthepath.Ifwesimplywanttoprintorotherwiseprocesstheedgesonthepath,goingallthewaythroughthepathinreverseordertogettothefirstedgeinthiswaymaybe undesirable.One approach to get around this difficulty is toworkwith thereverse network, as illustrated in Figure 21.6.We use reverse order and edgerelaxation in single-source problems because the SPT gives a compactrepresentationoftheshortestpathsfromthesourcetoalltheothervertices,inavectorwithjustVentries.Next,we consider path relaxation,which is the basis of some of our all-pairsalgorithms: Does going through a given vertex lead us to a shorter path thatconnectstwoothergivenvertices?Forexample,supposethat,forthreeverticess,x,andt,wewishtoknowwhetheritisbettertogofromstoxandthenfromxtotortogofromstotwithoutgoingthroughx.Forstraight-lineconnectionsinaEuclideanspace,thetriangleinequalitytellsusthattheroutethroughxcannotbeshorterthanthedirectroutefromstot,butforpathsinanetwork,itcouldbe(seeFigure21.7). To determinewhich,we need to know the lengths of pathsfroms tox,x to t, and of those from s to t (that do not include x). Then,wesimplytestwhetherornotthesumofthefirsttwoislessthanthethird;ifitis,weupdateourrecordsaccordingly.Path relaxation is appropriate for all-pairs solutions where we maintain thelengths of the shortest paths that we have encountered between all pairs ofvertices.Specifically,inall-pairs–shortest-pathscodeofthiskind,wemaintainavectorofvectorsdsuchthatd[s][t]istheshortest-pathlengthfromstot,andwealso maintain a vector of vectors p such that p[s][t] is the next vertex on ashortestpathfromstot.Werefertotheformerasthedistancesmatrixandthelatterasthepathsmatrix.Figure21.8shows the twomatrices forourexamplenetwork.Thedistancesmatrix isaprimeobjectiveof thecomputation,andweuse the paths matrix because it is clearly more compact than, but carries thesameinformationas,thefulllistofpathsthatisillustratedinFigure21.3.

Figure21.7Pathrelaxation

Thesediagramsillustratetherelaxationoperationthatunderliesourall-pairsshortest-pathsalgorithms.Wekeeptrackofthebestknownpathbetweenallpairsofverticesandaskwhetheravertexiisevidencethattheshortestknownpathfromstotcouldbeimproved.Inthetopexample,itisnot;inthebottomexample,itis.Wheneverweencounteravertexisuchthatthelengthofthe

shortestknownpathfromstoiplusthelengthoftheshortestknownpathfromitotissmallerthanthelengthoftheshortestknownpathfromstot,thenwe

updateourdatastructurestoindicatethatwenowknowashorterpathfromstot(headtowardsifirst).

Intermsofthesedatastructures,pathrelaxationamountstothefollowingcode:if(d[s][t]>d[s][x]+d[x][t])

{d[s][t]=d[s][x]+d[x][t];p[s][t]=p[s][x];}

Likeedgerelaxation,thiscodereadsasarestatementoftheinformaldescriptionthatwehavegiven,soweuseitdirectlyinourimplementations.Moreformally,pathrelaxationreflectsthefollowing.Property 21.1 If a vertex x is on a shortest path from s to t, then that pathconsistsofashortestpathfromstoxfollowedbyashortestpathfromxtot.Proof:Bycontradiction.Wecoulduseanyshorterpathfromstoxorfromxtottobuildashorterpathfromstot.We encountered the path-relaxation operation when we discussedtransitiveclosure algorithms, in Section19.3. If the edge and pathweights areeither1orinfinite(thatis,apath’sweightis1onlyifallthatpath’sedgeshaveweight 1), then path relaxation is the operation that we used in Warshall’salgorithm(ifwehaveapathfromstoxandapathfromxtot,thenwehaveapathfromstot).Ifwedefineapath’sweighttobethenumberofedgesonthatpath,then

Figure21.8Allshortestpaths

Thetwomatricesontherightarecompactrepresentationsofalltheshortestpathsinthesamplenetworkontheleft,containingthesameinformationintheexhaustivelistinFigure21.3.Thedistancesmatrixontheleftcontainsthe

shortest-pathlength:Theentryinrowsandcolumntisthelengthoftheshortestpathfromstot.Thepathsmatrixontherightcontainstheinformationneededtoexecutethepath:Theentryinrowsandcolumntisthenextvertexonthepath

fromstot.

Warshall’s algorithm generalizes to Floyd’s algorithm for finding all shortestpathsinunweighteddigraphs;itfurthergeneralizestoapplytonetworks,asweseeinSection21.3.Fromamathematician’sperspective,itisimportanttonotethatthesealgorithmsall can be cast in a general algebraic setting that unifies and helps us tounderstandthem.Fromaprogrammer’sperspective, it is importanttonotethatwe can implement each of these algorithms using an abstract + operator (tocomputepathweightsfromedgeweights)andanabstract<operator(tocomputetheminimumvalue ina setofpathweights),both solely in thecontextof therelaxationoperation(seeExercises19.55and19.56).Property21.1impliesthatashortestpathfromstotcontainsshortestpathsfroms toeveryothervertexalong thepath to t.Most shortest-pathsalgorithmsalsocompute shortest paths from s to every vertex that is closer to s than to t(whether or not the vertex is on the path from s to t), although that is not arequirement (see Exercise 21.18). Solving the source–sink shortest-pathsproblemwith such an algorithmwhen t is the vertex that is farthest from s isequivalenttosolvingthesingle-sourceshortest-pathsproblemfors.Conversely,wecoulduseasolutiontothesingle-sourceshortest-pathsproblemfromsasamethodforfindingthevertexthatisfarthestfroms.Thepathsmatrixthatweuseinourimplementationsfortheall-pairsproblemisalso a representation of the shortest-paths trees for each of the vertices. We

definedp[s][t]tobethevertexthatfollowssonashortestpathfromstot.Itisthusthesameasthevertexthatprecedessontheshortestpathfromttosinthereversenetwork.Inotherwords,column tinthepathsmatrixofanetworkisavertex-indexed vector that represents the SPT for vertex t in its reverse.Conversely,wecanbuildthepathsmatrixforanetworkby

Figure21.9Allshortestpathsinanetwork

ThesediagramsdepicttheSPTsforeachvertexinthereverseofthenetworkinFigure21.8(0to5,toptobottom),asnetworksubtrees(left),orientedtrees

(center),andparent-linkrepresentationincludingavertex-indexedarrayforpathlength(right).Puttingthearraystogethertoformpathanddistancematrices

(whereeacharraybecomesacolumn)givesthesolutiontotheall-pairsshortest-pathsproblemillustratedinFigure21.8.

fillingeachcolumnwiththevertex-indexedvectorrepresentationoftheSPTfortheappropriatevertexinthereverse.ThiscorrespondenceisillustratedinFigure21.9.In summary, relaxation gives us the basic abstract operations that we need tobuildour shortest paths algorithms.Theprimary complication is the choiceofwhether to provide the first or final edge on the shortest path. For example,single-source algorithms are more naturally expressed by providing the finaledge on the path so that we need only a single vertex-indexed vector toreconstructthepath,sinceallpathsleadbacktothesource.Thischoicedoesnotpresentafundamentaldifficultybecausewecaneitherusethereversegraphaswarranted or providemember functions that hide this difference from clients.Forexample,wecould specifyamember function in the interface that returnstheedgesontheshortestpathinavector(seeExercises21.15and21.16).Accordingly,forsimplicity,allofourimplementationsinthischapterincludeamember function dist that returns a shortest-path length and either a memberfunctionpaththatreturnsthefirstedgeonashortestpathoramemberfunctionpathRthatreturnsthefinaledgeonashortestpath.Forexample,oursingle-sourceimplementationsthatuseedgerelaxationtypicallyimplementthesefunctionsasfollows:

Edge*pathR(intw)const{returnspt[w];}

doubledist(intv){returnwt[v];}

Similarly, our all-paths implementations that use path relaxation typicallyimplementthesefunctionsasfollows:

Edge*path(ints,intt){returnp[s][t];}

doubledist(ints,intt){returnd[s][t];}

Insomesituations,itmightbeworthwhiletobuildinterfacesthatstandardizeononeortheotherorbothoftheseoptions;wechoosethemostnaturaloneforthealgorithmathand.

Exercises

•21.12DrawtheSPTfrom0forthenetworkdefinedinExercise21.1andforitsreverse.Givetheparent-linkrepresentationofbothtrees.21.13 Consider the edges in the network defined in Exercise 21.1 to beundirected edges such that each edge corresponds to equal-weight edges inbothdirections in thenetwork.AnswerExercise21.12for thiscorrespondingnetwork.•21.14 Change the direction of edge 0-2 in Figure 21.2. Draw two differentSPTsthatarerootedat2forthismodifiednetwork.21.15Write a function that uses the pathRmember function from a single-sourceimplementationtoputpointerstotheedgesonthepathfromthesourcevtoagivenvertexwinanSTLvector.21.16Write a function that uses the pathmember function froman all-pathsimplementationtoputpointerstotheedgesonthepathfromagivenvertexvtoanothergivenvertexwinanSTLvector.21.17WriteaprogramthatusesyourfunctionfromExercise21.16toprintoutallofthepaths,inthestyleofFigure21.3.21.18Giveanexamplethatshowshowwecouldknowthatapathfromstotisshortestwithoutknowingthelengthofashorterpathfromstoxforsomex.

21.2Dijkstra’sAlgorithmIn Section 20.3, we discussed Prim’s algorithm for finding the minimumspanningtree(MST)ofaweightedundirectedgraph:Webuilditoneedgeatatime,alwaystakingnexttheshortestedgethatconnectsavertexontheMSTtoavertexnotyetontheMST.WecanuseanearlyidenticalschemetocomputeanSPT.Webegin byputting the source on theSPT; then,webuild theSPToneedgeata time,always takingnext theedge thatgivesashortestpathfromthesourcetoavertexnotontheSPT.Inotherwords,weaddverticestotheSPTinorder of their distance (through the SPT) to the start vertex. This method isknownasDijkstra’salgorithm.Asusual,weneed tomake adistinctionbetween the algorithmat the level ofabstraction in this informal description and various concrete implementations(suchasProgram21.1)thatdifferprimarilyingraphrepresentationandpriority-queueimplementations,eventhoughsuchadistinctionisnotalwaysmadeintheliterature. We shall consider other implementations and discuss theirrelationships with Program 21.1 after establishing that Dijkstra’s algorithmcorrectlyperformsthesingle-sourceshortest-pathscomputation.

Property 21.2 Dijkstra’s algorithm solves the single-source shortest-pathsprobleminnetworksthathavenonnegativeweights.Proof:Givenasourcevertexs,wehavetoestablishthatthetreepathfromtherootstoeachvertexxinthetreecomputedbyDijkstra’salgorithmcorrespondsto a shortest path in the graph from s to x. This fact follows by induction.Assuming that the subtree so far computed has the property,we need only toprovethataddinganewvertexxaddsashortestpathtothatvertex.Butallotherpathstoxmustbeginwithatreepathfollowedbyanedgetoavertexnotonthetree.Byconstruction,allsuchpathsare longer than theonefroms tox that isunderconsideration.The same argument shows that Dijkstra’s algorithm solves the source–sinkshortest-pathsproblem, ifwestartat thesourceandstopwhen thesinkcomesoffthepriorityqueue.Theproofbreaksdowniftheedgeweightscouldbenegative,becauseitassumesthatapath’slengthdoesnotdecreasewhenweaddmoreedgestothepath.Inanetworkwith negative edgeweights, this assumption is not valid becauseanyedge that we encounter might lead to some tree vertex and might have asufficiently largenegativeweight togiveapath to thatvertex shorter than thetreepath.WeconsiderthisdefectinSection21.7(seeFigure21.28).Figure21.10showstheevolutionofanSPTforasamplegraphwhencomputedwithDijkstra’s algorithm;Figure21.11 shows an oriented drawing of a largerSPT tree.AlthoughDijkstra’s algorithmdiffers fromPrim’sMSTalgorithm inonlythechoiceofpriority,SPTtreesaredifferentincharacterfromMSTs.Theyare rooted at the start vertex and all edges are directed away from the root,whereasMSTsareunrooted andundirected.We sometimes representMSTsasdirected,rootedtreeswhenweusePrim’salgorithm,butsuchstructuresarestilldifferent incharacter fromSPTs (compare theorienteddrawing inFigure20.9with the drawing in Figure 21.11). Indeed, the nature of the SPT somewhatdependsonthechoiceofstartvertexaswell,asdepictedinFigure21.12.Dijkstra’s original implementation, which is suitable for dense graphs, isprecisely like Prim’s MST algorithm. Specifically, we simply change theassignmentofthepriorityPinProgram20.6from

P=e->wt()

(theedgeweight)toP=wt[v]+e->wt()

(thedistance from the source to the edge’sdestination).This changegives the

classicalimplementationofDijkstra’salgorithm:WegrowFigure21.10Dijkstra’salgorithm

Thissequencedepictstheconstructionofashortest-pathsspanningtreerootedatvertex0byDijkstra’salgorithmforasamplenetwork.Thickblackedgesinthenetworkdiagramsaretreeedges,andthickgrayedgesarefringeedges.Orienteddrawingsofthetreeasitgrowsareshowninthecenter,andalistoffringeedges

isgivenontheright.

Thefirststepistoadd0tothetreeandtheedgesleavingit,0-1and0-5,tothefringe(top).Second,wemovetheshortestofthoseedges,0-5,fromthefringetothetreeandchecktheedgesleavingit:Theedge5-4isaddedtothefringeandtheedge5-1isdiscardedbecauseitisnotpartofashorterpathfrom0to1thantheknownpath0-1 (second from top).Thepriorityof5-4on the fringe is thelengthofthepathfrom0thatitrepresents,0-5-4.Third,wemove0-1fromthefringetothetree,add1-2tothefringe,anddiscard1-4(thirdfromtop).Fourth,wemove5-4fromthefringetothetree,add4-3tothefringe,andreplace1-2with4-2because0-5-4-2isashorterpaththan0-1-2(fourthfromtop).Wekeepatmostoneedge toanyvertexon the fringe,choosing theoneon theshortestpathfrom0.Wecompletethecomputationbymoving4-2andthen4-3fromthefringetothetree(bottom).

Figure21.11Shortest-pathsspanningtree

ThisfigureillustratestheprogressofDijkstra’salgorithminsolvingthesingle-sourceshortest-pathsprobleminarandomEuclideannear-neighbordigraph

(withdirectededgesinbothdirectionscorrespondingtoeachlinedrawn),inthesamestyleasFigures18.13,18.24,and20.9.Thesearchtreeissimilarin

charactertoBFSbecauseverticestendtobeconnectedtooneanotherbyshortpaths,butitisslightlydeeperandlessbroadbecausedistancesleadtoslightly

longerpathsthanpathlengths.

an SPT one edge at a time, each time updating the distance to the tree of allvertices adjacent to its destination while at the same time checking all thenontreeverticestofindanedgetomovetothetreewhosedestinationvertexisanontreevertexofminimaldistancefromthesource.Property 21.3 With Dijkstra’s algorithm, we can find any SPT in a densenetworkinlineartime.Proof:AsforPrim’sMSTalgorithm,itisimmediatelyclear,frominspectionofthecodeofProgram20.6, that therunningtimeisproportional toV2,which islinearfordensegraphs.For sparse graphs, we can do better, by viewing Dijkstra’s algorithm as ageneralizedgraph-searchingmethod thatdiffers fromdepth-first search (DFS),frombreadth-firstsearch(BFS),andfromPrim’sMSTalgorithminonlytheruleusedtoaddedgestothetree.AsinChapter20,wekeepedgesthatconnecttreevertices to nontree vertices on a generalized queue called the fringe, use apriority queue to implement the generalized queue, and provide for updatingpriorities so as to encompass DFS, BFS, and Prim’s algorithm in a singleimplementation(seeSection20.3).Thispriority-firstsearch(PFS)schemealsoencompasses Dijkstra’s algorithm. That is, changing the assignment of P inProgram20.7to

P=wt[v]+e->wt()

(thedistancefromthesourcetotheedge’sdestination)givesanimplementationofDijkstra’salgorithmthatissuitableforsparsegraphs.Program 21.1 is an alternative PFS implementation for sparse graphs that isslightly simpler than Program 20.7 and that directly matches the informaldescription of Dijkstra’s algorithm given at the beginning of this section. Itdiffers from Program 20.7 in that it initializes the priority queue with all theverticesinthenetworkandmaintainsthequeuewiththeaidofasentinelvalueforthoseverticesthatareneitheronthetreenoronthefringe(unseenverticeswithsentinelvalues);incontrast,Program20.7keepsonthepriorityqueueonlythoseverticesthatarereachablebyasingleedgefromthetree.Keepingallthevertices on the queue simplifies the code but can incur a small performancepenaltyforsomegraphs(seeExercise21.31).Thegeneral results thatweconsideredconcerning theperformanceofpriority-first search (PFS) in Chapter 20 give us specific information about the

performanceoftheseimplementationsofDijkstra’salgorithmforsparsegraphs(Program21.1andProgram20.7, suitablymodified).For reference,we restatethose results in the present context. Since the proofs do not depend on thepriority function, they applywithoutmodification.They areworst-case resultsthatapply tobothprograms,althoughProgram20.7maybemoreefficient formanyclassesofgraphsbecauseitmaintainsasmallerfringe.Property 21.4For all networks and all priority functions, we can compute aspanningtreewithPFSintimeproportionaltothetimerequiredforVinsert,Vdeletetheminimum,andEdecreasekeyoperationsinapriorityqueueofsizeatmostV.Proof:Thisfactisimmediatefromthepriority-queue–basedimplementationsinProgram20.7orProgram21.1.Itrepresentsaconservativeupperboundbecausethe size of the priority queue is often much smaller than V, particularly forProgram20.7.Property21.5With a PFS implementation ofDijkstra’s algorithm that uses aheap for the priority-queue implementation, we can compute any SPT in timeproportionaltoElgV.

Program21.1Dijkstra’salgorithm(priority-firstsearch)This class implements a single-source shortest-paths ADT with linear-timepreprocessing,privatedatathattakesspaceproportionaltoV,andconstant-timememberfunctionsthatgivethelengthoftheshortestpathandthefinalvertexonthepathfromthe source to any given vertex. The constructor is an implementation of Dijkstra’salgorithm that uses a priority queue of vertices (in order of their distance from thesource)tocomputeanSPT.Thepriority-queueinterfaceisthesameoneusedinProgram20.7andimplementedinProgram20.10.TheconstructorisalsoageneralizedgraphsearchthatimplementsotherPFSalgorithmswithotherassignmentstothepriorityP(seetext).Thestatementtoreassigntheweightof treevertices to0 isneeded forageneralPFS implementationbutnot forDijkstra’salgorithm,sincetheprioritiesoftheverticesaddedtotheSPTarenondecreasing.

Figure21.12SPTexamples

ThesethreeexamplesshowgrowingSPTsforthreedifferentsourcelocations:leftedge(top),upperleftcorner(center),andcenter(bottom).

Proof:ThisresultisadirectconsequenceofProperty21.4.Property 21.6Given a graph with V vertices and E edges, let d denote thedensity E/V. If d < 2, then the running time of Dijkstra’s algorithm isproportionaltoVlgV.Otherwise,wecanimprovetheworst-caserunningtimebya factorof lg(E/V), toO(E lgdV )(which is linear ifE is at leastV1+ε)byusinga E/V -aryheapforthepriorityqueue.Proof: This result directly mirrors Property 20.12 and the multiway-heappriority-queueimplementationdiscusseddirectlythereafter.

Table21.1Priority-firstsearchalgorithms

These four classical graph-processing algorithms all can be implementedwithPFS,ageneralizedpriority-queue–basedgraphsearchthatbuildsgraphspanning

trees one edge at a time. Details of search dynamics depend upon graphrepresentation,priority-queueimplementation,andPFSimplementation;butthesearch trees generally characterize the various algorithms, as illustrated in thefiguresreferencedinthefourthcolumn.

Table 21.1 summarizes pertinent information about the four major PFSalgorithms that we have considered. They differ in only the priority functionused,but thisdifference leads tospanningtrees thatareentirelydifferent fromoneanotherincharacter(asrequired).Fortheexampleinthefiguresreferredtointhetable(andformanyothergraphs), theDFStreeistallandthin,theBFStreeisshortandfat, theSPTis liketheBFStreebutneitherquiteasshortnorquiteasfat,andtheMSTisneithershortandfatnortallandthin.WehavealsoconsideredfourdifferentimplementationsofPFS.Thefirstistheclassicaldense-graphimplementationthatencompassesDijkstra’salgorithmandPrim’s MST algorithm (Program 20.6); the other three are sparse-graphimplementationsthatdifferinpriority-queuecontents:

•Fringeedges(Program18.10)•Fringevertices(Program20.7)•Allvertices(Program21.1)

Of these, the first is primarily of pedagogical value, the second is the mostrefined of the three, and the third is perhaps the simplest. This frameworkalready describes 16 different implementations of classical graph-searchalgorithms—when we factor in different priority-queue implementations, thepossibilitiesmultiplyfurther.Thisproliferation

Table21.2CostofimplementationsofDijkstra’salgorithm

This table summarizes the cost (worst-case running time) of variousimplementations of Dijkstra’s algorithm. With appropriate priority-queueimplementations, thealgorithmruns in linear time(timeproportional toV2 for

densenetworks,Eforsparsenetworks),exceptfornetworksthatareextremelysparse.

of networks, algorithms, and implementations underscores the utility of thegeneralstatementsaboutperformanceinProperties21.4through21.6,whicharealsosummarizedinTable21.2.AsistrueofMSTalgorithms,actualrunningtimesofshortest-pathsalgorithmsare likely to be lower than these worst-case time bounds suggest, primarilybecause most edges do not necessitate decrease key operations. In practice,exceptforthesparsestofgraphs,weregardtherunningtimeasbeinglinear.ThenameDijkstra’salgorithm is commonly used to refer both to the abstractmethodofbuildinganSPTbyaddingverticesinorderoftheirdistancefromthesource and to its implementation as theV2 algorithm for the adjacency-matrixrepresentation, because Dijkstra presented both in his 1959 paper (and alsoshowed that the same approach could compute the MST). Performanceimprovements for sparse graphs are dependent on later improvements inADTtechnology and priority-queue implementations that are not specific to theshortest-pathsproblem.ImprovedperformanceofDijkstra’salgorithmisoneofthemost important applications of that technology (see reference section). AswithMSTs,weuseterminologysuchasthe“PFSimplementationofDijkstra’salgorithmusingd-heaps”toidentifyspecificcombinations.We saw in Section18.8 that, in unweighted undirected graphs, using preordernumbering for priorities causes the priority queue to operate as a FIFOqueueand leads to a BFS.Dijkstra’s algorithm gives us another realization of BFS:Whenalledgeweightsare1,itvisitsverticesinorderofthenumberofedgesontheshortestpathtothestartvertex.Thepriorityqueuedoesnotoperatepreciselyas a FIFOqueuewould in this case, because itemswith equal priority do notnecessarilycomeoutintheorderinwhichtheywentin.Each of these implementations puts the edges of anSPT fromvertex 0 in the

vertex-indexedvectorspt,withtheshortest-pathlengthtoeachvertexintheSPTinthevertex-indexedvectorwtandprovidesmemberfunctionsthatgivesclientsaccess to this data.As usual,we canbuild various graph-processing functionsandclassesaroundthisbasicdata(seeExercises21.21through21.28).

Exercises• 21.19 Show, in the style of Figure 21.10, the result of using Dijkstra’salgorithm to compute the SPTof the network defined inExercise21.1withstartvertex0.

•21.20Howwouldyoufindasecondshortestpathfromstotinanetwork?21.21Writeaclient function thatusesanSPTobject to find themostdistantvertex from a given vertex s (the vertex whose shortest path from s is thelongest).21.22WriteaclientfunctionthatusesanSPTobjecttocomputetheaverageofthe lengths of the shortest paths from a given vertex to each of the verticesreachablefromit.21.23DevelopaclassbasedonProgram21.1withapathmemberfunctionthatreturns an STL vector containing pointers to the edges on the shortest pathconnectingsandtinorderfromstotonanSTLvector.•21.24WriteaclientfunctionthatusesyourclassfromExercise21.23toprinttheshortestpathsfromagivenvertextoeachoftheotherverticesinagivennetwork.21.25WriteaclientfunctionthatusesanSPTobjecttofindallverticeswithinagivendistancedofagivenvertex inagivennetwork.Therunning timeofyour function should be proportional to the size of the subgraph induced bythoseverticesandtheverticesincidentonthem.21.26 Develop an algorithm for finding an edge whose removal causesmaximalincreaseintheshortest-pathlengthfromonegivenvertextoanothergivenvertexinagivennetwork.•21.27ImplementaclassthatusesSPTobjectstoperformasensitivityanalysison the network’s edges with respect to a given pair of vertices s and t:ComputeaV-by-Vmatrixsuchthat,foreveryuandv,theentryinrowuandcolumnvis1ifu-visanedgeinthenetworkwhoseweightcanbeincreasedwithouttheshortest-pathlengthfromstotbeingincreasedandis0otherwise.

• 21.28 Implement a class that uses SPT objects to find a shortest pathconnecting one given set of vertices with another given set of vertices in a

givennetwork.21.29 Use your solution fromExercise 21.28 to implement a client functionthatfindsashortestpathfromtheleftedgetotherightedgeinarandomgridnetwork(seeExercise20.17).21.30ShowthatanMSTofanundirectedgraphisequivalent toabottleneckSPT of the graph: For every pair of vertices v and w, it gives the pathconnectingthemwhoselongestedgeisasshortaspossible.21.31RunempiricalstudiestocomparetheperformanceofthetwoversionsofDijkstra’s algorithm for the sparse graphs that are described in this section(Program21.1andProgram20.7,withsuitableprioritydefinition),forvariousnetworks (see Exercises 21.4–8). Use a standard-heap priority-queueimplementation.21.32Runempiricalstudies to learn thebestvalueofdwhenusingad-heappriority-queue implementation(seeProgram20.10) foreachof the threePFSimplementations that we have discussed (Program 18.10, Program 20.7 andProgram21.1),forvariousnetworks(seeExercises21.4–8).•21.33Run empirical studies to determine the effect of using an index-heap-tournament priority-queue implementation (see Exercise 9.53) in Program21.1,forvariousnetworks(seeExercises21.4–8).

•21.34RunempiricalstudiestoanalyzeheightandaveragepathlengthinSPTs,forvariousnetworks(seeExercises21.4–8).21.35Developaclassforthesource–sinkshortest-pathsproblemthatisbasedoncodelikeProgram21.1butthatinitializesthepriorityqueuewithboth thesourceandthesink.DoingsoleadstothegrowthofanSPTfromeachvertex;yourmaintaskistodecidepreciselywhattodowhenthetwoSPTscollide.•21.36DescribeafamilyofgraphswithVverticesandEedgesforwhichtheworst-caserunningtimeofDijkstra’salgorithmisachieved.

•21.37DevelopareasonablegeneratorforrandomgraphswithVverticesandEedges forwhich the running time of the heap-based PFS implementation ofDijkstra’salgorithmissuperlinear.

• 21.38 Write a client program that does dynamic graphical animations ofDijkstra’s algorithm.Yourprogramshouldproduce images likeFigure21.11(seeExercises17.56through17.60).TestyourprogramonrandomEuclideannetworks(seeExercise21.9).

21.3All-PairsShortestPaths

In this section, we consider two classes that solve the all-pairs shortest-pathsproblem. The algorithms that we implement directly generalize two basicalgorithmsthatweconsideredinSection19.3forthetransitiveclosureproblem.The first method is to run Dijkstra’s algorithm from each vertex to get theshortestpathsfromthatvertextoeachoftheothers.Ifweimplementthepriorityqueuewithaheap,theworst-caserunningtimeforthisapproachisproportionaltoVElgV,andwecanimprovethisboundtoVEformanytypesofnetworksbyusingad-aryheap.Thesecondmethod,whichallowsustosolvetheproblemdirectlyintimeproportionaltoV3,isanextensionofWarshall’salgorithmthatisknownasFloyd’salgorithm.Both of these classes implement anabstract–shortest-pathsADT interface forfindingshortestdistancesandpaths.Thisinterface,whichisshowninProgram21.2, is ageneralization toweighteddigraphsof theabstract–transitiveclosureinterfaceforconnectivityqueries indigraphs thatwestudied inChapter19. Inboth class implementations, the constructor solves the all-pairs shortest-pathsproblemandsavestheresultinprivatedatamemberstosupportqueryfunctionsthatreturntheshortest-pathlengthfromonegivenvertextoanotherandeitherthefirstorlastedgeonthepath.ImplementingsuchanADTisaprimaryreasontouseall-pairsshortest-pathsalgorithmsinpractice.Program21.3 isasampleclientprogramthatuses theall–shortest-pathsADTinterface to find the weighted diameter of a network. It checks all pairs ofvertices to find the one for which the shortest-path length is longest; then, ittraversesthepath,edgebyedge.Figure21.13showsthepathcomputedbythisprogramforourEuclideannetworkexample.

Figure21.13Diameterofanetwork

Thelargestentryinanetwork’sall-shortest-pathsmatrixisthediameterofthenetwork:thelengthofthelongestoftheshortestpaths,depictedhereforour

sampleEuclideannetwork.

The goal of the algorithms in this section is to support constant-timeimplementations of the query functions. Typically, we expect to have a huge

number of such requests, so we are willing to invest substantial resources inprivatedatamembersandpreprocessingintheconstructortobeabletoanswerthe queries quickly. Both of the algorithms that we consider use spaceproportionaltoV2fortheprivatedatamembers.Theprimarydisadvantageof thisgeneralapproachis that, forahugenetwork,wemaynothavesomuchspaceavailable(orwemight

Program21.2All-pairsshortest-pathsADTOur solutions to the all-pairs shortest-paths problem are all classeswith a constructorandtwoqueryfunctions:adistfunctionthatreturnsthelengthoftheshortestpathfromthe first argument to the second; and one of two possible path functions, either path,whichreturnsapointertothefirstedgeontheshortestpath,orpathR,whichreturnsapointertothefinaledgeontheshortestpath.Ifthereisnosuchpath,thepathfunctionreturns0anddistisundefined.We use path or pathR as convenient for the algorithmunder scrutiny; in practice,wemight need to settle upon one or the other (or both) in the interface and use varioustransferfunctionsinimplementations,asdiscussedinSection21.1andintheexercisesattheendofthissection.

template<classGraph,classEdge>classSPall{public:SPall(constGraph&);Edge*path(int,int)const;Edge*pathR(int,int)const;doubledist(int,int)const;};

notbeabletoaffordtherequisitepreprocessingtime).Inprinciple,ourinterfaceprovidesuswiththelatitudetotradeoffpreprocessingtimeandspaceforquerytime. Ifweexpectonlya fewqueries,wecandonopreprocessingandsimplyrunasingle-sourcealgorithmforeachquery,butintermediatesituationsrequiremore advanced algorithms (see Exercises 21.48 through 21.50). This problemgeneralizes one that challenged us for much of Chapter 19: the problem ofsupportingfastreachabilityqueriesinlimitedspace.Thefirstall-pairsshortest-pathsADTfunctionimplementationthatweconsidersolves the problem by using Dijkstra’s algorithm to solve the single-sourceproblemforeachvertex.InC++,wecanexpressthemethoddirectly,asshownin Program 21.4: We build a vector of SPT objects, one to solve the single-sourceproblemforeachvertex.ThismethodgeneralizestheBFS-basedmethodforunweightedundirectedgraphs thatweconsideredinSection17.7. It isalsosimilar

Program21.3ComputingthediameterofanetworkThisclientfunctionillustratestheuseoftheinterfaceinProgram21.2.Itfindsthe

longestoftheshortestpathsinthegivennetwork,printsthepath,andreturnsitsweight(thediameterofthenetwork).

template<classGraph,classEdge>

doublediameter(Graph&G)

{intvmax=0,wmax=0;

allSP<Graph,Edge>all(G);


for(intw=0;w<G.V();w++)

if(all.path(v,w))

if(all.dist(v,w)>all.dist(vmax,wmax))

{vmax=v;wmax=w;}

intv=vmax;cout<<v;

while(v!=wmax)

{v=all.path(v,wmax)->w();cout<<”-”<<v;}

returnall.dist(vmax,wmax);

}

toouruseofaDFSthatstartsateachvertextocomputethetransitiveclosureofunweighteddigraphs,inProgram19.4.Property 21.7With Dijkstra’s algorithm, we can find all shortest paths in anetworkthathasnonnegativeweightsintimeproportionaltoVElogdV,whered=2ifE<2V,andd=E/Votherwise.Proof:ImmediatefromProperty21.6.Asareourbounds for the single-source shortest-pathsand theMSTproblems,thisboundisconservative;andarunningtimeofVEislikelyfortypicalgraphs.To compare this implementationwith others, it is useful to study thematricesimplicit in thevector-of-vectors structureof theprivatedatamembers.ThewtvectorsformpreciselythedistancesmatrixthatweconsideredinSection21.1:Theentryinrowsandcolumntisthelengthoftheshortestpathfromstot.Asillustrated in Figures 21.8 and 21.9, the spt vectors from the transpose of thepathsmatrix:Theentry in rowsandcolumn t is the lastentryon the shortestpathfromstot.

Program21.4Dijkstra’salgorithmforallshortestpathsThisclassusesDijkstra’salgorithmtobuildanSPTforeachvertexsothatitcananswerpathRanddistqueriesforanypairofvertices.

#include“SPT.cc”template<classGraph,classEdge>classallSP{constGraph&G;vector<SPT<Graph,Edge>*>A;public:allSP(constGraph&G):G(G),A(G.V()){for(ints=0;s<G.V();s++)

A[s]=newSPT<Graph,Edge>(G,s);}Edge*pathR(ints,intt)const{returnA[s]->pathR(t);}doubledist(ints,intt)const{returnA[s]->dist(t);}};

For dense graphs,we could use an adjacency-matrix representation and avoidcomputingthereversegraphbyimplicitlytransposingthematrix(interchangingtherowandcolumnindices),asinProgram19.7.Developinganimplementationalongtheselinesisaninterestingprogrammingexerciseandleadstoacompactimplementation(seeExercise21.43);however, adifferent approach,whichweconsidernext,admitsanevenmorecompactimplementation.Themethodofchoice for solving theall-pairs shortest-pathsproblem indensegraphs,whichwasdevelopedbyR.Floyd, isprecisely the sameasWarshall’smethod,exceptthatinsteadofusingthelogicaloroperationtokeeptrackoftheexistenceofpaths, itchecksdistances foreachedge todeterminewhether thatedge is part of a new shorter path. Indeed, as we have noted, Floyd’s andWarshall’s algorithms are identical in the proper abstract setting (see Sections19.3and21.1).Program 21.5 is an all-pairs shortest-paths ADT function that implementsFloyd’s algorithm. It explictly uses the matrices from Section 21.1 as privatedata members: a V -by-V vector of vectors d for the distances matrix, andanotherV-by-Vvectorofvectorspforthepathstable.Foreverypairofverticessandt,theconstructorsets

Program21.5Floyd’salgorithmforallshortestpathsThisimplementationoftheinterfaceinProgram21.2usesFloyd’salgorithm,a

generalizationofWarshall’salgorithm(seeProgram19.3)thatfindstheshortestpathsbetweeneachpairofpointsinsteadofjusttestingfortheirexistence.

Afterinitializingthedistancesandpathsmatriceswiththegraph’sedges,wedoaseriesof relaxation operations to compute the shortest paths. The algorithm is simple toimplement, but verifying that it computes the shortest paths ismore complicated (seetext).

d[s][t]totheshortest-pathlengthfromstot(tobereturnedbythedistmemberfunction)andp[s][t]totheindexofthenextvertexontheshortestpathfromstot (to be returned by the pathmember function). The implementation is baseduponthepathrelaxationoperationthatweconsideredinSection21.1.Property 21.8With Floyd’s algorithm, we can find all shortest paths in a

networkintimeproportionaltoV3.Proof:Therunningtimeisimmediatefrominspectionofthecode.Weprovethatthe algorithm is correct by induction in precisely the sameway aswe did forWarshall’salgorithm.Theithiterationoftheloopcomputesashortestpathfromstotinthenetworkthatdoesnotincludeanyverticeswithindicesgreaterthani(exceptpossiblytheendpointssandt).Assumingthisfacttobetruefortheithiterationoftheloop,weproveittobetrueforthe(i+1)stiterationoftheloop.Ashortestpathfromstot thatdoesnot includeanyverticeswithindicesgreaterthan i+1 iseither (i) apath froms to t thatdoesnot includeanyverticeswithindicesgreaterthani,oflengthd[s][t],thatwasfoundonapreviousiterationoftheloop,bytheinductivehypothesis;or(ii)comprisingapathfromstoiandapathfromitot,neitherofwhichincludesanyverticeswithindicesgreaterthani,inwhichcasetheinnerloopsetsd[s][t].Figure21.14isadetailedtraceofFloyd’salgorithmonoursamplenetwork.Ifweconverteachblankentryto0(toindicatetheabsenceofanedge)andconverteach nonblank entry to 1 (to indicate the presence of an edge), then thesematrices describe the operation ofWarshall’s algorithm in precisely the samemanneraswedid inFigure19.15.ForFloyd’salgorithm, thenonblankentriesindicate more than the existence of a path; they give information about theshortest known path. An entry in the distance matrix has the length of theshortestknownpathconnectingtheverticescorrespondingtothegivenrowandcolumn;thecorrespondingentryinthepathsmatrixgivesthenextvertexonthatpath.As thematrices become filledwith nonblank entries, runningWarshall’salgorithm amounts to just double-checking that new paths connect pairs ofverticesalreadyknowntobeconnectedbyapath;incontrast,Floyd’salgorithmmustcompare(andupdate ifnecessary)eachnewpathtoseewhether thenewpathleadstoshorterpaths.

Figure21.14Floyd’salgorithm

Thissequenceshowstheconstructionoftheall-pairsshortest-pathsmatriceswithFloyd’salgorithm.Forifrom0to5(toptobottom),weconsider,forallsandt,allofthepathsfromstothavingnointermediateverticesgreaterthani(theshadedvertices).Initially,theonlysuchpathsarethenetwork’sedges,so

thedistancesmatrix(center)isthegraph’sadjacencymatrixandthepathsmatrix(right)issetwithp[s][t]=tforeachedges-t.Forvertex0(top),thealgorithmfindsthat3-0-1isshorterthanthesentinelvaluethatispresentbecausethereisnoedge3-1andupdatesthematricesaccordingly.Itdoesnotdosoforpaths

suchas3-0-5,whichisnotshorterthantheknownpath3-5.Nextthealgorithmconsiderspathsthrough0and1(secondfromtop)andfindsthenewshorter

paths0-1-2,0-1-4,3-0-1-2,3-0-1-4,and5-1-2.Thethirdrowfromthetopshowstheupdatescorrespondingtoshorterpathsthrough0,1,and2andsoforth.

Blacknumbersoverstrikinggrayones in thematrices indicatesituationswherethe algorithm finds a shorter path than one it found earlier. For example, .91overstrikes 1.37 in row 3 and column 2 in the bottom diagram because thealgorithmdiscoveredthat3-5-4-2isshorterthan3-0-1-2.Comparingtheworst-caseboundsontherunningtimesofDijkstra’sandFloyd’salgorithms,we candraw the same conclusion for these all-pairs shortest-pathsalgorithms as we did for the corresponding transitiveclosure algorithms inSection19.3.RunningDijkstra’salgorithmoneachvertexisclearlythemethodof choice for sparse networks, because the running time is close to VE. Asdensity increases,Floyd’s algorithm—which always takes timeproportional toV3—becomescompetitive(seeExercise21.67);itiswidelyusedbecauseitissosimpletoimplement.Amore fundamentaldistinctionbetween the algorithms,whichweexamine indetail in Section 21.7, is that Floyd’s algorithm is effective in even thosenetworksthathavenegativeweights(providedthattherearenonegativecycles).AswenotedinSection21.2,Dijkstra’smethoddoesnotnecessarilyfindshortestpathsinsuchgraphs.The classical solutions to the all-pairs shortest-paths problem that we havedescribedpresumethatwehavespaceavailabletoholdthedistancesandpathsmatrices. Huge sparse graphs, where we cannot afford to have any V-by-Vmatrices,presentanothersetofchallengingandinterestingproblems.AswesawinChapter19,itisanopenproblemtoreducethisspacecosttobeproportionaltoVwhilestillsupportingconstant-timeshortest-path-lengthqueries.Wefoundtheanalogousproblemtobedifficultevenforthesimplerreachabilityproblem

(wherewearesatisfiedwithlearninginconstanttimewhetherthereisanypathconnectingagivenpairofvertices),sowecannotexpectasimplesolutionfortheall-pairsshortest-pathsproblem.Indeed,thenumberofdifferentshortestpathlengthsis, ingeneral,proportional toV2evenforsparsegraphs.Thatvalue, insome sense,measures the amountof information thatweneed toprocess, andperhapsindicatesthatwhenwedohaverestrictionsonspace,wemustexpecttospendmoretimeoneachquery(seeExercises21.48through21.50).

Exercises•21.39 Estimate, towithin a factor of 10, the largest graph (measured by itsnumberofvertices)thatyourcomputerandprogrammingsystemcouldhandleif youwere to use Floyd’s algorithm to compute all its shortest paths in 10seconds.

• 21.40 Estimate, to within a factor of 10, the largest graph of density 10(measured by its number of edges) that your computer and programmingsystemcouldhandleifyouweretouseDijkstra’salgorithmtocomputeallitsshortestpathsin10seconds.21.41Show,inthestyleofFigure21.9,theresultofusingDijkstra’salgorithmtocomputeallshortestpathsofthenetworkdefinedinExercise21.1.21.42Show,inthestyleofFigure21.14,theresultofusingFloyd’salgorithmtocomputeallshortestpathsofthenetworkdefinedinExercise21.1.•21.43CombineProgram20.6andProgram21.4tomakeanimplementationoftheall-pairs shortest-pathsADT interface (basedonDijkstra’salgorithm) fordensenetworksthatsupportspathqueriesbutdoesnotexplicitlycomputethereverse network. Do not define a separate function for the single-sourcesolution—put thecode fromProgram20.6directly in the inner loop andputresultsdirectlyinprivatedatamembersdandplikethoseinProgram21.5).21.44 Run empirical tests, in the style of Table 20.2, to compare Dijkstra’salgorithm(Program21.4andExercise21.43)andFloyd’salgorithm(Program21.5),forvariousnetworks(seeExercises21.4–8).21.45Runempirical tests todetermine thenumberof times thatFloyd’s andDijkstra’s algorithms update the values in the distances matrix, for variousnetworks(seeExercises21.4–8).21.46Giveamatrix inwhichtheentry inrowsandcolumn t isequal to thenumberofdifferentsimpledirectedpathsconnectingsandtinFigure21.1.21.47Implementaclasswhoseconstructorcomputesthepath-countmatrixthat

isdescribed inExercise21.46 so that it canprovide count queries through apublicmemberfunctioninconstanttime.•21.48Developaclassimplementationoftheabstract–shortest-pathsADTforsparsegraphsthatcutsthespacecosttobeproportionaltoV,byincreasingthequerytimetobeproportionaltoV.•21.49Developaclassimplementationoftheabstract–shortest-pathsADTforsparse graphs that uses substantially less than O (V2) space but supportsqueriesinsubstantiallylessthanO(V)time.Hint:Computeallshortestpathsforasubsetofthevertices.

•21.50Developaclassimplementationoftheabstract–shortest-pathsADTforsparse graphs that uses substantially less than O (V2) space and (usingrandomization)supportsqueriesinconstantexpectedtime.

•21.51Developaclassimplementationoftheabstract–shortest-pathsADTthattakes the lazy approach of usingDijkstra’s algorithm to build the SPT (andassociateddistancevector)foreachvertexsthefirsttimethattheclientissuesa shortest-path query from s, then references the information on subsequentqueries.21.52 Modify the shortest-paths ADT and Dijkstra’s algorithm to handleshortest-pathscomputationsinnetworksthathaveweightsonbothverticesandedges. Do not rebuild the graph representation (the method described inExercise21.4);modifythecodeinstead.• 21.53 Build a small model of airline routes and connection times, perhapsbased upon some flights that you have taken.Use your solution toExercise21.52tocomputethefastestwaytogetfromoneoftheserveddestinationstoanother.Thentestyourprogramonrealdata(seeExercise21.5).

21.4ShortestPathsinAcyclicNetworksInChapter19,wefoundthat,despiteourintuitionthatDAGsshouldbeeasiertoprocess than general digraphs, developing algorithmswith substantially betterperformanceforDAGsthanforgeneraldigraphsisanelusivegoal.Forshortest-pathsproblems,wedohavealgorithmsforDAGsthataresimplerandfasterthanthepriority-queue–basedmethodsthatwehaveconsideredforgeneraldigraphs.Specifically,inthissectionweconsideralgorithmsforacyclicnetworksthat•Solvethesingle-sourceprobleminlineartime.•Solvetheall-pairsproblemintimeproportionaltoVE.•Solveotherproblems,suchasfindinglongestpaths.

Inthefirsttwocases,wecutthelogarithmicfactorfromtherunningtimethatispresent in our best algorithms for sparse networks; in the third case,we havesimplealgorithmsforproblemsthatareintractableforgeneralnetworks.Thesealgorithms are all straightforward extensions to the algorithms for reachabilityandtransitiveclosureinDAGsthatweconsideredinChapter19.Sincetherearenocyclesatall,therearenonegativecycles;sonegativeweightspresentnodifficultyinshortest-pathsproblemsonDAGs.Accordingly,weplacenorestrictionsonedge-weightvaluesthroughoutthissection.Next, a note about terminology:Wemight choose to refer to directed graphswithweightsontheedgesandnocycleseitherasweightedDAGsorasacyclicnetworks.Weusebothtermsinterchangeablytoemphasizetheirequivalenceandtoavoidconfusionwhenwerefertotheliterature,wherebotharewidelyused.Itis sometimes convenient to use the former to emphasize differences fromunweighted DAGs that are implied by weights and the latter to emphasizedifferencesfromgeneralnetworksthatareimpliedbyacyclicity.The four basic ideas that we applied to derive efficient algorithms forunweightedDAGsinChapter19areevenmoreeffectiveforweightedDAGs.•UseDFStosolvethesingle-sourceproblem.•Useasourcequeuetosolvethesingle-sourceproblem.•Invokeeithermethod,onceforeachvertex,tosolvetheall-pairsproblem.• Use a single DFS (with dynamic programming) to solve the all-pairsproblem.

Thesemethodssolvethesingle-sourceproblemintimeproportionaltoEandtheall-pairsproblemintimeproportionaltoVE.Theyarealleffectivebecauseoftopological ordering, which allows us compute shortest paths for each vertexwithout having to revisit any decisions. We consider one implementation foreach problem in this section;we leave the others for exercises (see Exercises21.62through21.65).Webeginwithaslighttwist.EveryDAGhasatleastonesourcebutcouldhaveseveral,soitisnaturaltoconsiderthefollowingshortest-pathsproblem.Multisource shortest paths Given a set of start vertices, find, for each othervertexw,ashortestpathamongtheshortestpathsfromeachstartvertextow.This problem is essentially equivalent to the single-source shortest-pathsproblem.Wecanconvertamultisourceproblemintoasingle-sourceproblembyadding a dummy source vertex with zero-length edges to each source in the

network.Conversely,wecanconverta single-sourceproblem toamultisourceproblembyworkingwiththeinducedsubnetworkdefinedbyalltheverticesandedges reachable from the source. We rarely construct such subnetworksexplicitly,becauseouralgorithmsautomaticallyprocessthemifwetreatthestartvertexasthoughitweretheonlysourceinthenetwork(evenwhenitisnot).Topologicalsortingimmediatelypresentsasolutiontothemultisourceshortest-pathsproblemand tonumerousotherproblems.Wemaintainavertex-indexedvectorwt thatgives theweightof the shortestknownpath fromanysource toeachvertex.Tosolvethemultisourceshortest-pathsproblem,weinitializethewtvectorto0forsourcesandalargesentinelvalueforalltheothervertices.Then,weprocesstheverticesintopologicalorder.Toprocessavertexv,weperformarelaxationoperationforeachoutgoingedgev-wthatupdatestheshortestpathtowifv-wgivesashorterpathfromasourcetow(throughv).Thisprocesschecksallpaths fromanysource toeachvertex in thegraph; the relaxationoperationkeeps trackof theminimum-lengthsuchpath,and the topological sortensuresthatweprocesstheverticesinanappropriateorder.Wecanimplementthismethoddirectlyinoneoftwoways.Thefirstistoaddafew lines of code to the topological sort code in Program 19.8: Just after weremoveavertexv from the sourcequeue,weperform the indicated relaxationoperation for each of its edges (see Exercise21.56). The second is to put thevertices in topological order, then to scan through them and to perform therelaxationoperationspreciselyasdescribedinthepreviousparagraph.Thesesameprocesses(withotherrelaxationoperations)cansolvemanygraph-processing problems. For example, Program 21.6 is an implementation of thesecond approach (sort, then scan) for solving the multisource longest-pathsproblem: For each vertex in the network, what is a longest path from somesourcetothatvertex?Weinterpretthewtentryassociatedwitheachvertextobethelengthofthelongestknownpathfromanysourcetothatvertex,initializeallof theweights to 0, and change the sense of the comparison in the relaxationoperation.Figure21.15tracestheoperationofProgram21.6onasampleacyclicnetwork.Property 21.9We can solve the multisource shortest-paths problem and themultisourcelongest-pathsprobleminacyclicnetworksinlineartime.Proof:Thesameproofholdsforlongestpath,shortestpath,andmanyotherpathproperties. To match Program 21.6, we state the proof for longest paths. Weshowbyinductionontheloopvariableithat,forallverticesv=ts[j]withj<ithathavebeenprocessed,wt[v]isthelengthofthelongestpathfromasourceto

v.Whenv=ts[i],lettbethevertexprecedingvonanypathfromasourcetov.Sinceverticesinthetsvectorareintopologicallysortedorder,tmusthavebeenprocessedalready.Bytheinductionhypothesis,wt[t]isthelengthofthelongestpath to t, and the relaxation step in the code checkswhether that pathgives alongerpathtovthrought.Theinductionhypothesisalsoimpliesthatallpathstovarecheckedinthiswayasvisprocessed.Thispropertyissignificantbecauseittellsusthatprocessingacyclicnetworksisconsiderablyeasierthanprocessingnetworksthat

Figure21.15Computinglongestpathsinanacyclicnetwork

Inthisnetwork,eachedgehastheweightassociatedwiththevertexthatitleadsfrom,listedatthetopleft.Sinkshaveedgestoadummyvertex10,whichisnotshowninthedrawings.Thewtarraycontainsthelengthofthelongestknownpathtoeachvertexfromsomesource,andthestarraycontainstheprevious

vertexonthelongestpath.ThisfigureillustratestheoperationofProgram21.6,whichpicksfromamongthesources(theshadednodesineachdiagram)usingtheFIFOdiscipline,thoughanyofthesourcescouldbechosenateachstep.Webeginbyremoving0andcheckingeachofitsincidentedges,discoveringone-edgepathsoflength.41to1,7,and9.Next,weremove5andrecordtheone-

edgepathfrom5to10(left,secondfromtop).Next,weremove9andrecordthepaths0-9-4and0-9-6,oflength.70(left,thirdfromtop).Wecontinueinthis

way,changingthearrayswheneverwefindlongerpaths.Forexample,whenweremove7(left,secondfrombottom)werecordpathsoflength.73to8and3;then,later,whenweremove6,werecordlongerpaths(oflength.91)to8and3

(right,top).Thepointofthecomputationistofindthelongestpathtothedummynode10.Inthiscase,theresultisthepath0-9-6-8-2,oflength1.73.

Program21.6LongestpathsinanacyclicnetworkTofindthelongestpathsinanacyclicnetwork,weconsidertheverticesintopologicalorder,keepingtheweightofthelongestknownpathtoeachvertexinavertex-indexedvectorwtbydoingarelaxationstepforeachedge.Thevectorlptdefinesaaspanningforestoflongestpaths(rootedatthesources)sothatpath(v)returnsthelastedgeonthe

longestpathtov.

havecycles.Forshortestpaths,themethodisfasterthanDijkstra’salgorithmbya factor proportional to the cost of the priority-queue operations in Dijkstra’salgorithm.Forlongestpaths,wehavealinearalgorithmforacyclicnetworksbutanintractableproblemforgeneralnetworks.Moreover,negativeweightspresentnospecialdifficultyhere,buttheypresentformidablebarriersforalgorithmsongeneralnetworks,asdiscussedinSection21.7.Themethodjustdescribeddependsononlythefactthatweprocesstheverticesintopologicalorder.Therefore,anytopological-sorting

Figure21.16Shortestpathsinanacyclicnetwork

Thisdiagramdepictsthecomputationoftheall-shortest-distancesmatrix(bottomright)forasampleweightedDAG(topleft),computingeachrowasthelastactioninarecursiveDFSfunction.Eachrowiscomputedfromtherowsforadjacentvertices,whichappearearlierinthelist,becausetherowsarecomputedinreversetopologicalorder(postordertraversaloftheDFStreeshownatthe

bottomleft).Thearrayonthetoprightshowstherowsofthematrixintheorderthattheyarecomputed.Forexample,tocomputeeachentryintherowfor0weadd.41tothecorrespondingentryintherowfor1(togetthedistancetoitfrom0aftertaking0-1),thenadd.45tothecorrespondingentryintherowfor3(togetthedistancetoitfrom0aftertaking0-3),andtakethesmallerofthetwo.ThecomputationisessentiallythesameascomputingthetransitiveclosureofaDAG(see,forexample,Figure19.23).Themostsignificantdifferencebetweenthetwoisthatthetransitiveclosurealgorithmcouldignoredownedges(suchas1-2inthisexample)becausetheygotoverticesknowntobereachable,whiletheshortest-pathsalgorithmhastocheckwhetherpathsassociatedwithdown

edgesareshorterthanknownpaths.Ifweweretoignore1-2inthisexample,wewouldmisstheshortestpaths0-1-2and1-2.

algorithmcanbeadaptedtosolveshortest-andlongest-pathsproblemsandotherproblemsofthistype(see,forexample,Exercises21.56and21.62).

AsweknowfromChapter19,theDAGabstractionisageneralonethatarisesinmany applications. For example, we see an application in Section 21.6 thatseems unrelated to networks but that can be addressed directly with Program21.6.Next,weturntotheall-pairsshortest-pathsproblemforacyclicnetworks.AsinSection19.3, onemethod thatwe could use to solve this problem is to run asingle-source algorithm for each vertex (see Exercise 21.65). The equallyeffective approach thatwe consider here is to use a singleDFSwith dynamicprogramming, just aswe did for computing the transitive closure ofDAGs inSection19.5 (seeProgram19.9). Ifwe consider the vertices at the end of therecursivefunction,weareprocessingtheminreversetopologicalorderandcanderivetheshortest-pathvectorforeachvertexfromtheshortest-pathvectorsforeachadjacentvertex,simplybyusingeachedgeinarelaxationstep.Program 21.7 is an implementation along these lines. The operation of thisprogramonasampleweightedDAGis illustratedinFigure21.16.Beyond thegeneralization to include relaxation, there is one important difference betweenthiscomputationandthetransitiveclosure

Program21.7AllshortestpathsinanacyclicnetworkThisimplementationoftheinterfaceinProgram21.2forweightedDAGsisderivedby

addingappropriaterelaxationoperationstothedynamic-programming–basedtransitiveclosurefunctioninProgram19.9.

Figure21.17Alllongestpathsinanacyclicnetwork

Ourmethodforcomputingallshortestpathsinacyclicnetworksworkseveniftheweightsarenegative.Therefore,wecanuseittocomputelongestpaths,simplybyfirstnegatingalltheweights,asillustratedhereforthenetworkinFigure21.16.Thelongestsimplepathinthisnetworkis0-1-5-4-2-3,ofweight

1.73.

computationforDAGs: InProgram19.9,wehad thechoiceof ignoringdownedges in the DFS tree because they provide no new information aboutreachability;inProgram21.7,however,weneedtoconsideralledges,becauseanyedgemightleadtoashorterpath.Property 21.10We can solve the all-pairs shortest-paths problem in acyclicnetworkswithasingleDFSintimeproportionaltoVE.Proof: This fact follows immediately from the strategy of solving the single-sourceproblemforeachvertex(seeExercise21.65).Wecanalsoestablishitbyinduction,fromProgram21.7.Aftertherecursivecallsforavertexv,weknowthatwehavecomputedallshortestpathsforeachvertexonv’sadjacencylist,sowecanfindshortestpathsfromvtoeachvertexbycheckingeachofv’sedges.WedoVrelaxationstepsforeachedge,foratotalofVErelaxationsteps.Thus,foracyclicnetworks,topologicalsortingallowsustoavoidthecostofthepriorityqueueinDijkstra’salgorithm.LikeFloyd’salgorithm,Program21.7alsosolves problems more general than those solved by Dijkstra’s algorithm,because,unlikeDijkstra’s(seeSection21.7),thisalgorithmworkscorrectlyeveninthepresenceofnegativeedgeweights.Ifwerunthealgorithmafternegatingall theweights in an acyclic network, it finds all longest paths, as depicted inFigure21.17.Or,wecanfindlongestpathsbyreversingtheinequalitytestintherelaxationalgorithm,asinProgram21.6.The other algorithms for finding shortest paths in acyclic networks that arementionedatthebeginningofthissectiongeneralizethemethodsfromChapter19 in amanner similar to the other algorithms thatwe have examined in this

chapter. Developing implementations of them is a worthwhile way to cementyour understanding of both DAGs and shortest paths (see Exercises 21.62through 21.65).All themethods run in time proportional toVE in theworstcase,withactualcostsdependentonthestructureoftheDAG.Inprinciple,wemightdoevenbetterforcertainsparseweightedDAGs(seeExercise19.117).

Exercises• 21.54 Give the solutions to the multisource shortest-and longest-pathsproblemsforthenetworkdefinedinExercise21.1,withthedirectionsofedges2-3and1-0reversed.

•21.55ModifyProgram21.6suchthat itsolves themultisourceshortest-pathsproblemforacyclicnetworks.

• 21.56 Implement a class with the same interface as Program 21.6 that isderived from the source-queue–based topological-sorting code of Program19.8,performingtherelaxationoperationsforeachvertexjustafterthatvertexisremovedfromthesourcequeue.

•21.57DefineanADT for the relaxationoperation,provide implementations,andmodifyProgram21.6touseyourADTsuchthatyoucanuseProgram21.6tosolvethemultisourceshortest-pathsproblem,themultisourcelongest-pathsproblem,andotherproblems,justbychangingtherelaxationimplementation.21.58Use your generic implementation fromExercise 21.57 to implement aclasswithmember functions that return the length of the longest paths fromanysourcetoanyothervertexinaDAG,thelengthoftheshortestsuchpath,and the number of vertices reachable via paths whose lengths fall within agivenrange.•21.59Defineproperties of relaxation such that you canmodify theproof ofProperty21.9 to apply an abstract versionofProgram21.6 (such as theonedescribedinExercise21.57).

• 21.60 Show, in the style of Figure 21.16, the computation of the all-pairsshortest-pathsmatricesforthenetworkdefinedinExercise21.54byProgram21.7.

• 21.61 Give an upper bound on the number of edge weights accessed byProgram21.7,asafunctionofbasicstructuralpropertiesofthenetwork.Writeaprogramtocomputethisfunction,anduseittoestimatetheaccuracyoftheVE bound, for various acyclic networks (add weights as appropriate to themodelsinChapter19).

•21.62Write aDFS-based solution to themultisource shortest-paths problem

for acyclic networks. Does your solution work correctly in the presence ofnegativeedgeweights?Explainyouranswer.

•21.63ExtendyoursolutiontoExercise21.62toprovideanimplementationoftheall-pairsshortest-pathsADTinterfaceforacyclicnetworksthatbuildstheall-pathsandall-distancesmatricesintimeproportionaltoVE.21.64Show,inthestyleofFigure21.9,thecomputationofallshortestpathsofthenetworkdefinedinExercise21.54usingtheDFS-basedmethodofExercise21.63.•21.65ModifyProgram21.6suchthatitsolvesthesingle-sourceshortest-pathsprobleminacyclicnetworks, thenuseit todevelopanimplementationoftheall-pairsshortest-pathsADTinterfaceforacyclicnetworksthatbuildstheall-pathsandall-distancesmatricesintimeproportionaltoVE.

• 21.66Work Exercise 21.61 for theDFS-based (Exercise 21.63) and for thetopological-sort–based (Exercise 21.65) implementations of the all-pairsshortest-paths ADT. What inferences can you draw about the comparativecostsofthethreemethods?21.67Runempiricaltests,inthestyleofTable20.2,tocomparethethreeclassimplementations for the all-pairs shortest-paths problem described in thissection (see Program 21.7, Exercise 21.63, and Exercise 21.65), for variousacyclicnetworks(addweightsasappropriatetothemodelsinChapter19).

21.5EuclideanNetworksIn applications where networks model maps, our primary interest is often infinding thebest route fromoneplace toanother. In thissection,weexamineastrategy for this problem: a fast algorithm for the source–sink shortest-pathprobleminEuclideannetworks,whicharenetworkswhoseverticesarepointsinthe plane and whose edge weights are defined by the geometric distancesbetweenthepoints.Thesenetworkssatisfytwoimportantpropertiesthatdonotnecessarilyholdforgeneral edge weights. First, the distances satisfy the triangle inequality: Thedistance from s to d is never greater than the distance from s to x plus thedistancefromxtod.Second,vertexpositionsgivealowerboundonpathlength:Nopathfromstodwillbeshorterthanthedistancefromstod.Thealgorithmforthesource–sinkshortest-pathsproblemthatweexamineinthissectiontakesadvantageofthesetwopropertiestoimproveperformance.Often,Euclideannetworksarealsosymmetric:Edgesruninbothdirections.Asmentionedat thebeginningof thechapter,suchnetworksarise immediately if,

forexample,weinterprettheadjacency-matrixoradjacency-listsrepresentationof an undirected weighted Euclidean graph (see Section 20.7) as a weighteddigraph(network).WhenwedrawanundirectedEuclideannetwork,weassumethisinterpretationtoavoidproliferationofarrowheadsinthedrawings.Thebasicideaisstraightforward:Priority-firstsearchprovidesuswithageneralmechanismtosearchforpathsingraphs.WithDijkstra’salgorithm,weexaminepathsinorderoftheirdistancefromthestartvertex.Thisorderingensuresthat,whenwereachthesink,wehaveexaminedallpathsinthegraphthatareshorter,noneofwhichtookustothesink.ButinaEuclideangraph,wehaveadditionalinformation: Ifwe are looking for a path from a source s to a sink d andweencountera thirdvertexv, thenweknowthatnotonlydowehave to take thepaththatwehavefoundfromstov,butalsothebestthatwecouldpossiblydointravelingfromvtodisfirsttotakeanedgev-wandthentofindapathwhoselengthisthestraight-linedistancefromwtod(seeFigure21.18).Withpriority-first search,we can easily take into account this extra information to improveperformance. We use the standard algorithm, but we use the sum of thefollowing three quantities as the priority of each edge v-w: the length of theknownpathfromstov,theweightoftheedgev-w,andthedistancefromwtot.Ifwe always pick the edge forwhich this number is smallest, then,whenwereacht,wearestillassuredthatthereisnoshorterpathinthegraphfromstot.Furthermore, in typical networkswe reach this conclusion after doing far lessworkthanwewouldwereweusingDijkstra’salgorithm.

Figure21.18Edgerelaxation(Euclidean)

InaEuclideangraph,wecantakedistancestothedestinationintoaccountintherelaxationoperationaswecomputeshortestpaths.Inthisexample,wecoulddeducethatthepathdepictedfromstovplusv-wcannotleadtoashorterpathfromstodthantheonealreadyfoundbecausethelengthofanysuchpathmustbeatleastthelengthofthepathfromstovplusthelengthofv-wplusthe

straight-linedistancefromwtod,whichisgreaterthanthelengthoftheknown

pathfromstod.Testslikethisonecansignificantlyreducethenumberofpathsthatwehavetoconsider.

Toimplementthisapproach,weuseastandardPFSimplementationofDijkstra’salgorithm(Program21.1, sinceEuclideangraphs arenormally sparse,but alsoseeExercise21.73)with twochanges:First, insteadof initializingwt[s] at thebeginningofthesearchto0.0,wesetittothequantitydist(s,d),wheredistanceisa function that returns thedistancebetween twovertices.Second,wedefinethepriorityPtobethefunction(wt[v]+e->wt()+distance(w,d)-distance(v,d))insteadof the function (wt[v]+ e->wt()) thatweused inProgram21.1 (recallthat v and w are local variables that are set to the values e->v() and e->w(),respectively. These changes, to which we refer as the Euclidean heuristic,maintaintheinvariantthatthequantitywt[v]-distance(v,d)isthelengthoftheshortest path through the network from s to v, for every tree vertex v (andtherefore wt[v] is a lower bound on the length of the shortest possible paththroughvfroms tod).Wecomputewt[w]byadding to thisquantity theedgeweight(thedistancetow)plusthedistancefromwtothesinkd.

Figure21.19ShortestpathinaEuclideangraph

Whenwedirecttheshortest-pathsearchtowardsthedestinationvertex,wecanrestrictthesearchtoverticeswithinarelativelysmallellipsearoundthepath,as

illustratedinthesethreeexamples,whichshowSPTsubtreesfromtheexamplesinFigure21.12.

Property 21.11 Priority-first search with the Euclidean heuristic solves thesource–sinkshortest-pathsprobleminEuclideangraphs.Proof:TheproofofProperty21.2applies:Atthetimethatweaddavertexxtothetree,theadditionofthedistancefromxtodtotheprioritydoesnotaffectthereasoningthatthetreepathfromstoxisashortestpathinthegraphfromstox,sincethesamequantityisaddedtothelengthofallpathstox.Whendisaddedtothetree,weknowthatnootherpathfromstodisshorterthanthetreepath,becauseanysuchpathmustconsistofatreepathfollowedbyanedgetosomevertexwthat isnotonthetree,followedbyapathfromwtod(whoselengthcannotbeshorterthanthedistancefromwtod);and,byconstruction,weknowthatthelengthofthepathfromstowplusthedistancefromwtodisnosmallerthanthelengthofthetreepathfromstod.In Section 21.6, we discuss another simple way to implement the Euclideanheuristic.First,wemakeapassthroughthegraphtochangetheweightofeachedge: For each edge v-w, we add the quantity distance(w, d) - distance(v, d).Then, we run a standard shortest-path algorithm, starting at s (with wt[s]initialized to distance(s, d)) and stopping when we reach d. This method iscomputationally equivalent to the method that we have described (whichessentiallycomputesthesameweightsonthefly)andisaspecificexampleofabasicoperationknownasreweightinganetwork.Reweightingplaysanessentialroleinsolvingtheshortest-pathsproblemswithnegativeweights;wediscussitindetailinSection21.6.The Euclidean heuristic affects the performance but not the correctness ofDijkstra’s algorithm for the source–sink shortest-paths computation. AsdiscussedintheproofofProperty21.2,usingthestandardalgorithmtosolvethesource–sinkproblemamounts tobuildinganSPTthathasallverticescloser tothestartthanthesinkd.WiththeEuclideanheuristic,theSPTcontainsjusttheverticeswhosepath fromsplus distance tod is smaller than the lengthof theshortestpathfromstod.Weexpectthistreetobesubstantiallysmallerformanyapplicationsbecausetheheuristicprunesasubstantialnumberoflongpaths.Theprecisesavings isdependenton thestructureof thegraphand thegeometryofthevertices.Figure21.19showstheoperationoftheEuclideanheuristiconoursample graph,where the savings are substantial.We refer to themethod as aheuristic because there is no guarantee that therewill be any savings at all: Itcouldalwaysbethecasethattheonlypathfromsourcetosinkisalongonethat

wanders arbitrarily far from the source before heading back to the sink (seeExercise21.80).

Figure21.20Euclideanheuristiccostbounds

Whenwedirecttheshortest-pathsearchtowardsthedestinationvertex,wecanrestrictthesearchtoverticeswithinanellipsearoundthepath,ascomparedtothecirclecenteredatsthatisrequiredbyDijkstra’salgorithm.Theradiusofthecircleandtheshapeoftheellipsearedeterminedbythelengthoftheshortest

path.

Figure21.20illustratesthebasicunderlyinggeometrythatdescribestheintuitionbehindtheEuclideanheuristic:Iftheshortest-pathlengthfromstodisz, thenverticesexaminedbythealgorithmfallroughlywithintheellipsedefinedasthelocusofpointsxforwhichthedistancefromstoxplusthedistancefromxtodisequaltoz.FortypicalEuclideangraphs,weexpectthenumberofverticesinthisellipsetobefarsmallerthanthenumberofverticesinthecircleofradiuszthat is centered at the source (those that would be examined by Dijkstra’salgorithm).Precise analysis of the savings is a difficult analytic problem and depends onmodelsofbothrandompointsetsandrandomgraphs(seereferencesection).Fortypicalsituations,weexpectthat,ifthestandardalgorithmexaminesXverticesin computing a source–sink shortest path, the Euclidean heuristicwill cut thecost to be proportional to , which leads to an expected running timeproportional to for dense graphs and proportional topV for sparse graphs.Thisexampleillustratesthatthedifficultyofdevelopinganappropriatemodeloranalyzingassociatedalgorithmsshouldnotdissuadeusfromtakingadvantageofthesubstantialsavingsthatareavailableinmanyapplications,particularlywhentheimplementation(addatermtothepriority)istrivial.TheproofofProperty21.11appliesforanyfunctionthatgivesalowerboundonthedistancefromeachvertextod.MighttherebeotherfunctionsthatwillcausethealgorithmtoexamineevenfewerverticesthantheEuclideanheuristic?This

question has been studied in a general setting that applies to a broad class ofcombinatorial search algorithms. Indeed, the Euclidean heuristic is a specificinstanceofanalgorithmcalledA*(pronounced“ay-star”).Thistheorytellsusthatusingthebestavailablelower-boundfunctionisoptimal;statedanotherway,the better the bound function, the more efficient the search. In this case, theoptimalityofA*tellsus that theEuclideanheuristicwillcertainlyexaminenomoreverticesthanDijkstra’salgorithm(whichisA*withalowerboundof0).The analytic results just described give more precise information for specificrandomnetworkmodels.We can also use properties of Euclidean networks to help build efficientimplementationsoftheabstract–shortest-pathsADT,tradingtimeforspacemoreeffectively than we can for general networks (see Exercises 21.48 through21.50).Such algorithms are important in applications such asmapprocessing,where networks are huge and sparse. For example, suppose that we want todevelopanavigationsystembasedonshortestpathsforamapwithmillionsofroads.Weperhapscanstorethemapitselfinasmallonboardcomputer,butthedistancesandpathsmatricesaremuchtoolargetobestored(seeExercises21.39and21.40);therefore,theall-pathsalgorithmsofSection21.3arenoteffective.Dijkstra’salgorithmalsomaynotgivesufficientlyshortresponsetimesforhugemaps.Exercises21.77through21.78explorestrategieswherebywecaninvestareasonable amount of preprocessing and space to provide fast responses tosource–sinkshortest-pathsqueries.

Exercises•21.68FindalargeEuclideangraphonline—perhapsamapwithanunderlyingtable of locations and distances between them, telephone connections withcosts,orairlineroutesandrates.21.69Using the strategies described in Exercises 17.71 through 17.73,writeprograms that generate random Euclidean graphs by connecting verticesarrangedina -by- grid.• 21.70 Show that the partial SPT computed by the Euclidean heuristic isindependent of the value that we use to initialize wt[s]. Explain how tocomputetheshortest-pathlengthsfromtheinitialvalue.

•21.71Show,inthestyleofFigure21.10,whatistheresultwhenyouusetheEuclidean heuristic to compute a shortest path from 0 to 6 in the networkdefinedinExercise21.1.21.72 Describe what happens if the function distance(s, t), used for the

Euclidean heuristic, returns the actual shortest-path length from s to t for allpairsofvertices.21.73Developan class implementation for shortest paths indenseEuclideangraphsthatisbaseduponagraphrepresentationthatsupportstheedgefunctionand an implementation of Dijkstra’s algorithm (Program 20.6, with anappropriatepriorityfunction).21.74RunempiricalstudiestotesttheeffectivenessoftheEuclideanshortest-path heuristic, for various Euclidean networks (see Exercises 21.9, 21.68,21.69,and21.80).Foreachgraph,generateV/10randompairsofvertices,andprintatablethatshowstheaveragedistancebetweenthevertices,theaveragelengthoftheshortestpathbetweenthevertices,theaverageratioofthenumberof vertices examined with the Euclidean heuristic to the number of verticesexaminedwithDijkstra’s algorithm, and the average ratio of the area of theellipse associated with the Euclidean heuristic with the area of the circleassociatedwithDijkstra’salgorithm.21.75 Develop a class implementation for the source-sink shortest-pathsprobleminEuclideangraphsthatisbasedonthebidirectionalsearchdescribedinExercise21.35.•21.76UseageometricinterpretationtoprovideanestimateoftheratioofthenumberofverticesintheSPTproducedbyDijkstra’salgorithmforthesource–sinkproblemto thenumberofvertices in theSPTsproducedin the two-wayversiondescribedinExercise21.75.21.77Develop a class implementation for shortest paths inEuclidean graphsthat performs the followingpreprocessing step in the constructor:Divide themapregionintoaW-by-Wgrid,and thenuseFloyd’sall-pairsshortest-pathsalgorithmtocomputeaW2-by-W2matrix,whererowiandcolumnjcontainthelengthofashortestpathconnectinganyvertexingridsquareitoanyvertexingridsquarej.Then,usetheseshortest-pathlengthsaslowerboundstoimprovetheEuclideanheuristic.ExperimentwithafewdifferentvaluesofWsuchthatyouexpectasmallconstantnumberofverticespergridsquare.21.78 Develop an implementation of the all-pairs shortest-paths ADT forEuclideangraphsthatcombinestheideasinExercises21.75and21.77.21.79 Run empirical studies to compare the effectiveness of the heuristicsdescribed in Exercises 21.75 through 21.78, for various Euclidean networks(seeExercises21.9,21.68,21.69,and21.80).21.80 Expand your empirical studies to include Euclidean graphs that are

derivedby removal of all vertices and edges froma circle of radius r in thecenter,forr=0.1,0.2,0.3,and0.4.(ThesegraphsprovideaseveretestoftheEuclideanheuristic.)21.81GiveadirectimplementationofFloyd’salgorithmforanimplementationof thenetworkADTfor implicitEuclideangraphsdefinedbyNpointsintheplanewithedgesthatconnectpointswithindofeachother.Donotexplicitlyrepresent the graph; rather, given two vertices, compute their distance todeterminewhetheranedgeexistsand,ifonedoes,whatitslengthis.21.82DevelopanimplementationforthescenariodescribedinExercise21.81that builds a neighbor graph and then uses Dijkstra’s algorithm from eachvertex(seeProgram21.1).21.83 Run empirical studies to compare the time and space needed by thealgorithmsinExercises21.81and21.82,ford=0.1,0.2,0.3,and0.4.•21.84Write a client program that does dynamic graphical animations of theEuclidean heuristic.Your program should produce images like Figure 21.19(seeExercise21.38).Test your programon variousEuclidean networks (seeExercises21.9,21.68,21.69,and21.80).

21.6ReductionIt turns out that shortest-paths problems—particularly the general case, wherenegativeweights are allowed (the topic of Section 21.7)—represent a generalmathematicalmodel thatwe can use to solve a variety of other problems thatseemunrelatedtographprocessing.Thismodelisthefirstamongseveralsuchgeneralmodels thatweencounter.Aswemovetomoredifficultproblemsandincreasinglygeneralmodels,oneofthechallengesthatwefaceistocharacterizepreciselyrelationshipsamongvariousproblems.Givenanewproblem,weaskwhetherwecansolveiteasilybytransformingittoaproblemthatweknowhowtosolve.Ifweplacerestrictionsontheproblem,willwebeabletosolveitmoreeasily? To help answer such questions, we digress briefly in this section todiscuss the technical language that we use to describe these types ofrelationshipsamongproblems.Definition21.3WesaythataproblemAreducestoanotherproblemBifwecanuseanalgorithmthatsolvesBtodevelopanalgorithmthatsolvesA,inatotalamount of time that is, in the worst case, no more than a constant times theworst-caserunningtimeofthealgorithmthatsolvesB.Wesaythattwoproblemsareequivalentiftheyreducetoeachother.We postpone until Part 8 a rigorous definition ofwhat itmeans to “use” one

algorithmto“develop”another.Formostapplications,wearecontentwith thefollowingsimpleapproach.WeshowthatAreducestoBbydemonstratingthatwecansolveanyinstanceofAinthreesteps:•TransformittoaninstanceofB.•SolvethatinstanceofB.•TransformthesolutionofBtobeasolutionofA.

Aslongaswecanperformthetransformations(andsolveB)efficiently,wecansolveAefficiently.Toillustratethisprooftechnique,weconsidertwoexamples.Property21.12Thetransitiveclosureproblemreducestotheall-pairsshortest-pathsproblemwithnonnegativeweights.

Figure21.21Transitiveclosurereduction

Givenadigraph(left),wecantransformitsadjacencymatrix(withself-loops)intoanadjacencymatrixrepresentinganetworkbyassigninganarbitraryweight

toeachedge(leftmatrix).Asusual,blankentriesinthematrixrepresentasentinelvaluethatindicatestheabsenceofanedge.Giventheall-pairsshortest-paths-lengthsmatrixofthatnetwork(centermatrix),thetransitiveclosureofthedigraph(rightmatrix)issimplythematrixformedbysubsituting0foreach

sentineland1forallotherentries.

Proof:Wehave alreadypointed out the direct relationship betweenWarshall’salgorithmandFloyd’salgorithm.Anotherway toconsider that relationship, inthepresentcontext,istoimaginethatweneedtocomputethetransitiveclosureofdigraphsusingalibraryfunctionthatcomputesallshortestpathsinnetworks.Todoso,weaddself-loopsiftheyarenotpresentinthedigraph;then,webuildanetworkdirectly from theadjacencymatrixof thedigraph,withanarbitraryweight(say0.1)correspondingtoeach1andthesentinelweightcorrespondingtoeach0.Then,wecalltheall-pairsshortest-pathsfunction.Next,wecaneasilycompute the transitive closure from the all-pairs shortest-pathsmatrix that thefunctioncomputes:Givenanytwoverticesuandv,thereisapathfromutovinthedigraph if andonly if the lengthof thepath fromu tov in thenetwork is

nonzero(seeFigure21.21).Thispropertyisaformalstatementthatthetransitiveclosureproblemisnomoredifficult than the all-pairs shortest-paths problem. Since we happen to knowalgorithmsfortransitiveclosurethatareevenfasterthanthealgorithmsthatweknow for all-pairs shortest-paths problems, this information is no surprise.Reductionismoreinterestingwhenweuseittoestablisharelationshipbetweenproblemsthatwedonotknowhowtosolve,orbetweensuchproblemsandotherproblemsthatwecansolve.Property21.13 In networkswith no constraints on edgeweights, the longest-pathandshortest-pathproblems(single-sourceorall-pairs)areequivalent.Proof:Givenashortest-pathproblem,negateall theweights.Alongestpath(apathwith thehighestweight) in themodifiednetwork isa shortestpath in theoriginal network. An identical argument shows that the shortest-path problemreducestothelongest-pathproblem.This proof is trivial, but this property also illustrates that care is justified instatingandprovingreductions,becauseitiseasytotakereductionsforgrantedandthustobemisled.Forexample,itisdecidedlynottruethatthelongest-pathandshortest-pathproblemsareequivalentinnetworkswithnonnegativeweights.At the beginning of this chapter,we outlined an argument that shows that theproblemof findingshortestpaths inundirectedweightedgraphsreduces to theproblemoffindingshortestpathsinnetworks,sowecanuseouralgorithmsfornetworks to solve shortest-pathsproblems inundirectedweightedgraphs.Twofurther points about this reduction are worth contemplating in the presentcontext.First,theconversedoesnothold:Knowinghowtosolveshortest-pathsproblems in undirected weighted graphs does not help us to solve them innetworks. Second, we saw a flaw in the argument: If edge weights could benegative,thereductiongivesnetworkswithnegativecycles,andwedonotknowhowtofindshortestpathsinsuchnetworks.Eventhoughthereductionfails,itturnsouttobestillpossibletofindshortestpathsinundirectedweightedgraphswith no negative cycles with an unexpectedly complicated algorithm (seereferencesection). Since this problemdoes not reduce to the directed version,this algorithm does not help us to solve the shortest-path problem in generalnetworks.TheconceptofreductionessentiallydescribestheprocessofusingoneADTtoimplementanother,asisdoneroutinelybymodernsystemsprogrammers.Iftwoproblemsareequivalent,weknowthatifwecansolveeitherofthemefficiently,we can solve the other efficiently. We often find simple one-to-one

correspondences,suchastheoneinProperty21.13,thatshowtwoproblemstobe equivalent. In this case, we have not yet discussed how to solve eitherproblem,butitisusefultoknowthatifwecouldfindanefficientsolutiontooneof them, we could use that solution to solve the other one. We saw anotherexampleinChapter17:Whenfacedwiththeproblemofdeterminingwhetherornot a graph has an odd cycle, we noted that the problem is equivalent todeterminingwhetherornotthegraphistwo-colorable.Reductionhastwoprimaryapplicationsinthedesignandanalysisofalgorithms.First, it helps us to classify problems according to their difficulty at anappropriate abstract level without necessarily developing and analyzing fullimplementations.Second,weoftendoreductions toestablish lowerboundsonthedifficultyofsolvingvariousproblems,tohelpindicatewhentostoplookingforbetteralgorithms.WehaveseenexamplesoftheseusesinSections19.3and20.7;weseeotherslaterinthissection.Beyondthesedirectpracticaluses,theconceptofreductionalsohaswidespreadandprofoundimplicationsforthetheoryofcomputation;theseimplicationsareimportantforustounderstandaswetackleincreasinglydifficultproblems.WediscussthistopicbrieflyattheendofthissectionandconsideritinfullformaldetailinPart8.The constraint that the cost of the transformations should not dominate is anaturaloneandoftenapplies.Inmanycases,however,wemightchoosetousereductionevenwhenthecostofthetransformationsdoesdominate.Oneofthemost important uses of reduction is to provide efficient solutions to problemsthatmightotherwiseseemintractablebyperformingatransformationtoawell-understood problem that we know how to solve efficiently. ReducingA toB,evenifcomputingthetransformationsismuchmoreexpensivethanissolvingB,may give us a much more efficient algorithm for solving A than we couldotherwisedevise.Therearemanyotherpossibilities.Perhapsweare interestedin expected cost rather than the worst case. Perhaps we need to solve twoproblemsBandCtosolveA.PerhapsweneedtosolvemultipleinstancesofB.We leave further discussion of such variations until Part 8, because all theexamplesthatweconsiderbeforethenareofthesimpletypejustdiscussed.In the particular case where we solve a problem A by simplifying anotherproblemB, we know thatA reduces toB, but not necessarily vice versa. Forexample, selection reduces to sorting because we can find the kth smallestelement in a fileby sorting the file and then indexing (or scanning) to thekthposition,butthisfactcertainlydoesnotimplythatsortingreducestoselection.

In the present context, the shortest-paths problem forweightedDAGs and theshortest-paths problem for networks with positive weights both reduce to thegeneralshortest-pathsproblem.Thisuseofreductioncorrespondstotheintuitivenotionofoneproblembeingmoregeneral thananother.Anysortingalgorithmsolvesanyselectionproblem,and,ifwecansolvetheshortest-pathsproblemingeneralnetworks,wecertainlycanuse that solution fornetworkswithvariousrestrictions;buttheconverseisnotnecessarilytrue.

Figure21.22Jobscheduling

Inthisnetwork,verticesrepresentjobstobecompleted(withweightsindicatingtheamountoftimerequired)andedgesrepresentprecedencerelationships

betweenthem.Forexample,theedgesfrom7to8and3meanthatjob7mustbefinishedbeforejob8orjob3canbestarted.Whatistheminimumamountof

timerequiredtocompleteallthejobs?

Thisuseofreductionishelpful,buttheconceptbecomesmoreusefulwhenweuseittogaininformationabouttherelationshipsbetweenproblemsindifferentdomains. For example, consider the following problems, which seem at firstblush to be far removed from graph processing. Through reduction, we candevelop specific relationships between these problems and the shortest-pathsproblem.JobschedulingAlargesetofjobs,ofvaryingdurations,needstobeperformed.We can be working on any number of jobs at a given time, but a set ofprecedencerelationshipsspecify,forasetofpairsofjobs,thatthefirstmustbecompleted before the second can be started.What is theminimum amount oftime required to complete all the jobs while satisfying all the precedenceconstraints? Specifically, given a set of jobs (with durations) and a set ofprecedence constraints, schedule the jobs (find a start time for each) so as toachievethisminimum.Figure21.22depictsanexampleinstanceofthejob-schedulingproblem.Ituses

anaturalnetworkrepresentation,whichweuse inamomentas thebasis forareduction. This version of the problem is perhaps the simplest of literallyhundreds of versions that have been studied—versions that involve other jobcharacteristics and other constraints, such as the assignment of personnel orotherresourcesto thejobs,othercostsassociatedwithspecific jobs,deadlines,and so forth. In this context, the version thatwe have described is commonlycalledprecedence-constrainedschedulingwithunlimitedparallelism;weusethetermjobschedulingasshorthand.Tohelpus todevelopanalgorithmthatsolves the job-schedulingproblem,weconsiderthefollowingproblem,whichiswidelyapplicableinitsownright.DifferenceconstraintsAssignnonnegativevaluestoasetvariablesx0 throughxn thatminimizethevalueofxnwhilesatisfyingasetofdifferenceconstraintsonthevariables,eachofwhichspecifiesthatthedifferencebetweentwoofthevariablesmustbegreaterthanorequaltoagivenconstant.Figure21.23depictsanexampleinstanceofthisproblem.Itisapurelyabstractmathematical formulation that can serve as the basis for solving numerouspracticalproblems(seereferencesection).The difference-constraint problem is a special case of a much more generalproblem where we allow general linear combinations of the variables in theequations.

Figure21.23Differenceconstraints

Findinganassignmentofnonnegativevaluestothevariablesthatminimizesthevalueofx10subjecttothissetofinequalitiesisequivalenttothejob-schedulingprobleminstanceillustratedinFigure21.22.Forexample,theequationx8x7+.

32meansthatjob8cannotstartuntiljob7iscompleted.

Linear programming Assign nonnegative values to a set of variables x 0through x n that minimize the value of a specified linear combination of thevariables,subjecttoasetofconstraintsonthevariables,eachofwhichspecifiesthatagivenlinearcombinationofthevariablesmustbegreaterthanorequaltoagivenconstant.Linearprogrammingisawidelyusedgeneralapproachtosolvingabroadclassof optimization problems that we will not consider it in detail until Part 8.Clearly,thedifference-constraintsproblemreducestolinearprogramming,asdomanyotherproblems.Forthemoment,ourinterestisintherelationshipsamongthedifference-constraints,job-scheduling,andshortest-pathsproblems.Property 21.14 The job-scheduling problem reduces to the difference-constraintsproblem.Proof:Addadummyjobandaprecedenceconstraintforeachjobsayingthatthejobmust finishbefore thedummy job starts.Given a job-schedulingproblem,define a system of difference equations where each job i corresponds to avariablexi,andtheconstraintthatjcannotstartuntilifinishescorrespondstothe

equation xj ≥ xi + ci, where ci is the length of job i. The solution to thedifference-constraints problem gives precisely a solution to the job-schedulingproblem, with the value of each variable specifying the start time of thecorrespondingjob.Figure 21.23 illustrates the system of difference equations created by thisreduction for the job-scheduling problem in Figure 21.22. The practicalsignificanceofthisreductionisthatwecanusetosolvejob-schedulingproblemsanyalgorithmthatcansolvedifference-constraintproblems.Itisinstructivetoconsiderwhetherwecanusethisconstructionintheoppositeway: Given a job-scheduling algorithm, can we use it to solve difference-constraintsproblems?Theanswertothisquestionisthatthecorrespondenceinthe proof of Property 21.14 does not help us to show that the difference-constraintsproblemreducestothejob-schedulingproblem,becausethesystemsof difference equations that we get from job-scheduling problems have apropertythatdoesnotnecessarilyholdineverydifference-constraintsproblem.Specifically,iftwoequationshavethesamesecondvariable,thentheyhavethesameconstant.Therefore,analgorithmforjobschedulingdoesnotimmediatelygive a direct way to solve a system of difference equations that contains twoequationsxi−xj≥aandxk−xj≥b,wherea≠b.Whenprovingreductions,weneedtobeawareofsituationslikethis:AproofthatAreducestoBmustshowthatwecanuseanalgorithmforsolvingBtosolveanyinstanceofA.By construction, the constants in the difference-constraints problems producedbytheconstructionintheproofofProperty21.14arealwaysnonnegative.Thisfactturnsouttobesignificant.Property 21.15 The difference-constraints problem with positive constants isequivalenttothesingle-sourcelongest-pathsprobleminanacyclicnetwork.Proof: Given a system of difference equations, build a network where eachvariablexicorrespondstoavertexiandeachequationxixjccorrespondstoanedge i-j of weight c. For example, assigning to each edge in the digraph ofFigure21.22theweightofitssourcevertexgivesthenetworkcorrespondingtothe set of difference equations in Figure 21.23. Add a dummy vertex to thenetwork, with a zero-weight edge to every other vertex. If the network has acycle, the systemofdifferenceequationshasno solution (because thepositiveweights imply that the values of the variables corresponding to each vertexstrictlydecreaseaswemovealongapath,and, therefore,acyclewouldimplythatsomevariableislessthanitself),soreportthatfact.Otherwise,thenetwork

hasnocycle,sosolvethesingle-sourcelongest-pathsproblemfromthedummyvertex. There exists a longest path for every vertex because the network isacyclic(seeSection21.4).Assigntoeachvariablethelengthofthelongestpathto the corresponding vertex in the network from the dummy vertex. For eachvariable,thispathisevidencethatitsvaluesatisfiestheconstraintsandthatnosmallervaluedoesso.UnliketheproofofProperty21.14,thisproofdoesextendtoshowthatthetwoproblemsareequivalentbecause theconstructionworks inbothdirections.Wehave no constraint that two equations with the same second variable in theequationmusthavethesameconstants,andnoconstraintthatedgesleavinganygiven vertex in the network must have the same weight. Given any acyclicnetwork with positive weights, the same correspondence gives a system ofdifferenceconstraintswithpositiveconstantswhosesolutiondirectlyyieldsa

Program21.8JobschedulingThisimplementationreadsalistofjobswithlengthsfollowedbyalistofprecedence

constraintsfromstandardinput,thenprintsonstandardoutputalistofjobstartingtimesthatsatisfytheconstraints.Itsolvesthejob-schedulingproblembyreducingittothelongest-pathsproblemforacyclicnetworks,usingProperties21.14and21.15and

Program21.6.

#include”GRAPHbasic.cc“

#include”GRAPHio.cc“

#include”LPTdag.cc“

typedefWeightedEdgeEDGE;

typedefDenseGRAPH<EDGE>GRAPH;

intmain(intargc,char*argv[])

{inti,s,t,N=atoi(argv[1]);

doubleduration[N];

GRAPHG(N,true);

for(inti=0;i<N;i++)

cin>>duration[i];

while(cin>>s>>t)

G.insert(newEDGE(s,t,duration[s]));

LPTdag<GRAPH,EDGE>lpt(G);

for(i=0;i<N;i++)

cout<<i<<”“<<lpt.dist(i)<<endl;

}

solution to the single-source longest-paths problem in the network. Details ofthisproofareleftasanexercise(seeExercise21.90).The network in Figure 21.22 depicts this correspondence for our sampleproblem, and Figure 21.15 shows the computation of the longest paths in thenetwork, using Program 21.6 (the dummy start vertex is implicit in theimplementation).TheschedulethatiscomputedinthiswayisshowninFigure21.24.

Figure21.24Jobschedule

Thisfigureillustratesthesolutiontothejob-schedulingproblemofFigure21.22,derivedfromthecorrespondencebetweenlongestpathsinweightedDAGsandjobschedules.ThelongestpathlengthsinthewtarraythatiscomputedbythelongestpathsalgorithminProgram21.6(seeFigure21.15)arepreciselythe

requiredjobstarttimes(top,rightcolumn).Westartjobs0and5attime0,jobs1,7,and9attime.41,jobs4and6attime.70,andsoforth.

Program21.8isanimplementationthatshowstheapplicationofthistheoryinapractical setting. It transformsany instanceof the job-schedulingproblem intoaninstanceofthelongest-pathprobleminacyclicnetworks,thenusesProgram21.6tosolveit.Wehavebeenimplicitlyassumingthatasolutionexistsforanyinstanceofthejob-scheduling problem; however, if there is a cycle in the set of precedenceconstraints, then there is no way to schedule the jobs to meet them. Beforelooking for longest paths, we should check for this condition by checkingwhether the corresponding network has a cycle (see Exercise 21.100). Such asituationistypical,andaspecifictechnicaltermisnormallyusedtodescribeit.Definition 21.4 A problem instance that admits no solution is said to beinfeasible.In other words, for job-scheduling problems, the question of determiningwhetherajob-schedulingprobleminstanceisfeasiblereducestotheproblemofdeterminingwhetheradigraphisacyclic.Aswemovetoever-more-complicated

problems,thequestionoffeasibilitybecomesanever-more-important(andever-more-difficult!)partofourcomputationalburden.We have now considered three interrelated problems. We might have showndirectly that the job-scheduling problem reduces to the single-source longest-pathsproblem in acyclicnetworks, butwehavealso shown thatwecan solveanydifference-constraintsproblem(withpositiveconstants)inasimilarmanner(seeExercise21.94),aswellasanyotherproblemthatreducestoadifference-constraints problem or a job-scheduling problem. We could, alternatively,develop an algorithm to solve the difference-constraints problem and use thatalgorithmtosolvetheotherproblems,butwehavenotshownthatasolutiontothejob-schedulingproblemwouldgiveusawaytosolvetheothers.These examples illustrate the use of reduction to broaden the applicability ofprovenimplementations.Indeed,modernsystemsprogrammingemphasizestheneedtoreusesoftwarebydevelopingnewinterfacesandusingexistingsoftwareresourcestobuildimplementations.Thisimportantprocess,whichissometimesreferred to as library programming, is a practical realization of the idea ofreduction.Libraryprogramming is extremely important inpractice,but it representsonlypart of the story of the implications of reduction. To illustrate this point, weconsiderthefollowingversionofthejob-schedulingproblem.Job schedulingwith deadlines Allow an additional type of constraint in thejob-scheduling problem, to specify that a job must begin before a specifiedamountoftimehaselapsed,relativetoanotherjob.(Conventionaldeadlinesarerelativetothestartjob.)Suchconstraintsarecommonlyneededintime-criticalmanufacturingprocessesandinmanyotherapplications,andtheycanmakethejob-schedulingproblemconsiderablymoredifficulttosolve.Suppose that we need to add a constraint to our example of Figures 21.22through21.24thatjob2muststartearlierthanacertainnumbercoftimeunitsafter job 4 starts. If c is greater than .53, then the schedule that we havecomputedfitsthebill,sinceitsaystostartjob2attime1.23,whichis.53aftertheendtimeofjob4(whichstartsat.70).Ifcislessthan.53,wecanshiftthestarttimeof4latertomeettheconstraint.Ifjob4werealongjob,thischangecould increase the finish timeof thewhole schedule.Worse, if there areotherconstraintsonjob4,wemaynotbeabletoshiftitsstarttime.Indeed,wemayfindourselveswithconstraintsthatnoschedulecanmeet:Forinstance,wecouldnotsatisfyaconstraint inourexample that job2muststartearlier thand timeunitsafterthestartofjob6fordlessthan.53becausetheconstraintsthat2must

follow8and8must follow6 imply that2must start later than .53 timeunitsafterthestartof6.Ifweaddbothofthetwoconstraintsdescribedinthepreviousparagraphtotheexample, thenbothof themaffect the time that 4 canbe scheduled, the finishtimeofthewholeschedule,andwhetherafeasiblescheduleexists,dependingonthe values of c and d. Adding more constraints of this type multiplies thepossibilities and turns an easy problem into a difficult one. Therefore,we arejustifiedinseekingtheapproachofreducingtheproblemtoaknownproblem.Property 21.16 The job-scheduling-with-deadlines problem reduces to theshortest-pathsproblem(withnegativeweightsallowed).Proof:Convertprecedenceconstraints to inequalitiesusing the same reductiondescribedinProperty21.14.Foranydeadlineconstraint,addaninequalityxi−xj≤dj,or,equivalentlyxj−xi≥−dj,wheredjisapositiveconstant.ConvertthesetofinequalitiestoanetworkusingthesamereductiondescribedinProperty21.15.Negatealltheweights.BythesameconstructiongivenintheproofofProperty21.15, any shortest-path tree rooted at 0 in the network corresponds to aschedule.Thisreduction takesus to therealmofshortestpathswithnegativeweights. Itsaysthatifwecanfindanefficientsolutiontotheshortest-pathsproblemwithnegativeweights, thenwe can find an efficient solution to the job-schedulingproblem with deadlines. (Again, the correspondence in the proof of Property21.16doesnotestablishtheconverse(seeExercise21.91).)Adding deadlines to the job-scheduling problem corresponds to allowingnegativeconstantsinthedifference-constraintsproblemandnegativeweightsinthe shortest-paths problem. (This change also requires that we modify thedifference-constraintsproblemtoproperlyhandletheanalogofnegativecyclesin the shortestpathsproblem.)Thesemoregeneralversionsof theseproblemsaremoredifficulttosolvethantheversionsthatwefirstconsidered,buttheyarealso likely tobemoreusefulasmoregeneralmodels.Aplausibleapproach tosolvingallofthemwouldseemtobetoseekanefficientsolutiontotheshortest-pathsproblemwithnegativeweights.Unfortunately, there is a fundamental difficulty with this approach, and itillustratestheotherpartofthestoryintheuseofreductiontoassesstherelativedifficulty of problems. We have been using reduction in a positive sense, toexpandtheapplicabilityofsolutionstogeneralproblems;butitalsoappliesinanegativesense,toshowthelimitsonsuchexpansion.

Thedifficultyisthatthegeneralshortest-pathsproblemistoohardtosolve.Wesee next how the concept of reduction helps us to make this statement withprecisionandconviction.InSection17.8,wediscussedasetofproblems,knownas theNP-hardproblems, thatweconsider tobe intractablebecauseallknownalgorithmsforsolvingthemrequireexponentialtimeintheworstcase.Weshowherethatthegeneralshortest-pathsproblemisNP-hard.As mentioned briefly in Section 17.8 and discussed in detail in Part 8, wegenerally take the fact that a problem is NP-hard to mean not just that noefficientalgorithmisknownthatisguaranteedtosolvetheproblembutalsothatwehave littlehopeoffindingone. In thiscontext,weuse the termefficient torefertoalgorithmswhoserunningtimeisboundedbysomepolynomialfunctionof thesizeof the input, in theworstcase.Weassume that thediscoveryofanefficientalgorithmtosolveanyNP-hardproblemwouldbeastunningresearchbreakthrough.TheconceptofNP-hardnessisimportantinidentifyingproblemsthataredifficulttosolve,becauseitisofteneasytoprovethataproblemisNP-hard,usingthefollowingtechnique.Property21.17AproblemisNP-hardifthereisanefficientreductiontoitfromanyNP-hardproblem.Thispropertydependsontheprecisemeaningofanefficientreductionfromoneproblem A to another problem B. We defer such definitions to Part 8 (twodifferent definitions are commonly used). For themoment,we simply use thetermtocoverthecasewherewehaveefficientalgorithmsbothtotransformaninstanceofAtoaninstanceofBandtotransformasolutionofBtoasolutionofA.Now,supposethatwehaveanefficientreductionfromanNP-hardproblemAtoa given problem B. The proof is by contradiction: If we have an efficientalgorithmforB, thenwecoulduse it tosolveany instanceofA inpolynomialtime,byreduction(transformthegiveninstanceofAtoaninstanceofB, solvethat problem, then transform the solution).But noknown algorithmcanmakesuch a guarantee forA (becauseA is NP-hard), so the assumption that thereexistsapolynomial-timealgorithmforBisincorrect:BisalsoNP-hard.This technique is extremely important because people have used it to show ahugenumberofproblemstobeNP-hard,givingusabroadvarietyofproblemsfromwhichtochoosewhenwewant todevelopaproof thatanewproblemisNP-hard.Forexample,weencounteredoneoftheclassicNP-hardproblemsinSection17.7.TheHamilton-pathproblem,whichaskswhetherthereisasimplepathcontainingall thevertices inagivengraph,wasoneof thefirstproblems

showntobeNP-hard(seereferencesection).Itiseasytoformulateasashortest-pathsproblem,soProperty21.17impliesthattheshortest-pathsproblemitselfisNP-hard:Property21.18Innetworkswithedgeweightsthatcouldbenegative,shortest-pathsproblemsareNP-hard.Proof:OurproofconsistsofreducingtheHamilton-pathproblemtotheshortest-pathsproblem.Thatis,weshowthatwecoulduseanyalgorithmthatcanfindshortest paths in networkswith negative edgeweights to solve theHamilton-pathproblem.Givenanundirectedgraph,webuildanetworkwithedgesinbothdirections corresponding to each edge in the graph andwith all edges havingweight−1.Theshortest(simple)pathstartingatanyvertexinthisnetworkisoflength1−VifandonlyifthegraphhasaHamiltonpath.Notethatthisnetworkis replete with negative cycles. Not only does every cycle in the graphcorrespondtoanegativecycleinthenetwork,butalsoeveryedgeinthegraphcorrespondstoacycleofweight−2inthenetwork.The implication of this construction is that the shortest-paths problem is NP-hard,because ifwecoulddevelopanefficient algorithm for the shortest-pathsproblem in networks, then we would have an efficient algorithm for theHamilton-pathproblemingraphs.One response to the discovery that a given problem is NP-hard is to seekversionsofthatproblemthatwecansolve.Forshortest-pathsproblems,wearecaughtbetweenhavingahostofefficientalgorithmsforacyclicnetworksorfornetworks inwhich edgeweights arenonnegative andhavingnogood solutionfornetworksthatcouldhavecyclesandnegativeweights.Arethereotherkindsofnetworksthatwecanaddress?ThatisthesubjectofSection21.7.There,forexample, we see that the job-scheduling-with-deadlines problem reduces to aversionoftheshortest-pathsproblemthatwecansolveefficiently.Thissituationis typical:Aswe address ever-more-difficult computational problems,we findourselvesworkingtoidentifytheversionsofthoseproblemsthatwecanexpecttosolve.As these examples illustrate, reduction is a simple technique that is helpful inalgorithmdesign,andweuseitfrequently.Eitherwecansolveanewproblembyproving that it reduces toaproblemthatweknowhowtosolve,orwecanprovethat thenewproblemwillbedifficultbyprovingthataproblemthatweknowtobedifficultreducestotheprobleminquestion.Table21.3givesusamoredetailedlookatthevariousimplicationsofreductionresultsamongthefourgeneralproblemclassesthatwediscussedinChapter17.

Notethatthereareseveralcaseswhereareductionprovidesnonewinformation;for example, although selection reduces to sorting and the problem of findinglongest paths in acyclic networks reduces to the problem of finding shortestpathsingeneralnetworks,thesefactsshednonewlightontherelativedifficultyof the problems. In other cases, the reduction may or may not provide newinformation; in still other cases, the implications of a reduction are trulyprofound.Todeveloptheseconcepts,weneedapreciseandformaldescriptionofreduction,aswediscussindetailinPart8;here,wesummarizeinformallythemostimportantusesofreductioninpractice,withexamplesthatwehavealreadyseen.

Table21.3Reductionimplications

This table summarizes some implications of reducing a problemA to anotherproblemB,withexamplesthatwehavediscussedinthissection.Theprofoundimplicationsofcases9and10aresofar-reachingthatwegenerallyassumethatitisnotpossibletoprovesuchreductions(seePart8).Reductionismostusefulincases1,6,11,and16,tolearnanewalgorithmforAorprovealowerboundonB;incases13-15,tolearnnewalgorithmsforA;andincase12,tolearnthedifficultyofB.

UpperboundsIfwehaveanefficientalgorithmforaproblemBandcanprove

thatAreducestoB,thenwehaveanefficientalgorithmforA.TheremayexistsomeotherbetteralgorithmforA,butB’sperformanceisanupperboundonthebestthatwecandoforA.Forexample,ourproofthatjobschedulingreducestolongestpathsinacyclicnetworksmakesouralgorithmforthelatteranefficientalgorithmfortheformer.Lower bounds If we know that any algorithm for problem A has certainresourcerequirementsandwecanprovethatAreducestoB,thenweknowthatBhasatleastthosesameresourcerequirements,becauseabetteralgorithmforBwouldimplytheexistenceofabetteralgorithmforA(aslongasthecostofthereductionislowerthanthecostofB).Thatis,A’sperformanceisalowerboundonthebestthatwecandoforB.Forexample,weusedthistechniqueinSection19.3 to show that computing the transitive closure is as difficult as Booleanmatrixmultiplication,andweuseditinSection20.7toshowthatcomputingtheEuclideanMSTisasdifficultassorting.Intractability In particular, we can prove a problem to be intractable byshowingthatanintractableproblemreducesto it.Forexample,Property21.18shows that the shortest-pathsproblem is intractablebecause theHamilton-pathproblemreducestotheshortest-pathsproblem.Beyond these general implications, it is clear that more detailed informationabouttheperformanceofspecificalgorithmstosolvespecificproblemscanbedirectlyrelevanttootherproblemsthatreducetothefirstones.Whenwefindanupperbound,wecananalyzetheassociatedalgorithm,runempiricalstudies,andso forth to determine whether it represents a better solution to the problem.When we develop a good general-purpose algorithm, we can invest indeveloping and testing a good implementation and then develop associatedADTsthatexpanditsapplicability.Weusereductionasabasictoolinthisandthenextchapter.Weemphasizethegeneralrelevanceoftheproblemsthatweconsider,andthegeneralapplicabilityofthealgorithmsthatsolvethem,byreducingotherproblemstothem.Itisalsoimportant to be aware of a hierarchy among increasingly general problem-formulationmodels.Forexample, linearprogrammingisageneralformulationthatisimportantnotjustbecausemanyproblemsreducetoitbutalsobecauseitisnotknowntobeNP-hard.Inotherwords,thereisnoknownwaytoreducethegeneral shortest-paths problem (or any other NP-hard problem) to linearprogramming.WediscusssuchissuesinPart8.Notallproblemscanbesolved,butgoodgeneralmodelshavebeendevisedthatare suitable for broad classes of problems that we do know how to solve.

Shortestpathsinnetworksisourfirstexampleofsuchamodel.Aswemovetoever-more-general problemdomains,we enter the fieldofoperations research(OR),thestudyofmathematicalmethodsofdecisionmaking,wheredevelopingand studying suchmodels is central. One key challenge in OR is to find themodelthatismostappropriateforsolvingaproblemandtofittheproblemtothemodel.Thisactivityissometimesknownasmathematicalprogramming(anamegiven to it before the advent of computers and the new use of the word“programming”). Reduction is a modern concept that is in the same spirit asmathematicalprogrammingandisthebasisforourunderstandingofthecostofcomputationinabroadvarietyofapplications.

Exercises• 21.85 Use the reduction of Property 21.12 to develop a transitiveclosureimplementation(withthesameinterfaceasPrograms19.3and19.4)thatusestheall-pairsshortest-pathsADTofSection21.3.21.86Showthattheproblemofcomputingthenumberofstrongcomponentsina digraph reduces to the all-pairs shortest-paths problem with nonnegativeweights.21.87 Give the difference-constraints and shortest-paths problems thatcorrespond—accordingtotheconstructionsofProperties21.14and21.15—tothejob-schedulingproblem,wherejobs0to7havelengths

.4.2.3.4.2.5.1andconstraints

5-14-66-03-26-16-2,respectively.•21.88Giveasolutiontothejob-schedulingproblemofExercise21.87.•21.89SupposethatthejobsinExercise21.87alsohavetheconstraintsthatjob1muststartbeforejob6ends,andjob2muststartbeforejob4ends.Givetheshortest-pathsproblem towhich thisproblemreduces,using theconstructiondescribedintheproofofProperty21.16.21.90Show that the all-pairs longest-pathsproblem in acyclicnetworkswithpositive weights reduces to the difference-constraints problem with positiveconstants.•21.91ExplainwhythecorrespondenceintheproofofProperty21.16doesnotextendtoshowthattheshortest-pathsproblemreducestothejob-scheduling-with-deadlinesproblem.

21.92ExtendProgram21.8tousesymbolicnamesinsteadofintegerstorefertojobs(seeProgram17.10).21.93DesignanADTinterfacethatprovidesclientswiththeabilitytoposeandsolvedifference-constraintsproblems.21.94WriteaclassthatimplementsyourinterfacefromExercise21.93,basingyour solution to the difference-constraints problem on a reduction to theshortest-pathsprobleminacyclicnetworks.21.95 Provide an implementation for a class that solves the single-sourceshortest-paths problem in acyclic networks with negative weights, which isbased on a reduction to the difference-constraints problem and uses yourinterfacefromExercise21.93.• 21.96 Your solution to the shortest-paths problem in acyclic networks forExercise 21.95 assumes the existence of an implementation that solves thedifference-constraints problem.What happens if you use the implementationfromExercise21.94,which assumes the existence of an implementation fortheshortest-pathsprobleminacyclicnetworks?

•21.97ProvetheequivalenceofanytwoNP-hardproblems(thatis,choosetwoproblemsandprovethattheyreducetoeachother).

••21.98Giveanexplicitconstructionthatreducestheshortest-pathsprobleminnetworkswithintegerweightstotheHamilton-pathproblem.

•21.99UsereductiontoimplementaclassthatusesanetworkADTthatsolvesthesingle-sourceshortest-pathsproblemtosolvethefollowingproblem:Givenadigraph,avertex-indexedvectorofpositiveweights,andastartvertexv,findthepathsfromvtoeachothervertexsuchthatthesumoftheweightsoftheverticesonthepathisminimized.

•21.100Program21.8doesnotcheckwhetherthejob-schedulingproblemthatit takes as input is feasible (has a cycle). Characterize the schedules that itprintsoutforinfeasibleproblems.21.101DesignanADTinterfacethatgivesclientstheabilitytoposeandsolvejob-schedulingproblems.Writeaclassthatimplementsyourinterface,basingyoursolutiontothejob-schedulingproblemonareductiontotheshortest-pathsprobleminacyclicnetworks,asinProgram21.8.• 21.102 Add a function to your class from Exercise 21.101 (and provide animplementation)thatprintsoutalongestpathintheschedule.(Suchapathisknownasacriticalpath.)21.103Write a client for your interface fromExercise 21.101 that outputs a

PostScript program that draws the schedule in the style of Figure 21.24 (seeSection4.3).• 21.104 Develop a model for generating job-scheduling problems. Use thismodel to test your implementations of Exercises 21.101 and 21.103 for areasonablesetofproblemsizes.21.105 Write a class that implements your interface from Exercise 21.101,basing your solution to the job-scheduling problem on a reduction to thedifference-constraintsproblem.•21.106APERT(performance-evaluation-review-technique)chartisanetworkthat represents a job-scheduling problem, with edges representing jobs, asdescribedinFigure21.25.Writeaclass that implementsyour job-schedulinginterfaceofExercise21.101thatisbasedonPERTcharts.21.107 How many vertices are there in a PERT chart for a job-schedulingproblemwithVjobsandEconstraints?21.108 Write programs to convert between the edge-based job-schedulingrepresentation(PERTchart)discussedinExercise21.106andthevertex-basedrepresentationusedinthetext(seeFigure21.22).

Figure21.25APERTchart

APERTchartisanetworkrepresentationforjob-schedulingproblemswherewe

representjobsbyedges.Thenetworkatthetopisarepresentationofthejob-schedulingproblemdepictedinFigure21.22,wherejobs0through9inFigure21.22arerepresentedbyedges0-1,1-2,2-3,4-3,5-3,0-3,5-4,1-4,4-2,and1-5,respectively,here.Thecriticalpathinthescheduleisthelongestpathinthe

network.

21.7NegativeWeightsWenowturntothechallengeofcopingwithnegativeweightsinshortest-pathsproblems.Perhapsnegativeedgeweightsseemunlikely,givenourfocusthroughmostofthischapteronintuitiveexamples,whereweightsrepresentdistancesorcosts;however,wealsosawinSection21.6thatnegativeedgeweightsariseinanatural way when we reduce other problems to shortest-paths problems.Negativeweightsarenotmerelyamathematicalcuriosity;onthecontrary,theysignificantlyextend theapplicabilityof theshortest-pathsproblemsasamodelforsolvingotherproblems.Thispotentialutilityisourmotivationtosearchforefficientalgorithmstosolvenetworkproblemsthatinvolvenegativeweights.Figure 21.26 is a small example that illustrates the effects of introducingnegative weights on a network’s shortest paths. Perhaps the most importanteffectisthatwhennegativeweightsarepresent,low-weightshortestpathstendtohavemoreedgesthanhigher-weightpaths.Forpositiveweights,ouremphasiswason looking for shortcuts; butwhennegativeweights are present,we seekdetoursthatuseasmanyedgeswithnegativeweightsaswecanfind.Thiseffectturnsour intuition inseeking“short”paths intoa liability inunderstanding thealgorithms, so we need to suppress that line of intuition and consider theproblemonabasicabstractlevel.

Figure21.26Asamplenetworkwithnegativeedges

ThissamplenetworkisthesameasthenetworkdepictedinFigure21.1,exceptthattheedges3-5and5-1arenegative.Naturally,thischangedramaticallyaffectstheshortestpathsstructure,aswecaneasilyseebycomparingthe

distanceandpathmatricesattherightwiththeircounterpartsinFigure21.9.For

example,theshortestpathfrom0to1inthisnetworkis0-5-1,whichisoflength0;andtheshortestpathfrom2to1is2-3-5-1,whichisoflength-.17.

TherelationshipshownintheproofofProperty21.18betweenshortestpathsinnetworksandHamiltonpathsingraphstiesinwithourobservationthatfindingpaths of low weight (which we have been calling “short”) is tantamount tofinding paths with a high number of edges (which we might consider to be“long”).Withnegativeweights,wearelookingforlongpathsratherthanshortpaths.Thefirst ideathatsuggests itself toremedythesituationis tofindthesmallest(mostnegative)edgeweight,thentoaddtheabsolutevalueofthatnumbertoallthe edgeweights to transform the network into onewith no negativeweights.This naive approach does not work at all, because shortest paths in the newnetworkbearlittlerelationtoshortestpathsintheoldone.Forexample,inthenetworkillustratedinFigure21.26,theshortestpathfrom4to2is4-3-5-1-2.Ifweadd .38 toall theedgeweights in thegraph tomake themallpositive, theweightofthispathgrowsfrom.20to1.74.Buttheweightof4-2growsfrom.32tojust.70,sothatedgebecomestheshortestpathfrom4to2.Themoreedgesapath has, themore it is penalized by this transformation; that result, from theobservationinthepreviousparagraph,ispreciselytheoppositeofwhatweneed.Eventhoughthisnaiveideadoesnotwork,thegoaloftransformingthenetworkintoanequivalentonewithnonegativeweightsbut thesameshortestpaths isworthy; at the end of the section,we consider an algorithm that achieves thisgoal.Ourshortest-pathsalgorithmstothispointhaveallplacedoneoftworestrictionsontheshortest-pathsproblemso that theycanofferanefficientsolution:Theyeither disallow cycles or disallow negative weights. Is there a less stringentrestriction that we could impose on networks that contain both cycles andnegativeweights thatwouldstill lead to tractable shortest-pathsproblems?Wetouchedonananswertothisquestionatthebeginningofthechapter,whenwehadtoaddtherestrictionthatpathsbesimplesothattheproblemwouldmakesense if there were negative cycles. Should we perhaps restrict attention tonetworksthathavenosuchcycles?ShortestpathsinnetworkswithnonegativecyclesGivenanetworkthatmayhavenegativeedgeweightsbutdoesnothaveanynegative-weightcycles,solveone of the following problems: Find a shortest path connecting two givenvertices(shortest-pathproblem),findshortestpathsfromagivenvertextoalltheothervertices(single-sourceproblem),orfindshortestpathsconnectingallpairs

ofvertices(all-pairsproblem).TheproofofProperty21.18leavesthedooropenforthepossibilityofefficientalgorithms for solving this problem because it breaks down if we disallownegativecycles.TosolvetheHamilton-pathproblem,wewouldneedtobeabletosolveshortest-pathsproblemsinnetworksthathavehugenumbersofnegativecycles.Moreover,manypracticalproblems reduceprecisely to theproblemof findingshortestpathsinnetworksthatcontainnonegativecycles.Wehavealreadyseenonesuchexample.Property 21.19 The job-scheduling-with-deadlines problem reduces to theshortest-pathsprobleminnetworksthatcontainnonegativecycles.Proof:TheargumentthatweusedintheproofofProperty21.15showsthattheconstruction in the proof of Property 21.16 leads to networks that contain nonegative cycles. From the job-scheduling problem, we construct a difference-constraints problemwithvariables that correspond to job start times; from thedifference-constraints problem, we construct a network. We negate all theweightstoconvertfromalongest-pathsproblemtoashortest-pathsproblem—atransformationthatcorrespondstoreversingthesenseofalltheinequalities.Anysimplepathfromitojinthenetworkcorrespondstoasequenceofinequalitiesinvolving the variables.The existence of the path implies, by collapsing theseinequalities, thatxi−xj≤wij,wherewij is thesumof theweightson thepathfromitoj.Anegativecyclecorrespondsto0ontheleftsideofthisinequalityand a negative value on the right, so the existence of such a cycle is acontradiction.•

Figure21.27Arbitrage

Thetableatthetopspecifiesconversionfactorsfromonecurrencytoanother.Forexample,thesecondentryinthetoprowsaysthat$1buys1.631unitsof

currencyP.Converting$1000tocurrencyPandbackagainwouldyield$1000*(1.631)*(0.613)=$999,alossof$1.Butconverting$1000tocurrencyPthentocurrencyYandbackagainyields$1000*(1.631)*(0.411)*(1.495)=$1002,a.2%arbitrageopportunity.Ifwetakethenegativeofthelogarithmofallthe

numbersinthetable(bottom),wecanconsiderittobetheadjacencymatrixforacompletenetworkwithedgeweightsthatcouldbepositiveornegative.Inthisnetwork,nodescorrespondtocurrencies,edgestoconversions,andpathstosequencesofconversions.Theconversionjustdescribedcorrespondstothecycle$-P-Y-$inthegraph,whichhasweight−0.489+0.890−0.402=.002.

Thebestarbitrageopportunityistheshortestcycleinthegraph.

As we noted when we first discussed the job-scheduling problem in Section21.6, this statement implicitly assumes that our job-scheduling problems arefeasible(haveasolution).Inpractice,wewouldnotmakesuchanassumption,and part of our computational burdenwould be determiningwhether or not ajob-scheduling-with-deadlines problem is feasible. In the construction in theproof ofProperty 21.19, a negative cycle in the network implies an infeasibleproblem,sothistaskcorrespondstothefollowingproblem.Negative cycle detection Does a given network have a negative cycle? If itdoes,findonesuchcycle.Ontheonehand,thisproblemisnotnecessarilyeasy(asimplecycle-checkingalgorithmfordigraphsdoesnotapply);on theotherhand, it isnotnecessarilydifficult(thereductionofProperty21.16fromtheHamilton-pathproblemdoesnotapply).Ourfirstchallengewillbetodevelopanalgorithmforthistask.

In the job-scheduling-with-deadlinesapplication,negativecyclescorrespond toerrorconditions thatarepresumably rare,but forwhichweneed tocheck.Wemight even develop algorithms that remove edges to break the negative cycleanditerateuntiltherearenone.Inotherapplications,detectingnegativecyclesistheprimeobjective,asinthefollowingexample.ArbitrageMany newspapers print tables showing conversion rates among theworld’scurrencies(see,forexample,Figure21.27).Wecanviewsuchtablesasadjacency-matrixrepresentationsofcompletenetworks.Anedges-twithweightxmeansthatwecanconvert1unitofcurrencysintoxunitsofcurrencyt.Pathsin the network specifymultistep conversions. For example, if there is also anedget-wwithweighty,thenthepaths-t-wrepresentsawaytoconvert1unitofcurrency s intoxy unitsof currencyw.Wemight expectxy to be equal to theweightofs-winallcases,butsuchtablesrepresentacomplexdynamicsystemwhere such consistency cannot be guaranteed. If we find a case where xy issmaller than theweight of s-w, thenwemay be able to outsmart the system.Supposethattheweightofw-siszandxyz>1, then thecycles-t-w-sgivesaway toconvert1unitofcurrencys intomore than1units (xyz)ofcurrencys.Thatis,wecanmakea100(xyz−1)percentprofitbyconvertingfromstottowback to s.This situation is anexampleof anarbitrage opportunity thatwouldallowustomakeunlimitedprofitswereitnotforforcesoutsidethemodel,suchaslimitationsonthesizeoftransactions.Toconvertthisproblemtoashortest-paths problem,we take the logarithm of all the numbers so that pathweightscorrespond to adding edge weights instead of multiplying them, and then wetake the negative to invert the comparison. Then the edge weights might benegative or positive, and a shortest path from s to t gives a best way ofconverting from currency s to currency t. The lowest-weight cycle is the bestarbitrageopportunity,butanynegativecycleisofinterest.

Figure21.28FailureofDijkstra’salgorithm(negativeweights)

Inthisexample,Dijkstra’salgorithmdecidesthat4-2istheshortestpathfrom4to2(oflength.32)andmissestheshorterpath4-3-5-1-2(oflength.20).

Canwedetect negative cycles in a networkor find shortest paths innetworksthat contain nonegative cycles?The existence of efficient algorithms to solvetheseproblemsdoesnotcontradicttheNP-hardnessofthegeneralproblemthatwe proved in Property 21.18, because no reduction from the Hamilton-pathproblem to either problem is known. Specifically, the reduction of Property21.18saysthatwhatwecannotdoistocraftanalgorithmthatcanguaranteetofindefficientlythelowest-weightpathinanygivennetworkwhennegativeedgeweightsareallowed.Thatproblemstatementistoogeneral.Butwecansolvetherestrictedversionsoftheproblemjustmentioned,albeitnotaseasilyaswecanthe other restricted versions of the problem (positive weights and acyclicnetworks)thatwestudiedearlierinthischapter.Ingeneral,aswenotedinSection21.2,Dijkstra’salgorithmdoesnotworkinthepresenceofnegativeweights, evenwhenwe restrictattention tonetworks thatcontain no negative cycles. Figure21.28 illustrates this fact. The fundamentaldifficultyisthatthealgorithmdependsonexaminingpathsinincreasingorderoftheirlength.Theproofthatthealgorithmiscorrect(seeProperty21.2)assumesthataddinganedgetoapathmakesthatpathlonger.Floyd’s algorithmmakesno such assumption and is effective evenwhen edgeweightsmaybe negative. If there are no negative cycles, it computes shortestpaths;remarkablyenough,iftherearenegativecycles,itdetectsatleastoneofthem.Property21.20Floyd’s algorithm solves the negative-cycle–detection problemand the all-pairs shortest-paths problem in networks that contain no negativecycles,intimeproportionaltoV3.Proof: The proof of Property 21.8 does not depend on whether or not edgeweightsarenegative;however,weneedtointerprettheresultsdifferentlywhennegative edgeweights arepresent.Eachentry in thematrix is evidenceof thealgorithm having discovered a path of that length; in particular, any negativeentryon thediagonalof thedistancesmatrix is evidenceof thepresenceof atleastonenegativecycle.Inthepresenceofnegativecycles,wecannotdirectlyinfer any further information, because the paths that the algorithm implicitlytestsarenotnecessarilysimple:Somemayinvolveoneormoretripsaroundoneormorenegativecycles.However,iftherearenonegativecycles,thenthepathsthat are computedby the algorithmare simple, because anypathwith a cycle

wouldimplytheexistenceofapaththatconnectsthesametwopointswhichhasfeweredgesandisnotofhigherweight(thesamepathwiththecycleremoved).TheproofofProperty21.20doesnotgivespecificinformationabouthowtofinda specific negative cycle from the distances and paths matrices computed byFloyd’salgorithm.Weleavethattaskforanexercise(seeExercise21.122).Floyd’s algorithm solves the all-pairs shortest-paths problem for graphs thatcontainnonegativecycles.GiventhefailureofDijkstra’salgorithminnetworksthatcontainweightsthatcouldbenegative,wecouldalsouseFloyd’salgorithmto solve the theall-pairsproblem for sparsenetworks that containnonegativecycles, in timeproportional toV3. Ifwehavea single-sourceproblem in suchnetworks,thenwecanusethisV3solutiontotheall-pairsproblemthat,althoughit amounts to overkill, is the best thatwe have yet seen for the single-sourceproblem. Can we develop faster algorithms for these problems—ones thatachievetherunningtimesthatweachievewithDijkstra’salgorithmwhenedgeweightsarepositive(ElgVforsingle-sourceshortestpathsandVElgVforall-pairsshortestpaths)?Wecananswerthisquestionintheaffirmativefortheall-pairsproblemandalsocanbringdowntheworst-casecosttoVEforthesingle-source problem, but breaking the V E barrier for the general single-sourceshortest-pathsproblemisalongstandingopenproblem.Thefollowingapproach,developedbyR.BellmanandL.Fordinthelate1950s,providesa simple andeffectivebasis for attacking single-source shortest-pathsproblemsinnetworksthatcontainnonegativecycles.Tocomputeshortestpathsfrom a vertex s, wemaintain (as usual) a vertex-indexed vector wt such thatwt[t]containstheshortest-pathlengthfromstot.Weinitializewt[s]to0andallother

Figure21.29Floyd’salgorithm(negativeweights)

Thissequenceshowstheconstructionoftheall–shortest-pathsmatricesforadigraphwithnegativeweights,usingFloyd’salgorithm.Thefirststepisthe

sameasdepictedinFigure21.14.Thenthenegativeedge5-1comesintoplayinthesecondstep,wherethepaths5-1-2and5-1-4arediscovered.Thealgorithminvolvespreciselythesamesequenceofrelaxationstepsforanyedgeweights,

buttheoutcomediffers.

wtentriestoalargesentinelvalue,thencomputeshortestpathsasfollows:

Consideringthenetwork’sedgesinanyorder,relaxalongeachedge.MakeVsuchpasses.

We use the term Bellman–Ford algorithm to refer to the generic method ofmakingVpassesthroughtheedges,consideringtheedgesinanyorder.Certainauthorsusethetermtodescribeamoregeneralmethod(seeExercise21.130).For example, in agraph representedwith adjacency lists,wecould implementtheBellman–Fordalgorithm to find the shortestpaths froma startvertex sbyinitializing the wt entries to a value larger than any path length and the sptentriestonullpointers,thenproceedingasfollows:

wt[s]=0;

for(i=0;i<G->V();i++)

for(v=0;v<G->V();v++)

{

if(v!=s&&spt[v]==0)continue;


for(Edge*e=A.beg();!A.end();e=A.nxt())

if(wt[e->w()]>wt[v]+e->wt())

{wt[e->w()]=wt[v]+e->wt();st[e->w()]=e;}

}

Thiscodeexhibitsthesimplicityofthebasicmethod.Itisnotusedinpractice,however, because simple modifications yield implementations that are moreefficientformostgraphs,aswesoonsee.Property 21.21With the Bellman–Ford algorithm, we can solve the single-source shortest-paths problem in networks that contain no negative cycles intimeproportionaltoVE.Proof:WemakeVpassesthroughallEedges,sothetotaltimeisproportionaltoV E. To show that the computation achieves the desired result, we show byinductiononi that,after the ithpass,wt[v] isnogreater thanthe lengthof theshortestpath froms tov that contains ior feweredges, forallverticesv.Theclaimiscertainlytrueifiis0.Assumingtheclaimtobetruefori,therearetwocases foreachgivenvertexv:Among thepaths froms tovwith i+1or feweredges,theremayormaynotbeashortestpathwithi+1edges.Iftheshortestofthepathswithi+1orfeweredgesfromstovisoflengthiorless,

Program21.9Bellman–FordalgorithmThisimplementationoftheBellman–FordalgorithmmaintainsaFIFOqueueofall

verticesforwhichrelaxingalonganoutgoingedgecouldbeeffective.Wetakeavertexoffthequeueandrelaxalongallofitsedges.Ifanyofthemleadstoashorterpathto

somevertex,weputthatonthequeue.ThesentinelvalueG->Vseparatesthecurrentbatchofvertices(whichchangedonthelastiteration)fromthenextbatch(whichchangeon

thisiteration)andallowsustostopafterG->Vpasses.

thenwt[v]willnotchangeandwillremainvalid.Otherwise,thereisapathfroms tovwith i+1edges that is shorter thananypath froms tovwith ior feweredges.Thatpathmustconsistofapathwithiedgesfromstosomevertexwplusthe edge w-v. By the induction hypothesis, wt[w] is an upper bound on theshortest distance from s to w, and the (i+1)st pass checks whether each edgeconstitutes the final edge in a new shortest path to that edge’s destination. Inparticular,itcheckstheedgew-v.AfterV-1iterations, then,wt[v] isa lowerboundonthe lengthofanyshortestpathwithV-1orfeweredgesfromstov,forallverticesv.WecanstopafterV-1iterationsbecauseanypathwithVormoreedgesmusthavea(positive-orzero-cost)cycleandwecould findapathwithV-1or feweredges that is the samelengthorshorterbyremovingthecycle.Sincewt[v]isthelengthofsomepathfromstov,itisalsoanupperboundontheshortest-pathlength,andthereforemustbeequaltotheshortest-pathlength.Althoughwe did not consider it explicitly, the same proof shows that the sptvectorcontainspointerstotheedgesintheshortest-pathstreerootedats.Fortypicalgraphs,examiningeveryedgeoneverypassiswasteful.Indeed,wecan easily determine a priori that numerous edges are not going to lead to asuccessfulrelaxationinanygivenpass.Infact,theonlyedgesthatcouldleadtoa change are those emanating from a vertex whose value changed on thepreviouspass.

Program21.9isastraightforwardimplementationwhereweuseaFIFOqueuetohold theseedgesso that theyare theonlyonesexaminedoneachpass.Figure21.30showsanexampleofthisalgorithminoperation.Program21.9iseffectiveforsolvingthesingle-sourceshortest-pathsprobleminnetworks that arise in practice, but its worst-case performance is stillproportional toVE. For dense graphs, the running time is not better than forFloyd’s algorithm,which finds all shortest paths, rather than just those fromasingle source. For sparse graphs, the implementation of the Bellman–FordalgorithminProgram21.9isuptoafactorofVfasterthanFloyd’salgorithmbutis nearly a factor of V slower than the worst-case running time that we canachievewithDijkstra’s algorithm for networkswith no negative-weight edges(seeTable19.2).Other variations of the Bellman–Ford algorithm have been studied, some ofwhichare faster for the single-sourceproblem than theFIFO-queueversion inProgram21.9,butalltaketimeproportionaltoatleastVEintheworstcase(see,for example, Exercise 21.132). The basic Bellman–Ford algorithm wasdevelopeddecadesago;and,despitethedramaticstridesinperformancethatwehaveseenformanyothergraphproblems,wehavenotyetseenalgorithmswithbetterworst-caseperformancefornetworkswithnegativeweights.The Bellman–Ford algorithm is also a more efficient method than Floyd’salgorithmfordetectingwhetheranetworkhasnegativecycles.

Figure21.30Bellman-Fordalgorithm(withnegativeweights)

ThisfigureshowstheresultofusingtheBellman-Fordalgorithmtofindtheshortestpathsfromvertex4inthenetworkdepictedinFigureFigure21.26.Thealgorithmoperatesinpasses,whereweexaminealledgesemanatingfromallverticesonaFIFOqueue.Thecontentsofthequeueareshownbeloweach

graphdrawing,withtheshadedentriesrepresentingthecontentsofthequeueforthepreviouspass.Whenwefindanedgethatcanreducethelengthofapathfrom4toitsdestination,wedoarelaxationoperationthatputsthedestinationvertexonthequeueandtheedgeontheSPT.Thegrayedgesinthegraph

drawingscomprisetheSPTaftereachstage,whichisalsoshowninorientedforminthecenter(alledgespointingdown).WebeginwithanemptySPTand4onthequeue(top).Inthesecondpass,werelaxalong4-2and4-3,leaving2and3onthequeue.Inthethirdpass,weexaminebutdonotrelaxalong2-3andthenrelaxalong3-0and3-5,leaving0and5onthequeue.Inthefourthpass,werelaxalong5-1andthenexaminebutdonotrelaxalong1-0and1-5,leaving1onthequeue.Inthelastpass(bottom),werelaxalong1-2.Thealgorithm

initiallyoperateslikeBFS,but,unlikeallofourothergraphsearchmethods,itmightchangetreeedges,asinthelaststep.

Property21.22With the Bellman–Ford algorithm,we can solve the negative-cycle–detectionproblemintimeproportionaltoVE.Proof: The basic induction in the proof of Property 21.21 is valid even in thepresenceofnegativecycles.IfwerunaVthiterationof thealgorithmandanyrelaxationstepsucceeds, thenwehavefoundashortestpathwithVedges thatconnects s to some vertex in the network. Any such path must have a cycle(connecting some vertex w to itself) and that cycle must be negative, by theinductivehypothesis,sincethepathfromstothesecondoccurrenceofwmustbeshorterthanthepathfromstothefirstoccurrenceofwforwtobeincludedonthepaththesecondtime.Thecyclewillalsobepresentinthetree;thus,wecould also detect cycles by periodically checking the spt edges (see Exercise21.134).This argument holds for only those vertices that are in the same stronglyconnectedcomponentas thesources.Todetectnegativecycles ingeneral,wecaneithercomputethestronglyconnectedcomponentsandinitializeweightsforonevertexineachcomponentto0(seeExercise21.126)oraddadummyvertexwithedgestoeveryothervertex(seeExercise21.127).Toconclude thissection,weconsider theall-pairsshortest-pathsproblem.Canwe do better than Floyd’s algorithm, which runs in time proportional to V3?

UsingtheBellman–Fordalgorithmtosolvetheall-pairsproblembysolvingthesingle-source problem at each vertex exposes us to aworst-case running timethat is proportional to V2E. We do not consider this solution in more detailbecause there isaway toguarantee thatwecansolve theall-pathsproblemintimeproportionaltoVElogV.It isbasedonanideathatweconsideredatthebeginningofthissection:transformingthenetworkintoanetworkthathasonlynonnegativeweightsandthathasthesameshortest-pathsstructure.Infact,wehaveagreatdealofflexibilityintransforminganynetworktoanotherone with different edge weights but the same shortest paths. Suppose that avertex-indexed vector wt contains an arbitrary assignment of weights to thevertices of a network G. With these weights, we define the operation ofreweightingthegraphasfollows:•To reweight an edge, add to that edge’sweight thedifferencebetween theweightsoftheedge’ssourceanddestination.•Toreweightanetwork,reweightallofthatnetwork’sedges.

Forexample,thefollowingstraightforwardcodereweightsanetwork,usingourstandardconventions:

for(v=0;v<G->V();v++)



e->wt()=e->wt()+wt[v]-wt[e->w()]

}

This operation is a simple linear-time process that is well-defined for allnetworks, regardless of the weights. Remarkably, the shortest paths in thetransformednetworkarethesameastheshortestpathsintheoriginalnetwork.Property21.23Reweightinganetworkdoesnotaffectitsshortestpaths.Proof:Given any twovertices s and t, reweighting changes theweight of anypathfromstot,preciselybyaddingthedifferencebetweentheweightsofsandt.Thisfactiseasytoprovebyinductiononthelengthofthepath.Theweightofevery path from s to t is changed by the same amountwhenwe reweight thenetwork, long paths and short paths alike. In particular, this fact impliesimmediately that the shortest-path length between any two vertices in thetransformednetworkisthesameastheshortest-pathlengthbetweenthemintheoriginalnetwork.Since paths between different pairs of vertices are reweighted differently,reweightingcouldaffectquestionsthatinvolvecomparingshortest-pathlengths(for example, computing the network’s diameter). In such cases, we need toinvert the reweighting after completing the shortest-paths computation but

beforeusingtheresult.Reweightingisnohelpinnetworkswithnegativecycles:Theoperationdoesnotchange the weight of any cycle, so we cannot remove negative cycles byreweighting.Butfornetworkswithnonegativecycles,wecanseektodiscoveraset of vertex weights such that reweighting leads to edge weights that arenonnegative,nomatterwhat theoriginaledgeweights.Withnonnegativeedgeweights,wecanthensolvetheall-pairsshortest-pathsproblemwiththeall-pairsversion of Dijkstra’s algorithm. For example, Figure 21.31 gives such anexample for our sample network, and Figure 21.32 shows the shortest-pathscomputation with Dijkstra’s algorithm on the transformed network with nonegativeedges.Thefollowingpropertyshowsthatwecanalwaysfindsuchasetofweights.

Figure21.31Reweightinganetwork

Givenanyassignmentofweightstovertices(top),wecanreweightalloftheedgesinanetworkbyaddingtoeachedge’sweightthedifferenceoftheweightsofitssourceanddestinationvertices.Reweightingdoesnotaffecttheshortestpathsbecauseitmakesthesamechangetotheweightsofallpathsconnectingeachpairofvertices.Forexample,considerthepath0-5-4-2-3:Itsweightis.29+.21+.32+.50=1.32;itsweightinthereweightednetworkis1.12+.19+.12

+.34=1.77;theseweightsdifferby.45=.81-.36,thedifferenceoftheweightsof0and3;andtheweightsofallpathsbetween0and3changebythissame

amount.

Property21.24 Inanynetworkwithnonegativecycles,pickanyvertex sandassigntoeachvertexvaweightequaltothelengthofashortestpathtovfroms. Reweighting the network with these vertex weights yields nonnegative edgeweightsforeachedgethatconnectsverticesreachablefroms.Proof:Givenanyedgev-w,theweightofvisthelengthofashortestpathtov,andtheweightofwisthelengthofashortestpathtow.Ifv-wisthefinaledgeon a shortest path tow, then the difference between theweight ofw and theweightofvispreciselytheweightofv-w.Inotherwords,reweightingtheedgewillgiveaweightof0.Iftheshortestpaththroughwdoesnotgothroughv,thenthe weight of v plus the weight of v-w must be greater than or equal to theweightofw.Inotherwords,reweightingtheedgewillgiveapositiveweight.Just as we did when we used the Bellman–Ford algorithm to detect negativecycles,wehavetwowaystoproceedtomakeeveryedgeweightnonnegativeinanarbitrarynetworkwithnonegativecycles.Eitherwecanbeginwithasourcefromeachstronglyconnectedcomponent,orwecanaddadummyvertexwithan edge of length 0 to every network vertex. In either case, the result is ashortest-pathsspanningforestthatwecanusetoassignweightstoeveryvertex(weightofthepathfromtheroottothevertexinitsSPT).Forexample,theweightvalueschoseninFigure21.31arepreciselythelengthsofshortestpathsfrom4,sotheedgesintheshortest-pathstreerootedat4haveweight0inthereweightednetwork.Insummary,wecansolvetheall-pairsshortest-pathsprobleminnetworksthatcontainnegativeedgeweightsbutnonegativecyclesbyproceedingasfollows:• Apply the Bellman-Ford algorithm to find a shortest-paths forest in theoriginalnetwork.•Ifthealgorithmdetectsanegativecycle,reportthatfactandterminate.•Reweightthenetworkfromtheforest.•Applytheall-pairsversionofDijkstra’salgorithmtothereweightednetwork.

Figure21.32Allshortestpathsinareweightednetwork

ThesediagramsdepicttheSPTsforeachvertexinthereverseofthereweightednetworkfromFigure21.31,ascouldbecomputedwithDijkstra’salgorithmtogiveusshortestpathsintheoriginalnetworkinFigureFigure21.26.Thepathsarethesameasforthenetworkbeforereweighting,so,asinFigure21.9,thestvectorsinthesediagramsarethecolumnsofthepathsmatrixinFigureFigure21.26.Thewtvectorsinthisdiagramcorrespondtothecolumnsinthedistancesmatrix,butwehavetoundothereweightingforeachentrybysubtractingtheweightofthesourcevertexandaddingtheweightofthefinalvertexinthepath(seeFigure21.31).Forexample,fromthethirdrowfromthebottomherewecanseethattheshortestpathfrom0to3is0-5-1-4-3inbothnetworks,anditslengthis1.13inthereweightednetworkshownhere.ConsultingFigure21.31,wecancalculateitslengthintheoriginalnetworkbysubtractingtheweightof0andaddingtheweightof3togettheresult1.13-.81+.36=.68,theentryinrow0andcolumn3ofthedistancesmatrixinFigureFigure21.26.Allshortestpathsto4inthisnetworkareoflength0becauseweusedthosepathsforreweighting.

After thiscomputation, thepathsmatrixgives shortestpaths inbothnetworks,and thedistancesmatrixgive thepath lengths in the reweightednetwork.Thisseries of steps is sometimes known as Johnson’s algorithm (see referencesection).Property21.25With Johnson’s algorithm,we can solve theall-pairs shortest-pathsprobleminnetworksthatcontainnonegativecyclesintimeproportionaltoVElogdV,whered=2ifE<2V,andd=E/Votherwise.

Proof: See Properties 21.22 through 21.24 and the summary in the previousparagraph. The worst-case bound on the running time is immediate fromProperties21.7and21.22.ToimplementJohnson’salgorithm,wecombinetheimplementationofProgram21.9,thereweightingcodethatwegavejustbeforeProperty21.23,andtheall-pairsshortest-pathsimplementationofDijkstra’salgorithminProgram21.4(or,fordensegraphs,Program20.6).As noted in the proof of Property 21.22,wehave to make appropriate modifications to the Bellman–Ford algorithm fornetworksthatarenotstronglyconnected(seeExercises21.135through21.137).Tocompletetheimplementationoftheall-pairsshortest-pathsinterface,wecaneithercomputethetruepathlengthsbysubtractingtheweightofthestartvertexand adding the weight of the destination vertex (undoing the reweightingoperation for the paths) when copying the two vectors into the distances andpaths matrices in Dijkstra’s algorithm, or we can put that computation in

GRAPHdistintheADTimplementation.Fornetworkswithnonegativeweights,theproblemofdetectingcyclesiseasiertosolvethanistheproblemofcomputingshortestpathsfromasinglesourcetoall other vertices; the latter problem is easier to solve than is the problem ofcomputingshortestpathsconnectingallpairsofvertices.Thesefactsmatchourintuition. By contrast, the analogous facts for networks that contain negativeweights are counterintuitive: The algorithms that we have discussed in thissection show that, for networks that have negative weights, the best knownalgorithms for these three problems have similar worst-case performancecharacteristics. For example, it is nearly as difficult, in the worst case, todetermine whether a network has a single negative cycle as it is to find allshortestpathsinanetworkofthesamesizethathasnonegativecycles.

Exercises•21.109Modifyyourrandom-networkgeneratorsfromExercises21.6and21.7togenerateweightsbetweenaandb(whereaandbarebothbetween−1and1),byrescaling.

•21.110Modifyyourrandom-networkgeneratorsfromExercises21.6and21.7togeneratenegativeweights bynegating a fixedpercentage (whosevalue issuppliedbytheclient)oftheedgeweights.

• 21.111 Develop client programs that use your generators from Exercises21.109 and 21.110 to produce networks that have a large percentage ofnegativeweightsbuthaveatmostafewnegativecycles,foraslargearangeofvaluesofVandEaspossible.21.112 Find a currency-conversion table online or in a newspaper. Use it tobuildanarbitragetable.Note:Avoidtablesthatarederived(calculated)fromafew values and that therefore do not give sufficiently accurate conversioninformation to be interesting. Extra credit: Make a killing in the money-exchangemarket!•21.113 Build a sequence of arbitrage tables using the source for conversionthat you found for Exercise 21.112 (any source publishes different tablesperiodically).Findallthearbitrageopportunitiesthatyoucaninthetables,andtry to find patterns among them. For example, do opportunities persist dayafterday,oraretheyfixedquicklyaftertheyarise?21.114Developamodelforgeneratingrandomarbitrageproblems.YourgoalistogeneratetablesthatareassimilaraspossibletothetablesthatyouusedinExercise21.113.

21.115Developamodel forgenerating random job-schedulingproblems thatincludedeadlines.Yourgoalistogeneratenontrivialproblemsthatarelikelytobefeasible.21.116 Modify your interface and implementations from Exercise 21.101 togiveclientstheabilitytoposeandsolvejob-schedulingproblemsthatincludedeadlines,usingareductiontotheshortest-pathsproblem.• 21.117 Explain why the following argument is invalid: The shortest-pathsproblemreducestothedifference-constraintsproblembytheconstructionusedintheproofofProperty21.15,andthedifference-constraintsproblemreducestriviallytolinearprogramming,so,byProperty21.17,linearprogrammingisNP-hard.21.118Does the shortest-paths problem in networkswith no negative cyclesreduce to the job-schedulingproblemwithdeadlines? (Are the twoproblemsequivalent?)Proveyouranswer.• 21.119 Find the lowest-weight cycle (best arbitrage opportunity) in theexampleshowninFigure21.27.

•21.120Provethatfindingthelowest-weightcycleinanetworkthatmayhavenegativeedgeweightsisNP-hard.

•21.121 Show thatDijkstra’s algorithmdoeswork correctly for a network inwhichedgesthatleavethesourcearetheonlyedgeswithnegativeweights.

•21.122DevelopaclassbasedonFloyd’salgorithmthatprovidesclientswiththecapabilitytotestnetworksfortheexistenceofnegativecycles.21.123Show,inthestyleofFigure21.29,thecomputationofallshortestpathsofthenetworkdefinedinExercise21.1,withtheweightsonedges5-1and4-2negated,usingFloyd’salgorithm.•21.124IsFloyd’salgorithmoptimalforcompletenetworks(networkswithV2edges)?Proveyouranswer.21.125Show,in thestyleofFigures21.30 through21.32, thecomputationofallshortestpathsofthenetworkdefinedinExercise21.1,withtheweightsonedges5-1and4-2negated,usingtheBellman–Fordalgorithm.•21.126 Develop a class based on theBellman–Ford algorithm that providesclientswiththecapabilitytotestnetworksfortheexistenceofnegativecycles,using the method of starting with a source in each strongly connectedcomponent.

•21.127 Develop a class based on theBellman–Ford algorithm that provides

clientswiththecapabilitytotestnetworksfortheexistenceofnegativecycles,usingadummyvertexwithedgestoallthenetworkvertices.21.128GiveafamilyofgraphsforwhichProgram21.9takestimeproportionaltoVEtofindnegativecycles.• 21.129 Show the schedule that is computed by Program 21.9 for the job-scheduling-with-deadlinesprobleminExercise21.89.21.130 Prove that the following generic algorithm solves the single-sourceshortest-pathsproblem:“Relaxanyedge;continueuntiltherearenoedgesthatcanberelaxed.”21.131ModifytheimplementationoftheBellman–FordalgorithminProgram21.9 to use a randomized queue rather than a FIFO queue. (The result ofExercise21.130provesthatthismethodiscorrect.)•21.132ModifytheimplementationoftheBellman–FordalgorithminProgram21.9touseadequeratherthanaFIFOqueuesuchthatedgesareputontothedequeaccordingtothefollowingrule:Iftheedgehaspreviouslybeenonthedeque,putitatthebeginning(asinastack);ifitisbeingencounteredforthefirsttime,putitattheend(asinaqueue).21.133 Run empirical studies to compare the performance of theimplementations in Exercises 21.131 and 21.132 with Program 21.9, forvariousgeneralnetworks(seeExercises21.109through21.111).•21.134ModifytheimplementationoftheBellman–FordalgorithminProgram21.9 to implement a function that returns the index of any vertex on anynegative cycle or -1 if the network has no negative cycle.When a negativecycle is present, the function should also leave the spt vector such thatfollowinglinksinthevectorinthenormalway(startingwiththereturnvalue)tracesthroughthecycle.

•21.135ModifytheimplementationoftheBellman–FordalgorithminProgram21.9 to set vertex weights as required for Johnson’s algorithm, using thefollowingmethod.Eachtimethatthequeueempties,scanthesptvectortofindavertexwhoseweightisnotyetsetandrerunthealgorithmwiththatvertexassource(tosettheweightsforallverticesinthesamestrongcomponentasthenewsource),continuinguntilallstrongcomponentshavebeenprocessed.

• 21.136 Develop an implementation of the all-pairs shortest-paths ADTinterface for sparse networks (based on Johnson’s algorithm) by makingappropriatemodificationstoPrograms21.9and21.4.21.137 Develop an implementation of the all-pairs shortest-paths ADT

interface for dense networks (based on Johnson’s algorithm) (see Exercises21.136and21.43).RunempiricalstudiestocompareyourimplementationwithFloyd’salgorithm(Program21.5),forvariousgeneralnetworks(seeExercises21.109through21.111).•21.138AddamemberfunctiontoyoursolutiontoExercise21.137thatallowsaclient todecrease thecostof anedge.Returna flag that indicateswhetherthat action creates a negative cycle. If it does not, update the paths anddistancesmatricestoreflectanynewshortestpaths.YourfunctionshouldtaketimeproportionaltoV2.

•21.139Extendyour solution toExercise21.138withmember functions thatallowclientstoinsertanddeleteedges.

•21.140DevelopanalgorithmthatbreakstheVEbarrierforthesingle-sourceshortest-paths problem in general networks, for the special case where theweightsareknowntobeboundedinabsolutevaluebyaconstant.

21.8PerspectiveTable21.4summarizesthealgorithmsthatwehavediscussedinthischapterandgivestheirworst-caseperformancecharacteristics.Thesealgorithmsarebroadlyapplicable because, as discussed in Section 21.6, shortest-paths problems arerelated to a large number of other problems in a specific technical sense thatdirectly leads to efficient algorithms for solving the entire class, or at leastindicatessuchalgorithmsexist.

Table21.4Costsofshortest-pathsalgorithms

This table summarizes the cost (worst-case running time) of various shortest-paths algorithms considered in this chapter. Theworst-case boundsmarked asconservative may not be useful in predicting performance on real networks,particularlytheBellman–Fordalgorithm,whichtypicallyrunsinlineartime.

Thegeneralproblemof findingshortestpaths innetworkswhereedgeweightscouldbenegativeisintractable.Shortest-pathsproblemsareagoodillustrationof the fine line thatoften separates intractableproblems fromeasyones, sincewehavenumerousalgorithmstosolvethevariousversionsoftheproblemwhenwerestrictthenetworkstohavepositiveedgeweightsortobeacyclic,orevenwhenwerestrict tosubproblemswhere therearenegativeedgeweightsbutnonegative cycles. Several of the algorithms are optimal or nearly so, althoughthere are significant gaps between the best known lower bound and the bestknown algorithm for the single-source problem in networks that contain nonegative cycles and for the all-pairs problem in networks that containnonnegativeweights.Thealgorithmsareallbasedonasmallnumberofabstractoperationsandcanbecast in a general setting. Specifically, the only operations that we perform onedgeweightsareadditionandcomparison:anysettinginwhichtheseoperationsmakesensecanserveastheplatformforshortest-pathsalgorithms.Aswehavenoted, this point of view unifies our algorithms for computing the transitiveclosureofdigraphswithouralgorithms for findingshortestpaths innetworks.Thedifficultypresentedbynegativeedgeweightscorrespondstoamonotonicity

property on these abstract operations: If we can ensure that the sum of twoweightsisneverlessthaneitheroftheweights,thenwecanusethealgorithmsinSections21.2through21.4;ifwecannotmakesuchaguarantee,wehavetouse thealgorithmsfromSection21.7.Encapsulating theseconsiderations inanADTiseasilydoneandexpandstheutilityofthealgorithms.Shortest-paths problems put us at a crossroad, between elementary graph-processingalgorithmsandproblemsthatwecannotsolve.Theyare thefirstofseveral other classes of problems with a similar character that we consider,includingnetwork-flowproblemsandlinearprogramming.Asinshortestpaths,there is a fine line between easy and intractable problems in those areas.Notonly are numerous efficient algorithms availablewhen various restrictions areappropriate, but also there arenumerousopportunitieswherebetter algorithmshave yet to be invented and numerous occasionswherewe are facedwith thecertaintyofNP-hardproblems.ManysuchproblemswerestudiedindetailasORproblemsbeforetheadventofcomputers or computer algorithms. Historically, OR has focused on generalmathematicalandalgorithmicmodels,whereascomputersciencehasfocusedonspecificalgorithmicsolutionsandbasicabstractionsthatcanbothadmitefficientimplementations and help to build general solutions.Asmodels fromOR andbasicalgorithmicabstractionsfromcomputersciencehavebothbeenappliedtodevelop implementationsoncomputers thatcansolvehugepracticalproblems,the line between OR and computer science has blurred in some areas: Forexample,researchersinbothfieldsseekefficientsolutionstoproblemssuchasshortest-paths problems. As we address more difficult problems, we draw onclassicalmethodsfrombothfieldsofresearch.

CHAPTERTWENTY-TWONetworkFlow

GRAPHS,DIGRAPHS,ANDnetworksare justmathematicalabstractions,butthey are useful in practice because they help us to solve numerous importantproblems. In this chapter, we extend the network problem-solving model toencompassadynamicsituationwhereweimaginematerialflowingthroughthenetwork,withdifferentcostsattachedtodifferentroutes.Theseextensionsallowus to tackle a surprisingly broad variety of problems with a long list ofapplications.Weseethattheseproblemsandapplicationscanbehandledwithinafewnaturalmodels thatwe can relate to one another through reduction.There are severaldifferent ways, all of which are technically equivalent, to formulate the basicproblems.Toimplementalgorithmsthatsolvethemall,wesettleontwospecificproblems, develop efficient algorithms to solve them, then develop algorithmsthatsolveotherproblemsbyfindingreductionstotheknownproblems.In real life, we do not always have the freedom of choice that this idealizedscenariosuggests,becausenotallpairsofreductionrelationshipsbetweentheseproblemshavebeenproved,andbecausefewoptimalalgorithmsforsolvinganyof the problems are known. Perhaps no efficient direct solution to a givenproblemhasyetbeeninvented,andperhapsnoefficientreduction thatdirectlyrelates a given pair of problems has yet been devised. The network-flowformulation thatwecover in thischapterhasbeensuccessfulnotonlybecausesimple reductions to it areeasy todefine formanyproblems,butalsobecausenumerousefficientalgorithmsforsolvingthebasicnetwork-flowproblemshavebeendevised.

Figure22.1Distributionproblem

Inthisinstanceofthedistributionproblem,wehavethreesupplyvertices(0through2),fourdistributionpoints(3through6),threedemandvertices(7

through9),andtwelvechannels.Eachsupplyvertexhasarateofproduction;eachdemandvertexarateofconsumption;andeachchannelamaximum

capacityandacostperunitdistributed.Theproblemistominimizecostswhiledistributingmaterialthroughthechannels(withoutexceedingcapacity

anywhere)suchthatthetotalrateofmaterialleavingeachsupplyvertexequalsitsrateofproduction;thetotalrateatwhichmaterialarrivesateachdemand

vertexequalsitsrateofconsumption;andthetotalrateatwhichmaterialarrivesateachdistributionpointequalsthetotalrateatwhichmaterialleaves.

Thefollowingexamplesillustratetherangeofproblemsthatwecanhandlewithnetwork-flowmodels, algorithms, and implementations. They fall into generalcategories known as distribution problems, matching problems, and cutproblems, each of which we examine in turn. We indicate several differentrelatedproblems,ratherthanlayoutspecificdetailsintheseexamples.Laterinthechapter,whenweundertaketodevelopandimplementalgorithms,wegiverigorousdescriptionsofmanyoftheproblemsmentionedhere.Indistributionproblems,weareconcernedwithmovingobjectsfromoneplace

toanotherwithinanetwork.Whetherwearedistributinghamburgerandchickento fast-food outlets or toys and clothes to discount stores along highwaysthroughout the country—or software to computers or bits to display screensalong communications networks throughout theworld—the essential problemsare the same. Distribution problems typify the challenges that we face inmanagingalargeandcomplexoperation.Algorithmstosolvethemarebroadlyapplicableandarecriticalinnumerousapplications.MerchandisedistributionAcompanyhasfactories,wheregoodsareproduced;distribution centers,where thegoods are stored temporarily; and retail outlets,wherethegoodsaresold.Thecompanymustdistributethegoodsfromfactoriesthroughdistributioncenterstoretailoutletsonaregularbasis,usingdistributionchannelsthathavevaryingcapacitiesandunitdistributioncosts.Isitpossibletoget thegoods from thewarehouses to the retailoutlets such that supplymeetsdemand everywhere? What is the least-cost way to do so? Program 22.1illustratesadistributionproblem.Figure 22.2 illustrates the transportation problem, a special case of themerchandise-distribution problem where we eliminate the distribution centersandthecapacitiesonthechannels.Thisversionisimportantinitsownrightandis significant (aswe see in Section 22.7) not just because of important directapplications but also because it turns out not to be a “special case” at all—indeed,itisequivalentindifficultytothegeneralversionoftheproblem.CommunicationsAcommunicationsnetworkhasa setof requests to transmitmessages between servers that are connected by channels (abstractwires) thatarecapableof transferring informationatvarying rates.What is themaximumrateatwhichinformationcanbetransferredbetweentwospecifiedserversinthenetwork? If there are costs associatedwith the channels,what is the cheapestwaytosendtheinformationatagivenratethatislessthanthemaximum?

Figure22.2Transportationproblem

Thetransportationproblemislikethedistributionproblem,butwithnochannel-capacityrestrictionsandnodistributionpoints.Inthisinstance,wehavefivesupplyvertices(0through4),fivedemandvertices(5through9),andtwelvechannels.Theproblemistofindthelowest-costwaytodistributematerialthroughthechannelssuchthatsupplyexactlymeetsdemandeverywhere.Specifically,werequireanassignmentofweights(distributionrates)tothe

edgessuchthatthesumofweightsonoutgoingedgesequalsthesupplyateachsupplyvertex;thesumofweightsoningoingedgesequalsthedemandateachdemandvertex;andthetotalcost(sumofweighttimescostforalledges)is

minimizedoverallsuchassignments.

TrafficflowAcitygovernmentneedstoformulateaplanforevacuatingpeoplefrom the city in an emergency.What is the minimum amount of time that itwouldtaketoevacuatethecity,ifwesupposethatwecancontroltrafficflowsoastorealizetheminimum?Trafficplannersalsomightformulatequestionslikethiswhen decidingwhich new roads, bridges, or tunnelsmight alleviate rush-hourorvacation-weekendtrafficproblems.Inmatchingproblems,thenetworkrepresentsthepossiblewaystoconnectpairsof vertices, and our goal is to choose among the connections (according to aspecified criterion) without including any vertex twice. In other words, thechosensetofedgesdefinesawaytopairverticeswithoneanother.Wemightbematchingstudentstocolleges,applicantstojobs,coursestoavailablehoursfora

school, or members of Congress to committee assignments. In each of thesesituations,wemightimagineavarietyofcriteriadefiningthecharacteristicsofthematchessought.JobplacementAjob-placementservicearrangesinterviewsforasetofstudentswithasetofcompanies;theseinterviewsresultinasetofjoboffers.Assumingthataninterviewfollowedbyajobofferrepresentsmutualinterestinthestudenttaking a job at the company, it is in everyone’s best interests tomaximize thenumberofjobplacements.Program22.3isanexampleillustratingthatthistaskcanbecomplicated.Minimum-distancepointmatchingGiventwosetsofNpoints,findthesetofN line segments, each with one endpoint from each of the point sets, withminimum total length.One application of this purely geometric problem is inradar tracking systems. Each sweep of the radar gives a set of points thatrepresentplanes.Weassumethattheplanesarekeptsufficientlywellspacedthatsolvingthisproblemallowsustoassociateeachplane’spositionononesweeptoitspositionon thenext, thusgivingus thepathsof all theplanes.Otherdata-samplingapplicationscanbecastinthisframework.Incutproblems,suchastheoneillustratedinProgram22.4,weremoveedgestocut networks into two or more pieces. Cut problems are directly related tofundamentalquestionsofgraphconnectivity thatwe first examined inChapter18.In thischapter,wediscussacentral theoremthatdemonstratesasurprisingconnectionbetweencutandflowproblems,substantiallyexpandingthereachofnetwork-flowalgorithms.

Figure22.3Jobplacement

Supposethatwehavesixstudents,eachneedingjobs,andsixcompanies,eachneedingtohireastudent.Thesetwolists(onesortedbystudent,theothersortedbycompany)givealistofjoboffers,whichindicatemutualinterestinmatchingstudentsandjobs.Istheresomewaytomatchstudentstojobssothateveryjobisfilledandeverystudentgetsajob?Ifnot,whatisthemaximumnumberof

jobsthatcanbefilled?

Network reliability A simplified model considers a telephone network asconsistingofasetofwires thatconnect telephones throughswitchessuch thatthereisthepossibilityofaswitchedpaththroughtrunklinesconnectinganytwogiventelephones.Whatisthemaximumnumberoftrunklinesthatcouldbecutwithoutanypairofswitchesbeingdisconnected?Cutting supply lines A country atwarmoves supplies from depots to troopsalonganinterconnectedhighwaysystem.Anenemycancutoffthetroopsfromthesuppliesbybombingroads,withthenumberofbombsrequiredtodestroyaroadproportionaltothatroad’swidth.Whatistheminimumnumberofbombsthattheenemymustdroptoensurethatnotroopscangetsupplies?Each of the applications just cited immediately suggests numerous relatedquestions, and there are still other relatedmodels, such as the job-schedulingproblems that we considered in Chapter 21. We consider further examplesthroughout this chapter, yet still treat only a small fraction of the important,directlyrelatedpracticalproblems.Thenetwork-flowmodel thatwe consider in this chapter is important not just

because it provides uswith two simply statedproblems towhichmanyof thepractical problems reduce but also because we have efficient algorithms forsolving the two problems. This breadth of applicability has led to thedevelopment of numerous algorithms and implementations. The solutions thatwe consider illustrate the tension between our quest for implementations ofgeneral applicability andourquest for efficient solutions to specificproblems.The study of network-flow algorithms is fascinating because it brings ustantalizingly close to compact and elegant implementations that achieve bothgoals.We consider two particular problems within the network-flow model: themaxflow problem and themincostflow problem.We see specific relationshipsamongtheseproblem-solvingmodels,theshortest-pathmodelofChapter21,thelinear-programming (LP) model of Part 8, and numerous specific problemmodelsincludingsomeofthosejustdiscussed.

Figure22.4Cuttingsupplylines

Thisdiagramrepresentstheroadsconnectinganarmy’ssupplydepotatthetoptothetroopsatthebottom.Theblackdotsrepresentanenemybombingplanthatwouldseparatetroopsfromsupplies.Theenemy’sgoalistominimizethecostofbombing(perhapsassumingthatthecostofcuttinganedgeisproportionaltoits

width),andthearmy’sgoalistodesignitsroadnetworktomaximizetheenemy’sminimumcost.Thesamemodelisusefulinimprovingthereliabilityof

communicationsnetworksandmanyotherapplications.

At first blush,many of these problemsmight seem to be completely differentfrom network-flow problems. Determining a given problem’s relationship toknownproblemsisoftenthemostimportantstepindevelopingasolutiontothatproblem.Moreover,thisstepisoftensignificantbecause,asisusualwithgraphalgorithms, we must understand the fine line between trivial and intractableproblemsbeforewe attempt to develop implementations.The infrastructure ofproblems and the relationships among the problems that we consider in this

chapterprovidesahelpfulcontextforaddressingsuchissues.IntheroughcategorizationthatwebeganwithinChapter17,thealgorithmsthatweexamineinthischapterdemonstratethatnetwork-flowproblemsare“easy,”becausewehavestraightforward implementations thatareguaranteedtorun intime proportional to a polynomial in the size of the network. Otherimplementations,althoughnotguaranteedtoruninpolynomialtimeintheworstcase,arecompactandelegantandhavebeenprovedtosolveabroadvarietyofotherpracticalproblems,suchastheonesdiscussedhere.Weconsiderthemindetailbecauseoftheirutility.Researchersstillseekfasteralgorithms,inordertoenable huge applications and to save costs in critical ones. Ideal optimalalgorithmsthatareguaranteedtobeasfastaspossibleareyettobediscoveredfornetwork-flowproblems.Ontheonehand,someoftheproblemsthatwereducetonetwork-flowproblemsare known to be easier to solve with specialized algorithms. In principle, wemight consider implementing and improving these specialized algorithms.Althoughthatapproachisproductiveinsomesituations,efficientalgorithmsforsolvingmanyoftheproblems(otherthanthroughreductiontonetworkflow)arenot known. Even when specialized algorithms are known, developingimplementations that can outperform good network-flow codes can be asignificant challenge. Moreover, researchers are still improving network-flowalgorithms, and the possibility remains that a good network-flow algorithmmightoutperformknownspecializedmethodsforagivenpracticalproblem.On theother hand, network-flowproblems are special cases of the evenmoregeneralLPproblemsthatwediscussinPart8.Althoughwecould(andpeopleoften do) use an algorithm that solves LP problems to solve network-flowproblems, the network-flow algorithms thatwe consider are simpler andmoreefficient than are those that solve LP problems. But researchers are stillimprovingLPsolvers,andthepossibilityremainsthatagoodalgorithmforLPproblemsmight—whenused forpracticalnetwork-flowproblems—outperformallthealgorithmsthatweconsiderinthischapter.Theclassicalsolutionstonetwork-flowproblemsarecloselyrelatedtotheothergraph algorithms thatwe have been examining, andwe canwrite surprisinglyconcise programs that solve them, using the algorithmic tools we havedeveloped.Aswehaveseeninmanyothersituations,goodalgorithmsanddatastructurescanachievesubstantial reductions in running times.Developmentofbetterimplementationsofclassicalgenericalgorithmsisstillbeingstudied,andnewapproachescontinuetobediscovered.

In Section 22.1 we consider basic properties of flow networks, where weinterpretanetwork’sedgeweightsascapacitiesandconsiderpropertiesofflows,which are a second set of edgeweights that satisfy certainnatural constraints.Next,weconsiderthemaxflowproblem,whichistocomputeaflowthatisbestin a specific technical sense. In Sections 22.2 and 22.3, we consider twoapproaches to solving the maxflow problem, and examine a variety ofimplementations. Many of the algorithms and data structures that we haveconsideredaredirectlyrelevanttothedevelopmentofefficientsolutionsofthemaxflowproblem.Wedonotyethavethebestpossiblealgorithmstosolvethemaxflowproblem,butweconsiderspecificuseful implementations. InSection22.4, to illustrate the reach of the maxflow problem, we consider differentformulations,aswellasotherreductionsinvolvingotherproblems.Maxflow algorithms and implementations prepare us to discuss an evenmoreimportant and general model known as the mincostflow problem, where weassigncosts(anothersetofedgeweights)anddefineflowcosts,thenlookforasolutiontothemaxflowproblemthat isofminimalcost.Weconsideraclassicgeneric solution to the mincostflow problem known as the cycle-cancelingalgorithm; then, in Section 22.6, we give a particular implementation of thecycle-cancelingalgorithmknownas thenetworksimplex algorithm. In Section22.7,wediscussreductionstothemincostflowproblemthatencompass,amongothers,alltheapplicationsthatwejustoutlined.Network-flow algorithms are an appropriate topic to conclude this book forseveral reasons. They represent a payoff on our investment in learning basicalgorithmictoolssuchaslinkedlists,priorityqueues,andgeneralgraph-searchmethods.Thegraph-processingclassesthatwehavestudiedleadimmediatelytocompactandefficientclass implementationsfornetwork-flowproblems.Theseimplementations take us to a new level of problem-solving power and areimmediately useful in numerous practical applications. Furthermore, studyingtheir applicability and understanding their limitations sets the context for ourexaminationofbetteralgorithmsandharderproblems—theundertakingofPart8.

Figure22.5Networkflow

Aflownetworkisaweightednetworkwhereweinterpretedgeweightsascapacities(top).Ourobjectiveistocomputeasecondsetofedgeweights,boundedbythecapacities,whichwecalltheflow.Thebottomdrawing

illustratesourconventionsfordrawingflownetworks.Eachedge’swidthisproportionaltoitscapacity;theamountofflowineachedgeisshadedingray;theflowisalwaysdirecteddownthepagefromasinglesourceatthetoptoa

singlesinkatthebottom;andintersections(suchas1-4and2-3inthisexample)donotrepresentverticesunlesslabeledassuch.Exceptforthesourceandthesink,flowinisequaltoflowoutateveryvertex:Forexample,vertex2has2

unitsofflowcomingin(from0)and2unitsofflowgoingout(1unitto3and1unitto4).

22.1FlowNetworksTodescribenetwork-flowalgorithms,webeginwithanidealizedphysicalmodelinwhichseveralof thebasicconceptsare intuitive.Specifically,we imagineacollectionofinterconnectedoilpipesofvaryingsizes,withswitchescontrollingthedirectionofflowatjunctions,asintheexampleillustratedinProgram22.5.Wesupposefurtherthatthenetworkhasasinglesource(say,anoilfield)andasingle sink (say, a large refinery) towhichall thepipesultimatelyconnect.Ateach vertex, the flowing oil reaches an equilibrium where the amount of oilflowinginisequaltotheamountflowingout.Wemeasurebothflowandpipecapacityinthesameunits(say,gallonspersecond).

If every switch has the property that the total capacity of the ingoing pipes isequal to the total capacity of the outgoing pipes, then there is no problem tosolve:Wesimplyfillallpipestofullcapacity.Otherwise,notallpipesarefull,butoilflowsthroughthenetwork,controlledbyswitchsettingsatthejunctions,suchthattheamountofoilflowingintoeachjunctionisequaltotheamountofoilflowingout.Butthislocalequilibriumatthejunctionsimpliesanequilibriumin the network as awhole:We prove in Property 22.1 that the amount of oilflowingintothesinkisequaltotheamountflowingoutofthesource.Moreover,asillustratedinProgram22.6,theswitchsettingsatthejunctionsofthisamountof flow from source to sink have nontrivial effects on the flow through thenetwork.Given these facts,we are interested in the following question:Whatswitchsettingswillmaximizetheamountofoilflowingfromsourcetosink?We can model this situation directly with a network (a weighted digraph, asdefinedinChapter21) thathasasinglesourceandasinglesink.Theedges inthenetworkcorrespondtotheoilpipes,theverticescorrespondtothejunctionswithswitchesthatcontrolhowmuchoilgoesintoeachoutgoingedge,andtheweightsontheedgescorrespondtothecapacityofthepipes.Weassumethattheedges are directed, specifying that oil can flow in only one direction in eachpipe.Eachpipehasacertainamountofflow,whichis lessthanorequaltoitscapacity,andeveryvertexsatisfiestheequilibriumconditionthattheflowinisequaltotheflowout.

Figure22.6Controllingflowinanetwork

Wemightinitializetheflowinthisnetworkbyopeningtheswitchesalongthepath0-1-3-5,whichcanhandle2unitsofflow(top),andbyopeningswitchesalongthepath0-2-4-5togetanother1unitofflowinthenetwork(center).

Asterisksindicatefulledges.

Since0-1,2-4,and3-5arefull,thereisnodirectwaytogetmoreflowfrom0to5,butifwechangetheswitchat1toredirectenoughflowtofill1-4,weopenupenoughcapacityin3-5toallowustoaddflowon0-2-3-5,givingamaxflowforthisnetwork(bottom).This flow-network abstraction is a useful problem-solving model that appliesdirectly toavarietyofapplicationsand indirectly to stillmore.Wesometimesappeal to the idea of oil flowing through pipes for intuitive support of basicideas, but our discussion applies equally well to goods moving throughdistributionchannelsandtonumerousothersituations.Theflowmodeldirectlyappliestoadistributionscenario:Weinterprettheflowvaluesasratesofflow,sothataflownetworkdescribestheflowofgoodsinamannerpreciselyanalogoustotheflowofoil.Forexample,wecaninterprettheflowinProgram22.5asspecifyingthatweshouldbesendingtwoitemspertimeunitfrom0to1andfrom0to2,oneitempertimeunitfrom1to3andfrom1to4,andsoforth.Anotherwaytointerprettheflowmodelforadistributionscenarioistointerpretflowvalues as amounts of goods so that a flownetwork describes a one-timetransfer of goods. For example,we can interpret the flow in Program 22.5 asdescribing the transfer of four items from 0 to 5 in the following three-stepprocess:First,sendtwoitemsfrom0to1andtwoitemsfrom0to2,leavingtwoitemsateachofthosevertices.Second,sendoneitemeachfrom1to3,1to4,2to3,and2to4,leavingtwoitemseachat3and4.Third,completethetransferbysendingtwoitemsfrom3to5andtwoitemsfrom4to5.Aswithouruseofdistanceinshortest-pathsalgorithms,wearefreetoabandonany physical intuitionwhen convenient because all the definitions, properties,andalgorithmsthatweconsiderarebasedentirelyonanabstractmodelthatdoesnotnecessarilyobeyphysicallaws.Indeed,aprimereasonforourinterestinthenetwork-flow model is that it allows us to solve numerous other problemsthrough reduction, aswe see inSections22.4 and22.6.Becauseof this broadapplicability, it is worthwhile to consider precise statements of the terms andconceptsthatwehavejustinformallyintroduced.Definition 22.1 We refer to a network with a designated source s and a

designatedsinktasanst-network.Weusethemodifier“designated”heretomeanthatsdoesnotnecessarilyhavetobeasource(vertexwithnoincomingedges)andtdoesnotnecessarilyhavetobeasink(vertexwithnooutgoingedges),butthatwenonethelesstreatthemassuch,becauseourdiscussion(andouralgorithms)willignoreedgesdirectedintosandedgesdirectedoutoft.Toavoidconfusion,weusenetworkswithasinglesource and a single sink in examples; we considermore general situations inSection22.4.Werefertosandtas“thesource”and“thesink,”respectively,inthest-networkbecausethosearetherolesthattheyplayinthenetwork.Wealsorefertotheotherverticesinthenetworkastheinternalvertices.Definition 22.2A flow network is an st-network with positive edge weights,whichwerefertoascapacities.Aflowinaflownetworkisasetofnonnegativeedgeweights—whichwerefer toasedgeflows—satisfying theconditions thatnoedge’s flow isgreater than that edge’scapacityand that the total flow intoeachinternalvertexisequaltothetotalflowoutofthatvertex.We refer to the total flow into avertex (the sumof the flowson its incomingedges)asthevertex’s inflowand the total flowoutofavertex(thesumof theflowsonitsoutgoingedges)asthevertex’soutflow.Byconvention,wesettheflowonedgesintothesourceandedgesoutofthesinktozero,andinProperty22.1 we prove that the source’s outflow is always equal to the sink’s inflow,which we refer to as the network’s value. With these definitions, the formalstatementofourbasicproblemisstraightforward.MaximumflowGivenanst-network,findaflowsuchthatnootherflowfromstothaslargervalue.Forbrevity,werefertosuchaflowasamaxflowand theproblem of finding one in a network as the maxflow problem. In someapplications, we might be content to know just the maxflow value, but wegenerallywanttoknowaflow(edgeflowvalues)thatachievesthatvalue.Variations on the problem immediately come tomind.Canwe allowmultiplesources and sinks? Shouldwe be able to handle networkswith no sources orsinks?Canweallowflowineitherdirectionintheedges?Canwehavecapacityrestrictions for the vertices instead of or in addition to the restrictions for theedges?Asistypicalwithgraphalgorithms,separatingrestrictionsthataretrivialto handle from those that have profound implications can be a challenge.Weinvestigatethischallengeandgiveexamplesofreducingtomaxflowavarietyofproblemsthatseemdifferentincharacter,afterweconsideralgorithmstosolvethebasicproblem,inSections22.2and22.3.

Figure22.7Flowequilibrium

Thisdiagramillustratesthepreservationofflowequilibriumwhenwemergesetsofvertices.Thetwosmallerfiguresrepresentanytwodisjointsetsofvertices,andthelettersrepresentflowinsetsofedgesasindicated:Aistheamountofflowintothesetontheleftfromoutsidethesetontheright,xistheamountofflowintothesetontheleftfromthesetontheright,andsoforth.Now,ifwe

haveflowequilibriuminthetwosets,thenwemusthave

A+x=B+yforthesetontheleftand

C+y=D+xfor the set on the right. Adding these two equations and canceling the x + yterms,weconcludethat

A+C=B+D,orinflowisequaltooutflowfortheunionofthetwosets.Thecharacteristicpropertyofflowsisthelocalequilibriumconditionthatinflowbe equal to outflow at each internal vertex. There is no such constraint oncapacities;indeed,theimbalancebetweentotalcapacityofincomingedgesandtotalcapacityofoutgoingedgesiswhatcharacterizesthemaxflowproblem.Theequilibriumconstrainthastoholdateachandeveryinternalvertex,anditturnsoutthatthislocalpropertydeterminesglobalmovementthroughthenetwork,aswell.Althoughthisideaisintuitive,itneedstobeproved.Property22.1Anyst-flowhas theproperty thatoutflow froms isequal to theinflowtot.Proof:(Weusethetermst-flowtomean“flowinanst-network.”)Augmentthenetworkwithanedgefromadummyvertexintos,withflowandcapacityequaltotheoutflowfroms,andwithanedgefrom t toanotherdummyvertex,withflowandcapacityequal to the inflowto t.Then,wecanproveamoregeneralproperty by induction: Inflow is equal to outflow for any set of vertices (notincludingthedummyvertices).Thisproperty is true for any singlevertex,by local equilibrium.Now,assume

that it is trueforagivensetofverticesS and thatweaddasinglevertexv tomake the setS ′=S{v}.To compute inflow andoutflow forS ′, note that eachedgefromvtosomevertexinSreducesoutflow(fromv)bythesameamountasitreducesinflow(toS);eachedgetovfromsomevertexinSreducesinflow(tov)bythesameamountasitreducesoutflow(fromS);andallotheredgesprovideinfloworoutflowforS′ifandonlyiftheydosoforSorv.Thus,inflowandoutflowareequalforS′,andthevalueoftheflowisequaltothesumofthevaluesoftheflowsofvandSminussumoftheflowsontheedgesconnectingvtoavertexinS(ineitherdirection).Applying thisproperty to thesetofall thenetwork’svertices,wefind that thesource’sinflowfromitsassociateddummyvertex(whichisequal to thesource’soutflow) isequal to thesink’soutflowto itsassociateddummyvertex(whichisequaltothesink’sinflow).

Figure22.8Cycleflowrepresentation

Thisfiguredemonstratesthatthecirculationatleftdecomposesintothefourcycles1-3-5-4-1,0-1-3-5-4-2-0,1-3-5-4-2-1,3-5-4-3,withweights2,1,1,and3,respectively.Eachcycle’sedgesappearinitsrespectivecolumn,andsummingeachedge’sweightfromeachcycleinwhichitappears(acrossitsrespective

row)givesitsweightinthecirculation.

CorollaryThevalueoftheflowfortheunionoftwosetsofverticesisequaltothesumofthevaluesoftheflowsforthetwosetsminusthesumoftheweightsoftheedgesthatconnectavertexinonetoavertexintheother.Proof:TheproofjustgivenforasetSandavertexvstillworksifwereplacevbyasetT(whichisdisjointfromS)intheproof.AnexampleofthispropertyisillustratedinProgram22.7.WecandispensewiththedummyverticesintheproofofProperty22.1,augment

anyflownetworkwithanedgefromt toswithflowandcapacityequaltothenetwork’svalue,andknowthatinflowisequaltooutflowforanysetofnodesintheaugmentednetwork.Suchaflowiscalledacirculation,andthisconstructiondemonstrates that the maxflow problem reduces to the problem of finding acirculation that maximizes the flow along a given edge. This formulationsimplifies our discussion in some situations. For example, it leads to aninteresting alternate representation of flows as a set of cycles, as illustrated inProgram22.8.Givenasetofcyclesandaflowvalueforeachcycle,itiseasytocomputethecorresponding circulation by following through each cycle and adding theindicatedflowvaluetoeachedge.Theconversepropertyismoresurprising:Wecan find a set of cycles (with a flowvalue for each) that is equivalent to anygivencirculation.Property 22.2 (Flow decomposition theorem) Any circulation can berepresentedasflowalongasetofatmostEdirectedcycles.Proof:Asimplealgorithmestablishesthisresult.Iteratethefollowingprocessaslongasthereisanyedgethathasflow:Startingwithanyedge thathas flow, followanyedge leaving that edge’sdestinationvertex thathasflowandcontinueuntilencounteringavertexthathasalreadybeenvisited(acyclehasbeendetected).Gobackaroundthecycletofindanedgewithminimalflow; then reduce the flow on every edge in the cycle by that amount. Eachiterationofthisprocessreducestheflowonatleastoneedgeto0,sothereareatmostEcycles.

Figure22.9Cycleflowdecompositionprocess

Todecomposeanycirculationintoasetofcycles,weiteratethefollowingprocess:Followanypathuntilencounteringanodeforthesecondtime,thenfindtheminimumweightontheindicatedcycle,thensubtractthatweightfromeachedgeonthecycleandremoveanyedgewhoseweightbecomes0.Forexample,thefirstiterationistofollowthepath0-1-3-5-4-1tofindthecycle1-3-5-4-1,

thensubtract1fromtheweightsofeachoftheedgesonthecycle,whichcausesustoremove4-1becauseitsweightbecomes0.Intheseconditeration,we

remove0-1and2-0;inthethirditeration,weremove1-3,4-2,and2-1;andinthefourthiteration,weremove3-5,5-4,and4-3.

Figure22.9illustratestheprocessdescribedintheproof.Forst-flows,applyingthis property to the circulation created by the addition of an edge from t to sgivestheresultthatanyst-flowcanberepresentedasflowalongasetofatmostEdirectedpaths,eachofwhichiseitherapathfromstotoracycle.CorollaryAny st-network has a maxflow such that the subgraph induced bynonzeroflowvaluesisacyclic.Proof:Cyclesthatdonotcontaint-sdonotcontributetothevalueoftheflow,sowecanchangetheflowto0alonganysuchcyclewithoutchangingthevalueoftheflow.CorollaryAnyst-networkhasamaxflowthatcanberepresentedasflowalongasetofatmostEdirectedpathsfromstot.Proof:Immediate.This representation provides a useful insight into the nature of flows that ishelpfulinthedesignandanalysisofmaxflowalgorithms.Ontheonehand,wemightconsideramoregeneralformulationofthemaxflowproblemwhereweallowformultiplesourcesandsinks.Doingsowouldallowouralgorithmstobeusedforabroaderrangeofapplications.Ontheotherhand,we might consider special cases, such as restricting attention to acyclicnetworks.Doingsomightmaketheproblemeasiertosolve.Infact,asweseeinSection22.4,thesevariantsareequivalentindifficultytotheversionthatweareconsidering. Therefore, in the first case, we can adapt our algorithms andimplementations to the broader range of applications; in the second case, wecannotexpectaneasiersolution.Inourfigures,weuseacyclicnetworksbecausetheexamplesareeasiertounderstandwhentheyhaveanimplicitflowdirection(downthepage),butourimplementationsallownetworkswithcycles.Toimplementmaxflowalgorithms,weusetheGRAPHclassofChapter20,butwithpointers toamoresophisticatedEDGEclass.Insteadof thesingleweightthatweused inChapters20and21,weuse pcap and pflow private datamembers(with cap() and flow() public member functions that return their values) forcapacityandflow, respectively.Even thoughnetworksaredirectedgraphs,ouralgorithmsneed to traverse edges in bothdirections, soweuse theundirected

graph representation from Chapter 20 and the member function from todistinguishu-vfromv-u.This approach allows us to separate the abstraction needed by our algorithms(edges going in both directions) from the client’s concrete data structure andleavesasimplegoalforouralgorithms:Assignvaluestotheflowdatamembersintheclient’sedgesthatmaximizeflowthroughthenetwork.Indeed,acriticalcomponentofourimplementationsinvolvesachangingnetworkabstractionthatis dependent on flow values and implementedwith EDGEmember functions.WewillconsideranEDGEimplementation(Program22.2)inSection22.2.Since flow networks are typically sparse,we use an adjacency-lists-based GRAPHrepresentation like the SparseMultiGRAPH implementation of Program 20.5. Moreimportant, typical flow networks may have multiple edges (of varyingcapacities) connecting two given vertices. This situation requires no specialtreatmentwith SparseMultiGRAPH, butwith an adjacency-matrix–based representation,clientshavetocollapsesuchedgesintoasingleedge.In thenetwork representationsofChapters20and21,weused the conventionthatweightsarerealnumbersbetween0and1.Inthischapter,weassumethattheweights(capacitiesandflows)areallm-bitintegers(between0and2m−1).Wedosofortwoprimaryreasons.First,wefrequentlyneedtotestforequalityamong linear combinations of weights, and doing so can be inconvenient infloating-pointrepresentations.Second,therunningtimesofouralgorithms

Program22.1FlowcheckandvaluecomputationAcalltoflow(G,v)computesthedifferencebetweenv’singoingandoutgoingflowsinG.Acalltoflow(G,s,t)checksthenetworkflowvaluesfromthesource(s)tothesink(t),returning0ifingoingflowisnotequaltooutgoingflowatsomeinternalnodeorifsome

flowvalueisnegative;theflowvalueotherwise.

candependontherelativevaluesoftheweights,andtheparameterM=2mgivesusaconvenientwaytoboundweightvalues.Forexample,theratioofthelargestweighttothesmallestnonzeroweightislessthanM.Theuseofintegerweightsisbutoneofmanypossiblealternatives(see,forexample,Exercise20.8)thatwecouldchoosetoaddresstheseproblems.We sometimes refer to edges as having infinite capacity, or, equivalently, asbeing uncapacitated. That might mean that we do not compare flow againstcapacityforsuchedges,orwemightuseasentinelvaluethatisguaranteedtobelargerthananyflowvalue.

Figure22.10Flownetworkforexercises

Thisflownetworkisthesubjectofseveralexercisesthroughoutthechapter.

Program 22.1 is an client function that checks whether a flow satisfies theequilibrium condition at every node and returns that flow’s value if the flowdoes.Typically,wemightincludeacalltothisfunctionasthefinalactionofamaxflowalgorithm.DespiteourconfidenceasmathematiciansinProperty22.1,ourparanoiaasprogrammersdictatesthatwealsocheckthattheflowoutofthesourceisequaltotheflowintothesink.Itmightalsobeprudenttocheckthatnoedge’s flow exceeds that edge’s capacity and that the data structures areinternallyconsistent(seeExercise22.12).

Exercises• 22.1 Find two different maxflows in the flow network shown in Program22.10.22.2 Under our assumption that capacities are positive integers less thanM,what is themaximumpossible flowvalue foranyst-networkwithV verticesandE edges?Give twoanswers,dependingonwhetherornotparallel edgesareallowed.•22.3 Give an algorithm to solve themaxflow problem for the case that thenetworkformsatreeifthesinkisremoved.

•22.4Give a family of networkswithE edges having circulationswhere theprocessdescribedintheproofofProperty22.2producesEcycles.22.5WriteanEDGEclassthatrepresentscapacitiesandflowsasrealnumbersbetween 0 and 1 that are expressedwithd digits to the right of the decimalpoint,wheredisafixedconstant.•22.6Write a program that builds a flownetwork by reading edges (pairs ofintegers between 0 andV− 1) with integer capacities from standard input.AssumethatthecapacityupperboundMislessthan220.

22.7Extendyour solution toExercise22.6 touse symbolicnames insteadofintegerstorefertovertices(seeProgram17.10).•22.8Findalargenetworkonlinethatyoucanuseasavehiclefortestingflowalgorithmsonrealisticdata.Possibilitiesincludetransportationnetworks(road,rail,orair),communicationsnetworks(telephoneorcomputerconnections),ordistribution networks. If capacities are not available, devise a reasonablemodeltoaddthem.WriteaprogramthatusestheinterfaceofProgram22.2toimplement flow networks from your data, perhaps using your solution toExercise22.7. Ifwarranted, develop additional private functions to clean upthedata,asdescribedinExercises17.33–35.22.9 Write a random-network generator for sparse networks with capacitiesbetween0and220,basedonProgram17.7.Useaseparateclassforcapacitiesand develop two implementations: one that generates uniformly distributedcapacities and another that generates capacities according to a Gaussiandistribution. Implement client programs that generate random networks forbothweightdistributionswithawell-chosensetvaluesofVandE,sothatyoucanusethemtorunempiricaltestsongraphsdrawnfromvariousdistributionsofedgeweights.22.10Write a random-network generator for dense networks with capacitiesbetween 0 and 220, based on Program 17.8 and edge-capacity generators asdescribedinExercise22.9.Writeclientprogramstogeneraterandomnetworksforbothweightdistributionswithawell-chosensetvaluesofVandE,sothatyoucanusethemtorunempiricaltestsongraphsdrawnfromthesemodels.•22.11WriteaprogramthatgeneratesVrandompointsintheplane,thenbuildsa flownetworkwithedges (inbothdirections)connectingallpairsofpointswithinagivendistancedofeachother(seeProgram3.20),settingeachedge’scapacity using one of the random models described in Exercise 22.9.DeterminehowtosetdsothattheexpectednumberofedgesisE.

•22.12ModifyProgram22.1toalsocheckthatflowislessthancapacityforalledges.

•22.13Findall themaxflowsin thenetworkdepicted inProgram22.11.Givecyclerepresentationsforeachofthem.22.14Writeafunctionthatreadsvaluesandcycles(oneperline,intheformatillustrated in Program 22.8) and builds a network having the correspondingflow.22.15Writeaclientfunctionthatfindsthecyclerepresentationofanetwork’s

flowusingthemethoddescribedintheproofofProperty22.2andprintsvaluesandcycles(oneperline,intheformatillustratedinProgram22.8).•22.16Writeafunctionthatremovescyclesfromanetwork’sst-flow.•22.17Write a program that assigns integer flows to each edge in any givendigraphthatcontainsnosinksandnosourcessuchthat thedigraphisaflownetworkthatisacirculation.

•22.18Supposethataflowrepresentsgoodstobetransferredbytrucksbetweencities,withtheflowonedgeu-vrepresentingtheamounttobetakenfromcityu tov in agivenday.Write a client function that prints out dailyorders fortruckers, telling them howmuch and where to pick up and howmuch andwhere todropoff.Assume that thereareno limitson the supplyof truckersandthatnothingleavesagivendistributionpointuntileverythinghasarrived.

Figure22.11Flownetworkwithcycle

ThisflownetworkisliketheonedepictedinProgram22.10,butwiththedirectionoftwooftheedgesreversed,sotherearetwocycles.Itisalsothe

subjectofseveralexercisesthroughoutthechapter.

22.2Augmenting-PathMaxflowAlgorithmsAn effective approach to solving maxflow problems was developed by L. R.FordandD.R.Fulkersonin1962.Itisagenericmethodfor

Figure22.12Augmentingflowalongapath

Thissequenceshowstheprocessofincreasingflowinanetworkalongapathofforwardandbackwardedges.Startingwiththeflowdepictedattheleftandreadingfromlefttoright,weincreasetheflowin0-2andthen2-3(additionalflowisshowninblack).Thenwedecreasetheflowin1-3(showninwhite)and

divertitto1-4andthen4-5,resultingintheflowattheright.

increasingflowsincrementallyalongpathsfromsourcetosinkthatservesasthebasisforafamilyofalgorithms.ItisknownastheFord–Fulkersonmethod inthe classical literature; the more descriptive term augmenting-path method isalsowidelyused.Consider anydirected path (not necessarily a simple one) from source to sinkthroughanst -network.Letx be theminimumof the unused capacities of theedgeson thepath.Wecan increase thenetwork’s flowvalueby at leastx, byincreasingtheflowinalledgesonthepathbythatamount.Iteratingthisaction,we get a first attempt at computing flow in a network: Find another path,increasetheflowalongthatpath,andcontinueuntilallpathsfromsourcetosinkhaveatleastonefulledge(sothatwecannolongerincreaseflowinthisway).This algorithmwill compute themaxflow in somecases, butwill fall short inothercases.Program22.6illustratesacasewhereitfails.To improve the algorithm such that it always finds amaxflow,we consider amore general way to increase the flow, along any path from source to sinkthroughthenetwork’sunderlyingundirectedgraph.Theedgesonanysuchpathare either forward edges, which gowith the flow (whenwe traverse the pathfromsourcetosink,wetraversetheedgefromitssourcevertextoitsdestinationvertex)orbackwardedges,whichgoagainsttheflow(whenwetraversethepathfromsourcetosink,wetraversetheedgefromitsdestinationvertextoitssourcevertex).Now,foranypathwithnofull forwardedgesandnoemptybackwardedges,wecanincreasetheamountofflowinthenetworkbyincreasingflowinforwardedgesanddecreasingflowinbackwardedges.Theamountbywhichtheflowcanbeincreasedislimitedbytheminimumoftheunusedcapacitiesintheforwardedgesand the flows in thebackwardedges.Program22.12depictsanexample.Inthenewflow,atleastoneoftheforwardedgesalongthe

Figure22.13Augmenting-pathsequences

Inthesethreeexamples,weaugmentaflowalongdifferentsequencesofaugmentingpathsuntilnoaugmentingpathcanbefound.Theflowthatresults

ineachcaseisamaximumflow.Thekeyclassicaltheoreminthetheoryofnetworkflowsstatesthatwegetamaximumflowinanynetwork,nomatter

whatsequenceofpathsweuse(seeProperty22.5).

pathbecomesfulloratleastoneofthebackwardedgesalongthepathbecomesempty.TheprocessjustsketchedisthebasisfortheclassicalFord–Fulkersonmaxflowalgorithm(augmenting-pathmethod).Wesummarizeitasfollows:

Startwith zero flow everywhere. Increase the flow along any path from source to sinkwith no fullforwardedgesoremptybackwardedges,continuinguntiltherearenosuchpathsinthenetwork.

Remarkably,thismethodalwaysfindsamaxflow,nomatterhowwechoosethepaths.Like theMSTmethod discussed in Section20.1 and theBellman–Fordshortest-pathsmethoddiscussedinSection21.7, itisagenericalgorithmthat isusefulbecauseitestablishesthecorrectnessofawholefamilyofmorespecificalgorithms.Wearefreetouseanymethodwhatevertochoosethepath.Figure22.13illustratesseveraldifferentsequencesofaugmentingpathsthatalllead to a maxflow for a sample network. Later in this section, we examineseveral algorithms that compute sequences of augmenting paths, all of whichleadtoamaxflow.Thealgorithmsdifferinthenumberofaugmentingpathstheycompute,thelengthsofthepaths,andthecostsoffindingeachpath,buttheyallimplementtheFord–Fulkersonalgorithmandfindamaxflow.ToshowthatanyflowcomputedbyanyimplementationoftheFord–Fulkersonalgorithm indeedhasmaximal value,we show that this fact is equivalent to akeyfactknownasthemaxflow–mincuttheorem.Understandingthistheoremisacrucial step in understanding network-flow algorithms. As suggested by itsname, the theorem isbasedon adirect relationshipbetween flowsandcuts innetworks,sowebeginbydefiningtermsthatrelatetocuts.RecallfromSection20.1thatacut inagraphisapartitionof thevertices intotwodisjointsets,andacrossingedgeisanedgethatconnectsavertexinonesetto a vertex in the other set. For flow networks,we refine these definitions asfollows(seeFigure22.14).Definition22.3Anst-cutisacutthatplacesvertexsinoneofitssetsandvertextintheother.Eachcrossingedgecorrespondingtoanst-cutiseitheranst-edgethatgoesfromavertexinthesetcontainings toavertexinthesetcontaining t,ora ts -edgethatgoesintheotherdirection.Wesometimesrefertothesetofcrossingedges

as a cut set. The capacity of an st-cut in a flow network is the sum of thecapacitiesof thatcut’sst-edges,and theflowacrossanst-cut is thedifferencebetweenthesumoftheflowsinthatcut’sst-edgesandthesumoftheflowsinthatcut’sts-edges.Removingacutsetdividesaconnectedgraphintotwoconnectedcomponents,leaving no path connecting any vertex in one to any vertex in the other.Removingalltheedgesinanst-cutofanetworkleavesnopathconnectingstotin the underlying undirected graph, but adding any one of them back couldcreatesuchapath.Cuts are the appropriate abstraction for the application mentioned at thebeginning of the chapter where a flow network describes the movement ofsupplies fromadepot to the troopsofanarmy.Tocutoff suppliescompletelyand in the most economical manner, an enemy might solve the followingproblem.

Figure22.14st-cutterminology

Anst-networkhasonesourcesandonesinkt.Anst-cutisapartitionoftheverticesintoasetcontainings(white)andanothersetcontainingt(black).Theedgesthatconnectavertexinonesetwithavertexintheother(highlightedingray)areknownasacutset.Aforwardedgegoesfromavertexintheset

containingstoavertexinthesetcontainingt;abackwardedgegoestheotherway.Therearefourforwardedgesandtwobackwardedgesinthecutsetshown

here.

MinimumcutGivenanst-network, findanst-cutsuch that thecapacityofnoothercut issmaller.Forbrevity,werefer tosuchacutasamincut, and to the

problemoffindingoneinanetworkasthemincutproblem.The mincut problem is a generalization of the connectivity problems that wediscussedbriefly inSection18.6.Weanalyzespecific relationships indetail inSection22.4.The statement of themincut problem includesnomentionof flows, and thesedefinitionsmight seem to digress from our discussion of the augmenting-pathalgorithm.Onthesurface,computingamincut(asetofedges)seemseasierthancomputing a maxflow (an assignment of weights to all the edges). On thecontrary,thekeyfactofthischapteristhatthemaxflowandmincutproblemsareintimatelyrelated.Theaugmenting-pathmethoditself, inconjunctionwith twofactsaboutflowsandcuts,providesaproof.Property22.3Foranyst-flow,theflowacrosseachst-cutisequaltothevalueoftheflow.Proof: This property is an immediate consequence of the generalization ofProperty22.1thatwediscussedintheassociatedproof(seeFigure22.7).Addanedge t-swith flow equal to the value of the flow such that inflow is equal tooutflowforanysetofvertices.Then,foranyst-cutwhereCs is thevertexsetcontainingsandCtisthevertexsetcontainingt,theinflowtoCsistheinflowto s (the value of the flow) plus the sum of the flows in the backward edgesacrossthecut;andtheoutflowfromCs isthesumoftheflowsintheforwardedges across the cut. Setting these twoquantities equal establishes the desiredresult.Property22.4Nost-flow’svaluecanexceedthecapacityofanyst-cut.Proof:Theflowacrossacutcertainlycannotexceedthatcut’scapacity,sothisresultisimmediatefromProperty22.3.In other words, cuts represent bottlenecks in networks. In our militaryapplication,anenemythatisnotabletocutoffarmytroops

Figure22.15Allst-cuts

Thislistgives,forallthest-cutsofthenetworkatleft,theverticesinthesetcontainings,theverticesinthesetcontainingt,forwardedges,backwardedges,andcapacity(sumofcapacitiesoftheforwardedges).Foranyflow,theflowacrossallthecuts(flowinforwardedgesminusflowinbackwardedges)isthesame.Forexample,fortheflowinthenetworkatleft,theflowacrossthecut

separating013and245is2+1+2(theflowin0-2,1-4,and3-5,respectively)minus1(theflowin2-3),or4.Thiscalculationalsoresultsinthevalue4foreveryothercutinthenetwork,andtheflowisamaximumflow

becauseitsvalueisequaltothecapacityoftheminimumcut(seeProperty22.5).Therearetwominimumcutsinthisnetwork.

completelyfromtheirsuppliescouldstillbesurethatsupplyflowisrestrictedtoatmostthecapacityofanygivencut.Wecertainlymightimaginethatthecostofmakingacut isproportional to itscapacity in thisapplication, thusmotivatingthe invading army to find a solution to the mincut problem.More important,thesefactsalsoimply,inparticular,thatnoflowcanhavevaluehigherthanthecapacityofanyminimumcut.Property 22.5 (Maxflow–mincut theorem) The maximum value among allstflowsinanetworkisequaltotheminimumcapacityamongallst-cuts.Proof: It suffices toexhibita flowandacut such that thevalueof the flow isequaltothecapacityofthecut.Theflowhastobeamaxflowbecausenootherflowvaluecanexceedthecapacityofthecutandthecuthastobeaminimumcutbecausenoother cut capacity canbe lower than thevalueof the flow (byProperty22.4).TheFord–Fulkersonalgorithmgivesprecisely sucha flowandcut: When the algorithm terminates, identify the first full forward or emptybackwardedgeoneverypathfromstot in thegraph.LetC sbe thesetofallverticesthatcanbereachedfromswithanundirectedpaththatdoesnotcontaina full forwardor emptybackward edge, and letC t be the remainingvertices.

Then,tmustbeinCt,so(Cs,Ct))isanst-cut,whosecutsetconsistsentirelyoffullforwardoremptybackwardedges.Theflowacross thiscut isequal to thecut’scapacity(sinceforwardedgesarefullandthebackwardedgesareempty)andalsotothevalueofthenetworkflow(byProperty22.3).Thisproofalsoestablishesexplicitly that theFord–Fulkersonalgorithmfindsamaxflow.Nomatterwhatmethodwechoosetofindanaugmentingpath,andnomatterwhatpathswefind,wealwaysendupwithacutwhoseflowisequaltoitscapacity,andthereforealsoisequaltothevalueofthenetwork’sflow,whichthereforemustbeamaxflow.AnotherimplicationofthecorrectnessoftheFord–Fulkersonalgorithmisthat,for any flow networkwith integer capacities, there exists amaxflow solutionwheretheflowsareall integers.Eachaugmentingpathincreasestheflowbyapositiveinteger(theminimumoftheunusedcapacitiesintheforwardedgesandtheflowsinthebackwardedges,allofwhicharealwayspositiveintegers).Thisfactjustifiesourdecisiontorestrictourattentiontointegercapacitiesandflows.Itispossibletodesignamaxflowwithnonintegerflows,evenwhencapacitiesareallintegers(seeExercise22.23),butwedonotneedtoconsidersuchflows.Thisrestrictionisimportant:Generalizingtoallowcapacitiesandflowsthatarereal numbers can lead to unpleasant anomalous situations. For example, theFord–Fulkerson algorithm might lead to an infinite sequence of augmentingpathsthatdoesnotevenconvergetothemaxflowvalue(seereferencesection).The generic Ford–Fulkerson algorithmdoes not specify any particularmethodfor findinganaugmentingpath.Perhaps themostnaturalway toproceed is tousethegeneralizedgraph-searchstrategyofSection18.8.Tothisend,webeginwiththefollowingdefinition.Definition22.4Givenaflownetworkandaflow,theresidualnetwork for theflowhasthesameverticesastheoriginalandoneortwoedgesintheresidualnetwork foreachedge in theoriginal,definedas follows:Foreachedge v-w intheoriginal,letfbetheflowandcthecapacity.Iffispositive,includeanedgew-vintheresidualwithcapacityf;andiffislessthanc,includeanedgev-wintheresidualwithcapacityc-f.If v-w is empty (f is equal to 0), there is a single corresponding edge v-w withcapacity c in the residual; if v-w is full (f is equal to c), there is a singlecorrespondingedgew-vwithcapacityfintheresidual;andifv-wisneitheremptynorfull,bothv-wandw-vareintheresidualwiththeirrespectivecapacities.Program 22.2 defines the EDGE class that we use to implement the residualnetworkabstractionwithclassfunctionmembers.Withsuch

Figure22.16Residualnetworks(augmentingpaths)

Findingaugmentingpathsinaflownetworkisequivalenttofindingdirectedpathsintheresidualnetworkthatisdefinedbytheflow.Foreachedgeinthe

flownetwork,wecreateanedgeineachdirectionintheresidualnetwork:oneinthedirectionoftheflowwithweightequaltotheunusedcapacity,andoneintheoppositedirectionwithweightequaltotheflow.Wedonotincludeedgesofweight0ineithercase.Initially(top),theresidualnetworkisthesameasthe

flownetworkwithweightsequaltocapacities.Whenweaugmentalongthepath0-1-3-5(secondfromtop),wefilledges0-1and3-5tocapacitysothattheyswitchdirectionintheresidualnetwork,wereducetheweightof1-3tocorrespondtotheremainingflow,andweaddtheedge3-1ofweight2.

Similarly,whenweaugmentalongthepath0-2-4-5,wefill2-4tocapacitysothatitswitchesdirection,andwehaveedgesineitherdirectionbetween0and2andbetween4and5torepresentflowandunusedcapacity.Afterweaugmentalong0-2-3-1-4-5(bottom),nodirectedpathsfromsourcetosinkremaininthe

residualnetwork,sotherearenoaugmentingpaths.

Program22.2Flow-networkedgesToimplementflownetworks,weusetheundirectedGRAPHclassfromChapter20 tomanipulatepointers toedges that implement this interface.Theedgesaredirected,butthemember functions implement the residualnetworkabstraction,whichencompassesbothorientationsofeachedge(seetext).

classEDGE

{intpv,pw,pcap,pflow;

public:

EDGE(intv,intw,intcap):

pv(v),pw(w),pcap(cap),pflow(0){}

intv()const{returnpv;}

intw()const{returnpw;}

intcap()const{returnpcap;}

intflow()const{returnpflow;}

boolfrom(intv)const

{returnpv==v;}

intother(intv)const

{returnfrom(v)?pw:pv;}

intcapRto(intv)const

{returnfrom(v)?pflow:pcap-pflow;}

voidaddflowRto(intv,intd)

{pflow+=from(v)?-d:d;}

};

an implementation, we continue to work exclusively with pointers to clientedges. Our algorithms work with the residual network, but they are actuallyexaminingcapacitiesandchanging flow(throughedgepointers) in theclient’sedges. The member functions from and P|other| allow us to process edges ineither orientation: e.other(v) returns the endpoint of e that is not v. ThememberfunctionscapRtoandaddflowRtoimplementtheresidualnetwork:Ifeisapointertoanedge v-w with capacity c and flow f, then e->capRto(w) is c-f and e->capRto(v) is f; e-

>addflowRto(w,d)addsdtotheflow;ande->addflowRto(v,d)subtractsdfromtheflow.Residual networks allow us to use any generalized graph search (see Section18.8)tofindanaugmentingpath,sinceanypathfrom

Program22.3Augmenting-pathsmaxflowimplementationThis class implements the generic augmenting-paths (Ford–Fulkerson) maxflowalgorithm. It usesPFS to find a path from source to sink in the residual network (seeProgram22.4), then adds asmuch flow as possible to that path, repeating the processuntilthereisnosuchpath.Constructinganobjectofthisclasssetstheflowvaluesinthegivennetwork’sedgessuchthattheflowfromsourcetosinkismaximal.ThestvectorholdsthePFSspanningtree,withst[v]containingapointertotheedgethatconnectsvtoitsparent.TheSTfunctionreturnstheparentofitsargumentvertex.TheaugmentfunctionusesSTtotraversethepathtofinditscapacityandthenaugmentflow.

source to sink in the residual network corresponds directly to an augmentingpathintheoriginalnetwork.Increasingtheflowalongthepathimpliesmakingchanges in the residual network: For example, at least one edge on the pathbecomes full or empty, so at least one edge in the residual network changesdirectionordisappears(butouruseofan

Program22.4PFSforaugmenting-pathsimplementationThispriority-firstsearchimplementationisderivedfromtheonethatweusedfor

Dijkstra’salgorithm(Program21.1)bychangingittouseintegerweights,toprocessedgesintheresidualnetwork,andtostopwhenitreachesthesinkorreturnfalseifthere

isnopathfromsourcetosink.ThegivendefinitionofthepriorityPleadstothemaximum-capacityaugmentingpath(negativevaluesarekeptonthepriorityqueuesoastoadheretotheinterfaceofProgram20.10);otherdefinitionsofPyieldvariousdifferent

maxflowalgorithms.

abstractresidualnetworkmeansthatwejustcheckforpositivecapacityanddonotneedtoactuallyinsertanddeleteedges).Figure22.16showsasequenceofaugmentingpathsandthecorrespondingresidualnetworksforanexample.Program 22.3 is a priority-queue–based implementation that encompasses allthesepossibilities,usingtheslightlymodifiedversionofourPFSgraph-searchimplementation from Program 21.1 that is shown in Program 22.4. Thisimplementation allows us to choose among several different classicalimplementationsof theFord–Fulkersonalgorithm, simplybysettingprioritiessoastoimplementvariousdatastructuresforthefringe.AsdiscussedinSection21.2,usingapriorityqueuetoimplementastack,queue,orrandomizedqueueforthefringedatastructureincursanextrafactoroflgVinthe cost of fringe operations. Since we could avoid this cost by using ageneralized-queue ADT in an implementation like Program 18.10 with directimplementations, we assume when analyzing the algorithms that the costs offringeoperationsareconstantinthesecases.ByusingthesingleimplementationinProgram22.3,weemphasize thedirect relationships amoungvariousFord–Fulkersonimplementations.Althoughitisgeneral,Program22.3doesnotencompassallimplementationsofthe Ford–Fulkerson algorithm (see, for example, Exercises 22.36 and 22.38).Researcherscontinuetodevelopnewwaystoimplementthealgorithm.ButthefamilyofalgorithmsencompassedbyProgram22.3 iswidelyused,givesusabasis for understanding computation of maxflows, and introduces us tostraightforwardimplementationsthatperformwellonpracticalnetworks.Aswesoonsee,thesebasicalgorithmictoolsgetussimple(anduseful,formanyapplications) solutions to the network-flow problem. A complete analysisestablishingwhichspecificmethodisbestisacomplextask,however,becausetheirrunningtimesdependon•Thenumberofaugmentingpathsneededtofindamaxflow

•ThetimeneededtofindeachaugmentingpathThesequantitiescanvarywidely,dependingonthenetworkbeingprocessedandonthegraph-searchstrategy(fringedatastructure).Perhaps the simplest Ford–Fulkerson implementation uses the shortestaugmentingpath(asmeasuredbythenumberofedgesonthepath,notfloworcapacity). This method was suggested by Edmonds and Karp in 1972. Toimplement it, we use a queue for the fringe, either by using the value of anincreasing counter for P or by using a queueADT instead of a priority-queueADTinProgram22.3.Inthiscase,thesearchforanaugmentingpathamountstobreadth-first search (BFS) in the residual network, precisely as described inSections 18.8 and 21.2. Figure 22.17 shows this implementation of the Ford–Fulkersonmethodinoperationonasamplenetwork.Forbrevity,werefertothismethodastheshortest-augmenting-pathmaxflowalgorithm.Asisevident

Figure22.17Shortestaugmentingpaths

Thissequenceillustrateshowtheshortest-augmenting-pathimplementationoftheFord–Fulkersonmethodfindsamaximumflowinasamplenetwork.Pathlengthsincreaseasthealgorithmprogresses:Thefirstfourpathsinthetoprowareoflength3;thelastpathinthetoprowandallofthepathsinthesecondrowareoflength4;thefirsttwopathsinthebottomrowareoflength5;andthe

processfinisheswithtwopathsoflength7thateachhaveabackwardedge.

from the figure, the lengths of the augmenting paths form a nondecreasingsequence.Ouranalysisofthismethod,inProperty22.7,provesthatthispropertyischaracteristic.AnotherFord–FulkersonimplementationsuggestedbyEdmondsandKarpisthefollowing:Augmentalongthepaththatincreasestheflowbythelargestamount.ThepriorityvaluePthatisusedinProgram22.3implementsthismethod.Thisprioritymakesthealgorithmchooseedgesfromthefringetogivethemaximumamountof flow thatcanbepushed througha forwardedgeordiverted fromabackwardedge.Forbrevity,werefertothismethodasthemaximum-capacity–augmenting-pathmaxflowalgorithm.Program22.18illustratesthealgorithmonthesameflownetworkasthatinProgram22.17.These are but two examples (ones we can analyze!) of Ford– Fulkersonimplementations.Attheendofthissection,weconsiderothers.Beforedoingso,we consider the task of analyzing augmenting-pathmethods in order to learntheir properties and, ultimately, to decide which one will have the bestperformance.

Figure22.18Maximum-capacityaugmentingpaths

Thissequenceillustrateshowthemaximum-capacity–augmenting-pathimplementationoftheFord–Fulkersonmethodfindsamaxflowinasamplenetwork.Pathcapacitiesdecreaseasthealgorithmprogresses,buttheirlengthsmayincreaseordecrease.Themethodneedsonlynineaugmentingpathsto

computethesamemaxflowastheonedepictedinProgram22.17.

IntryingtochooseamongthefamilyofalgorithmsrepresentedbyProgram22.3,we are in a familiar situation. Should we focus on worst-case performanceguarantees,ordo those represent amathematical fiction thatmaynot relate tonetworksthatweencounterinpractice?Thisquestionisparticularlyrelevantinthis context, because the classicalworst-case performance bounds thatwe canestablish aremuch higher than the actual performance results that we see fortypicalgraphs.Many factors further complicate the situation. For example, the worst-caserunning timeforseveralversionsdependsnot justonVandE,butalsoon thevaluesof theedgecapacities in thenetwork.Developingamaxflowalgorithmwith fast guaranteed performance has been a tantalizing problem for severaldecades, and numerous methods have been proposed. Evaluating all thesemethods for all the types of networks that are likely to be encountered inpractice,with sufficientprecision toallowus tochooseamong them, isnotasclear-cut as is the same task forother situations thatwehave studied, such astypicalpracticalapplicationsofsortingorsearchingalgorithms.Keeping thesedifficulties inmind,wenowconsider theclassical resultsabouttheworst-caseperformanceof theFord–Fulkersonmethod:Onegeneralboundand two specific bounds, one for each of the two augmenting-path algorithmsthat we have examined. These results serve more to give us insight intocharacteristics of the algorithms than to allow us to predict performance to asufficientdegreeofaccuracy

Figure22.19TwoscenariosfortheFord–Fulkersonalgorithm

ThisnetworkillustratesthatthenumberofiterationsusedbytheFord–Fulkersonalgorithmdependsonthecapacitiesoftheedgesinthenetworkandthe

sequenceofpathschosenbytheimplementation.ItconsistsoffouredgesofcapacityXandoneofcapacity1.Thescenariodepictedatthetopshowsthatanimplementationthatalternatesbetweenusing0-1-2-3and0-2-1-3asaugmenting

paths(forexample,onethatpreferslongpaths)wouldrequireXpairsof

iterationslikethetwopairsshown,eachpairincrementingthetotalflowby2.Thescenariodepictedatthebottomshowsthatanimplementationthatchooses0-1-3andthen0-2-3asaugmentingpaths(forexample,onethatprefersshort

paths)findsthemaximumflowinjusttwoiterations.

formeaningfulcomparison.Wediscussempiricalcomparisonsofthemethodsattheendofthesection.Ifedgecapacitiesare,say,32-bitintegers,thescenariodepictedatthetopwouldbebillionsoftimesslowerthanthescenariodepictedthebottom.Property22.6LetMbethemaximumedgecapacityinthenetwork.Thenumberof augmenting paths needed by any implementation of the Ford–FulkersonalgorithmisatmostequaltoVM.Proof:AnycuthasatmostVedges,ofcapacityM, fora totalcapacityofVM.Everyaugmentingpathincreasestheflowthrougheverycutbyatleast1,sothealgorithm must terminate after VM passes, since all cuts must be filled tocapacityafterthatmanyaugmentations.Asdiscussedbelow,suchaboundisoflittleuseintypicalsituationsbecauseMcanbe a very large number.Worse, it is easy to describe situationswhere thenumberofiterationsisproportionaltothemaximumedgecapacity.Forexample,supposethatweusealongestaugmenting-pathalgorithm(perhapsbasedontheintuitionthatthelongerthepath,themoreflowweputonthenetwork’sedges).Since we are counting iterations, we ignore, for the moment, the cost ofcomputingsuchapath.The(classical)exampleshowninFigure22.19showsanetwork for which the number of iterations of a longest augmenting-pathalgorithmisequaltothemaximumedgecapacity.Thisexampletellsusthatwemust undertake a more detailed scrutiny to know whether other specificimplementationsusesubstantiallyfeweriterationsthanareindicatedbyProperty22.6.For sparsenetworks andnetworkswith small integer capacities,Property22.6does give an upper bound on the running time of any Ford–Fulkersonimplementationthatisuseful.

CorollaryThetimerequiredtofindamaxflowisO(VEM),whichisO(V2M)forsparsenetworks.Proof:Immediatefromthebasicresultthatgeneralizedgraphsearchislinearinthesizeofthegraphrepresentation(Property18.12).Asmentioned,weneedanextralgVfactorifweareusingapriority-queuefringeimplementation.

Theproofactuallyestablishesthat thefactorofMcanbereplacedby theratiobetweenthelargestandsmallestnonzerocapacitiesinthenetwork(seeExercise22.25). When this ratio is low, the bound tells us that any Ford–Fulkersonimplementationwillfindamaxflowintimeproportionaltothetimerequiredto(forexample)solvetheall-shortest-pathsproblem,intheworstcase.TherearemanysituationswherethecapacitiesareindeedlowandthefactorofMisofnoconcern.WewillseeanexampleinSection22.4.WhenMislarge,theVEMworst-caseboundishigh;butitispessimistic,asweobtaineditbymultiplyingtogetherworst-caseboundsthatderivefromcontrivedexamples.Actualcostsonpracticalnetworksaretypicallymuchlower.From a theoretical standpoint, our first goal is to discover, using the roughsubjective categorizations of Section 17.8, whether or not the maximum-flowproblem for networks with large integer weights is tractable (solvable by apolynomial-time algorithm). The bounds just derived do not resolve thisquestion,becausethemaximumweightM=2mcouldgrowexponentiallywithVandE.Fromapracticalstandpoint,weseekbetterperformanceguarantees.Topickatypicalpracticalexample,supposethatweuse32-bitintegers(m=32)torepresentedgeweights. Inagraphwithhundredsofverticesand thousandsofedges, the corollary to Property 22.6 says that we might have to performhundreds of trillions of operations in an augmenting-path algorithm. Ifwe aredealingwithmillionsofvertices,thispointismoot,notonlybecausewewillnothaveweightsas large21000000,but alsobecauseV3 andVE are so large as tomake the boundmeaningless.We are interested both in finding a polynomialbound to resolve the tractabilityquestionand in findingbetterbounds that arerelevantforsituationsthatwemightencounterinpractice.Property22.6isgeneral:ItappliestoanyFord–Fulkersonimplementationatall.The generic nature of the algorithm leaves us with a substantial amount offlexibilitytoconsideranumberofsimpleimplementationsinseekingtoimproveperformance.Weexpectthatspecificimplementationsmightbesubjecttobetterworst-case bounds. Indeed, that is one of our primary reasons for consideringthemin thefirstplace!Now,aswehaveseen, implementingandusinga largeclassoftheseimplementationsistrivial:Wejustsubstitutedifferentgeneralized-queue implementations or priority definitions in Program 22.3. Analyzingdifferences in worst-case behavior is more challenging, as indicated by theclassical results that we consider next for the two basic augmenting-pathimplementationsthatwehaveconsidered.First, we analyze the shortest-augmenting-path algorithm. This method is not

subject to the problem illustrated in Program 22.19. Indeed, we can use it toreplace the factor of M in the worst-case running time with V E/ 2, thusestablishingthatthenetwork-flowproblemistractable.Wemightevenclassifyitasbeingeasy(solvable inpolynomial timeonpracticalcasesbyasimple, ifclever,implementation).Property 22.7 The number of augmenting paths needed in the shortest-augmenting-pathimplementationof theFord–FulkersonalgorithmisatmostVE/2.Proof:First,asisapparentfromtheexampleinProgram22.17,noaugmentingpath is shorter than a previous one. To establish this fact, we show bycontradiction that a strongerpropertyholds:Noaugmentingpathcandecreasethe length of the shortest path from the source s to any vertex in the residualnetwork.Supposethatsomeaugmentingpathdoesso,andthatvisthefirstsuchvertexonthepath.Therearetwocasestoconsider:Eithernovertexonthenewshorter path from s to v appears anywhere on the augmenting path or somevertexwonthenewshorterpathfromstovappearssomewherebetweenvandton the augmenting path. Both situations contradict the minimality of theaugmentingpath.Now,byconstruction,everyaugmentingpathhasat leastonecriticaledge:anedgethatisdeletedfromtheresidualnetworkbecauseitcorrespondseithertoaforwardedgethatbecomesfilledtocapacityorabackwardedgethatisemptied.Supposethatanedgeu-visacriticaledgeforanaugmentingpathPoflengthd.Thenextaugmentingpath forwhich it isacriticaledgehas tobeof lengthatleastd+2,becausethatpathhastogofromstov,thenalongv-u,thenfromutot.Thefirstsegmentisoflengthatleast1greaterthanthedistancefromstouinP,andthefinalsegmentisoflengthatleast1greaterthanthedistancefromvtotinP,sothepathisoflengthatleast2greaterthanP.SinceaugmentingpathsareoflengthatmostV,thesefactsimplythateachedgecanbethecriticaledgeonatmostV/2augmentingpaths,sothetotalnumberofaugmentingpathsisatmostEV/2.

CorollaryThetimerequiredtofindamaxflowinasparsenetworkisO(V3).Proof:ThetimerequiredtofindanaugmentingpathisO(E),sothetotaltimeisO(VE2).Thestatedboundfollowsimmediately.

ThequantityV3issufficientlyhighthatitdoesnotprovideaguaranteeofgoodperformanceonhugenetworks.Butthatfactshouldnotprecludeusfromusingthealgorithmonahugenetwork,becauseit isaworst-caseperformanceresult

thatmaynotbeusefulforpredictingperformanceinapracticalapplication.Forexample, as justmentioned, themaximum capacityM (or themaximum ratiobetweencapacities)mightbemuchlessthanV,sothecorollarytoProperty22.6would provide a better bound. Indeed, in the best case, the number ofaugmenting paths needed by the Ford–Fulkersonmethod is the smaller of theoutdegreeof s or the indegree of t, which againmight be far smaller thanV.Given this range between best-and worst-case performance, comparingaugmenting-pathalgorithmssolelyonthebasisofworst-caseboundsisnotwise.Still,otherimplementationsthatarenearlyassimpleastheshortest-augmenting-pathmethodmightadmitbetterboundsorbepreferredinpractice(orboth).Forexample,themaximum-augmenting-pathalgorithmusedfarfewerpathstofinda maxflow than did the shortest-augmenting-path algorithm in the exampleillustratedinFigures22.17and22.18.Wenowturntotheworst-caseanalysisofthatalgorithm.First,justasforPrim’salgorithmandforDijkstra’salgorithm(seeSections20.6and21.2),we can implement the priority queue such that the algorithm takestimeproportionaltoV2 (fordensegraphs)or(E+V)logV (forsparsegraphs)periterationintheworstcase,althoughtheseestimatesarepessimisticbecausethealgorithmstops

Figure22.20Stack-basedaugmenting-pathsearch

ThisillustratestheresultofusingastackforthegeneralizedqueueinourimplementationoftheFord–Fulkersonmethod,sothatthepathsearchoperateslikeDFS.Inthiscase,themethoddoesaboutaswellasBFS,butitssomewhaterraticbehaviorisrathersensitivetothenetworkrepresentationandhasnotbeen

analyzed.

whenitreachesthesink.Wealsohaveseenthatwecandoslightlybetterwithadvanceddatastructures.Themoreimportantandmorechallengingquestionishowmanyaugmentingpathsareneeded.Property 22.8 The number of augmenting paths needed in the maximal-augmenting-pathimplementationoftheFord–Fulkersonalgorithmisatmost2ElgM.Proof:Givenanetwork, letF be itsmaxflowvalue.Letv be thevalueof theflowatsomepointduringthealgorithmaswebegintolookforanaugmentingpath. Applying Property 22.2 to the residual network, we can decompose theflowintoatmostEdirectedpathsthatsumtoF−v,sotheflowinatleastoneofthepathsisatleast(F−v)/E.Now,eitherwefindthemaxflowsometimebeforedoing another 2E augmenting paths or the value of the augmenting path afterthatsequenceof2Epathsislessthan(F−v)/2E,whichislessthanone-halfofthevalueofthemaximumbeforethatsequenceof2Epaths.Thatis,intheworstcase,weneedasequenceof2Epathsto

Figure22.21Randomizedaugmenting-pathsearch

Thissequencetheresultofusingarandomizedqueueforthefringedatastructureintheaugmenting-pathsearchintheFord–Fulkersonmethod.Inthisexample,wehappenupontheshorthigh-capacitypathandthereforeneed

relativelyfewaugmentingpaths.Whilepredictingtheperformancecharacteristicsofthismethodisachallengingtask,itperformswellinmany

situations.

decreasethepathvaluebyafactorof2.ThefirstpathvalueisatmostM,whichweneedtodecreasebyafactorof2atmostlgMtimes,sowehaveatotalofatmostlgMsequencesof2Epaths.CorollaryThetimerequiredtofindamaxflowinasparsenetworkisO(V2lgMlgV).Proof: Immediate fromtheuseofaheap-basedpriority-queue implementation,asforProperties20.7and21.5.ForvaluesofMandV thatare typicallyencountered inpractice, thisbound issignificantly lower than theO (V3)boundof thecorollary toProperty22.7. Inmany practical situations, the maximum-augmenting-path algorithm usessignificantlyfeweriterationsthandoestheshortest-augmenting-pathalgorithm,atthecostofaslightlyhigherboundontheworktofindeachpath.There are many other variations to consider, as reflected by the extensiveliterature on maxflow algorithms. Algorithms with better worst-case boundscontinuetobediscovered,andnonontriviallowerboundhasbeenproved—thatis, thepossibilityofasimple linear-timealgorithmremains.Although theyareimportant from a theoretical standpoint, many of the algorithms are primarilydesignedtolowertheworst-caseboundsfordensegraphs,sotheydonotoffersubstantiallybetterperformance than themaximum-augmenting-pathalgorithmforthekindsofsparsenetworksthatweencounterinpractice.Still,thereremainmanyoptions to explore inpursuit ofbetterpracticalmaxflowalgorithms.Webrieflyconsidertwomoreaugmenting-pathalgorithmsnext;inSection22.3,weconsideranotherfamilyofalgorithms.Oneeasyaugmenting-pathalgorithmistousethevalueofadecreasingcounterfor P or a stack implementation for the generalized queue in Program 22.3,making the search for augmenting paths like depth-first search. Figure 22.20showstheflowcomputedforoursmallexamplebythisalgorithm.Theintuitionis that the method is fast, is easy to implement, and appears to put flowthroughoutthenetwork.Aswewillsee,itsperformancevariesremarkably,fromextremelypooronsomenetworkstoreasonableonothers.Another alternative is to use a randomized-queue implementation for thegeneralized queue so that the search for augmenting paths is a randomizedsearch.Program22.21showstheflowcomputedforoursmallexamplebythis

algorithm.Thismethod is also fast and easy to implement; in addition, aswenoted in Section 18.8, itmay embody good features of both breadth-first anddepth-first search. Randomization is a powerful tool in algorithm design, andthisproblemrepresentsareasonablesituationinwhichtoconsiderusingit.Weconcludethissectionbylookingmorecloselyat themethodsthatwehaveexamined in order to see the difficulty of comparing them or attempting topredictperformanceforpracticalapplications.Asastartinunderstandingthequantitativedifferencesthatwemightexpect,weusetwoflow-networkmodelsthatarebasedontheEuclideangraphmodelthatwehavebeenusingtocompareothergraphalgorithms.BothmodelsuseagraphdrawnfromVpointsintheplanewithrandomcoordinatesbetween0and1withedges connecting any two points within a fixed distance of each other. Theydifferintheassignmentofcapacitiestotheedges.The first model simply assigns the same constant value to each capacity. Asdiscussed in Section 22.4, this type of network-flow problem is known to beeasierthanthegeneralproblem.ForourEuclideangraphs,flowsarelimitedbytheoutdegreeofthesourceandtheindegreeofthesink,sothealgorithmseachneed only a few augmenting paths. But the paths differ substantially for thevariousalgorithms,aswesoonsee.Thesecondmodelassignsrandomweightsfromsomefixedrangeofvalues tothecapacities.Thismodelgenerates the typeofnetworks thatpeople typicallyenvisionwhenthinkingabouttheproblem,and

Figure22.22Randomflownetworks

ThisfiguredepictsmaxflowcomputationsonourrandomEuclideangraph,withtwodifferentcapacitymodels.Ontheleft,alledgesareassignedunitcapacities;ontheright,edgesareassignedrandomcapacities.Thesourceisnearthemiddleatthetopandthesinknearthemiddleatthebottom.Illustratedtoptobottomaretheflowscomputedbytheshortest-path,maximum-capacity,stack-based,andrandomizedalgorithms,respectively.Sincetheverticesarenotofhighdegreeandthecapacitiesaresmallintegers,therearemanydifferentflowsthatachieve

themaximumfortheseexamples.

The indegree of the sink is 6, so all the algorithms find the flow in the unit-capacitymodelontheleftwithsixaugmentingpaths.Themethodsfindaugmentingpathsthatdifferdramaticallyincharacterfortherandom-weightmodel on the right. In particular, the stack-basedmethod findslongpathsoflowweightandevenproducesaflowwithadisconnectedcycle.

Table22.1Empiricalstudyofaugmenting-pathalgorithms

Thistableshowsperformanceparametersforvariousaugmenting-pathnetwork-flow algorithms for our sample Euclidean neighbor network (with randomcapacities with maxflow value 286) and with unit capacities (with maxflowvalue 6). The maximum-capacity algorithm outperforms the others for bothtypesofnetworks.Therandomsearchalgorithmfindsaugmentingpathsthatarenotmuch longer than theshortest,andexaminesfewernodes.Thestack-basedalgorithm peforms verywell for randomweights but, though it has very longpaths,iscompetetiveforunitweights.

the performance of the various algorithms on such networks is certainlyinstructive.BothofthesemodelsareillustratedinProgram22.22,alongwiththeflowsthatare computed by the four methods on the two networks. Perhaps the mostnoticeable characteristic of these examples is that the flows themselves aredifferent in character. All have the same value, but the networks have manymaxflows,andthedifferentalgorithmsmakedifferentchoiceswhencomputingthem. This situation is typical in practice. We might try to impose otherconditions on the flows that we want to compute, but such changes to theproblemcanmakeitmoredifficult.ThemincostflowproblemthatweconsiderinSections22.5through22.7isonewaytoformalizesuchsituations.Table22.1givesmoredetailedquantitativeresultsforuseofthefourmethodstocomputetheflowsinProgram22.22.Anaugmenting-path

Figure22.23Maximum-capacityaugmentingpaths(largerexample)

Thisfiguredepictstheaugmentingpathscomputedbythemaximum-capacityalgorithmfortheEuclideannetworkwithrandomweightsthatisshownin

Program22.22,alongwiththeedgesinthegraph-searchspanningtree(ingray).Theresultingflowisshownatthebottomright.

algorithm’s performance depends not just on the number of augmenting pathsbut also on the lengths of such paths and on the cost of finding them. Inparticular,therunningtimeisproportionaltothenumberofedgesexaminedintheinnerloopofProgram22.3.Asusual,thisnumbermightvarywidely,evenforagivengraph,dependingonpropertiesoftherepresentation;butwecanstillcharacterize different algorithms. For example, Figures 22.23 and 22.24 showthe search trees for the maximum-capacity and shortest-path algorithms,respectively. These examples help support the general conclusion that theshortest-path method expends more effort to find augmenting paths with lessflow than the maximum-capacity algorithm, thus helping to explain why thelatterispreferred.Perhaps themost important lesson that we can learn from studying particularnetworks in detail in this way is that the gap between the upper bounds ofProperties22.6through22.8andtheactualnumberofaugmentingpathsthatthealgorithms need for a given applicationmight be enormous. For example, theflownetwork illustrated inProgram22.23has177vertices and2000edgesofcapacitylessthan100,sothevalueofthequantity2ElgMinProperty22.8ismorethan25,000;butthemaximum-capacityalgorithmfindsthemaxflowwithonly seven augmenting paths. Similarly, the value of the quantity V E/2 in

Property22.7forthisnetworkis177,000,buttheshortest-pathalgorithmneedsonly37paths.Aswehavealreadyindicated,therelativelylownodedegreeandthelocalityoftheconnectionspartiallyexplainthesedifferencesbetweentheoreticalandactualperformanceinthiscase.Wecanprovemoreaccurateperformanceboundsthataccount forsuchdetails;butsuchdisparitiesare the rule,not theexception, inflow-networkmodelsandinpracticalnetworks.Ontheonehand,wemighttakethese results to indicate that these networks are not sufficiently general torepresentthenetworksthatweencounterinpractice;ontheotherhand,perhapsthe worst-case analysis is more removed from practice than these kinds ofnetworks.Largegapslikethiscertainlyprovidestrongmotivationforresearchersseekingtolowertheworst-casebounds.Therearemanyotherpossibleimplementationsofaugmenting-pathalgorithms toconsider thatmight lead tobetterworst-caseperformance or better practical performance than the methods that we haveconsidered (see Exercises 22.56 through 22.60). Numerous methods that aremore sophisticated and have been shown to have improved worst-caseperformancecanbefoundintheresearchliterature(seereferencesection).Another importantcomplicationfollowswhenweconsider thenumerousotherproblemsthatreducetothemaxflowproblem.Whensuchreductionsapply,theflownetworks that resultmayhavesomespecialstructure thatsomeparticularalgorithmmay be able to exploit for improved performance. For example, inSection 22.8 wewill examine a reduction that gives flow networks with unitcapacitiesonalledges.Evenwhenwerestrictattentionjusttoaugmenting-pathalgorithms,weseethatthe study ofmaxflow algorithms is both an art and a science. The art lies inpicking the strategy that is most effective for a given practical situation; thescienceliesinunderstandingtheessentialnatureoftheproblem.Aretherenewdatastructuresandalgorithmsthatcansolvethemaxflowprobleminlineartime,orcanweprovethatnoneexist?InSection22.3,weseethatnoaugmenting-pathalgorithmcanhave linearworst-caseperformance,andweexamineadifferentgenericfamilyofalgorithmsthatmight.

Figure22.24Shortestaugmentingpaths(largerexample)

Thisfiguredepictstheaugmentingpathscomputedbytheshortest-pathsalgorithmfortheEuclideannetworkwithrandomweightsthatisshownin

Program22.22,alongwiththeedgesinthegraph-searchspanningtree(ingray).Inthiscase,thisalgorithmismuchslowerthanthemaximum-capacityalgorithm

depictedinProgram22.23bothbecauseitrequiresalargenumberofaugmentingpaths(thepathsshownarejustthefirst12outofatotalof37)and

becausethespanningtreesarelarger(usuallycontainingnearlyallofthevertices).

Exercises22.19 Show, in the style of Program 22.13, as many different sequences ofaugmenting paths as you can find for the flow network shown in Program22.10.22.20 Show, in the style ofProgram22.15, all the cuts for the flownetworkshowninProgram22.10,theircutsets,andtheircapacities.•22.21FindaminimumcutintheflownetworkshowninProgram22.11.•22.22Supposethatcapacitiesareinequilibriuminaflownetwork(foreveryinternalnode,thetotalcapacityofincomingedgesisequaltothetotalcapacity

ofoutgoingedges).Does theFord–Fulkersonalgorithmeveruseabackwardedge?Provethatitdoesorgiveacounterexample.22.23Give amaxflow for the flow network shown in Program 22.5with atleastoneflowthatisnotaninteger.•22.24DevelopanimplementationoftheFord–Fulkersonalgorithmthatusesageneralizedqueueinsteadofapriorityqueue(seeSection18.8).

• 22.25 Prove that the number of augmenting paths needed by anyimplementationoftheFord–FulkersonalgorithmisnomorethanV times thesmallestintegerlargerthantheratioofthelargestedgecapacitytothesmallestedgecapacity.22.26Provealinear-timelowerboundforthemaxflowproblem:Showthat,foranyvaluesofVandE, anymaxflowalgorithmmighthave toexamineeveryedgeinsomenetworkwithVverticesandEedges.•22.27GiveanetworklikeProgram22.19forwhichtheshortest-augmenting-pathalgorithmhastheworst-casebehaviorthatisillustrated.22.28Giveanadjacency-listsrepresentationof thenetworkinProgram22.19forwhichourimplementationofthestack-basedsearch(Program22.3,usingastackforthegeneralizedqueue)hastheworst-casebehaviorthatisillustrated.22.29Show,inthestyleofProgram22.16,theflowandresidualnetworksaftereachaugmentingpathwhenweusetheshortest-augmenting-pathalgorithmtofindamaxflowintheflownetworkshowninProgram22.10.Alsoincludethegraph-search trees for each augmenting path. When more than one path ispossible, show the one that is chosen by the implementations given in thissection.22.30 Do Exercise 22.29 for the maximum-capacity–augmenting-pathalgorithm.22.31DoExercise22.29forthestack-based–augmenting-pathalgorithm.•22.32Exhibitafamilyofnetworksforwhichthemaximum-augmenting-pathalgorithmneeds2ElgMaugmentingpaths.

• 22.33 Can you arrange the edges such that our implementations take timeproportional to E to find each path in your example in Exercise 22.32? Ifnecessary,modifyyourexampletoachievethisgoal.Describetheadjacency-listsrepresentationthatisconstructedforyourexample.Explainhowtheworstcaseisachieved.22.34RunempiricalstudiestodeterminethenumberofaugmentingpathsandtheratiooftherunningtimetoVforeachofthefouralgorithmsdescribedin

thissection,forvariousnetworks(seeExercises22.7–12).22.35Developandtestanimplementationoftheaugmenting-pathmethodthatusesthesource–sinkshortest-pathheuristicforEuclideannetworksofSection21.5.22.36Developandtestanimplementationoftheaugmenting-pathmethodthatisbasedonalternatelygrowingsearchtreesrootedatthesourceandatthesink(seeExercises21.35and21.75).• 22.37 The implementation of Program 22.3 stops the graph search when itfindsthefirstaugmentingpathfromsourcetosink,augments, thenstarts thesearch all over again.Alternatively, it could go onwith the search and findanotherpath,continuinguntilallverticeshavebeenmarked.Developandtestthissecondapproach.22.38Develop and test implementations of the augmenting-pathmethod thatusepathsthatarenotsimple.•22.39GiveasequenceofsimpleaugmentingpathsthatproducesaflowwithacycleinthenetworkdepictedinProgram22.11.

• 22.40 Give an example showing that not all maxflows can be the result ofstartingwith an empty network and augmenting along a sequence of simplepathsfromsourcetosink.22.41[Gabow]Developamaxflowimplementationthatusesm=lgMphases,wheretheithphasesolvesthemaxflowproblemusingtheleadingibitsofthecapacities.Startwithzerofloweverywhere;then,afterthefirstphase,initializetheflowbydoublingtheflowfoundduringthepreviousphase.Runempiricalstudies for various networks (see Exercises 22.7–12) to compare thisimplementationtothebasicmethods.•22.42ProvethattherunningtimeofthealgorithmdescribedinExercise22.41isO(VElgM).22.43Experimentwithhybridmethodsthatuseoneaugmenting-pathmethodatthebeginning,thenswitchtoadifferentaugmentingpathtofinishup(partofyour task is to decidewhat are appropriate criteria forwhen to switch).Runempirical studies for various networks (see Exercises 22.7–12) to comparethesetothebasicmethods,studyingmethodsthatperformbetterthanothersinmoredetail.22.44 Experiment with hybrid methods that alternate between two or moredifferentaugmenting-pathmethods.Runempiricalstudiesforvariousnetworks(see Exercises 22.7–12) to compare these to the basic methods, studying

variationsthatperformbetterthanothersinmoredetail.•22.45 Experimentwith hybridmethods that choose randomly among twoormore different augmenting-path methods. Run empirical studies for variousnetworks (see Exercises 22.7–12) to compare these to the basic methods,studyingvariationsthatperformbetterthanothersinmoredetail.

•22.46Write a flow-networkclient function that,givenan integerc, finds anedgeforwhichincreasingthecapacityofthatedgebycincreasesthemaxflowbythemaximumamount.YourfunctionmayassumethattheclienthasalreadycomputedamaximumflowwithMAXFLOW.

•• 22.47 Suppose that you are given a mincut for a network. Does thisinformationmakeiteasiertocomputeamaxflow?Developanalgorithmthatusesagivenmincuttospeedupsubstantiallythesearchformaximum-capacityaugmentingpaths.

• 22.48 Write a client program that does dynamic graphical animations ofaugmenting-path algorithms. Your program should produce images likeProgram22.17andtheotherfiguresinthissection(seeExercises17.55–59).Testyour implementationfor theEuclideannetworksamongExercises22.7–12.

22.3Preflow-PushMaxflowAlgorithmsIn thissection,weconsideranotherapproach tosolving themaxflowproblem.Using a genericmethodknown as thepreflow-pushmethod,we incrementallymove flow along the outgoing edges of vertices that have more inflow thanoutflow.Thepreflow-pushapproachwasdevelopedbyA.GoldbergandR.E.Tarjan in 1986 on the basis of various earlier algorithms. It is widely usedbecauseofitssimplicity,flexibility,andefficiency.Asdefined inSection22.1,a flowmustsatisfy theequilibriumconditions thattheoutflowfromthesourceisequaltotheinflowtothesinkandthatinflowisequaltotheoutflowateachoftheinternalnodes.Werefertosuchaflowasafeasibleflow.Anaugmenting-pathalgorithmalwaysmaintainsafeasibleflow:Itincreases the flow along augmenting paths until a maxflow is achieved. Bycontrast, thepreflow-pushalgorithms thatweconsider in this sectionmaintainmaxflows that are not feasible because some vertices have more inflow thanoutflow:Theypushflowthroughsuchverticesuntilafeasibleflowisachieved(nosuchverticesremain).Definition 22.5 In a flow network, apreflow is a set of positive edge flowssatisfyingtheconditionsthattheflowoneachedgeisnogreaterthanthatedge’s

capacityandthatinflowisnosmallerthanoutflowFigure22.25Preflow-pushexample

Inthepreflow-pushalgorithm,wemaintainalistoftheactivenodesthathavemoreincomingthanoutgoingflow(shownbeloweachnetwork).Oneversionofthealgorithmisaloopthatchoosesanactivenodefromthelistandpushesflowalongoutgoingedgesuntilitisnolongeractive,perhapscreatingotheractivenodesintheprocess.Inthisexample,wepushflowalong0-1,whichmakes1active.Next,wepushflowalong1-2and1-3,whichmakes1inactivebut2and3bothactive.Then,wepushflowalong2-4,whichmakes2inactive.But3-4doesnothavesufficientcapacityforustopushflowalongittomake3inactive,sowealsopushflowbackalong3-1todoso,whichmakes1active.Thenwecanpushtheflowalong1-2andthen2-4,whichmakesallnodesinactiveand

leavesamaxflow.

foreveryinternalvertex.Anactivevertex isaninternalvertexwhoseinflowislargerthanitsoutflow(byconvention,thesourceandsinkareneveractive).Werefertothedifferencebetweenanactivevertex’sinflowandoutflowasthatvertex’sexcess.Tochangethesetofactivevertices,wechooseoneandpushitsexcessalonganoutgoingedge,or,ifthereisinsufficientcapacitytodoso,pushthe excess back along an incoming edge. If the push equalizes the vertex’sinflow and outflow, the vertex becomes inactive; the flow pushed to anothervertexmayactivatethatvertex.Thepreflow-pushmethodprovidesasystematicway to push excess out of active vertices repeatedly so that the processterminates inamaxflow,withnoactivevertices.Wekeepactiveverticesonageneralized queue. As for the augmenting-path method, this decision gives agenericalgorithmthatencompassesawholefamilyofmorespecificalgorithms.Figure 22.25 is a small example that illustrates the basic operations used inpreflow-push algorithms, in terms of the metaphor that we have been using,whereweimaginethatflowcangoonlydownthepage.Eitherwepushexcessflow out of an active vertex down an outgoing edge or we imagine that thevertex temporarily moves up so we can push excess flow back down an

incomingedge.Figure22.26isanexamplethatillustrateswhythepreflow-pushapproachmightbepreferredtotheaugmenting-pathsapproach.Inthisnetwork,anyaugmenting-pathmethodsuccessivelyputsa tinyamountof flow througha longpathoverandoveragain,slowlyfillinguptheedgesonthepathuntilfinallythemaxflowisreached.Bycontrast,thepreflow-pushmethodfillsuptheedgesonthelongpathasittraversesthatpathforthefirsttime,thendistributesthatflowdirectlytothesinkwithouttraversingthelongpathagain.As we did in augmenting-path algorithms, we use the residual network (seeDefinition 22.4) to keep track of the edges thatwemight push flow through.Everyedgeintheresidualnetworkrepresentsapotentialplacetopushflow.Ifaresidual-networkedgeisinthesamedirectionasthecorrespondingedgeintheflownetwork,weincreasetheflow;ifitisintheoppositedirection,wedecreasethe flow. If the increase fills the edge or the decrease empties the edge, thecorresponding edge disappears from the residual network. For preflow-pushalgorithms,weuseanadditionalmechanismtohelpdecidewhichoftheedgesintheresidualnetworkcanhelpustoeliminateactivevertices.Definition22.6Aheightfunctionforagivenflowinaflownetworkisasetofnonnegative vertexweights h(0)…h(V−1)such that h (t)= 0 for the sink t andh(u)≤h(v)+1for every edge u-v in the residual network for the flow. An eligibleedgeisanedgeu-vintheresidualnetworkwithh(u)=h(v)+1.

Figure22.26BadcasefortheFord–Fulkersonalgorithm

ThisnetworkrepresentsafamilyofnetworkswithVverticesforwhichanyaugmenting-pathalgorithmrequiresV/2pathsoflengthV/2(sinceevery

augmentingpathhastoincludethelongverticalpath),foratotalrunningtimeproportionaltoV2.Preflow-pushalgorithmsfindmaxflowsinsuchnetworksin

lineartime.

Atrivialheightfunction,forwhichtherearenoeligibleedges,ish(0)=h(1)=…=h(V−1)=0.Ifweseth(s)=1,thenanyedgethatemanatesfromthesourceandhasflowcorrespondstoaneligibleedgeintheresidualnetwork.We define a more interesting height function by assigning to each vertex thelatter’sshortest-pathdistancetothesink(itsdistancetotherootinanyBFStreeforthereverseofthenetworkrootedatt,asillustratedinProgram22.27).Thisheight function isvalidbecauseh(t)=0, and, for anypairofverticesuandvconnectedbyanedgeu-v,anyshortestpathtotstartingwithu-visoflengthh(v)+1;sotheshortest-pathlengthfromutot,orh(u),mustbelessthanorequaltothatvalue.This functionplaysaspecial rolebecause itputseachvertexat themaximumpossibleheight.Workingbackward,weseethatthastobeatheight0;theonlyverticesthatcouldbeatheight1arethosewithanedgedirectedtotin

the residualnetwork; theonlyvertices that couldbe at height2 are those thathaveedgesdirectedtoverticesthatcouldbeatheight1,andsoforth.Property22.9Foranyflowandassociatedheightfunction,avertex’sheightisnolargerthanthelengthoftheshortestpathfromthatvertextothesinkintheresidualnetwork.Proof:Foranygivenvertexu,letdbetheshortest-pathlengthfromutot,andletu=u1,u2,…,ud=tbeashortestpath.Then

Figure22.27Initialheightfunction

ThetreeattherightisaBFStreerootedat5forthereverseofoursamplenetwork(left).Thevertex-indexedarrayhgivesthedistancefromeachvertextotherootandisavalidheightfunction:Foreveryedgeu-vinthenetwork,h[u]is

lessthanorequaltoh[v]+1.

The intuition behind height functions is the following:When an active node’sheightislessthantheheightofthesource,itispossiblethatthereissomewaytopush flow from that node down to the sink; when an active node’s heightexceeds theheightof the source,weknow that thatnode’sexcessneeds tobe

pushedbacktothesource.Toestablishthis latterfact,wereorientourviewofProperty22.9,wherewethoughtaboutthelengthoftheshortestpathtothesinkasanupperboundontheheight;instead,wethinkoftheheightasalowerboundontheshortest-pathlength:CorollaryIfavertex’sheightisgreaterthanV,thenthereisnopathfromthatvertextothesinkintheresidualnetwork.Proof:Ifthereisapathfromthevertextothesink,theimplicationofProperty22.9would be that the path’s length is greater thanV, but that cannot be truebecausethenetworkhasonlyVvertices.Now that we understand these basic mechanisms, the generic preflow-pushalgorithm is simple to describe.We start with any height function and assignzero flow to all edges except those connected to the source, whichwe fill tocapacity.Then,werepeatthefollowingstepuntilnoactiveverticesremain:

Chooseanactivevertex.Pushflowthroughsomeeligibleedgeleavingthatvertex(ifany).Iftherearenosuchedges,incrementthevertex’sheight.

Wedonotspecifywhat the initialheight function is,howtochoose theactivevertex,howtochoosetheeligibleedge,orhowmuchflowtopush.Werefertothisgenericmethodastheedge-basedpreflow-pushalgorithm.Thealgorithmdependsontheheightfunctiontoidentifyeligibleedges.Wealsousetheheightfunctionbothtoprovethatthealgorithmcomputesamaxflowandtoanalyzeperformance.Therefore,itiscriticaltoensurethattheheightfunctionremainsvalidthroughouttheexecutionofthealgorithm.Property22.10The edge-based–preflow-push algorithm preserves the validityoftheheightfunction.Proof:Weincrementh(u)onlyiftherearenoedgesu-vwithh(u)=h(v)+1.Thatis,h(u)< h(v)+1 for all edges u-v before incrementing h(u), so h(u)h(v) + 1afterward. For any incoming edgesw-u, incrementingh(u) certainly preservesthe inequality h(w)h(u)+1. Incrementing h(u) does not affect inequalitiescorrespondingtoanyotheredge,andweneverincrementh(t)(orh(s)).Together,theseobservationsimplythestatedresult.All theexcessflowemanatesfromthesource.Informally, thegenericpreflow-push algorithm tries to push the excess flow to the sink; if it cannot do so, iteventuallypushestheexcessflowbacktothesource.Itbehavesinthismannerbecausenodeswith excess always stayconnected to the source in the residualnetwork.

Property 22.11While the preflow-push algorithm is in execution on a flownetwork, there exists a (directed) path in that flow network’s residual networkfrom each active vertex to the source, and there are no (directed) paths fromsourcetosinkintheresidualnetwork.Proof:By induction. Initially, theonly flow is in theedges leaving thesource,which are filled to capacity, so the destination vertices of those edges are theonlyactivevertices.Sincetheedgesarefilledtocapacity,thereisanedgeintheresidualnetworkfromeachofthoseverticestothesourceandtherearenoedgesintheresidualnetworkleavingthesource.Thus,thestatedpropertyholdsfortheinitialflow.Thesourceisreachablefromeveryactivevertexbecausetheonlywaytoaddtothesetofactiveverticesistopushflowfromanactivevertexdownaneligibleedge.Thisoperation leavesanedge in theresidualnetworkfromthereceivingvertex back to the active vertex, from which the source is reachable, by theinductivehypothesis.Initially,noothernodesare reachable fromthesource in the residualnetwork.The first time thatanothernodeubecomes reachable from thesource iswhenflow is pushed back along u-s (thus causing s-u to be added to the residualnetwork).But this can happen onlywhenh(u) is greater thanh(s), which canhappenonlyafterh(u)hasbeenincremented,becausetherearenoedgesintheresidualnetworktoverticeswithlowerheight.Thesameargumentshowsthatallnodes reachable from the source have greater height. But the sink’s height isalways0,soitcannotbereachablefromthesource.CorollaryDuring the preflow-push algorithm, vertex heights are always lessthan2V.Proof:Weneedtoconsideronlyactivevertices,sincetheheightofeachinactivevertexiseitherthesameasor1greaterthanitwasthelasttimethatthevertexwasactive.BythesameargumentasintheproofofProperty22.9,thepathfromagivenactivevertextothesourceimpliesthatthatvertex’sheightisatmostV−2greaterthantheheightofthesource(tcannotbeonthepath).Theheightofthe source never changes, and it is initially no greater than V. Thus, activeverticesareofheightatmost2V−2,andnovertexhasheight2Vorgreater.Thegenericpreflow-pushalgorithmissimpletostateandimplement.Lessclear,perhaps, is why it computes a maxflow. The height function is the key toestablishingthatitdoesso.Property22.12Thepreflow-pushalgorithmcomputesamaxflow.

Proof:First,wehavetoshowthatthealgorithmterminates.Theremustcomeapointwheretherearenoactivevertices.Oncewepushalltheexcessflowoutofa vertex, that vertex cannot become active again until some of that flow ispushedback; and thatpushbackcanhappenonly if theheightof thevertex isincreased.Ifwehaveasequenceofactiveverticesofunboundedlength,somevertexhastoappearanunboundednumberoftimes;anditcandosoonlyifitsheightgrowswithoutbound,contradictingthecorollarytoProperty22.9.Whentherearenoactivevertices,theflowisfeasible.Since,byProperty22.11,there isalsonopath fromsource tosink in the residualnetwork, the flow isamaxflow,bythesameargumentasthatintheproofofProperty22.5.ItispossibletorefinetheproofthatthealgorithmterminatestogiveanO(V2E)bound on its worst-case running time. We leave the details to exercises (seeExercises22.66through22.67),infavorofthesimplerproofinProperty22.13,which applies to a less general version of the algorithm. Specifically, theimplementations thatweconsider arebasedon the followingmore specializedinstructionsfortheiteration:

Choose an active vertex. Increase the flow along an eligible edge leaving that vertex (filling it ifpossible),continuinguntilthevertexbecomesinactiveornoeligibleedgesremain.Inthelattercase,incrementthevertex’sheight.

That is, once we have chosen a vertex, we push out of it as much flow aspossible.Ifwegettothepointthatthevertexstillhasexcessflowbutnoeligibleedgesremain,weincrementthevertex’sheight.Werefertothisgenericmethodasthevertex-basedpreflow-pushalgorithm.Itisaspecialcaseoftheedge-basedgeneric algorithm, where we keep choosing the same active vertex until itbecomes inactive or we have used up all the eligible edges leaving it. Thecorrectnessproof inProperty22.12applies toanyimplementationof theedge-based generic algorithm, so it immediately implies that the vertex-basedalgorithmcomputesamaxflow.Program22.5 is an implementation of the vertex-based generic algorithm thatusesageneralizedqueuefortheactivevertices.Itisadirectimplementationofthemethodjustdescribedandrepresentsafamilyofalgorithmsthatdifferonlyin their initial height function (see, for example, Exercise 22.52) and in theirgeneralized-queueADT implementation.This implementationassumes that thegeneralizedqueuedisallowsduplicatevertices;alternatively,wecouldaddcodetoProgram22.5toavoidenqueueingduplicates(seeExercises22.61and22.62).Perhaps the simplest data structure to use for active vertices is aFIFOqueue.Figure 22.28 shows the operation of the algorithm on a sample network. As

illustrated in the figure, it is convenient to break up the sequence of activevertices chosen into a sequence ofphases,where a phase is defined to be thecontentsofthequeueafteralltheverticesthatwereonthequeueinthepreviousphasehavebeenprocessed.Doingsohelpsustoboundthetotalrunningtimeofthealgorithm.Property22.13Theworst-caserunningtimeoftheFIFOqueueimplementationofthepreflow-pushalgorithmisproportionaltoV2E.Proof:Weboundthenumberofphasesusingapotentialfunction.Thisargumentisasimpleexampleofapowerfultechniqueinthe

Figure22.28Residualnetworks(FIFOpreflow-push)

Thisfigureshowstheflownetworks(left)andtheresidualnetworks(right)foreachphaseoftheFIFOpreflow-pushalgorithmoperatingonoursample

network.Queuecontentsareshownbelowtheflownetworksanddistancelabelsbelowtheresidualnetworks.Intheinitialphase,wepushflowthrough0-1and0-2,thusmaking1and2active.Inthesecondphase,wepushflowfromthesetwoverticesto3and4,whichmakesthemactiveand1inactive(2remainsactiveanditsdistancelabelisincremented).Inthethirdphase,wepushflowthrough3and4to5,whichmakestheminactive(2stillremainsactiveanditsdistancelabelisagainincremented).Inthefourthphase,2istheonlyactivenode,andtheedge2-0isadmissiblebecauseofthedistance-labelincrements,andoneunitofflowispushedbackalong2-0tocompletethecomputation.

Program22.5Preflow-pushmaxflowimplementationThisimplementationofthegenericvertex-based–preflow-pushmaxflowalgorithmusesageneralizedqueuethatdisallowsduplicatesforactivenodes.Thewtvectorcontainseach

vertex’sexcessflowandthereforeimplicitlydefinesthesetofactivevertices.Byconvention,sisinitiallyactivebutnevergoesbackonthequeueandtisneveractive.Themainloopchoosesanactivevertexv,thenpushesflowthrougheachofitseligibleedges(addingverticesthatreceivetheflowtotheactivelist,ifnecessary),untileithervbecomesinactiveorallitsedgeshavebeenconsidered.Inthelattercase,v’sheightis

incrementedanditgoesbackontothequeue.

analysisofalgorithmsanddatastructuresthatweexamineinmoredetailinPart8.Define the quantity ϕ to be 0 if there are no active vertices and to be themaximumheightoftheactiveverticesotherwise,thenconsidertheeffectofeachphase on the value of ϕ. Let h0(s) be the initial height of the source. At the

beginning,ϕ=h0(s);attheend,ϕ=0.

First, we note that the number of phases where the height of some vertexincreases isnomore than2V2−h0(s), sinceeachof theVvertexheightscanbeincreasedtoavalueofatmost2V,by thecorollary toProperty22.11.Sinceϕcan increaseonly if theheightofsomevertex increases, thenumberofphaseswhereϕincreasesisnomorethan2V2−h0(s).

If, however, no vertex’s height is incremented during a phase, then ϕ mustdecreasebyatleast1,sincetheeffectofthephasewastopushallexcessflowfromeachactivevertextoverticesthathavesmallerheight.

Together,thesefactsimplythatthenumberofphasesmustbelessthan4V2:Thevalueofϕ ish0(s)at thebeginningandcanbe incrementedatmost2V2−h0(s)timesand thereforecanbedecrementedatmost2V2 times.Theworstcase foreach phase is that all vertices are on the queue and all of their edges areexamined,leadingtothestatedboundonthetotalrunningtime.This bound is tight. Program 22.29 illustrates a family of flow networks forwhichthenumberofphasesusedbythepreflow-pushalgorithmisproportionaltoV2.Becauseourimplementationsmaintainanimplicitrepresentationoftheresidualnetwork,theyexamineedgesleavingavertexevenwhenthoseedgesarenotintheresidualnetwork(totestwhetherornottheyarethere).Itispossibletoshowthat we can reduce the bound in Property 22.13 from V2E to V3 for animplementation that eliminates this cost by maintaining an explicitrepresentation of the residual network. Although the theoretical bound is thelowest that we have seen for the maxflow problem, this change may not beworththetrouble,particularlyforthesparsegraphsthatweseeinpractice(seeExercises22.63through22.65).Again, theseworst-casebounds tend tobepessimisticand thusnotnecessarilyuseful for predicting performance on real networks (though the gap is not asexcessiveaswefoundforaugmenting-path

Figure22.29FIFOpreflow-pushworstcase

ThisnetworkrepresentsafamilyofnetworkswithVverticessuchthatthetotalrunningtimeofthepreflow-pushalgorithmisproportionaltoV2.Itconsistsofunit-capacityedgesemanatingfromthesource(vertex0)andhorizontaledgesofcapacityv−2runningfromlefttorighttowardsthesink(vertex10).Intheinitialphaseofthepreflow-pushalgorithm(top),wepushoneunitofflowouteachedgefromthesource,makingalltheverticesactiveexceptthesourceandthe

sink.Inastandardadjacency-listsrepresentation,theyappearontheFIFOqueueofactiveverticesinreverseorder,asshownbelowthenetwork.Inthesecondphase(center),wepushoneunitofflowfrom9to10,making9inactive

(temporarily);thenwepushoneunitofflowfrom8to9,making8inactive(temporarily)andmaking9active;thenwepushoneunitofflowfrom7to8,making7inactive(temporarily)andmaking8active;andsoforth.Only1isleftinactive.Inthethirdphase(bottom),wegothroughasimilarprocesstomake2

inactive,andthesameprocesscontinuesforV−2phases.

algorithms). For example, the FIFO algorithm finds the flow in the network

illustrated in Program 22.30 in 15 phases, whereas the bound in the proof ofProperty22.13saysonlythatitmustdosoinfewerthan182.To improve performance,wemight try using a stack, a randomized queue, oranyothergeneralizedqueueinProgram22.5.Oneapproachthathasprovedtodo well in practice is to implement the generalized queue such that GQgetreturnsthehighestactivevertex.Werefertothismethodasthehighest-vertex–preflow-pushmaxflowalgorithm.Wecanimplementthisstrategywithapriorityqueue,althoughitisalsopossibletotakeadvantageoftheparticularpropertiesof heights to implement the generalized-queue operations in constant time. Aworst-case time bound of V2 (which is V5/2 for sparse graphs) has beenproved for this algorithm (see reference section); as usual, this bound ispessimistic.Many other preflow-push variants have been proposed, several ofwhich reduce the worst-case time bound to be close to V E (see referencesection).Table 22.2 shows performance results for preflow-push algorithmscorrespondingtothoseforaugmenting-pathalgorithmsinTable

Figure22.30Preflow-pushalgorithm(FIFO)

ThissequenceillustrateshowtheFIFOimplementationofthepreflow-pushmethodfindsamaximumflowinasamplenetwork.Itproceedsinphases:Firstitpushesasmuchflowasitcanfromthesourcealongtheedgesleavingthe

source(topleft).Then,itpushesflowfromeachofthosenodes,continuinguntilallnodesareinequilibrium.

Table22.2Empiricalstudyofpreflow-pushalgorithms

This table shows performance parameters (number of vertices expanded andnumberofadjacency-listnodestouched)forvariouspreflow-pushnetwork-flowalgorithmsforoursampleEuclideanneighbornetwork(withrandomcapacitieswith maxflow value 286) and with unit capacities (with maxflow value 6).Differences among the methods are minimal for both types of networks. Forrandomcapacities, thenumberofedgesexamined isabout thesameas for therandom augmenting-path algorithm (see Table 22.1). For unit capacities, theaugmenting-path algorithms examine substantially fewer edges for thesenetworks.

22.1,forthetwonetworkmodelsdiscussedinSection22.2.Theseexperimentsshow much less performance variation for the various preflow-push methodsthanweobservedforaugmenting-pathmethods.Therearemanyoptionstoexploreindevelopingpreflow-pushimplementations.Wehavealreadydiscussedthreemajorchoices:•Edge-basedversusvertex-basedgenericalgorithm•Generalizedqueueimplementation•Initialassignmentofheights

Thereareseveralotherpossibilitiestoconsiderandmanyoptionstotryforeach,leading to a multitude of different algorithms to study (see, for example,Exercises22.56through22.60).Thedependenceofanalgorithm’sperformanceoncharacteristicsoftheinputnetworkfurthermultipliesthepossibilities.The two generic algorithms that we have discussed (augmenting-path andpreflow-push) are among the most important from an extensive researchliterature on maxflow algorithms. The quest for better maxflow algorithms isstillapotentiallyfruitfulareaforfurtherresearch.Researchersaremotivatedtodevelopandstudynewalgorithmsand implementationsby therealityof fasteralgorithms for practical problems and by the possibility that a simple linearalgorithmexistsforthemaxflowproblem.Untiloneisdiscovered,wecanworkconfidently with the algorithms and implementations that we have discussed;numerousstudieshaveshownthemtobeeffectiveforabroadrangeofpracticalmaxflowproblems.

Exercises• 22.49 Describe the operation of the preflow-push algorithm in a networkwhosecapacitiesareinequilibrium.22.50 Use the concepts described in this section (height functions, eligibleedges, and pushing of flow through edges) to describe augmenting-pathmaxflowalgorithms.22.51Show,inthestyleofProgram22.28,theflowandresidualnetworksaftereachphasewhenyouusetheFIFOpreflow-pushalgorithmtofindamaxflowintheflownetworkshowninProgram22.10.•22.52 Implement the initheights() function forProgram22.5,usingbreadth-firstsearchfromthesink.22.53DoExercise22.51forthehighest-vertex–preflow-pushalgorithm.• 22.54 Modify Program 22.5 to implement the highest-vertex–preflow-pushalgorithm, by implementing the generalized queue as a priority queue. RunempiricaltestssothatyoucanaddalinetoTable22.2for thisvariantof thealgorithm.22.55Plotthenumberofactiveverticesandthenumberofverticesandedgesin the residual network as the FIFO preflow-push algorithm proceeds, forspecificinstancesofvariousnetworks(seeExercises22.7–12).• 22.56 Implement the generic edge-based–preflow-push algorithm, using ageneralized queue of eligible edges. Run empirical studies for variousnetworks (see Exercises 22.7–12) to compare these to the basic methods,studyinggeneralized-queueimplementationsthatperformbetterthanothersinmoredetail.22.57ModifyProgram22.5torecalculatethevertexheightsperiodicallytobeshortest-pathlengthstothesinkintheresidualnetwork.•22.58Evaluatetheideaofpushingexcessflowoutofverticesbyspreadingitevenlyamongtheoutgoingedges,ratherthanperhapsfillingsomeandleavingothersempty.22.59Runempiricalteststodeterminewhethertheshortest-pathscomputationfor the initial height function is justified in Program 22.5 by comparing itsperformance as given for various networks (see Exercises 22.7–12) with itsperformancewhenthevertexheightsarejustinitializedtozero.•22.60 Experimentwith hybridmethods involving combinations of the ideasabove.Runempiricalstudiesforvariousnetworks(seeExercises22.7–12) to

compare these to the basic methods, studying variations that perform betterthanothersinmoredetail.22.61 Modify the implementation of Program 22.5 such that it explicitlydisallowsduplicateverticeson thegeneralizedqueue.Runempirical tests forvarious networks (see Exercises 22.7–12) to determine the effect of yourmodificationonactualrunningtimes.22.62What effect does allowing duplicate vertices on the generalized queuehaveontheworst-caserunning-timeboundofProperty22.13?22.63 Modify the implementation of Program 22.5 to maintain an explicitrepresentationoftheresidualnetwork.•22.64SharpentheboundinProperty22.13toO(V3)fortheimplementationofExercise 22.63.Hint: Prove separate bounds on the number of pushes thatcorrespondtodeletionofedgesintheresidualnetworkandonthenumberofpushesthatdonotresultinfulloremptyedges.22.65Run empirical studies for various networks (seeExercises 22.7–12) todeterminetheeffectofusinganexplicitrepresentationoftheresidualnetwork(seeExercise22.63)onactualrunningtimes.22.66 For the edge-based generic preflow-push algorithm, prove that thenumberofpushesthatcorrespondtodeletinganedgeintheresidualnetworkisless than 2V E. Assume that the implementation keeps an explicitrepresentationoftheresidualnetwork.• 22.67 For the edge-based generic preflow-push algorithm, prove that thenumber of pushes that do not correspond to deleting an edge in the residualnetworkislessthan4V2(V+E).Hint:Usethesumoftheheightsoftheactiveverticesasapotentialfunction.

•22.68RunempiricalstudiestodeterminetheactualnumberofedgesexaminedandtheratiooftherunningtimetoVforseveralversionsofthepreflow-pushalgorithm for various networks (see Exercises 22.7–12). Consider variousalgorithmsdescribedinthetextandinthepreviousexercises,andconcentrateonthosethatperformthebestonhugesparsenetworks.CompareyourresultswithyourresultfromExercise22.34.

• 22.69 Write a client program that does dynamic graphical animations ofpreflow-pushalgorithms.Yourprogramshouldproduce images likeProgram22.30 and the other figures in this section (see Exercise 22.48). Test yourimplementationfortheEuclideannetworksamongExercises22.7–12.

Figure22.31Reductionfrommultiplesourcesandsinks

Thenetworkatthetophasthreesources(0,1,and2)andtwosinks(5and6).Tofindaflowthatmaximizesthetotalflowoutofthesourcesandintothesinks,wefindamaxflowinthest-networkillustratedatthebottom.Thisnetworkisacopyoftheoriginalnetwork,withtheadditionofanewsource7andanewsink8.Thereisanedgefrom7toeachoriginal-networksourcewithcapacityequaltothesumofthecapacitiesofthatsource’soutgoingedges,andanedgefromeachoriginal-networksinkto8withcapacityequaltothesumofthecapacities

ofthatsink’sincomingedges.

22.4MaxflowReductionsInthissection,weconsideranumberofreductionstothemaxflowprobleminordertodemonstratethatthemaxflowalgorithmsofSections22.2and22.3areimportantinabroadcontext.Wecanremovevariousrestrictionsonthenetworkand solve other flow problems; we can solve other network-and graph-processingproblems;andwecansolveproblemsthatarenotnetworkproblemsatall.Thissectionisdevotedtoexamplesofsuchuses—toestablishmaxflowasageneralproblem-solvingmodel.Wealsoconsiderrelationshipsbetweenthemaxflowproblemandproblemsthataremoredifficultsoastosetthecontextforconsideringthoseproblemslateron.In particular, we note that the maxflow problem is a special case of the

mincostflow problem that is the subject of Sections 22.5 and 22.6, and wedescribe how to formulatemaxflow problems as LP problems,whichwewilladdress inPart8.Mincost flowandLP representproblem-solvingmodels thatare more general than the maxflow model. Although we normally can solvemaxflowproblemsmoreeasilywiththespecializedalgorithmsofSections22.2and 22.3 than with algorithms that solve these more general problems, it isimportanttobecognizantoftherelationshipsamongproblem-solvingmodelsasweprogresstomorepowerfulones.Weusethetermstandardmaxflowproblemtorefertotheversionoftheproblemthatwehavebeenconsidering(maxflowinedge-capacitatedst-networks).Thisusage is solely for easy reference in this section. Indeed, we begin byconsidering reductions that show the restrictions in the standardproblem tobeessentially immaterial, because several other flow problems reduce to or areequivalent to the standard problem. We could adopt any of the equivalentproblems as the “standard” problem. A simple example of such a problem,alreadynotedasaconsequenceofProperty22.1,isthatoffindingacirculationin a network that maximizes the flow in a specified edge. Next, we considerother ways to pose the problem, in each case noting its relationship to thestandardproblem.MaxflowingeneralnetworksFind the flow inanetwork thatmaximizes thetotal outflow from its sources (and therefore the total inflow to its sinks). Byconvention,definetheflowtobezeroiftherearenosourcesornosinks.Property22.14Themaxflowproblemforgeneralnetworksisequivalenttothemaxflowproblemforst-networks.Proof: A maxflow algorithm for general networks will clearly work for st-networks,sowejustneedtoestablishthatthegeneralproblemreducestothest-networkproblem.Todoso,firstfindthesourcesandsinks(using,forexample,themethodthatweusedtoinitializethequeueinProgram19.8),andreturnzeroiftherearenoneofeither.Then,addadummysourcevertexsandedgesfromstoeachsourceinthenetwork(witheachsuchedge’scapacitysettothatedge’sdestination vertex’s outflow) and a dummy sink vertex t and edges from eachsinkinthenetworktot(witheachsuchedge’scapacitysettothatedge’ssourcevertex’sinflow)Program22.31illustratesthisreduction.Anymaxflowinthest-networkcorrespondsdirectlytoamaxflowintheoriginalnetwork.Vertex-capacityconstraintsGiven a flownetwork, find amaxflow satisfyingadditional constraints specifying that the flow through each vertex must notexceedsomefixedcapacity.

Property 22.15 The maxflow problem for flow networks with capacityconstraintsonverticesisequivalenttothestandardmaxflowproblem.Proof: Again, we could use any algorithm that solves the capacity-constraintproblem to solve a standard problem (by setting capacity constraints at eachvertex to be larger than its inflow or outflow), so we need only to show areduction to the standard problem. Given a flow network with capacityconstraints, construct a standard flow network with two vertices u and u*correspondingtoeachoriginalvertexu,withallincomingedgestotheoriginalvertex going to u, all outgoing edges coming from u*, and an edge u-u* ofcapacityequaltothevertexcapacity.ThisconstructionisillustratedinProgram22.32. The flows in the edges of the form u*-v in any maxflow for thetransformednetworkgiveamaxflowfor theoriginalnetwork thatmustsatisfythevertex-capacityconstraintsbecauseoftheedgesoftheformu-u*.

Figure22.32Removingvertexcapacities

Tosolvetheproblemoffindingamaxflowinthenetworkatthetopsuchthatflowthrougheachvertexdoesnotexceedthecapacityboundgiveninthevertex-indexedarraycapV,webuildthestandardnetworkatthebottom:

Associateanewvertexu*(whereu*denotesu+V)witheachvertexu,addanedgeu-u*whosecapacityisthecapacityofu,andincludeanedgeu*-vforeachedgeu-v.Eachu-u*pairisencircledinthediagram.Anyflowinthebottomnetworkcorrespondsdirectlytoaflowinthetopnetworkthatsatisfiesthe

vertex-capacityconstraints.

Allowing multiple sinks and sources or adding capacity constraints seem togeneralize themaxflow problem; the interest of Properties 22.14 and 22.15 isthat these problems are actually no more difficult than the standard problem.Next,weconsideraversionoftheproblemthatmightseemtobeeasiertosolve.AcyclicnetworksFindamaxflowinanacyclicnetwork.Doesthepresenceofcyclesinaflownetworkmakemoredifficultthetaskofcomputingamaxflow?We have seen many examples of digraph-processing problems that are muchmore difficult to solve in the presence of cycles. Perhaps themost prominentexample is the shortest-pathsproblem inweighteddigraphswithedgeweightsthatcouldbenegative(seeSection21.7),whichissimpletosolveinlineartimeif therearenocyclesbutNP-complete ifcyclesareallowed.But themaxflowproblem,remarkably,isnoeasierforacyclicnetworks.Property22.16Themaxflowproblemforacyclicnetworks isequivalent to thestandardmaxflowproblem.Proof:Again,we need only to show that the standard problem reduces to theacyclicproblem.GivenanynetworkwithVverticesandEedges,weconstructanetworkwith2V+2verticesandE+3Vedgesthat isnot justacyclicbuthasasimplestructure.Letu*denoteu+V,andbuildabipartitedigraphconsistingoftwoverticesuandu* corresponding to each vertex u in the original network and one edge u-v*corresponding to each edge u-v in theoriginal networkwith the same capacity.Now,addtothebipartitedigraphasourcesandasinktand,foreachvertexuintheoriginalgraph,anedges-uandanedgeu*-t,bothofcapacityequal to thesumofthecapacitiesofu’soutgoingedgesintheoriginalnetwork.Also,letXbethesumofthecapacitiesoftheedgesintheoriginalnetwork,andaddedgesfrom u to u*, with capacityX +1. This construction is illustrated in Program22.33.

Figure22.33Reductiontoacyclicnetwork

Eachvertexuinthetopnetworkcorrespondstotwoverticesuandu*(whereu*denotesu+V)inthebottomnetworkandeachedgeu-vinthetopnetworkcorrespondstoanedgeu-v*inthebottomnetwork.Additionally,thebottom

networkhasuncapacitatededgesu-u*,asourceswithanedgetoeachunstarredvertex,andasinktwithanedgefromeachstarredvertex.Theshadedand

unshadedvertices(andedgeswhichconnectshadedtounshaded)illustratethedirectrelationshipamongcutsinthetwonetworks(seetext).

Toshowthatanymaxflowintheoriginalnetworkcorrespondstoamaxflowinthetransformednetwork,weconsidercutsratherthanflows.Givenanyst-cutofsizecintheoriginalnetwork,weshowhowtoconstructanst-cutofsizec+Xinthe transformed network; and, given any minimal st-cut of size c+ X in thetransformednetwork,weshowhowtoconstructanst-cutofsizecintheoriginalnetwork. Thus, given a minimal cut in the transformed network, thecorrespondingcutintheoriginalnetworkisminimal.Moreover,ourconstructiongivesaflowwhosevalueisequaltotheminimal-cutcapacity,soitisamaxflow.Givenanycutoftheoriginalnetworkthatseparatesthesourcefromthesink,letS be the source’svertex set andT the sink’svertex set.Construct a cut of thetransformednetworkbyputtingverticesinSinasetwithsandverticesinTinasetwithtandputtinguandu*onthesamesideofthecutforallu,asillustratedinProgram22.33.Foreveryu,eithers-uorP|u*-t|isinthecutset,andu-v*isinthecutsetifandonlyifu-visinthecutsetoftheoriginalnetwork;sothetotalcapacityofthecutisequaltothecapacityofthecutintheoriginalnetworkplusX.Givenanyminimalst-cutofthetransformednetwork,letS*bes’svertexsetandT*t’svertexset.Ourgoalistoconstructacutofthesamecapacitywithuandu*both in the same cut vertex set for all u so that the correspondence of theprevious paragraph gives a cut in the original network, completing the proof.First, ifu is inS andu* inT*, thenu-u*mustbea crossingedge,which is acontradiction:u-u*cannotbeinanyminimalcut,becauseacutconsistingofallthe edges corresponding to the edges in the original graph is of lower cost.Second,ifuisinTandu*isinS,thens-umustbeinthecut,becausethatistheonlyedgeconnectingstou.Butwecancreateacutofequalcostbysubstitutingalltheedgesdirectedoutofufors-u,movingutoS*.Givenanyflowinthetransformednetworkofvaluec+X,wesimplyassignthesameflowvaluetoeachcorrespondingedgeintheoriginalnetworktogetaflowwithvaluec.Thecut transformationat theendof thepreviousparagraphdoes

notaffectthisassignment,becauseitmanipulatesedgeswithflowvaluezero.Theresultofthereductionnotonlyisanacyclicnetwork,butalsohasasimplebipartitestructure.Thereductionsaysthatwecould, ifwewished,adopt thesesimplernetworks,ratherthangeneralnetworks,asourstandard.Itwouldseemthatperhapsthisspecialstructurewouldleadtofastermaxflowalgorithms.Butthe reduction shows thatwe could use any algorithm thatwe found for thesespecialacyclicnetworkstosolvemaxflowproblemsingeneralnetworks,atonlymodest extra cost. Indeed, the classical maxflow algorithms exploit theflexibilityofthegeneralnetworkmodel:Boththeaugmenting-pathandpreflow-pushapproachesthatwehaveconsideredusetheconceptofaresidualnetwork,whichinvolvesintroducingcyclesintothenetwork.

Figure22.34Reductionfromundirectednetworks

Tosolveamaxflowprobleminanundirectednetwork,wecanconsiderittobeadirectednetworkwithedgesineachdirection.Notethattheresidualnetworkforsuchanetworkhasfouredgescorrespondingtoeachedgeintheundirected

network.

Whenwehaveamaxflowproblemforanacyclicnetwork,wetypicallyusethestandardalgorithmforgeneralnetworkstosolveit.TheconstructionofProperty22.16iselaborate,andit illustratesthatreductionproofscanrequirecare,ifnotingenuity.Suchproofsareimportantbecausenotallversionsofthemaxflowproblemareequivalenttothestandardproblem,andweneed toknowtheextentof theapplicabilityofouralgorithms.Researcherscontinue to explore this topic because reductions relating various naturalproblemshavenotyetbeenestablished,asillustratedbythefollowingexample.Maxflow in undirected networks An undirected flow network is aweightedgraphwithintegeredgeweightsthatweinterprettobecapacities.Acirculationin such a network is an assignment of weights and directions to the edgessatisfyingtheconditionsthattheflowoneachedgeisnogreaterthanthatedge’scapacityandthatthetotalflowintoeachvertexisequaltothetotalflowoutofthat vertex. The undirected maxflow problem is to find a circulation thatmaximizestheflowinspecifieddirectioninaspecifiededge(thatis,fromsomevertex s to some other vertex t). This problem perhaps corresponds morenaturallythanthestandardproblemtoourliquid-flowing-through-pipesmodel:Itcorrespondstoallowingliquidtoflowthroughapipeineitherdirection.Property22.17Themaxflowproblemforundirectedst-networksreducestothemaxflowproblemforst-networks.Proof:Givenanundirectednetwork,constructadirectednetworkwiththesamevertices and two directed edges corresponding to each edge, one in eachdirection,bothwiththecapacityoftheundirectededge.Anyflowintheoriginalnetworkcertainlycorrespondstoaflowwiththesamevalueinthetransformednetwork. The converse is also true: If u-v has flow f and v-u flow g in theundirectednetwork,thenwecanputtheflowf−ginu-vinthedirectednetworkif f≥g;g−f in v-u otherwise. Thus, anymaxflow in the directed network is amaxflowintheundirectednetwork:Theconstructiongivesaflow,andanyflowwithahighervalueinthedirectednetworkwouldcorrespondtosomeflowwithahighervalueintheundirectednetwork;butnosuchflowexists.This proof does not establish that the problem for undirected networks isequivalenttothestandardproblem.Thatis,itleavesopen

Figure22.35Feasibleflow

Inafeasible-flowproblem,wespecifysupplyanddemandconstraintsattheverticesinadditiontothecapacityconstraintsontheedges.Weseekanyflow

forwhichoutflowequalssupplyplusinflowatsupplyverticesandinflowequalsoutflowplusdemandatdemandvertices.Threesolutionstothefeasible-flow

problematleftareshownontheright.

the possibility that finding maxflows in undirected networks is easier thanfindingmaxflowsinstandardnetworks(seeExercise22.81).Insummary,wecanhandlenetworkswithmultiplesinksandsources,undirectednetworks,networkswithcapacityconstraintsonvertices,andmanyothertypesofnetworks(see,forexample,Exercise22.79)withthemaxflowalgorithmsforst-networks in the previous two sections. In fact, Property 22.16 says thatwecould solve all these problems with even an algorithm that works for onlyacyclicnetworks.Next,weconsideraproblemthatisnotanexplicitmaxflowproblembutthatwecan reduce to the maxflow problem and therefore solve with maxflowalgorithms. It is one way to formalize a basic version of the merchandise-distributionproblemdescribedatthebeginningofthischapter.Feasible flow Suppose that a weight is assigned to each vertex in a flownetwork,andistobeinterpretedassupply(ifpositive)ordemand(ifnegative),withthesumofthevertexweightsequaltozero.Defineaflowtobefeasibleifthedifferencebetweeneachvertex’soutflowandinflowisequaltothatvertex’sweight (supply if positive and demand if negative). Given such a network,determine whether or not a feasible flow exists. Program 22.35 illustrates afeasible-flowproblem.Supply vertices correspond to warehouses in the merchandise-distributionproblem;demandverticescorrespondtoretailoutlets;andedgescorrespondtoroads on the trucking routes. The feasible-flow problem answers the basic

questionofwhetheritispossibletofindawaytoshipthegoodssuchthatsupplymeetsdemandeverywhere.

Program22.6FeasibleflowviareductiontomaxflowThis class solves the feasible-flow problem by reduction to maxflow, using theconstructionillustratedinProgram22.36.Theconstructortakesasargumentanetworkandavertex-indexedvectorsdsuch thatsd[i] represents, if it ispositive, thesupplyatvertexiand,ifitisnegative,thedemandatvertexi.

AsindicatedinProgram22.36,theconstructormakesanewgraphwiththesameedgesbutwithtwoextraverticessandt,withedgesfromstothesupplynodesandfromthedemandnodes to tThenit findsamaxflowandcheckswhetherall theextraedgesarefilledtocapacity.

Figure22.36Reductionfromfeasibleflow

Thisnetworkisastandardnetworkconstructedfromthefeasible-flowprobleminProgram22.35byaddingedgesfromanewsourcevertextothesupply

vertices(eachwithcapacityequaltotheamountofthesupply)andedgestoa

newsinkvertexfromthedemandvertices(eachwithcapacityequaltotheamountofthedemand).ThenetworkinProgram22.35hasafeasibleflowifandonlyifthisnetworkhasaflow(amaxflow)thatfillsalltheedgesfromthesink

andalltheedgestothesource.

Property22.18Thefeasible-flowproblemreducestothemaxflowproblem.Proof: Given a feasible-flow problem, construct a network with the samevertices and edges butwith noweights on the vertices. Instead, add a sourcevertexsthathasanedgetoeachsupplyvertexwithweightequaltothatvertex’ssupplyandasinkvertextthathasanedgefromeachdemandvertexwithweightequal to the negation of that vertex’s demand (so that the edge weight ispositive).Solvethemaxflowproblemonthisnetwork.Theoriginalnetworkhasafeasibleflowifandonlyifalltheedgesoutofthesourceandalltheedgesintothesinkarefilledtocapacityinthisflow.Program22.36illustratesanexampleofthisreduction.Developing classes that implement reductions of the type that we have beenconsidering can be a challenging software-engineering task, primarily becausethe objects that we are manipulating are represented with complicated datastructures.Toreduceanotherproblemtoastandardmaxflowproblem,shouldwecreateanewnetwork?Someoftheproblemsrequireextradata,suchasvertexcapacitiesorsupplyanddemand,socreatingastandardnetworkwithout thesedatamightbejustified.Ouruseofedgepointersplaysacriticalrolehere:Ifwecopyanetwork’sedgesandthencomputeamaxflow,whatarewetodowiththeresult? Transferring the computed flow (a weight on each edge) from onenetworktoanotherwhenbotharerepresentedwithadjacencylistsisnotatrivialcomputation.With edge pointers, the new network has copies of pointers, notedges,sowecantransferflowassignmentsrightthroughtotheclient’snetwork.Program22.6isanimplementationthatillustratessomeoftheseconsiderationsin a class for solving feasible-flow problems using the reduction of Property22.16.Acanonicalexampleofaflowproblemthatwecannothandlewiththemaxflowmodel,andthat is thesubjectofSections22.5and22.6, isanextensionof thefeasible-flowproblem.Weaddasecondsetofedgeweightsthatweinterpretascosts,defineflowcostsintermsoftheseweights,andaskforafeasibleflowofminimal cost. This model formalizes the general merchandise-distributionproblem.Weareinterestednotjustinwhetheritispossibletomovethegoodsbutalsoinwhatisthelowest-costwaytomovethem.

All theproblems thatwehaveconsidered so far in this sectionhave the samebasicgoal(computingaflowinaflownetwork),soitisperhapsnotsurprisingthatwe can handle themwith a flow-network problem-solvingmodel. Aswesawwiththemaxflow–mincuttheorem,wecanusemaxflowalgorithmstosolvegraph-processing problems that seem to have little to dowith flows.We nowturntoexamplesofthiskind.Maximum-cardinalitybipartitematchingGivenabipartitegraph,findasetofedgesofmaximumcardinalitysuchthateachvertexisconnectedtoatmostoneothervertex.Forbrevity,werefer to thisproblemsimplyas thebipartite-matchingproblemexcept in contexts where we need to distinguish it from similar problems. Itformalizesthejob-placementproblemdiscussedatthebeginningofthischapter.Verticescorrespondtoindividualsandemployers;edgescorrespondtoa“mutualinterest in the job” relation. A solution to the bipartite-matching problemmaximizes totalemployment.Program22.37 illustrates thebipartitegraph thatmodelstheexampleprobleminProgram22.3.

Figure22.37Bipartitematching

Thisinstanceofthebipartite-matchingproblemformalizesthejob-placementexampledepictedinProgram22.3.Findingthebestwaytomatchthestudentstothejobsinthatexampleisequivalenttofindingthemaximumnumberofvertex-

disjointedgesinthisbipartitegraph.

It is an instructive exercise to think about finding a direct solution to thebipartite-matching problem, without using the graph model. For example, theproblemamountstothefollowingcombinatorialpuzzle:“Findthelargestsubsetofasetofpairsofintegers(drawnfromdisjointsets)withthepropertythatnotwo pairs have the same integer.” The example depicted in Program 22.37corresponds tosolving thispuzzleon thepairs0-6,0-7,0-8,1-6,andsoforth.Theproblemseemsstraightforwardatfirst,but,aswastrueoftheHamilton-pathproblemthatweconsideredinSection17.7(andmanyotherproblems),anaiveapproachthatchoosespairsinsomesystematicwayuntilfindingacontradictionmight require exponential time. That is, there are far toomany subsets of the

pairs for us to try all possibilities; a solution to the problem must be cleverenoughtoexamineonlyafewof them.Solvingspecificmatchingpuzzles liketheone justgivenordevelopingalgorithms thatcansolveefficientlyanysuchpuzzlearenontrivialtasksthathelptodemonstratethepowerandutilityofthenetwork-flowmodel,whichprovidesareasonablewaytodobipartitematching.Property 22.19 The bipartite-matching problem reduces to the maxflowproblem.Proof:Givenabipartite-matchingproblem,constructaninstanceofthemaxflowproblembydirectingalledgesfromonesettotheother,addingasourcevertexwith edges directed to all the members of one set in the bipartite graph, andaddingasinkvertexwithedgedirectedfromallthemembersoftheotherset.Tomake the resulting digraph a network, assign each edge capacity 1. Program22.38illustratesthisconstruction.Now,anysolutiontothemaxflowproblemforthisnetworkprovidesasolutionto the corresponding bipartite-matching problem. The matching correspondsexactlytothoseedgesbetweenverticesinthetwosetsthatarefilledtocapacityby the maxflow algorithm. First, the network flow always gives a legalmatching:Sinceeachvertexhasanedgeofcapacityoneeithercomingin(fromthesink)orgoingout(to thesource),atmostoneunitofflowcangothrougheachvertex, implyingin turn thateachvertexwillbe includedatmostonceinthe matching. Second, no matching can have more edges, since any suchmatchingwouldleaddirectlytoabetterflowthanthatproducedbythemaxflowalgorithm.

Figure22.38Reductionfrombipartitematching

Tofindamaximummatchinginabipartitegraph(top),weconstructanst-network(bottom)bydirectingalltheedgesfromthetoprowtothebottomrow,addinganewsourcewithedgestoeachvertexonthetoprow,addinganewsinkwithedgestoeachvertexonthebottomrow,andassigningcapacity1toall

edges.Inanyflow,atmostoneoutgoingedgefromeachvertexonthetoprowcanbefilledandatmostoneincomingedgetoeachvertexonthebottomrowcanbefilled,soasolutiontothemaxflowproblemonthisnetworkgivesa

maximummatchingforthebipartitegraph.

Forexample, inProgram22.38,anaugmenting-pathmaxflowalgorithmmightuse the paths s-0-6-t, s-1-7-t, s-2-8-t, s-4-9-t, s-5-10-t, and s-3-6-0-7-1-11-t tocomputethematching0-7,1-11,2-8,3-6,4-9,and5-10.Thus,thereisawaytomatchallthestudentstojobsinProgram22.3.Program22.7 isaclientprogramthat readsabipartite-matchingproblemfromstandardinputandusesthereductiondescribedinthisprooftosolveit.Whatistherunningtimeofthisprogramforhugenetworks?Certainly,therunningtimedepends on themaxflow algorithm and implementation thatwe use.Also,weneedtotakeintoaccountthatthenetworksthatwebuildhaveaspecialstructure(unit-capacity bipartite flow networks)—not only do the running times of thevariousmaxflow algorithms thatwe have considered not necessarily approachtheirworst-case bounds, but alsowe can substantially reduce the bounds. Forexample, the first bound that we considered, for the generic augmenting-pathalgorithm,providesaquickanswer.Corollary The time required to find a maximum-cardinality matching in a

bipartitegraphisO(VE).Proof:ImmediatefromProperty22.6.

Program22.7BipartitematchingviareductiontomaxflowThis client reads a bipartitematching problemwithV +V vertices andE edges fromstandard input, thenconstructs a flownetworkcorresponding to thebipartitematchingproblem, finds the maximum flow in the network, and uses the solution to print amaximumbipartitematching.

Theoperationofaugmenting-pathalgorithmsonunit-capacitybipartitenetworksissimpletodescribe.Eachaugmentingpathfillsoneedgefromthesourceandoneedgeintothesink.Theseedgesareneverusedasbackedges,sothereareatmostV augmenting paths. TheVE upper bound holds for any algorithm thatfindsaugmentingpathsintimeproportionaltoE.Table 22.3 shows performance results for solving random bipartite-matchingproblemsusing various augmenting-path algorithms. It is clear from this tablethatactualrunningtimesforthisproblemareclosertotheVEworstcasethantotheoptimal(linear)

Table22.3Empiricalstudyforbipartitematching

This table shows performance parameters (number of vertices expanded andnumber of adjacency-list nodes touched) when various augmenting-pathmaxflow algorithms are used to compute a maximum bipartite matching forgraphswith2000pairsofverticesand500edges(top)and4000edges(bottom).Forthisproblem,depth-firstsearchisthemosteffectivemethod.

time. It is possible, with judicious choice and tuning of the maxflowimplementation,tospeedupthismethodbyafactorofpV(seeExercises22.91and22.92).Thisproblemisrepresentativeofasituationthatwefacemorefrequentlyasweexamine new problems and more general problem-solving models,demonstratingtheeffectivenessofreductionasapracticalproblem-solvingtool.If we can find a reduction to a known general model such as the maxflowproblem,wegenerallyview thatasamajorstep towarddevelopingapracticalsolution,becauseitat least indicatesnotonlythat theproblemis tractable,butalso that we have numerous efficient algorithms for solving the problem. Inmany situations, it is reasonable touse an existingmaxflowclass to solve theproblem and move on to the next problem. If performance remains a criticalissue,wecanstudytherelativeperformanceofvariousmaxflowalgorithmsorimplementations,orwecanusetheirbehaviorasthestartingpointtodevelopabetter,special-purposealgorithm.Thegeneralproblem-solvingmodelprovidesbothanupperbound thatwecanchooseeithertolivewithortostrivetoimprove,andahostofimplementationsthathaveprovedeffectiveonavarietyofotherproblems.Next,wediscussproblemsrelatingtoconnectivityingraphs.Beforeconsideringtheuseofmaxflowalgorithmstosolveconnectivityproblems,weexaminetheuse of the maxflow–mincut theorem to take care of a piece of unfinished

businessfromChapter18:Theproofsofthebasictheoremsrelatingtopathsandcutsinundirectedgraphs.Theseproofsarefurthertestimonytothefundamentalimportanceofthemaxflow–mincuttheorem.Property 22.20 (Menger’s Theorem) The minimum number of edges whoseremovaldisconnectstwoverticesinadigraphisequaltothemaximumnumberofedge-disjointpathsbetweenthetwovertices.Proof:Givenadigraph,defineaflownetworkwiththesameverticesandedgeswithalledgecapacitiesdefinedtobe1.ByProperty22.2,wecanrepresentanyst-flowasasetofedge-disjointpathsfromstot,withthenumberofsuchpathsequal to thevalueof the flow.Thecapacityofanyst-cut isequal to thatcut’scardinality.Given these facts, themaxflow–mincut theorem implies the statedresult.Thecorrespondingresultsforundirectedgraphs,andforvertexconnectivityfordigraphs and for undirected graphs, involve reductions similar to thoseconsideredhereandareleftforexercises(seeExercises22.94through22.96).Nowweturntoalgorithmicimplicationsofthedirectconnectionbetweenflowsandconnectivity that isestablishedby themaxflow–mincut theorem.Property22.5isperhapsthemostimportantalgorithmicimplication(themincutproblemreducestothemaxflowproblem),buttheconverseisnotknowntobetrue(seeExercise22.47).Intuitively, itseemsas thoughknowingamincutshouldmakeeasier the taskof findingamaxflow,butnoonehasbeenable todemonstratehow.Thisbasicexamplehighlightstheneedtoproceedwithcarewhenworkingwithreductionsamongproblems.Still, we can also use maxflow algorithms to handle numerous connectivityproblems. For example, they help solve the first nontrivial graph-processingproblemsthatweencountered,inChapter18.Edge connectivity What is the minimum number of edges that need to beremoved to separate a given graph into two pieces? Find a set of edges ofminimalcardinalitythatdoesthisseparation.VertexconnectivityWhat is theminimumnumber of vertices that need to beremoved to separate a given graph into two pieces? Find a set of vertices ofminimalcardinalitythatdoesthisseparation.These problems also are relevant for digraphs, so there are a total of fourproblems to consider.AswithMenger’s theorem,we consider one of them indetail(edgeconnectivityinundirectedgraphs)andleavetheothersforexercises.Property 22.21 The time required to determine the edge connectivity of an

undirectedgraphisO(E2).Proof:Wecancompute theminimumsizeof anycut that separates twogivenverticesbycomputingthemaxflowinthest-networkformedfromthegraphbyassigning unit capacity to each edge. The edge connectivity is equal to theminimumofthesevaluesoverallpairsofvertices.Wedonotneedtodothecomputationforallpairsofvertices,however.Lets*beavertexofminimaldegreeinthegraph.Notethatthedegreeofs*canbenogreater than2E/V.Consider anyminimumcut of thegraph.Bydefinition, thenumber of edges in the cut set is equal to the graph’s edge connectivity. Thevertex s* appears in one of the cut’s vertex sets, and the other setmust havesomevertext,sothesizeofanyminimalcutseparatings*andtmustbeequaltothe graph’s edge connectivity. Therefore, if we solve V-1 maxflow problems(using s* as the source and each other vertex as the sink), theminimum flowvaluefoundistheedgeconnectivityofthenetwork.Now, any augmenting-path maxflow algorithm with s* as the source uses atmost2E/Vpaths;so,ifweuseanymethodthattakesatmostEstepstofindanaugmentingpath,wehavea totalofatmost (V−1)(2E/V )E steps to find theedgeconnectivityandthatimpliesthestatedresult.This method, unlike all the other examples of this section, is not a directreductionofoneproblem to another, but it doesgive apractical algorithm forcomputing edge connectivity. Again, with a careful maxflow implementationtuned to this specificproblem,wecan improve performance—it is possible tosolvetheproblemintimeproportionaltoVE(seereferencesection).Theproofof Property 22.21 is an example of the more general concept of efficient(polynomial-time)reductionsthatwefirstencounteredinSection21.7and thatplays an essential role in the theory of algorithms discussed inPart 8. Such areductionbothprovestheproblemtobetractableandprovidesanalgorithmforsolvingit—significantfirststepsincopingwithanewcombinatorialproblem.Weconclude thissectionbyconsideringastrictlymathematical formulationofthemaxflowproblem,using linearprogramming (LP) (seeSection21.6). Thisexerciseisusefulbecauseithelpsustoseerelationshipstootherproblemsthatcanbesoformulated.The formulation is straightforward:We consider a system of inequalities thatinvolveonevariablecorrespondingtoeachedge,twoinequalitiescorrespondingtoeachedge,andoneequationcorresponding toeachvertex.Thevalueof thevariable is the edge flow, the inequalities specify that the edge flowmust bebetween0andtheedge’scapacity,andtheequationsspecifythatthetotalflow

ontheedgesthatgointoeachvertexmustbeequaltothetotalflowontheedgesthatgooutofthatvertex.

Figure22.39LPformulationofamaxflowproblem

ThislinearprogramisequivalenttothemaxflowproblemforthesamplenetworkofProgram22.5.Thereisoneinequalityforeachedge(whichspecifies

thatflowcannotexceedcapacity)andoneequalityforeachvertex(whichspecifiesthatflowinmustequalflowout).Weuseadummyedgefromsinkto

sourcetocapturetheflowinthenetwork,asdescribedinthediscussionfollowingProperty22.2.

Figure22.39illustratesanexampleofthisconstruction.AnymaxflowproblemcanbecastasaLPproblem in thisway.LP isaversatileapproach to solvingcombinatorialproblems,andagreatnumberoftheproblemsthatwestudycanbeformulatedaslinearprograms.ThefactthatmaxflowproblemsareeasiertosolvethanLPproblemsmaybeexplainedbythefactthattheconstraintsintheLP formulation ofmaxflowproblems have a specific structure not necessarilyfoundinallLPproblems.EventhoughtheLPproblemismuchmoredifficultingeneralthanarespecificproblemssuchas themaxflowproblem,therearepowerfulalgorithmsthatcansolveLPproblemsefficiently.Theworst-caserunningtimeofthesealgorithmscertainlyexceedstheworst-caserunningofthespecificalgorithmsthatwehavebeenconsidering,butanimmenseamountofpracticalexperienceoverthepastseveraldecadeshasshownthemtobeeffectiveforsolvingproblemsofthetypethatariseinpractice.TheconstructionillustratedinProgram22.39indicatesaproofthatthemaxflowproblemreducestotheLPproblem,unlessweinsistthatflowvaluesbeintegers.Whenwe examineLP in detail in Part 8,we describe away to overcome thedifficulty that theLP formulation does not carry the constraint that the resultshaveintegervalues.This context gives us a precise mathematical framework that we can use toaddressevermoregeneralproblemsandtocreateevermorepowerfulalgorithmsto solve those problems. The maxflow problem is easy to solve and also isversatileinitsownright,asindicatedbytheexamplesinthissection.Next,weexamine aharderproblem (still easier thanLP) that encompasses a still-widerclass of practical problems.We discuss ramifications of building solutions tothese increasingly general problem-solving models at the end of this chapter,settingthestageforafulltreatmentinPart8.

Exercises•22.70Defineaclassforfindingacirculationwithmaximalflowinaspecified

edge.ProvideanimplementationthatusesMAXFLOW.22.71Defineaclassforfindingamaxflowinanetworkwithnoconstraintonthenumberofsourcesorsinks.ProvideanimplementationthatusesMAXFLOW.22.72Defineaclassforfindingamaxflowinanundirectednetwork.ProvideanimplementationthatusesMAXFLOW.22.73 Define a class for finding a maxflow in a network with capacityconstraintsonvertices.ProvideanimplementationthatusesMAXFLOW.• 22.74 Develop a class for feasible-flow problems that includes memberfunctionsallowingclientstosetsupply–demandvaluesandtocheckthatflowvaluesareproperlyrelatedateachvertex.22.75DoExercise22.18 for the case that eachdistributionpoint has limitedcapacity(thatis,thereisalimitontheamountofgoodsthatcanbestoredthereatanygiventime).•22.76 Show that themaxflowproblem reduces to the feasible-flowproblem(sothatthetwoproblemsarethereforeequivalent).22.77FindafeasibleflowfortheflownetworkshowninProgram22.10,giventhe additional constraints that 0, 2, and 3 are supply verticeswithweight 4,andthat1,4,and5aresupplyverticeswithweights1,3,and5,respectively.• 22.78 Write a program that takes as input a sports league’s schedule andcurrentstandingsanddetermineswhetheragiventeamiseliminated.Assumethattherearenoties.Hint:Reducetoafeasible-flowproblemwithonesourcenodethathasasupplyvalueequaltothetotalnumberofgamesremainingtoplayintheseason,sinknodesthatcorrespondtoeachpairofteamshavingademandvalueequaltothenumberofremaininggamesbetweenthatpair,anddistribution nodes that correspond to each team. Edges should connect thesupplynodetoeachteam’sdistributionnode(ofcapacityequaltothenumberof games that team would have to win to beat X if X were to win all itsremaining games), and there should be an (uncapacitated) edge connectingeachteam’sdistributionnodetoeachofthedemandnodesinvolvingthatteam.

• 22.79 Prove that the maxflow problem for networks with lower bounds onedgesreducestothestandardmaxflowproblem.

•22.80 Prove that, for networkswith lower bounds on edges, the problemoffinding aminimal flow (that respects the bounds) reduces to the maxflowproblem(seeExercise22.79).

••• 22.81 Prove that the maxflow problem for st-networks reduces to themaxflow problem for undirected networks, or find amaxflow algorithm for

undirected networks that has a worst-case running time substantially betterthanthoseofthealgorithmsinSections22.2and22.3.

•22.82FindallthematchingswithfiveedgesforthebipartitegraphinProgram22.37.22.83ExtendProgram22.7tousesymbolicnamesinsteadofintegerstorefertovertices(seeProgram17.10).•22.84Prove that thebipartite-matchingproblemisequivalent to theproblemoffindingmaxflowsinnetworkswherealledgesareofunitcapacity.22.85Wemight interpret the example inProgram22.3 as describing studentpreferences for jobsandemployerpreferences forstudents, the twoofwhichmay not be mutual. Does the reduction described in the text apply to thedirectedbipartite-matchingproblemthatresultsfromthisinterpretation,whereedgesinthebipartitegrapharedirected(ineitherdirection)fromonesettotheother?Provethatitdoesorprovideacounterexample.• 22.86 Construct a family of bipartite-matching problemswhere the averagelength of the augmenting paths used by any augmenting-path algorithm tosolvethecorrespondingmaxflowproblemisproportionaltoE.22.87Show,inthestyleofProgram22.28,theoperationoftheFIFOpreflow-push network-flow algorithm on the bipartite-matching network shown inProgram22.38.•22.88ExtendTable22.3toincludevariouspreflow-pushalgorithms.••22.89Supposethatthetwosetsinabipartite-matchingproblemareofsizeSandT, with S << T. Give as sharp a bound as you can for the worst-caserunningtimetosolvethisproblem,forthereductionofProperty22.19andthemaximal-augmenting-path implementation of the Ford–Fulkerson algorithm(seeProperty22.8).

•22.90Exercise22.89fortheFIFO-queueimplementationofthepreflow-pushalgorithm(seeProperty22.13).22.91 Extend Table 22.3 to include implementations that use the all-augmenting-pathsapproachdescribedinExercise22.37.••22.92ProvepthattherunningtimeofthemethoddescribedinExercise22.91isO( E)forBFS.22.93Doempiricalstudiestoplottheexpectednumberofedgesinamaximalmatching in random bipartite graphswithV +V vertices andE edges, for areasonablesetofvaluesforVandsufficientvaluesofEtoplotasmoothcurve

thatgoesfromzerotoV.•22.94ProveMenger’stheorem(Property22.20)forundirectedgraphs.•22.95Provethattheminimumnumberofverticeswhoseremovaldisconnectstwovertices inadigraph isequal to themaximumnumberofvertex-disjointpaths between the two vertices.Hint: Use a vertex-splitting transformation,similartotheoneillustratedinProgram22.32.

•22.96ExtendyourproofforExercise22.95toapplytoundirectedgraphs.22.97Implementanedge-connectivityclassforthegraphADTofChapter17whose constructor uses the algorithm described in this section to support apublicmemberfunctionthatreturnsthegraph’sconnectivity.22.98ExtendyoursolutiontoExercise22.97toputinauser-suppliedvectoraminimalsetofedgesthatseparatesthegraph.Howbigavectorshouldtheuserexpect?•22.99Developanalgorithmforcomputingtheedgeconnectivityofdigraphs(the minimal number of edges whose removal leaves a digraph that is notstrongly connected). Implement a class based on your algorithm for thedigraphADTofChapter19.

• 22.100 Develop algorithms based on your solutions to Exercises 22.95 and22.96forcomputingthevertexconnectivityofdigraphsandundirectedgraphs.Implementclassesbasedonyouralgorithmsfor thedigraphADTofChapter19 and thegraphADTofChapter17, respectively (seeExercises 22.97 and22.98).22.101DescribehowtofindthevertexconnectivityofadigraphbysolvingVlgVunit-capacitymaxflowproblems.Hint:UseMenger’stheoremandbinarysearch.• 22.102 Run empirical studies based on your solution to Exercise 22.97 todetermineedgeconnectivityofvariousgraphs(seeExercises17.63–76).

•22.103GiveanLP formulation for theproblemof findingamaxflow in theflownetworkshowninProgram22.10.

•22.104FormulateasanLPproblemthebipartite-matchingprobleminProgram22.37.

22.5MincostFlowsItisnotunusualfortheretobenumeroussolutionstoagivenmaxflowproblem.This fact leads to the question of whether we might wish to impose someadditionalcriteriaforchoosingoneofthem.Forexample,thereareclearlymany

solutions to theunit-capacity flowproblems shown inProgram22.22;perhapswewouldprefertheonethatusesthefewestedgesortheonewiththeshortestpaths, or perhapswewould like to knowwhether there exists one comprisingdisjoint paths. Such problems are more difficult than the standard maxflowproblem; they fall into a more general model known as the mincost flowproblem,whichisthesubjectofthissection.Aswiththemaxflowproblem,therearenumerousequivalentwaystoposethemincostflow problem. We consider one standard formulation in detail in thissection,thenconsidervariousreductionsinSection22.7.Specifically, we use themincostmaxflow model:We define an edge type thatincludesintegercosts,usetheedgecoststodefineaflowcostinanaturalway,thenaskforamaximalflowofminimalcost.Aswelearn,notonlydowehaveefficientandeffectivealgorithmsforthisproblem,butalsotheproblem-solvingmodelisofbroadapplicability.Definition22.8Theflowcostofanedgeinaflownetworkwithedgecostsistheproductof thatedge’s flowandcost.Thecostofa flow is the sumof the flowcostsofthatflow’sedges.WecontinuetoassumethatcapacitiesarepositiveintegerslessthanM.WealsoassumeedgecoststobenonnegativeintegerslessthanC.(Disallowingnegativecosts is primarily a matter of convenience, as discussed in Section 22.7.) Asbefore,weassignnamestotheseupper-boundvaluesbecausetherunningtimesof some algorithms depend on the latter. With these basic assumptions, theproblemthatwewishtosolveistrivialtodefine.MincostmaxflowGivenaflownetworkwithedgecosts,findamaxflowsuchthatnoothermaxflowhaslowercost.Figure 22.40 illustrates different maxflows in a flow network with costs,includingamincostmaxflow.Certainly,thecomputationalburdenofminimizingcost isno lesschallenging than theburdenofmaximizingflowwithwhichwewereconcernedinSections22.2

Figure22.40Maxflowsinflownetworkswithcosts

Theseflowsallhavethesame(maximal)value,buttheircosts(thesumoftheproductsofedgeflowsandedgecosts)differ.Themaxflowinthecenterhas

minimalcost(nomaxflowhaslowercost).

and 22.3. Indeed, costs add an extra dimension that presents significant newchallenges.Evenso,wecanshoulderthisburdenwithagenericalgorithmthatissimilartotheaugmenting-pathalgorithmforthemaxflowproblem.Numerous other problems reduce to or are equivalent to themincostmaxflowproblem. For example, the following formulation is of interest because itencompasses the merchandise-distribution problem that we considered at thebeginningofthechapter.Mincost feasible flow Recall thatwe define a flow in a networkwith vertexweights(supplyifpositive,demandifnegative)tobefeasibleifthesumofthevertexweightsisnegativeandthedifferencebetweeneachvertex’soutflowandinflow.Givensuchanetwork,findafeasibleflowofminimalcost.Todescribe thenetworkmodel for themincost–feasible-flowproblem,weusethetermdistributionnetworkforbrevitytomean“capacitatedflownetworkwithedgecostsandsupplyordemandweightsonvertices.”In the merchandise-distribution application, supply vertices correspond towarehouses,demandverticestoretailoutlets,edgestotruckingroutes,supplyordemand values to the amount ofmaterial to be shipped or received, and edgecapacities to the number and capacity of the trucks available for the variousroutes.Anaturalwaytointerpreteachedge’scostisasthecostofmovingaunitofflowthroughthatedge(thecostofsendingaunitofmaterialinatruckalongthecorrespondingroute).Givenaflow,anedge’sflowcostisthepartofthecostofmovingtheflowthroughthenetworkthatwecanattributetothatedge.Givenanamountofmaterialthatistobeshippedalongagivenedge,wecancomputethecostofshippingitbymultiplyingthecostperunitbytheamount.Doingthis

computationforeachedge

Program22.8ComputingflowcostThis function,whichmightbe added toProgram22.1, returns the costof anetwork’sflow.Itsumscosttimesflowforallpositive-capacityedges,allowingforuncapacitatededgestobeusedasdummyedges.

staticintcost(Graph&G)

{intx=0;


{



if(e->from(v)&&e->costRto(e->w())<C)

x+=e->flow()*e->costRto(e->w());

}

returnx;

}

andaddingtheresultstogethergivesusthetotalshippingcost,whichwewouldliketominimize.Property 22.22 The mincost–feasible-flow problem and the mincostmaxflowproblemsareequivalent.Proof: Immediate, by the same correspondence as Property 22.18 (see alsoExercise22.76).Because of this equivalence and because the mincost–feasible-flow problemdirectlymodelsmerchandise-distributionproblemsandmanyotherapplications,we use the termmincost flow to refer to both problems in contextswherewecouldrefertoeither.WeconsiderotherreductionsinSection22.7.To implementedgecosts in flownetworks,weaddan integer pcost private datamembertotheEDGEclassfromSection22.1andamemberfunctioncost()toreturnitsvaluetoclients.Program22.8isaclientfunctionthatcomputesthecostofaflow in a graph built with pointers to such edges. As when we work withmaxflows, it is also prudent to implement a function to check that inflowandoutflowvaluesareproperlyrelatedateachvertexandthatthedatastructuresareconsistent(seeExercise22.12).The first step indevelopingalgorithms to solve themincostflowproblem is toextendthedefinitionofresidualnetworkstoincludecostsontheedges.Definition22.9Given a flow in a flow networkwith edge costs, the residualnetworkfortheflowhasthesameverticesastheoriginalandoneortwoedgesin the residual network for each edge in the original, defined as follows:Foreachedgeu-vintheoriginal,letfbetheflow,cthecapacity,andxthecost.Iffispositive,includeanedgev-uintheresidualwithcapacityfandcost-x;iffisless

thanc,includeanedgeu-vintheresidualwithcapacityc-fandcostx.ThisdefinitionisnearlyidenticaltoDefinition22.4,butthedifferenceiscrucial.Edgesintheresidualnetworkthatrepresentbackwardedgeshavenegativecost.To implement this convention, we use the following member function in theedgeclass:

intcostRto(intv)

{returnfrom(v)?-pcost:pcost;}

Traversingbackwardedgescorrespondstoremovingflowinthecorrespondingedgeintheoriginalnetwork,sothecosthastobereducedaccordingly.Becauseof the negative edge costs, these networks can have negative-cost cycles. Theconceptofnegativecycles,whichseemedartificialwhenwefirstconsidereditin thecontextof shortest-pathsalgorithms,playsacritical role inmincostflowalgorithms, as we now see. We consider two algorithms, both based on thefollowingoptimalitycondition.Property 22.23A maxflow is amincost maxflow if and only if its residualnetworkcontainsnonegative-cost(directed)cycle.Proof:Supposethatwehaveamincostmaxflowwhoseresidualnetworkhasanegative-cost cycle. Let x be the capacity of a minimal-capacity edge in thecycle. Augment the flow by adding x to edges in the flow corresponding topositive-cost edges in the residual network (forward edges) and subtracting xfrom edges corresponding to negative-cost edges in the residual network(backward edges). These changes do not affect the difference between inflowandoutflowatanyvertex,sotheflowremainsamaxflow,buttheychangethenetwork’s cost by x times the cost of the cycle, which is negative, therebycontradictingtheassertionthatthecostoftheoriginalflowwasminimal.Toprovetheconverse,supposethatwehaveamaxflowFwithnonegative-costcycleswhosecostisnotminimal,andconsideranymincostmaxflowM.Byanargument identical to the flow-decomposition theorem (Property22.2),wecanfindatmostEdirectedcyclessuchthataddingthosecyclestotheflowFgivestheflowM.But,sinceFhasnonegativecycles,thisoperationcannotlowerthecostofF,acontradiction.Inotherwords,weshouldbeabletoconvertFtoMbyaugmentingalongcycles,butwecannotdosobecausewehavenonegative-costcyclestousetolowertheflowcost.This property leads immediately to a simple generic algorithm for solving themincostflowproblem,calledthecycle-cancelingalgorithm.

Findamaxflow.Augment theflowalonganynegative-costcycle in theresidualnetwork,continuinguntilnoneremain.

Thismethodbringstogethermachinerythatwehavedevelopedoverthischapterandthepreviousonetoprovideeffectivealgorithmsforsolvingthewideclassofproblemsthatfitintothemincostflowmodel.Likeseveralothergenericmethodsthatwehaveseen,itadmitsseveraldifferentimplementations,sincethemethodsfor finding the initialmaxflowand for finding thenegative-costcyclesarenotspecified. Figure 22.41 shows an example mincostmaxflow computation thatusescyclecanceling.Sincewehavealreadydevelopedalgorithmsforcomputingamaxflowandforfindingnegativecycles,weimmediatelyhavetheimplementationof thecycle-canceling algorithm given in Program 22.9. We use any maxflowimplementation to find the initialmaxflowand theBellman–Fordalgorithm tofind negative cycles (seeExercise 22.108). To these two implementations,weneedtoaddonlyalooptoaugmentflowalongthecycles.We can eliminate the initial maxflow computation in the cycle-cancelingalgorithmbyaddingadummyedgefromsourcetosinkandassigningtoitacostthatishigherthanthecostofanysource–sinkpathinthenetwork(forexample,VC )anda flow that ishigher than themaxflow(forexample,higher than thesource’soutflow).Withthisinitialsetup,cyclecancelingmovesasmuchflowaspossible out of the dummy edge, so the resulting flow is a maxflow. AmincostflowcomputationusingthistechniqueisillustratedinProgram22.42.Inthe figure,weuse an initial flowequal to themaxflow tomakeplain that thealgorithmissimplycomputinganotherflowofthesamevaluebut

Figure22.41Residualnetworks(cyclecanceling)

Eachoftheflowsdepictedhereisamaxflowfortheflownetworkdepictedatthetop,butonlytheoneatthebottomisamincostmaxflow.Tofindit,westart

withanymaxflowandaugmentflowaroundnegativecycles.Theinitialmaxflow(secondfromtop)hasacostof22,whichisnotamincostmaxflowbecausetheresidualnetwork(shownatright)hasthreenegativecycles.Inthisexample,weaugmentalong4-1-0-2-4togetamaxflowofcost21(thirdfromtop),whichstillhasonenegativecycle.Augmentingalongthatcyclegivesamincostflow(bottom).Notethataugmentingalong3-2-4-1-3inthefirststep

wouldhavebroughtustothemincostflowinonestep.

Program22.9CyclecancelingThisclasssolvesthemincostmaxflowproblembycancelingnegative-costcycles.ItusesMAXFLOW to find amaxflow and a privatemember function negcyc (see Exercise22.108) to find negative cycles. While a negative cycle exists, this code finds one,computes themaximumamountof flowtopush through it,anddoesso.Theaugmentfunction is the same as in Program 22.3, whichwas coded (with some foresight!) toworkproperlywhenthepathisacycle.

template<classGraph,classEdge>classMINCOST{constGraph&G;ints,t;vector<int>wt;vector<Edge*>st;intST(intv)const;voidaugment(int,int);intnegcyc(int);intnegcyc();public:MINCOST(constGraph&G,ints,intt):G(G),s(s),t(t),st(G.V()),wt(G.V()){MAXFLOW<Graph,Edge>(G,s,t);for(intx=negcyc();x!=-1;x=negcyc()){augment(x,x);}}};

lowercost(ingeneral,wedonotknowtheflowvalue,sothereissomeflowleftinthedummyedgeattheend,whichweignore).Asisevidentfromthefigure,some augmenting cycles include the dummy edge and increase flow in thenetwork;othersdonotincludethedummyedgeandreducecost.Eventually,wereach amaxflow; at that point, all the augmenting cycles reduce costwithoutchangingthevalueoftheflow,aswhenwestartedwithamaxflow.

Technically,usingadummy-flowinitialization isneithermorenor lessgenericthan using a maxflow initialization for cycle canceling. The former doesencompass all augmenting-pathmaxflowalgorithms, but not allmaxflows canbecomputedwithanaugmenting-pathalgorithm

Figure22.42Cyclecancelingwithoutinitialmaxflow

Thissequenceillustratesthecomputationofamincostmaxflowfromaninitiallyemptyflowwiththecycle-cancelingalgorithmbyusingadummyedgefrom

sinktosourceintheresidualnetworkwithinfinitecapacityandinfinitenegativecost.Thedummyedgemakesanyaugmentingpathfrom0to5anegativecycle

(butweignoreitwhenaugmentingandcomputingthecostoftheflow).Augmentingalongsuchapathincreasestheflow,asinaugmenting-path

algorithms(topthreerows).Whentherearenocyclesinvolvingthedummyedge,therearenopathsfromsourcetosinkintheresidualnetwork,sowehaveamaxflow(thirdfromtop).Atthatpoint,augmentingalonganegativecycle

decreasesthecostwithoutchangingtheflowvalue(bottom).Inthisexample,wecomputeamaxflow,thendecreaseitscost;butthatneednotbethecase.For

example,thealgorithmmighthaveaugmentedalongthenegativecycle1-4-5-3-1insteadof0-1-4-5-0inthesecondstep.Sinceeveryaugmentationeitherincreasesthefloworreducesthecost,wealwayswindupwithamincost

maxflow.

(see Exercise 22.40). On the one hand, by using this technique, we may begivingupthebenefitsofasophisticatedmaxflowalgorithm;ontheotherhand,wemaybebetteroffreducingcostsduringtheprocessofbuildingamaxflow.Inpractice, dummy-flow initialization is widely used because it is so simple toimplement.As formaxflows, theexistenceof thisgenericalgorithmguarantees thateverymincostflowproblem(withcapacitiesandcoststhatareintegers)hasasolutionwhere flows are all integers; and the algorithm computes such a solution (seeExercise22.107).Given this fact, it iseasy toestablishanupperboundon theamountoftimethatanycycle-cancelingalgorithmwillrequire.Property22.24Thenumberofaugmentingcyclesneededin thegenericcycle-cancelingalgorithmislessthanECM.Proof:Intheworstcase,eachedgeintheinitialmaxflowhascapacityM,costC,andisfilled.Eachcyclereducesthiscostbyatleast1.Corollary The time required to solve the mincostflow problem in a sparsenetworkisO(V3CM).Proof: Immediate bymultiplying theworst-case number of augmenting cyclesby the worst-case cost of the Bellman–Ford algorithm for finding them (seeProperty21.22).Like that of augmenting-path methods, this running time is extremely

pessimistic,asitassumesnotonlythatwehaveaworst-casesituationwhereweneed to use a huge number of cycles tominimize cost, but also thatwe haveanotherworst-casesituationwherewehavetoexamineahugenumberofedgestofindeachcycle.Inmanypracticalsituations,weuserelativelyfewcyclesthatarerelativelyeasytofind,andthecycle-cancelingalgorithmiseffective.It is possible to develop a strategy that finds negative-cost cycles and ensuresthat the number of negative-cost cycles used is less than V E (see referencesection). This result is significant because it establishes the fact that themincostflow problem is tractable (as are all the problems that reduce to it).However,practitionerstypicallyuseimplementationsthatadmitabadworstcase(in theory) but use substantially fewer iterations on the problems that arise inpracticethanpredictedbytheworst-casebounds.Themincostflow problem represents the most general problem-solvingmodelthatwehaveyetexamined,soitisperhapssurprisingthatwecansolveitwithsuch a simple implementation. Because of the importance of the model,numerous other implementations of the cycle-cancelingmethod and numerousotherdifferentmethodshavebeendevelopedandstudiedindetail.Program22.9is a remarkably simple and effective starting point, but it suffers from twodefects that can potentially lead to poor performance. First, each time thatweseek a negative cycle, we start from scratch. Can we save intermediateinformation during the search for one negative cycle that can help us find thenext?Second,Program22.9justtakesthefirstnegativecyclethattheBellman–Ford algorithm finds. Can we direct the search towards negative cycles withparticularproperties?InSection22.6,weconsideranimprovedimplementation,stillgeneric,thatrepresentsaresponsetobothofthesequestions.

Exercises22.105 Expand your class for feasible flows from Exercise 22.74 to includecosts.UseMINCOSTtosolvesthemincost–feasible-flowproblem.•22.106Givena flownetworkwhoseedgesarenotallmaximalcapacityandcost,giveanupperboundbetterthanECMonthecostofamaxflow.22.107Provethat,ifallcapacitiesandcostsareintegers,thenthemincostflowproblemhasasolutionwhereallflowvaluesareintegers.22.108Implementthenegcyc()functionforProgram22.9,usingtheBellman-Fordalgorithm(seeExercise21.134).•22.109ModifyProgram22.9toinitializewithflowinadummyedgeinsteadofcomputingaflow.

•22.110GiveallpossiblesequencesofaugmentingcyclesthatmighthavebeendepictedinProgram22.41.

•22.111GiveallpossiblesequencesofaugmentingcyclesthatmighthavebeendepictedinProgram22.42.22.112 Show, in the style of Program 22.41, the flow and residual networksaftereachaugmentationwhenyouuse thecycle-canceling implementationofProgram22.9 to find amincost flow in the flownetwork shown inProgram22.10,withcost2assignedto0-2and0-3;cost3assignedto2-5and3-5;cost4assignedto1-4;andcost1assignedtoalloftheotheredges.Assumethatthemaxflowiscomputedwiththeshortest-augmenting-pathalgorithm.22.113AnswerExercise 22.112, but assume that the program ismodified tostart with a maxflow in a dummy edge from source to sink, as in Program22.42.22.114ExtendyoursolutionstoExercises22.6and22.7tohandlecostsinflownetworks.22.115ExtendyoursolutionstoExercises22.9through22.11toincludecostsin the networks. Take each edge’s cost to be roughly proportional to theEuclideandistancebetweentheverticesthattheedgeconnects.

22.6NetworkSimplexAlgorithmThe running time of the cycle-canceling algorithm is based on not just thenumberofnegative-cost cycles that thealgorithmuses to reduce the flowcostbut also the time that the algorithm uses to find each of the cycles. In thissection,weconsiderabasicapproachthatbothdramaticallydecreasesthecostof identifyingnegative cycles and admitsmethods for reducing thenumberofiterations.Thisimplementationofthecycle-cancelingalgorithmisknownasthenetworksimplexalgorithm.It isbasedonmaintaininga treedatastructureandreweightingcostssuchthatnegativecyclescanbeidentifiedquickly.Todescribethenetworksimplexalgorithm,webeginbynotingthat,withrespecttoanyflow,eachnetworkedgeu-visinoneofthreestates(seeFigure22.43):

Figure22.43Edgeclassification

Withrespecttoanyflow,everyedgeiseitherempty,full,orpartial(neitheremptynorfull).Inthisflow,edge1-4isempty;edges0-2,2-3,2-4,3-5,and4-5arefull;andedges0-1and1-3arepartial.Ourconventionsinfiguresgivetwowaystoidentifyanedge’sstate:Intheflowcolumn,0entriesareemptyedges;starredentriesarefulledges;andentriesthatareneither0norstarredarepartialedges.Intheresidualnetwork(bottom),emptyedgesappearonlyintheleft

column;fulledgesappearonlyintherightcolumn;andpartialedgesappearinbothcolumns.

•Empty,soflowcanbepushedfromonlyutov•Full,soflowcanbepushedfromonlyvtou•Partial(neitheremptynorfull),soflowcanpushedeitherway

Thisclassificationisfamiliarfromouruseofresidualnetworksthroughoutthischapter.Ifu-visanemptyedge,thenu-visintheresidualnetwork,butv-uisnot;ifu-visafulledge,thenv-uisintheresidualnetwork,butu-visnot;ifu-visapartialedge,thenbothu-vandv-uareintheresidualnetwork.Definition 22.10Given a maxflow with no cycle of partial edges, a feasiblespanningtreeforthemaxflowisanyspanningtreeofthenetworkthatcontainsalltheflow’spartialedges.Inthiscontext,weignoreedgedirectionsinthespanningtree.Thatis,anysetofV − 1 directed edges that connects the network’s V vertices (ignoring edgedirections) constitutes a spanning tree, and a spanning tree is feasible if all

nontreeedgesareeitherfullorempty.Thefirststepinthenetworksimplexalgorithmistobuildaspanningtree.Onewaytobuilditistocomputeamaxflow,tobreak

Figure22.44Maxflowspanningtree

Givenanymaxflow(top),wecanconstructamaxflowthathasaspanningtreesuchthatnonontreeedgesarepartialbythetwo-stepprocessillustratedinthisexample.First,webreakcyclesofpartialedges;inthiscase,webreakthecycle0-2-4-1-0bypushing1unitofflowalongit.Wecanalwaysfilloremptyatleast

oneedgeinthisway;inthiscase,weempty1-4andfillboth0-2and2-4(center).Second,weaddemptyorfulledgestothesetofpartialedgestomakea

spanningtree;inthiscase,weadd0-2,1-4and3-5(bottom).

cyclesofpartialedgesbyaugmentingalongeachcycletofilloremptyoneofitsedges,thentoaddfulloremptyedgestotheremainingpartialedgestomakeaspanning tree. An example of this process is illustrated in Program 22.44.Another option is to startwith themaxflow in a dummy edge from source tosink. Then, this edge is the only possible partial edge, and we can build aspanningtreefortheflowwithanygraphsearch.AnexampleofsuchaspanningtreeisillustratedinProgram22.45.Now, adding any nontree edge to a spanning tree creates a cycle. The basicmechanismbehindthenetworksimplexalgorithmisasetofvertexweightsthatallowsimmediateidentificationoftheedgesthat,whenaddedtothetree,createnegative-costcyclesintheresidualnetwork.Werefertothesevertexweightsaspotentialsanduse thenotationϕ (v) to refer to thepotentialassociatedwithv.Depending on context, we refer to potentials as a function defined on thevertices,orasasetof integerweightswith the implicitassumption thatone isassigned to each vertex, or as a vertex-indexed vector (sincewe always storethemthatwayinimplementations).Definition 22.11Given a flow in a flow network with edge costs, let c (u,v)denotethecostofu-vintheflow’sresidualnetwork.Foranypotentialfunctionϕ, the reduced cost of an edge u-v in the residual network with respect to ϕ,whichwedenotebyc(u,v),isdefinedtobethevaluec(u,v)−(ϕ(u)−ϕ(v)).In otherwords, the reduced cost of every edge is the difference between thatedge’sactualcostand thedifferenceof thepotentialsof theedge’svertices. Inthemerchandise distribution application,we can see the intuition behind nodepotentials: If we interpret the potential ϕ (u) as the cost of buying a unit ofmaterialatnodeu,thefullcostc(u,v)+ϕ(u)−ϕ(v) is thecostofbuyingatu,shippingtovandsellingatv.Wemaintainavertex-indexedvectorphifor thevertexpotentialsandcomputethe reduced cost of an edge v-w by subtracting from the edge’s cost the value(phi[v] - phi[w]). That is, there is no need to store the reduced edge costs

anywhere,becauseitissoeasytocomputethem.In the network simplex algorithm, we use feasible spanning trees to definevertex potentials such that reduced edge costswith respect to those potentialsgivedirectinformationaboutnegative-costcycles.

Figure22.45Spanningtreefordummymaxflow

Ifwestartwithflowonadummyedgefromsourcetosink,thenthatistheonlypossiblepartialedge,sowecanuseanyspanningtreeoftheremainingnodestobuildaspanningtreefortheflow.Intheexample,theedges0-5,0-1,0-2,1-3,and1-4compriseaspanningtreefortheinitialmaxflow.Allofthenontree

edgesareempty.

Specifically,wemaintain a feasible spanning tree throughout the execution ofthe algorithm, andwe set the values of the vertex potentials such that all treeedgeshavereducedcostzero.Property22.25Wesaythatasetofvertexpotentialsisvalidwithrespect toaspanningtreeifalltreeedgeshavezeroreducedcost.Allvalidvertexpotentialsforanygivenspanningtreeimplythesamereducedcostsforeachnetworkedge.

Proof:Giventwodifferentpotentialfunctionsϕandϕ ′ thatarebothvalidwithrespect to a given spanning tree, we show that they differ by an additiveconstant:Thatϕ(u)=ϕ ′(u)+′foralluandsomeconstant.Then,ϕ (u)−ϕ(v)=ϕ′(u)−ϕ′(v)foralluandv,implyingthatallreducedcostsarethesameforthetwopotentialfunctions.Foranytwoverticesuandvthatareconnectedbyatreeedge,wemusthaveϕ(v)=ϕ(u)c(u,v),bythefollowingargument.Ifu-visatreeedge,thenϕ(v)must−beequaltoϕ(u)c(u,v)tomakethereducedcostc(u,v)−ϕ(u)+ϕ(v)equaltozero;ifv-uisatreeedge,thenϕ(v)mustbeequaltoϕ(u)+c(v,u)=ϕ(u)c(u,v) to make the reduced cost c (v, u)− ϕ (v)+ ϕ (u) equal to zero. The same

argumentholdsforϕ′,sowemustalsohaveϕ′(v)=ϕ′(u)c(u,v).

Subtracting,wefindthatϕ(v)−ϕ′(v)=ϕ−(u)ϕ′(u)foranyuandvconnectedbyatreeedge.Denotingthisdifferenceby′foranyvertexandapplyingtheequalityalong the edges of any search tree of the spanning tree immediately gives thedesiredresultthatϕ(u)=ϕ′(u)+′forallu.Anotherwaytoimaginetheprocessofdefiningasetofvalidvertexpotentialsisthat we start by fixing one value, then compute the values for all verticesconnected to that vertex by tree edges, then compute them for all verticesconnectedtothosevertices,andsoforth.Nomatterwherewestarttheprocess,thepotentialdifferencebetweenanytwoverticesisthesame,determinedbythestructureofthetree.Program22.46depictsanexample.Weconsiderthedetailsof the task of computingpotentials afterwe examine the relationship betweenreducedcostsonnontreeedgesandnegative-costcycles.Property22.26Wesaythatanontreeedgeiseligibleifthecyclethatitcreateswith tree edges is a negative-cost cycle in the residual network. An edge iseligibleifandonlyifitisafulledgewithpositivereducedcostoranemptyedgewithnegativereducedcost.

Figure22.46Vertexpotentials

Vertexpotentialsaredeterminedbythestructureofthespanningtreeandbyaninitialassignmentofapotentialvaluetoanyvertex.Atleftisasetofedgesthat

compriseaspanningtreeofthetenvertices0through9.Inthecenterisarepresentationofthattreewith5attheroot,verticesconnectedto5onelevellower,andsoforth.Whenweassigntherootthepotentialvalue0,thereisauniqueassignmentofpotentialstotheothernodesthatmakethedifference

betweenthepotentialsofeachedge’sverticesequaltoitscost.Atrightisadifferentrepresentationofthesametreewith0attheroot.Thepotentialsthatwegetbyassigning0thevalue0differfromthoseinthecenterbyaconstantoffset.Allourcomputationsusethedifferencebetweentwopotentials:Thisdifferenceisthesameforanypairofpotentialsnomatterwhatvertexwestartwith(andnomatterwhatvalueweassignit),soourchoiceofstartingvertexandvalueis

immaterial.

Proof: Suppose that the edge u-v creates the cycle t1-t2-t3-…-td-t1 with treeedgest1-t2,t2-t3,…,wherevist1anduistd.Thereducedcostdefinitionsofeachedgeimplythefollowing:

c(u,v)=c(u,v)+ϕ(u)−ϕ(t1)c(t1,t2)=ϕ(t1)−ϕ(t2)c(t2,t3)=ϕ(t2)ϕ(t3)...c(td−1,u)=ϕ(td−1)−ϕ(u)

Theleft-handsideofthesumoftheseequationsgivesthetotalcostofthecycle,andtheright-handsidecollapsestoc(u,v).Inotherwords,theedge’sreducedcostgives thecyclecost, soonly theedgesdescribedcangiveanegative-costcycleProperty22.27Ifwehaveaflowandafeasiblespanningtreewithnoeligibleedges,theflowisamincostflow.Proof:Iftherearenoeligibleedges,thentherearenonegative-costcyclesintheresidualnetwork,sotheoptimalityconditionofProperty22.23impliesthattheflowisamincostflow.Anequivalentstatementisthatifwehaveaflowandasetofvertexpotentialssuch that reduced costs of tree edges are all zero, full nontree edges are allnonnegative, and empty nontree edges are all nonpositive, then the flow is amincostflow.Ifwehaveeligibleedges,wecanchooseoneandaugmentalongthecyclethatitcreates with tree edges to get a lower-cost flow. As we did with the cycle-canceling implementation inSection22.5,wego through the cycle to find themaximumamountofflowthatwecanpush,thengothroughthecycleagaintopushthatamount,whichfillsoremptiesatleastoneedge.Ifthatistheeligible

edge that we used to create the cycle, it becomes ineligible (its reduced coststaysthesame,butitswitchesfromfulltoemptyoremptytofull).Otherwise,itbecomespartial.Byaddingittothetreeandremovingafulledgeoranemptyedgefromthecycle,wemaintaintheinvariantthatnonontreeedgesarepartialandthatthetreeisafeasiblespanningtree.Again,weconsiderthemechanicsofthiscomputationlaterinthissection.In summary, feasible spanning trees give us vertex potentials, which give usreducedcosts,whichgiveuseligibleedges,whichgiveusnegative-costcycles.Augmentingalonganegative-costcyclereducestheflowcostandalsoimplieschangesinthetreestructure.Changesinthetreestructureimplychangesinthevertexpotentials;changesinthevertexpotentialsimplychangesinreducededgecosts;andchanges in reducedcosts implychanges in thesetofeligibleedges.Aftermakingallthesechanges,wecanpickanothereligibleedgeandstarttheprocessagain.Thisgenericimplementationofthecycle-cancelingalgorithmforsolvingthemincostflowproblemiscalledthenetworksimplexalgorithm.

Build a feasible spanning tree and maintain vertex potentials such that all tree vertices have zeroreducedcost.Addaneligibleedgetothetree,augmenttheflowalongthecyclethatitmakeswithtreeedges,andremove fromthe treeanedge that is filledoremptied,continuinguntilnoeligibleedgesremain.

Thisimplementationisagenericonebecausetheinitialchoiceofspanningtree,themethodofmaintainingvertexpotentials,andthemethodofchoosingeligibleedgesarenotspecified.Thestrategyforchoosingeligibleedgesdeterminesthenumberofiterations,whichtradesoffagainstthedifferingcostsofimplementingvariousstrategiesandrecalculatingthevertexpotentials.Property22.28Ifthegenericnetworksimplexalgorithmterminates,itcomputesamincostflow.Proof:Ifthealgorithmterminates,itdoessobecausetherearenonegative-costcyclesintheresidualnetwork,sobyProperty22.23themaxflowisofmincost.Theconditionthatthealgorithmmightnotterminatederivesfromthepossibilitythataugmentingalongacyclemight filloremptymultipleedges, thus leavingedgesinthetreethroughwhichnoflowcanbepushed.Ifwecannotpushflow,wecannot reduce thecost, andwecouldget caught inan infinite loopaddingandremovingedgestomakeafixedsequenceofspanningtrees.Severalwaystoavoidthisproblemhavebeendevised;wediscussthemlaterinthissectionafterwelookinmoredetailatimplementations.The first choice thatwe face in developing an implementation of the networksimplexalgorithmiswhatrepresentationtousefor thespanningtree.Wehave

threeprimarycomputationaltasksthatinvolvethetree:•Computingthevertexpotentials•Augmentingalongthecycle(andidentifyinganemptyorafulledgeonit)• Inserting a new edge and removing an edge on the cycle formedEach ofthese tasks is an interesting exercise in data structure and algorithm design.There are several data structures and numerous algorithms that we mightconsider, with varied performance characteristics. We begin by consideringperhaps the simplest availabledata structure—whichwe first encountered inChapter1(!)—theparent-linktreerepresentation.Afterweexaminealgorithmsand implementations that are based on the parent-link representation for thetasks just listedanddescribe theiruse in thecontextof thenetworksimplexalgorithm,wediscussalternativedatastructuresandalgorithms.

Aswedid inseveralother implementations in thischapter,beginningwith theaugmenting-path maxflow implementation, we keep links into the networkrepresentation,ratherthansimpleindicesinthetreerepresentation,toallowustohaveaccesstotheflowvaluesthatneedtobechangedwithoutlosingaccesstothevertexname.Program 22.10 is an implementation that assigns vertex potentials in timeproportionaltoV. It isbasedon thefollowing idea,also illustrated inProgram22.47.We start with any vertex and recursively compute the potentials of itsancestors, following parent links up to the root, to which, by convention, weassignpotential0.Then,we

Program22.10VertexpotentialcalculationThe recursive function phiR follows parent links up the tree until finding one whosepotentialisvalid(byconventionthepotentialoftherootisalwaysvalid),thencomputespotentials for vertices on the path on the way back down as the last actions of therecursive invocations. Itmarks each vertexwhose potential it computes by setting itsmarkentrytothecurrentvalueofvalid.

intphiR(intv)

{

if(mark[v]==valid)returnphi[v];

phi[v]=phiR(ST(v))-st[v]->costRto(v);

mark[v]=valid;

returnphi[v];

}

pickanothervertexanduseparentlinkstocomputerecursivelythepotentialsofits ancestors. The recursion terminates when we reach an ancestor whosepotentialisknown;then,onthewayoutoftherecursion,wetravelbackdownthe path, computing each node’s potential from its parent. We continue this

process until we have computed all potential values. Once we have traveledalong a path, we do not revisit any of its edges, so this process runs in timeproportionaltoV.Giventwonodesinatree,theirleastcommonancestor(LCA)istherootofthesmallestsubtree thatcontains themboth.Thecycle thatweformbyaddinganedgeconnectingtwonodesconsistsofthatedgeplustheedgesonthetwopathsfrom the twonodes to theirLCA.Theaugmentingcycle formedbyadding v-wgoesthroughv-wtow,upthetreetotheLCAofvandw(sayr),thendownthetreetov,sowehavetoconsideredgesinoppositeorientationinthetwopaths.Asbefore,weaugmentalongthecyclebytraversingthepathsoncetofindthemaximumamountofflowthatwecanpushthroughtheiredgesthentraversingthe two paths again, pushing flow through them.We do not need to consideredges in order around the cycle; it suffices to consider them all (in eitherdirection).Accordingly,wecansimplyfollowpathsfromeachofthetwonodestotheirLCA.Toaugmentalongthecycleformedbytheadditionofv-w,wepushflowfromvtow;fromvalongthepathtotheLCAr;andfromwalong

Figure22.47Computingpotentialsthroughparentlinks

Westartat0,followthepathtotheroot,setpt[5]to0,thenworkdownthepath,firstsetting6tomakept[6]-pt[5]equaltothecostof6-5,thensettingp[3]tomakep[3]-p[6]equaltothecostof3-6,andsoforth(left).Thenwestartat1andfollowparentlinksuntilhittingavertexwhosepotentialisknown(6inthiscase)andworkdownthepathtocomputepotentialson9and1(center).Whenwestartat2,wecancomputeitspotentialfromitsparent(right);whenwestartat3,weseethatitspotentialisalreadyknown,andsoforth.Inthisexample,whenwetryeachvertexafter1,weeitherfindthatitspotentialisalreadydoneorwecancomputethevaluefromitsparent.Weneverretraceanedge,nomatter

whatthetreestructure,sothetotalrunningtimeislinear.

the path to r, but in reverse direction for each edge. Program 22.11 is an

implementationofthisidea,intheformofafunctionthataugmentsacycleandalsoreturnsanedgethatisemptiedorfilledbytheaugmentation.TheimplementationinProgram22.11usesasimple technique toavoidpayingthecostof initializingall themarkseach time thatwecall it.Wemaintain themarksasglobalvariables,initializedtozero.EachtimethatweseekanLCA,weincrement a global counter and mark vertices by setting their correspondingentry in a vertex-indexed vector to that counter. After initialization, thistechnique allows us to perform the computation in time proportional to thelengthofthecycle.Intypicalproblems,wemightaugmentalongalargenumberof small cycles, so the time saved can be substantial. As we learn, the sametechniqueisusefulinsavingtimeinotherpartsoftheimplementation.Our third tree-manipulation task is tosubstituteanedge u-v foranotheredge inthecyclethatitcreateswithtreeedges.Program22.12isanimplementationofafunctionthataccomplishesthistaskfortheparent-linkrepresentation.Again,theLCAofuandv is important,because theedge tobe removed is eitheron thepathfromu to theLCAoron thepathfromv to theLCA.Removinganedgedetaches all its descendants from the tree, but we can repair the damage byreversingthelinksbetween u-vandtheremovededge,as illustrated inProgram22.48.These three implementations support the basic operations underlying thenetwork simplex algorithm: We can choose an eligible edge by examiningreduced costs and flows; we can use the parent-link representation of thespanningtreetoaugmentalongthenegativecycle

Program22.11AugmentingalongacycleTofindtheleastcommonancestoroftwovertices,wemarknodeswhilemovingupthetree from them in synchrony.TheLCA is the root if it is theonlymarkednode seen;otherwisetheLCAisthefirstmarkednodeseenthatisnottheroot.Toaugment,weuseafunctionsimilartotheoneinProgram22.3thatpreservestreepathsinstandreturnsanedgethatwasemptiedorfilledbytheaugmentation(seetext).

Figure22.48Spanningtreesubstitution

Thisexampleillustratesthebasictree-manipulationoperationinthenetworksimplexalgorithmfortheparent-linkrepresentation.Atleftisasampletreewithlinksallpointingupwards,asindicatedbytheparent-linkstructureST.(Inourcode,thefunctionSTcomputestheparentofthegivenvertexfromtheedge

pointerinthevertex-indexedvectorst.)Addingtheedge1-2createsacyclewiththepathsfrom1and2totheirLCA,11.Ifwethendeleteoneofthoseedges,say0-3,thestructureremainsatree.Toupdatetheparent-linkarraytoreflectthe

change,weswitchthedirectionsofallthelinksfrom2upto3(center).Thetreeatrightisthesametreewithnodepositionschangedsothatlinksallpointup,as

indicatedbytheparent-linkarraythatrepresentsthetree(bottomright).

formedwithtreeedgesandthechoseneligibleedge;andwecanupdatethetreeandrecalculatepotentials.Theseoperationsare illustratedforanexampleflownetworkinFigures22.49and22.50.Figure22.49illustratesinitializationofthedatastructuresusingadummyedgewith themaxflowon it, as inFigure22.42.Shown therearean initial feasiblespanning tree with its parent-link representation, the corresponding vertexpotentials,thereducedcostsforthenontreeedges,andtheinitialsetofeligibleedges.Also,ratherthancomputingthemaxflowvalueintheimplementation,weuse the outflow from the source, which is guaranteed to be no less than themaxflow value; we use themaxflow value here tomake the operation of thealgorithmeasiertotrace.Figure22.50illustratesthechangesinthedatastructuresforeachofasequenceofeligibleedgesandaugmentationsaroundnegative-costcycles.Thesequencedoesnotreflectanyparticularmethodforchoosingeligibleedges;itrepresentsthe choices thatmake the augmenting paths the same as depicted in Program22.42.Thesefiguresshowallvertexpotentialsandallreducedcostsaftereachcycleaugmentation,eventhoughmanyofthesenumbersareimplicitlydefinedand are not necessarily computed explicitly by typical implementations. Thepurposeofthesetwofiguresistoillustratetheoverallprogressofthealgorithmandthestateofthedatastructuresasthealgorithmmoves

Program22.12SpanningtreesubstitutionThefunctionupdateaddsanedgetothespanningtreeandremovesanedgeonthecyclethuscreated.TheedgetoberemovedisonthepathfromoneofthetwoverticesontheedgeaddedtotheirLCA.Thisimplementationusesthefunctiononpathtofindtheedgeremoved and the function reverse to reverse the edges on the path between it and theedgeadded.

fromonefeasiblespanningtreetoanothersimplybyaddinganeligibleedgeand

removingatreeedgeonthecyclethatisformed.Onecritical fact that is illustrated in the example inProgram22.50 is that thealgorithmmightnoteventerminate,becausefulloremptyedgesonthespanningtree can stop us from pushing flow along the negative cycle thatwe identify.That is,we can identify an eligible edge and the negative cycle that itmakeswith spanning tree edges, but themaximumamount of flow thatwe canpushalongthecyclemaybe0.Inthiscase,westillsubstitutetheeligibleedgeforanedgeonthe

Figure22.49Networksimplexinitialization

Toinitializethedatastructuresforthenetworksimplexalgorithm,westartwithzeroflowonalledges(left),thenaddadummyedge0-5fromsourcetosink

withflownolessthanthemaxflowvalue(forclarity,weuseavalueequaltothemaxflowvaluehere).Thecostvalue9forthedummyedgeisgreaterthanthecostofanycycleinthenetwork;intheimplementation,weusethevalueCV.Thedummyedgeisnotshownintheflownetwork,butitisincludedinthe

residualnetwork(center).

We initialize the spanning treewith the sinkat the root, the sourceas itsonlychild, and a search tree of the graph induced by the remaining nodes in theresidualnetwork.Theimplementationusestheparent-edgerepresentationofthetreeinthearraystandtherelatedparent-vertexfunctionST;ourfiguresdepictthisrepresentationandtwoothers:therootedrepresentationshownontherightandthesetofshadededgesintheresidualnetwork.Thevertexpotentialsareintheptarrayandarecomputedfromthetreestructuresoastomakethedifferenceofatreeedge’svertexpotentialsequaltoitscost.ThecolumnlabeledcostRinthecentergivesthereducedcostsfornontreeedges,which are computed for each edge by adding the difference of its vertex

potentialstoitscost.Reducedcostsfortreeedgesarezeroandleftblank.Emptyedges with negative reduced cost and full edges with positive reduced cost(eligibleedges)aremarkedwithasterisks.cycle,butwemakenoprogressinreducingthecostoftheflow.Toensurethatthealgorithmterminatesweneedtoprovethatwecannotendupinanendlesssequenceofzero-flowaugmentations.If there is more than one full or empty edge on the augmenting cycle, thesubstitution algorithm in Program 22.12 always deletes from the tree the oneclosest to theLCAof the eligible edge’s twovertices.Fortunately, it hasbeenshown that this particular strategy for choosing the edge to remove from thecyclesufficestoensurethatthealgorithmterminates(seereferencesection).Thefinalimportantchoicethatwefaceindevelopinganimplementationofthenetwork simplex algorithm is a strategy for identifying eligible edges andchoosingonetoaddtothetree.Shouldwemaintainadatastructurecontainingeligible edges? If so, how sophisticated a data structure is appropriate? Theanswer to these questions depends somewhat on the application and thedynamicsofsolvingparticular instancesof theproblem.If the totalnumberofeligible edges is small, then it is worthwhile to maintain a separate datastructure; ifmost edges are eligiblemost of the time, it is not.Maintaining aseparatedatastructurecouldspareustheexpenseofsearchingforeligibleedges,butalsocouldrequirecostlyupdatecomputations.Whatcriteriashouldweuseto pick from among the eligible edges?Again, there aremany that wemightadopt.Weconsiderexamplesinourimplementations, thendiscussalternatives.Forexample,Program22.13illustratesafunctionthatfindsaneligibleedgeofminimalreducedcost:Nootheredgegivesacycleforwhichaugmentingalongthecyclewouldleadtoagreaterreductioninthetotalcost.Program22.14 is a full implementation of the network simplex algorithm thatuses the strategy of choosing the eligible edge giving a negative cyclewhosecostishighestinabsolutevalue.Theimplementation

Figure22.50Residualnetworksandspanningtrees(networksimplex)

EachrowinthisfigurecorrespondstoaniterationofthenetworksimplexalgorithmfollowingtheinitializationdepictedinProgram22.49:Oneach

iteration,itchoosesaneligibleedge,augmentsalongacycle,andupdatesthedatastructuresasfollows:First,theflowisaugmented,includingimpliedchangesintheresidualnetwork.Second,thetreestructureSTischangedby

addinganeligibleedgeanddeletinganedgeonthecyclethatitmakeswithtreeedges.Third,thetableofpotentialsphiisupdatedtoreflectthechangesinthetreestructure.Fourth,thereducedcostsofthenontreeedges(columnmarkedcostRinthecenter)areupdatedtoreflectthepotentialchanges,andthesevaluesusedtoidentifyemptyedgeswithnegativereducedcostandfulledgeswith

positivereducedcostsaseligibleedges(markedwithasterisksonreducedcosts).Implementationsneednotactuallymakeallthesecomputations(theyonlyneedtocomputepotentialchangesandreducedcostssufficienttoidentifyaneligibleedge),butweincludeallthenumbersheretoprovideafullillustrationofthe

algorithm.

Thefinalaugmentationfor thisexample isdegenerate. Itdoesnot increase theflow,butitleavesnoeligibleedges,whichguaranteesthattheflowisamincostmaxflow.

Program22.13EligibleedgesearchThis function finds the eligible edge of lowest reduced cost. It is a straightforwardimplementationthatthattraversesalltheedgesinthenetwork.

uses the tree-manipulation functions and the eligible-edge search of Programs22.10through22.13,butthecommentsthatwemaderegardingourfirstcycle-canceling implementation (Program 22.9) apply: It is remarkable that such asimplepieceofcode is sufficientlypowerful toprovideuseful solutions in thecontextof ageneralproblem-solvingmodelwith the reachof themincostflowproblem.Theworst-caseperformancebound forProgram22.14 is at least a factorofVlowerthanthatforthecycle-cancelingimplementationinProgram22.9,because

thetimeperiterationisjustE(tofindtheeligibleedge)ratherthanVE(tofindanegative cycle). Although we might suspect that using the maximumaugmentation will result in fewer augmentations than just taking the firstnegativecycleprovidedbytheBellman–Fordalgorithm,thatsuspicionhasnotbeenprovedvalid.Specificboundsonthenumberofaugmentingcyclesusedaredifficulttodevelop,and,asusual,theseboundsarefarhigherthanthenumbersthatweseeinpractice.Asmentionedearlier,thereare

Program22.14Networksimplex(basicimplementation)Thisclassusesthenetworksimplexalgorithmtosolvethemincostflowproblem.ItusesastandardDFSfunctiondfsRaninitialtree(seeExercise22.117),thenitentersaloopwhere ituses the functions inPrograms22.10 through22.13 tocomputeall thevertexpotentials, examine all the edges to find the one that creates the lowest-cost negativecycle,andaugmentalongthatcycle.

theoretical results demonstrating that certain strategies can guarantee that thenumber of augmenting cycles is bounded by a polynomial in the number ofedges,butpracticalimplementationstypicallyadmitanexponentialworstcase.Inlightoftheseconsiderations,therearemanyoptionstoconsiderinpursuitofimprovedperformance.Forexample,Program22.15 isanother implementationof the network simplex algorithm. The straightforward implementation inProgram 22.14 always takes time proportional to V to revalidate the treepotentialsandalwaystakestimeproportionaltoEtofindtheeligibleedgewiththe largest reduced cost. The implementation in Program 22.15 is designed toeliminatebothofthesecostsintypicalnetworks.First, even if choosing the maximum edge leads to the fewest number ofiterations, expending the effort of examining every edge to find themaximum

edgemaynotbeworthwhile.Wecoulddonumerousaugmentationsalongshortcyclesinthetimethatittakestoscanalltheedges.Accordingly,itisworthwhiletoconsiderthestrategyofusinganyeligibleedge,ratherthantakingthetimetofindaparticularone.Intheworstcase,wemighthavetoexaminealltheedgesorasubstantialfractionofthemtofindaneligibleedge,butwetypicallyexpecttoneedtoexaminerelativelyfewedgestofindaneligibleone.Oneapproachistostartfromthebeginningeachtime;anotheristopickarandomstartingpoint(seeExercise 22.126). This use of randomness alsomakes an artificially longsequenceofaugmentingpathsunlikely.Second,weadoptalazyapproachtocomputingpotentials.Ratherthancomputeall thepotentials in thevertex-indexedvectorphi, then refer to themwhenweneedthem,wecallthefunctionphiRtogeteachpotentialvalue;ittravelsupthetreetofindavalidpotential,thencomputesthenecessarypotentialsonthatpath.Toimplementthisapproach,wesimplychangethefunctionthatdefinesthecosttousethefunctioncallphiR(u),insteadofthearrayaccessphi[u].Intheworstcase,wecalculateallthepotentialsinthesamewayasbefore;butifweexamineonlya few eligible edges, then we calculate only those potentials that we need toidentifythem.Suchchangesdonotaffecttheworst-caseperformanceofthealgorithm,buttheycertainlyspeeditupinpracticalapplications.Severalotherideasforimprovingtheperformanceofthenetwork

Program22.15Networksimplex(improvedimplementation)ReplacingreferencestophibycallstophiRinthefunctionRandreplacingtheforloopintheconstructorinProgram22.14bythiscodegivesanetworksimpleximplementationthat saves time on each iteration by calculating potentials only when needed, and bytakingthefirsteligibleedgethatitfinds.

intold=0;

for(valid=1;valid!=old;)

{

old=valid;


{



if(e->capRto(e->other(v))>0)

if(e->capRto(v)==0)

{update(augment(e),e);valid++;}

}

}

simplex algorithmare explored in the exercises (seeExercises22.126 through22.130); those represent only a small sample of the ones that have beenproposed.

As we have emphasized throughout this book, the task of analyzing andcomparing graph algorithms is complex.With the network simplex algorithm,the task is further complicated by the variety of different implementationapproachesandthebroadarrayoftypesofapplicationsthatwemightencounter(see Section 22.5). Which implementation is best? Should we compareimplementations based on the worst-case performance bounds that we canprove? How accurately can we quantify performance differences of variousimplementations, for specific applications. Should we use multipleimplementations,eachtailoredtospecificapplications?Readersareencouragedtogaincomputationalexperiencewithvariousnetworksimplex implementations and to address some of these questions by runningempirical studies of the kind that we have emphasized throughout this book.When seeking to solvemincost flowproblems,we are facedwith the familiarfundamental challenges, but the experience that we have gained in tacklingincreasinglydifficultproblemsthroughoutthisbookprovidesamplebackgroundtodevelopefficientimplementationsthatcaneffectivelysolveabroadvarietyofimportantpracticalproblems.Somesuchstudiesaredescribedintheexercisesattheendof thisand thenextsection,but theseexercisesshouldbeviewedasastartingpoint.Each reader can craft a newempirical study that sheds light onsomeparticularimplementation/applicationpairofinterest.The potential to improve performance dramatically for critical applicationsthrough proper deployment of classic data structures and algorithms (ordevelopmentofnewones) forbasic tasksmakes thestudyofnetwork-simpleximplementations a fruitful research area, and there is a large literature onnetworksimpleximplementations.Inthepast,progressinthisresearchhasbeencrucial, because it helps reduce the huge cost of solving network simplexproblems. People tend to rely on carefully crafted libraries to attack theseproblems,andthatisstillappropriateinmanysituations.However,itisdifficultforsuchlibrariestokeepupwithrecentresearchandtoadapttothevarietyofproblems that arise in new applications. With the speed and size of moderncomputers, accessible implementations like Programs 22.12 and 22.13 can bestarting points for the development of effective problem-solving tools fornumerousapplications.

Exercises•22.116 Give a maxflowwith associated feasible spanning tree for the flownetworkshowninProgram22.10.

•22.117ImplementthedfsRfunctionforProgram22.14.

•22.118Implementafunctionthatremovescyclesofpartialedgesfromagivennetwork’s flow and builds a feasible spanning tree for the resulting flow, asillustratedinProgram22.44.PackageyourfunctionsothatitcouldbeusedtobuildtheinitialtreeinProgram22.14orProgram22.15.22.119 In the example in Program 22.46, show the effect of reversing thedirectionoftheedgeconnecting6and5onthepotentialtables.•22.120Constructaflownetworkandexhibitasequenceofaugmentingedgessuchthatthegenericnetworksimplexalgorithmdoesnotterminate.

• 22.121 Show, in the style of Program 22.47, the process of computingpotentialsforthetreerootedat0inProgram22.46.22.122 Show, in the style of Program 22.50, the process of computing amincostmaxflow in the flownetworkshown inProgram22.10, startingwiththe basic maxflow and associated basic spanning tree that you found inExercise22.116.• 22.123 Suppose that all nontree edges are empty. Write a function thatcomputestheflowsinthetreeedges,puttingtheflowintheedgeconnectingvanditsparentinthetreeinthevthentryofavectorflow.

•22.124DoExercise22.123forthecasewheresomenontreeedgesmaybefull.22.125UseProgram22.12as thebasis foranMSTalgorithm.Runempiricaltests comparing your implementation with the three basic MST algorithmsdescribedinChapter20(seeExercise20.66).22.126DescribehowtomodifyProgram22.15such that it starts thescanforaneligibleedgeatarandomedgeratherthanatthebeginningeachtime.22.127ModifyyoursolutiontoExercise22.126suchthateachtimeitsearchesforaneligibleedgeitstartswhereitleftoffintheprevioussearch.22.128Modifytheprivatememberfunctionsinthissectiontomaintainatriplylinkedtreestructurethatincludeseachnode’sparent,leftmostchild,andrightsibling (see Section 5.4). Your functions to augment along a cycle andsubstituteaneligibleedgeforatreeedgeshouldtaketimeproportionaltothelengthoftheaugmentingcycle,andyourfunctiontocomputepotentialsshouldtake time proportional to the size of the smaller of the two subtrees createdwhenthetreeedgeisdeleted.•22.129Modify the privatemember functions in this section tomaintain, inadditiontobasicparent-edgetreevector,twoothervertex-indexedvectors:onecontainingeachvertex’sdistancetotheroot,theothercontainingeachvertex’ssuccessorinaDFS.Yourfunctionstoaugmentalongacycleandsubstitutean

eligibleedgeforatreeedgeshouldtaketimeproportionaltothelengthoftheaugmenting cycle, and your function to compute potentials should take timeproportionaltothesizeofthesmallerofthetwosubtreescreatedwhenthetreeedgeisdeleted.

•22.130Exploretheideaofmaintainingageneralizedqueueofeligibleedges.Consider various generalized-queue implementations and variousimprovements to avoid excessive edge-cost calculations, such as restrictingattentiontoasubsetoftheeligibleedgessoastolimitthesizeofthequeueorpossiblyallowingsomeineligibleedgestoremainonthequeue.

• 22.131 Run empirical studies to determine the number of iterations, thenumberofvertexpotentialscalculated,andtheratiooftherunningtimetoEfor several versions of the network simplex algorithm, for various networks(seeExercises22.7–12).Considervariousalgorithmsdescribedinthetextandin the previous exercises, and concentrate on those that perform the best onhugesparsenetworks.

• 22.132 Write a client program that does dynamic graphical animations ofnetworksimplexalgorithms.Yourprogramshouldproduceimages like thosein Program 22.50 and the other figures in this section (see Exercise 22.48).Testyour implementationfor theEuclideannetworksamongExercises22.7–12.

Figure22.51Reductionfromshortestpaths

Findingasingle-source–shortest-pathstreeinthenetworkatthetopisequivalenttosolvingthemincostmaxflowproblemintheflownetworkatthe

bottom.

22.7MincostFlowReductionsMincostflowisageneralproblem-solvingmodelthatcanencompassavarietyof useful practical problems. In this section, we substantiate this claim byprovingspecificreductionsfromavarietyofproblemstomincostflow.Themincostflowproblemisobviouslymoregeneralthanthemaxflowproblem,since anymincostmaxflow is an acceptable solution to themaxflowproblem.Specifically, ifweassign to thedummyedgeacost1and toallotheredgesacost0intheconstructionofProgram22.42,anymincostmaxflowminimizestheflow in the dummy edge and therefore maximizes the flow in the originalnetwork.Therefore,alltheproblemsdiscussedinSection22.4thatreducetothemaxflowproblemalsoreducetothemincostflowproblem.Thissetofproblemsincludesbipartitematching,feasibleflow,andmincut,amongnumerousothers.More interesting, we can examine properties of our algorithms for themincostflow problem to develop new generic algorithms for the maxflowproblem.Wehavealreadynoted that thegenericcycle-cancelingalgorithmforthemincostmaxflowproblemgivesagenericaugmenting-pathalgorithmforthemaxflowproblem. In particular, this approach leads to an implementation thatfinds augmenting paths without having to search the network (see Exercises22.133and22.134).Ontheotherhand, thealgorithmmightproducezero-flowaugmentingpaths,soitsperformancecharacteristicsaredifficulttoevaluate(seereferencesection).Themincostflowproblemisalsomoregeneralthantheshortest-pathsproblem,bythefollowingsimplereduction.Property22.29Thesingle-source–shortest-pathsproblem(innetworkswithnonegativecycles)reducestothemincost–feasible-flowproblem.Proof: Given a single-source–shortest-paths problem (a network and a sourcevertex s), build a flownetworkwith the samevertices, edges, and edge costs;andgiveeachedgeunlimitedcapacity.AddanewsourcevertexwithanedgetosthathascostzeroandcapacityV-1andanewsinkvertexwithedgesfromeachof the other vertices with costs zero and capacities 1. This construction isillustratedinProgram22.51.Solvethemincost–feasible-flowproblemonthisnetwork.Ifnecessary,remove

cycles in the solution to produce a spanning-tree solution. This spanning treecorresponds directly to a shortest-paths spanning tree of the original network.Detailedproofofthisfactisleftasanexercise(seeExercise22.138).Thus, all the problems discussed in Section 21.6 that reduce to the single-source–shortest-pathsproblemalsoreducetothemincostflowproblem.Thissetof problems includes job schedulingwithdeadlines anddifference constraints,amongnumerousothers.Aswefoundwhenstudyingmaxflowproblems,itisworthwhiletoconsiderthedetails of the operation of the network simplex algorithm when solving ashortest-paths problemusing the reduction of Property 22.29. In this case, thealgorithmmaintainsaspanningtreerootedatthesource,muchlikethesearch-basedalgorithmsthatweconsideredinChapter21,but thenodepotentialsandreducedcostsprovideincreasedflexibilityindevelopingmethodstochoosethenextedgetoaddtothetree.Wedonot generally exploit the fact that themincostflowproblem is a propergeneralizationofboththemaxflowandtheshortest-pathsproblems,becausewehave specialized algorithms with better performance guarantees for bothproblems. If such implementations are not available, however, a goodimplementation of the network simplex algorithm is likely to produce quicksolutions to particular instances of both problems. Of course, we must avoidreduction loops when using or building network-processing systems that takeadvantageofsuchreductions.Forexample,thecycle-cancelingimplementationinProgram22.9usesbothmaxflowandshortestpathstosolvethemincostflowproblem(seeExercise21.96).Next,weconsiderequivalentnetworkmodels.First,weshowthatassumingthatcosts are nonnegative is not restrictive, as we can transform networks withnegativecostsintonetworkswithoutthem.Property 22.30 In mincostflow problems, we can assume, without loss ofgenerality,thatedgecostsarenonnegative.Proof:We prove this fact for feasiblemincost flows in distribution networks.ThesameresultistrueformincostmaxflowsbecauseoftheequivalenceofthetwoproblemsprovedinProperty22.22(seeExercises22.143and22.144).Given a distribution network,we replace any edge u-v that has cost x< 0 andcapacity c by an edge v-u of the same capacity that has cost −x (a positivenumber).Furthermore,wecandecrementthesupply-demandvalueofubycandincrement the supply-demand value of v by c. This operation corresponds to

pushingcunitsofflowfromutovandadjustingthenetworkaccordingly.For negative-cost edges, if a solution to the mincostflow problem for thetransformednetworkputs flow f in theedge v-u,weput flowc− f in u-v in theoriginalnetwork;forpositive-costedges,thetransformednetworkhasthesameflow as in the original. This flow assignment preserves the supply or demandconstraintatallthevertices.Theflowinv-uinthetransformednetworkcontributesfxtothecostandtheflowin u-v in theoriginalnetworkcontributes−cx+ fx to thecost.Thefirst terminthis expression does not depend on the flow, so the cost of any flow in thetransformed network is equal to the cost of the corresponding flow in theoriginalnetworkplusthesumoftheproductsof thecapacitiesandcostsofallthenegative-costedges(whichisanegativequantity).Anyminimal-costflowinthe transformed network is therefore a minimal-cost flow in the originalnetwork.This reduction shows that we can restrict attention to positive costs, but wegenerally do not bother to do so in practice because our implementations inSections 22.5 and 22.6 work exclusively with residual networks and handlenegativecostswithnodifficulty. It is important tohavesome lowerboundoncosts in some contexts, but that bounddoesnot need to be zero (seeExercise22.145).Next,weshow,aswedidforthemaxflowproblem,thatwecould,ifwewanted,restrictattentiontoacyclicnetworks.Moreover,wecanalsoassumethatedgesareuncapacitated(thereisnoupperboundontheamountofflowintheedges).Combiningthesetwovariationsleadstothefollowingclassicformulationofthemincostflowproblem.Transportation Solve the mincostflow problem for a bipartite distributionnetworkwherealledgesaredirected fromasupplyvertex toademandvertexandhaveunlimitedcapacity.Asdiscussedat thebeginningof thischapter(seeFigure 22.2), the usual way to think of this problem is as modeling thedistributionofgoodsfromwarehouses(supplyvertices)toretailoutlets(demandvertices)alongdistributionchannels(edges)atacertaincostperunitamountofgoods.Property 22.31 The transportation problem is equivalent to the mincostflowproblem.Proof:Givena transportationproblem,wecansolve itbyassigningacapacityfor each edge higher than the supply or demand values of the vertices that it

connects and solving the resulting mincost–feasible-flow problem on theresultingdistributionnetwork.Therefore,weneedonlytoestablishareductionfromthestandardproblemtothetransportationproblem.For variety,we describe a new transformation,which is linear only for sparsenetworks.AconstructionsimilartotheonethatweusedintheproofofProperty22.16establishestheresultfornonsparsenetworks(seeExercise22.148).Given a standard distribution network with V vertices and E edges, build atransportationnetworkwithVsupplyvertices,Edemandverticesand2Eedges,as follows. For each vertex in the original network, include a vertex in thebipartitenetworkwithsupplyordemandvaluesettotheoriginalvalueplusthesumofthecapacitiesoftheoutgoingedges;andforeachedgeu-vintheoriginalnetworkwithcapacityc,includeavertexinthebipartitenetworkwithsupplyordemandvalue-c(weusethenotation[u-v]torefertothisvertex).Foreachedgeu-vintheoriginalnetwork,includetwoedgesinthebipartitenetwork:onefromuto[u-v]withthesamecost,andonefromvto[u-v]withcost0.Thefollowingone-to-onecorrespondencepreservescostsbetweenflowsin thetwonetworks:Anedgeu-vhasflowvaluefintheoriginalnetworkifandonlyifedge u-[u-v] has flow value f, and edge v-[u-v] has flow value c-f in the bipartitenetwork (those two flows must sum to c because of the supply–demandconstraintatvertex[u-v].Thus,anymincostflowinonenetworkcorrespondstoamincostflowintheother.Since we have not considered direct algorithms for solving the transportationproblem,thisreductionisofacademicinterestonly.Touseit,wewouldhavetoconvert the resultingproblemback toa (different)mincostflowproblem,usingthesimplereductionmentionedatthebeginningoftheproofofProperty22.31.Perhapssuchnetworksadmitmoreefficientsolutions inpractice;perhaps theydo not. The point of studying the equivalence between the transportationproblemandthemincostflowproblemistounderstandthatremovingcapacitiesand restricting attention to bipartite networks would seem to simplify themincostflowproblemsubstantially;however,thatisnotthecase.Weneedtoconsideranotherclassicalprobleminthiscontext.Itgeneralizesthebipartite-matchingproblemthatisdiscussedindetailinSection22.4.Like thatproblem,itisdeceptivelysimple.AssignmentGivenaweightedbipartitegraph,findasetofedgesofminimumtotalweightsuchthateachvertexisconnectedtoexactlyoneothervertex.Forexample,wemightgeneralizeourjob-placementproblemtoincludeaway

foreachcompanytoquantifyitsdesireforeachapplicant(saybyassigninganintegertoeachapplicant,withlowerintegersgoingtothebetterapplicants)andfor each applicant to quantify his or her desire for each company. Then, asolutiontotheassignmentproblemwouldprovideareasonablewaytotaketheserelativepreferencesintoaccount.Property22.32Theassignmentproblemreducestothemincostflowproblem.Proof:Thisresultcanbeestablishedviaasimplereductiontothetransportationproblem.Givenanassignmentproblem,constructatransportationproblemwiththe samevertices and edges,with all vertices in one of the sets designated assupply vertices with value 1 and all vertices in the other set designated asdemandverticeswithvalue1.Assigncapacity1toeachedge,andassignacostcorrespondingtothatedge’sweightintheassignmentproblem.Anysolutiontothis instanceof the transportationproblemissimplyasetofedgesofminimaltotal cost that each connect a supply vertex to a demand vertex and thereforecorrespondsdirectlytoasolutiontotheoriginalassignmentproblem.Reducing this instance of the transportation problem to the mincostmaxflowproblem gives a construction that is essentially equivalent to the constructionthatweusedtoreducethebipartite-matchingproblemtothemaxflowproblem(seeExercise22.158).Thisrelationshipisnotknowntobeanequivalence.Thereisnoknownwaytoreduceageneralmincostflowproblem to theassignmentproblem. Indeed, likethe single-source–shortest-paths problem and the maxflow problem, theassignment problem seems to be easier than the mincostflow problem in thesense that algorithms that solve it are known that have better asymptoticperformancethanthebestknownalgorithmsforthemincostflowproblem.Still,the network simplex algorithm is sufficiently well refined that a goodimplementation of it is a reasonable choice for solving assignment problems.Moreover, as with maxflow and shortest paths, we can tailor the networksimplexalgorithmtogetimprovedperformancefortheassignmentproblem(seereferencesection).Our next reduction to the mincostflow problem brings us back to a basicproblem related to paths in graphs like the ones that we first considered inSection17.7.AsintheEuler-pathproblem,wewantapaththatincludesalltheedgesinagraph.Recognizingthatnotallgraphshavesuchapath,werelaxtherestrictionthatedgesmustappearonlyonce.MailcarrierGivenanetwork(weighteddigraph),findacyclicpathofminimalweightthatincludeseachedgeatleastonce(seeFigure22.52).Recall thatour

basicdefinitionsinChapter17makethedistinctionbetweencyclicpaths(whichmay revisit vertices and edges) and cycles (which consist of distinct vertices,exceptthefirstandthefinal,whicharethesame).The solution to this problemwould describe the best route for a mail carrier(who has to cover all the streets on her route) to follow. A solution to thisproblemmight also describe the route that a snow-plow should take during asnowstorm,andtherearemanysimilarapplications.Themail carrier’s problem is an extension of the Euler-tour problem that wediscussed in Section 17.7: The solution to Exercise 17.92 is a simple test forwhetheradigraphhasanEuler tour,andProgram17.14 isaneffectiveway tofindanEulertourforsuchdigraphs.Thattoursolvesthemailcarrier’sproblembecause it includeseachedgeexactlyonce—nopathcouldhave lowerweight.The problem becomes more difficult when indegrees and outdegrees are notnecessarily

Figure22.52Mailcarrier’sproblem

Findingtheshortestpaththatincludeseachedgeatleastonceisachallengeevenforthissmallnetwork,buttheproblemcanbesolvedefficientlythrough

reductiontothemincostflowproblem.

equal. In thegeneralcase, someedgesmustbe traversedmore thanonce:Theproblemistominimizethetotalweightofallthemultiplytraversededges.Property22.33Themailcarrier’sproblemreducestothemincostflowproblem.Proof: Given an instance of the mail carrier’s problem (a weighted digraph),defineadistributionnetworkonthesameverticesandedges,withallthevertexsupply or demand values set to 0, edge costs set to the weights of thecorresponding edge, and no upper bounds on edge capacities, but all edgecapacitiesconstrainedtobegreaterthan1.Weinterpretaflowvaluefinanedge

u-vassayingthatthemailcarrierneedstotraverseu-vatotalofftimes.Find amincost flow for this network by using the transformation of Exercise22.146toremovethelowerboundonedgecapacities.Theflow-decompositiontheoremsaysthatwecanexpresstheflowasasetofcycles,sowecanbuildacyclic path from this flow in the sameway that we built an Euler tour in anEuleriangraph:Wetraverseanycycle,takingadetourtotraverseanothercyclewheneverweencounteranodethatisonanothercycle.A careful look at themail carrier’s problem illustrates yet again the fine linebetweentrivialandintractableingraphalgorithms.Supposethatweconsiderthetwo-wayversionof theproblemwhere thenetwork isundirected,and themailcarrier must travel in both directions along each edge. Then, as we noted inSection18.5,depth-firstsearch(oranygraphsearch)willprovideanimmediatesolution. If, however, it suffices to traverse each undirected edge in eitherdirection, thenasolutionissignificantlymoredifficult toformulate thanis thesimplereductiontomincostflowthatwejustexamined,buttheproblemisstilltractable.Ifsomeedgesaredirectedandothersundirected,theproblembecomesNP-hard(seereferencesection).These are only a few of the scores of practical problems that have beenformulated as mincostflow problems. The mincostflow problem is even moreversatilethanthemaxflowproblemorshortest-pathsproblems,andthenetworksimplexalgorithmeffectivelysolvesallproblemsencompassedbythemodel.Aswe didwhenwe studiedmaxflow,we can examine how anymincostflowproblemcanbecastasanLPproblem,asillustratedin

Figure22.53LPformulationofamincostmaxflowproblem

ThislinearprogramisequivalenttothemincostmaxflowproblemforthesamplenetworkofProgram22.40.Theedgeequalitiesandvertexinequalitiesarethe

sameasinProgram22.39,buttheobjectiveisdifferent.Thevariablecrepresentsthetotalcost,whichisalinearcombinationoftheothervariables.In

thiscase,c=−9x50+3x01+x02+x13+x14+4x23+2x24+2x35+x45.

Figure 22.53. The formulation is a straightforward extension of the maxflowformulation:Weaddequationsthatsetadummyvariabletobeequaltothecostof the flow, then set the objective so as tominimize that variable. LPmodelsallow addition of arbitrary (linear) constraints. Some constraints may lead toproblems that stillmay be equivalent tomincostflow problems, but others donot. That is, many problems do not reduce to mincostflow problems: Inparticular, LP encompasses amuch broader set of problems. Themincostflowproblem is a next step toward that general problem-solvingmodel, whichweconsiderinPart8.ThereareothermodelsthatareevenmoregeneralthantheLPmodel;butLPhastheadditionalvirtue that,whileLPproblemsare ingeneralmoredifficult thanmincostflowproblems,effectiveandefficientalgorithmshavebeeninventedtosolvethem.Indeed,perhapsthemostimportantsuchalgorithmisknownasthesimplex method: The network simplexmethod is a specialized version of thesimplex method that applies to the subset of LP problems that correspond tomincostflow problems, and understanding the network simplex algorithm is ahelpfulfirststepinunderstandingthesimplexalgorithm.

Exercises• 22.133 Show that, when the network simplex algorithm is computing amaxflow, the spanning tree is theunionof t-s, a tree containing s anda treecontainingt.22.134DevelopamaxflowimplementationbasedonExercise22.133.Chooseaneligibleedgeatrandom.22.135 Show, in the style of Program 22.50, the process of computing amaxflow in the flow network shown in Program 22.10 using the reductiondescribed in the text and the network simplex implementation of Program22.14.22.136 Show, in the style of Program 22.50, the process of finding shortestpathsfrom0intheflownetworkshowninProgram22.10usingthereductiondescribed in the text and the network simplex implementation of Program22.14.•22.137 Prove that all edges in the spanning tree described in the proof ofProperty22.29areonpathsdirectedfromthesourcetoleaves.

•22.138Prove that thespanning treedescribed in theproofofProperty22.29correspondstoashortest-pathstreeintheoriginalnetwork.22.139Supposethatyouusethenetworksimplexalgorithmtosolveaproblemcreated by a reduction from the single-source–shortest-paths problem asdescribedintheproofofProperty22.29.(i)Provethatthealgorithmneverusesa zero-cost augmentingpath. (ii) Show that the edge that leaves the cycle isalways the parent of the destination vertex of the edge that is added to thecycle. (iii) As a consequence of Exercise 22.138, the network simplexalgorithmdoesnotneedtomaintainedgeflows.Provideafullimplementationthattakesadvantageofthisobservation.Choosethenewtreeedgeatrandom.22.140Supposethatweassignapositivecosttoeachedgeinanetwork.Provethattheproblemoffindingasingle-source–shortest-pathstreeofminimalcostreducestothemincost-maxflowproblem.22.141Suppose thatwemodify the job-scheduling-with-deadlinesprobleminSection21.6tostipulatethatjobscanmisstheirdeadlines,butthattheyincurafixedpositivecostiftheydo.Showthatthismodifiedproblemreducestothemincost-maxflowproblem.22.142Implementaclassthatfindsmincostmaxflowsindistributionnetworkswithnegativecosts.UseyoursolutiontoExercise22.105(whichassumesthat

costsareallnonnegative).•22.143Supposethatthecostsof0-2and1-3inProgram22.40are-1,insteadof1.Showhowtofindamincostmaxflowbytransformingthenetworktoanetwork with positive costs and finding a mincost maxflow of the newnetwork.22.144 Implement a class that finds mincost maxflows in networks withnegativecosts.UseMINCOST(whichassumesthatcostsareallnonnegative).• 22.145 Do the implementations in Sections 22.5 and 22.6 depend in afundamentalwayoncostsbeingnonnegative?Iftheydo,explainhow;iftheydonot,explainwhatfixes(ifany)arerequiredtomakethemworkproperlyfornetworkswithnegativecosts,orexplainwhynosuchfixesarepossible.22.146Extendyour feasible-flowADTfromExercise22.74 to include lowerboundsonthecapacitiesofedges.Implementaclassthatcomputesamincostmaxflowthatrespectsthesebounds(ifoneexists).22.147 Give the result of using the reduction in the text to reduce the flownetworkdescribedinExercise22.112tothetransportationproblem.•22.148Showthatthemincost-maxflowproblemreducestothetransportationproblemwithjustVextraverticesandedgesbyusingaconstructionsimilartotheoneusedintheproofofProperty22.16.

•22.149 Implementaclass for the transportationproblemthat isbasedon thesimple reduction to themincost-flowproblemgiven in theproofofProperty22.30.

•22.150Developaclass implementation for themincost-flowproblem that isbasedonthereductiontothetransportationproblemdescribedintheproofofProperty22.31.

•22.151Developaclass implementation for themincost-flowproblem that isbased on the reduction to the transportation problem described in Exercise22.148.22.152 Write a program to generate random instances of the transportationproblem, thenuse themas thebasis forempirical testsonvariousalgorithmsandimplementationstosolvethatproblem.22.153Findalargeinstanceofthetransportationproblemonline.22.154Runempiricalstudiestocomparethetwodifferentmethodsofreducingarbitrary mincost-flow problems to the transportation problem that arediscussedintheproofofProperty22.31.

22.155 Write a program to generate random instances of the assignmentproblem, thenuse themas thebasis forempirical testsonvariousalgorithmsandimplementationstosolvethatproblem.22.156Findalargeinstanceoftheassignmentproblemonline.22.157Thejob-placementproblemdescribedinthetextfavorstheemployers(their totalweightsaremaximized).Formulateaversionof theproblemsuchthatapplicantsalsoexpresstheirwishes.Explainhowtosolveyourversion.22.158Do empirical studies to compare theperformanceof the twonetworksimplex implementations inSection22.6 for solving random instancesof theassignmentproblem(seeExercise22.155)withVverticesandE edges, for areasonablesetofvaluesforVandE.22.159Themailcarrier’sproblemclearlyhasnosolutionfornetworksthatarenotstronglyconnected(themailcarriercanvisitonlythoseverticesthatareinthe strongcomponentwhere she starts), but that fact is notmentioned in thereduction of Property 22.33.What happenswhenwe use the reduction on anetworkthatisnotstronglyconnected?22.160Runempiricalstudiesforvariousweightedgraphs(seeExercises21.4–8)todetermineaveragelengthofthemailcarrier’spath.22.161 Give a direct proof that the single-source–shortest-paths problemreducestotheassignmentproblem.22.162Describehow to formulate an arbitrary assignmentproblemas anLPproblem.•22.163DoExercise22.18 for the casewhere the cost value associatedwitheachedgeis-1(soyouminimizeunusedspaceinthetrucks).

•22.164 Devise a cost model for Exercise 22.18 such that the solution is amaxflowthattakesaminimalnumberofdays.

22.8PerspectiveOurstudyofgraphalgorithmsappropriatelyculminatesinthestudyofnetwork-flow algorithms for four reasons. First, the network-flowmodel validates thepractical utility of the graph abstraction in countless applications. Second, themaxflow and mincost-flow algorithms that we have examined are naturalextensionsofgraphalgorithmsthatwestudiedforsimplerproblems.Third,theimplementations exemplify the important role of fundamental algorithms anddata structures in achieving good performance. Fourth, the maxflow andmincost-flow models illustrate the utility of the approach of developing

increasingly general problem-solving models and using them to solve broadclassesofproblems.Ourabilitytodevelopefficientalgorithmsthatsolvetheseproblems leaves the door open for us to developmore generalmodels and toseekalgorithmsthatsolvethoseproblems.Beforeconsideringtheseissuesinfurtherdetail,wedevelopfurthercontextbylistingimportantproblemsthatwehavenotcoveredinthischapter,eventhoughtheyarecloselyrelatedtofamiliarproblems.Maximummatching In a graphwith edgeweights, find a subset of edges inwhichnovertexappearsmorethanonceandwhosetotalweightissuchthatnoothersuchsetofedgeshasahighertotalweight.Wecanreducethemaximum-cardinality matching problem in unweighted graphs immediately to thisproblem,bysettingalledgeweightsto1.Theassignmentproblemandmaximum-cardinalitybipartite-matchingproblemsreducetomaximummatchingforgeneralgraphs.Ontheotherhand,maximummatching does not reduce to mincost flow, so the algorithms that we haveconsidered do not apply.The problem is tractable, although the computationalburden of solving it for huge graphs remains significant. Treating the manytechniques that have been tried for matching on general graphs would fill anentire volume:The problem is one of those studiedmost extensively in graphtheory. We have drawn the line in this book at mincost flow, but we revisitmaximummatchinginPart8.Multicommodity flow Suppose thatwe need to compute a second flow suchthatthesumofanedge’stwoflowsislimitedbythatedge’scapacity,bothflowsare in equilibrium, and the total cost is minimized. This change models thepresence of two different types of material in the merchandise-distributionproblem; forexample, shouldweputmorehamburgerormorepotatoes in thetruck bound for the fast-food restaurant?This change alsomakes the problemmuch more difficult and requires more advanced algorithms than thoseconsidered here; for example, no analogue to themaxflow–mincut theorem isknowntoholdforthegeneralcase.FormulatingtheproblemasanLPproblemisa straightforward extension of the example shown in Program 22.53, so theproblemistractable(becauseLPistractable).Convex and nonlinear costs The simple cost functions that we have beenconsideringarelinearcombinationsofvariables,andouralgorithmsforsolvingthem depend in an essential way on the simple mathematical structureunderlying these functions. Many applications call for more complicatedfunctions. For example, when we minimize distances, we are led to sums of

squaresofcosts.SuchproblemscannotbeformulatedasLPproblems,so theyrequire problem-solving models that are even more powerful. Many suchproblemsarenottractable.SchedulingWehave presented a few scheduling problems as examples.Theyarebarely representativeof thehundredsofdifferent schedulingproblems thathavebeenposed.Theresearchliteratureisrepletewiththestudyofrelationshipsamongtheseproblemsandthedevelopmentofalgorithmsandimplementationstosolvetheproblems(seereferencesection). Indeed,wemighthavechosentouse scheduling rather than network-flow algorithms to develop the idea fordefininggeneralproblem-solvingmodelsandimplementingreductionstosolveparticular problems (the same might be said of matching). Many schedulingproblemsreducetothemincost-flowmodel.Thescopeofcombinatorialcomputingisvastindeed,andthestudyofproblemsofthissortiscertaintooccupyresearchersformanyyearstocome.WerevisitmanyoftheseproblemsinPart8,inthecontextofcopingwithintractability.Wehavepresentedonlyafractionofthestudiedalgorithmsthatsolvemaxflowand mincost-flow problems. As indicated in the exercises throughout thischapter, combining the many options available for different parts of variousgenericalgorithms leads toa largenumberofdifferentalgorithms.Algorithmsand data structures for basic computational tasks play a significant role in theefficacy ofmany of these approaches; indeed, some of the important general-purpose algorithms that we have studied were developed in the quest forefficient implementations of network-flow algorithms. This topic is still beingstudiedbymanyresearchers.Thedevelopmentofbetteralgorithmsfornetwork-flowproblemscertainlydependsonintelligentuseofbasicalgorithmsanddatastructures.Thebroadreachofnetwork-flowalgorithmsandourextensiveuseofreductionsto extend this reachmakes this section an appropriate place to consider someimplications of the concept of reduction. For a large class of combinatorialalgorithms, theseproblems represent awatershed in our studies of algorithms,wherewestandbetweenthestudyofefficientalgorithmsforparticularproblemsand the study of general problem-solving models. There are important forcespullinginbothdirections.We are drawn to develop as general a model as possible, because the moregeneral the model, the more problems it encompasses, thereby increasing theusefulnessofanefficientalgorithmthatcansolveanyproblemthat reduces tothemodel.Developingsuchanalgorithmmaybeasignificant,ifnotimpossible,

challenge. Even if we do not have an algorithm that is guaranteed to bereasonably efficient, we typically have good algorithms that performwell forspecific classes of problems that are of interest. Specific analytic results areoften elusive, but we often have persuasive empirical evidence. Indeed,practitionerstypicallywilltrythemostgeneralmodelavailable(oronethathasawell-developedsolutionpackage)andwilllooknofurtherifthemodelworksin reasonable time.However,we certainly should strive to avoid using overlygeneralmodelsthatleadustospendexcessiveamountsoftimesolvingproblemsforwhichmorespecializedmodelscanbeeffective.We are also drawn to seek better algorithms for important specific problems,particularlyforhugeproblemsorhugenumbersofinstancesofsmallerproblemswhere computational resources are a critical bottleneck. As we have seen fornumerousexamplesthroughoutthisbookandinParts1through4,weoftencanfindacleveralgorithmthatcanreduceresourcecostsbyfactorsofhundredsorthousandsormore,whichisextremelysignificant ifwearemeasuringcosts inhours or dollars. The general outlook described in Chapter 2, whichwe haveused successfully in so many domains, remains extremely valuable in suchsituations, and we can look forward to the development of clever algorithmsthroughout the spectrum of graph algorithms and combinatorial algorithms.Perhapsthemostimportantdrawbacktodependingtooheavilyonaspecializedalgorithmisthatoftenasmallchangetothemodelwillinvalidatethealgorithm.Whenweuseanoverlygeneralmodelandanalgorithmthatgetsourproblemsolved,wearelessvulnerabletothisdefect.Software libraries that encompass many of the algorithms that we haveaddressed may be found in many programming environments. Such librariescertainly are important resources to consider for specific problems. However,librariesmaybedifficult touse,obsolete,orpoorlymatchedto theproblemathand.Experiencedprogrammersknowtheimportanceofconsideringthetrade-off between taking advantage of a library resource and becoming overlydependentonthatresource(ifnotsubjecttoprematureobsolescence).Someofthe implementations that we have considered are efficient, simple to develop,and broad in scope. Adapting and tuning such implementations to addressproblemsathandcanbetheproperapproachinmanysituations.Thetensionbetweentheoreticalstudiesthatarerestrictedtowhatwecanproveand empirical studies that are relevant to only the problems at hand becomesever more pronounced as the difficulty of the problems that we addressincreases.Thetheoryprovidestheguidancethatweneedtogainafootholdonthe problem, and practical experience provides the guidance that we need to

develop implementations. Moreover, experience with practical problemssuggestsnewdirections for the theory,perpetuating the cycle that expands theclassofpracticalproblemsthatwecansolve.Ultimately, whichever approach we pursue, the goal is the same:We want abroad spectrum of problem-solving models, effective algorithms for solvingproblemswithinthosemodels,andefficientimplementationsofthosealgorithmsthat can handle practical problems. The development of increasingly generalproblem-solvingmodels(suchastheshortestpaths,maxflow,andmincost-flowproblems), the increasinglypowerfulgenericalgorithms(suchas theBellman–Ford algorithm for the shortest-paths problem, the augmenting-path algorithmfor themaxflowproblem, and thenetwork simplex algorithm for themincost-maxflowproblem)broughtusalongwaytowardsthegoal.Muchofthisworkwas done in the 1950s and 1960s. The subsequent emergence of fundamentaldata structures (Parts 1 through 4) and of algorithms that provide effectiveimplementationsofthesegenericmethods(thisbook)hasbeenanessentialforceleadingtoourcurrentabilitytosolvesuchalargeclassofhugeproblems.

ReferencesforPartFive

Thealgorithmstextbookslistedbelowcovermostofthebasicgraph-processingalgorithmsinChapters17through21.Thesebooksarebasicreferencesthatgivecareful treatments of fundamental and advanced graph algorithms, withextensive references to the recent literature. The book by Even and themonograph by Tarjan are devoted to thorough coverage ofmany of the sametopics that we have discussed. Tarjan’s original paper on the application ofdepth-firstsearchtosolvestrongconnectivityandotherproblemsmeritsfurtherstudy.Thesource-queuetopologicalsort implementationinChapter19 is fromKnuth’sbook.Originalreferencesforsomeoftheotherspecificalgorithmsthatwehavecoveredarelistedbelow.Thealgorithmsforminimumspanning trees indensegraphs inChapter20arequite old, but the original papers by Dijkstra, Prim, and Kruskal still makeinteresting reading. The survey byGraham andHell provides a thorough andentertaininghistoryoftheproblem.ThepaperbyChazelleisthestateoftheartinthequestforalinearMSTalgorithm.The book by Ahuja, Magnanti, and Orlin is a comprehensive treatment ofnetwork-flowalgorithms(andshortest-pathsalgorithms).Furtherinformationonnearly every topic covered inChapters21 and22may be found in that book.Another source for further material is the classic book by Papadimitriou andSteiglitz. Though most of that book is about much more advanced topics, itcarefullytreatsmanyofthealgorithmsthatwehavediscussed.Bothbookshaveextensive and detailed information about source material from the researchliterature.TheclassicworkbyFordandFulkersonisstillworthyofstudy,asitintroducedmanyofthefundamentalconcepts.We have briefly introduced numerous advanced topics from (the yet to bepublished)Part8,includingreducibility,intractability,andlinearprogramming,amongseveralothers.Thisreferencelistisfocusedonthematerialthatwecoverin detail and cannot do justice to these advanced topics. The algorithms textstreat many of them, and the book by Papadimitriou and Steiglitz provides athorough introduction. There are numerous other books and a vast researchliteratureonthesetopics.R.K.Ahuja,T.L.Magnanti,andJ.B.Orlin,NetworkFlows:Theory,Algorithms,andApplications,PrenticeHall,1993.B.Chazelle,“Aminimumspanningtreealgorithmwithinverse-Ackermanntype

complexity,”JournaloftheACM,47(2000).T. H. Cormen, C. L. Leiserson, and R. L. Rivest, Introduction to Algorithms,MITPressandMcGraw-Hill,1990.E.W.Dijkstra,“Anoteontwoproblemsinconnexionwithgraphs,”NumerischeMathematik,1(1959).P.ErdösandA.Renyi,“Ontheevolutionofrandomgraphs,”MagyarTud.Akad.Mat.KutatoIntKozl,5(1960).S.Even,GraphAlgorithms,ComputerSciencePress,1979.L.R.FordandD.R.Fulkerson,FlowsinNetworks,PrincetonUniversityPress,1962.H. N. Gabow, “Path-based depth-first search for strong and biconnectedcomponents,”InformationProcessingLetters,74(2000).R. L. Graham and P. Hell, “On the history of the minimum spanning treeproblem,”AnnalsoftheHistoryofComputing,7(1985).D. B. Johnson, “Efficient shortest path algorithms,” Journal of the ACM, 24(1977).D. E. Knuth, The Art of Computer Programming. Volume 1:FundamentalAlgorithms,thirdedition,Addison-Wesley,1997.J.R.KruskalJr.,“Ontheshortestspanningsubtreeofagraphandthetravelingsalesmanproblem,”ProceedingsAMS,7,1(1956).K.Mehlhorn,Data Structures and Algorithms 2:NP-Completeness and GraphAlgorithms,Springer-Verlag,1984.C.H. Papadimitriou andK. Steiglitz,CombinatorialOptimization:AlgorithmsandComplexity,Prentice-Hall,1982.R. C. Prim, “Shortest connection networks and some generalizations,” BellSystemTechnicalJournal,36(1957).R.E.Tarjan,“Depth-firstsearchandlineargraphalgorithms,”SIAMJournalonComputing,1,2(1972).R.E.Tarjan,DataStructuresandNetworkAlgorithms,SocietyforIndustrialandAppliedMathematics,Philadelphia,PA,1983.

Index

Abstracttransitiveclosure,174–175,216–220Activevertex,411Acyclicgraph.seeDigraph;Directedacyclicgraph(DAG)Acyclicnetwork,313–321,334–335

maxflow,427–429Adjacency-listsrepresentation,31–35

DFS,90,95,102digraphs,36–37,153,155,179edgeconnectivity,115–116find/removeedge,40flownetworks,379performance,33–35,37–40,145–146removingvertices,41standardadjacency-listsDFS,97,161transitiveclosure,174weightedgraphs/networks,37,230,235–238,278

Adjacency-matrixrepresentation,25–30DFS,90,95,101digraphs,36–37,153,154–155,169–172,176,179flownetworks,379linear-timealgorithm,59performance,29–30,37–40,144–146removingvertices,41standardadjacency-matrixDFS,97,161weightedgraphs/networks,37,230,233–238,278

Adjacentvertices,9ADT,graph.seeGraphADTAirlineroute,283All-pairsshortestpath,281,304–311

acyclicnetworks,318–320BFS,127–129

negativeweights,356–360pathrelaxation,288–290andtransitiveclosure,175,328–329

Arbitrage,348–349Arbitraryweight,229Arc,8,14Articulationpoint,117–118Assignmentproblem,74,476–477A*algorithm,325Augmenting-pathmethod,382–407

cyclecanceling,451longestaugmentingpath,396maximum-capacity,394,399–401,405–406networksimplexalgorithm,460–461performance,393,395–406,434–436andpreflow-push,411randomflownetworks,402–404randomized,402shortestaugmentingpath,393–394,398–399,405–407stack-based,400

Backedge,99,113–115,161–165,202–203Backlink,99–100Bellman-Fordalgorithm,350–356,358–360,447BFStree,124–129BinaryDAG,188–190Binarydecisiondiagram(BDD),189Binarytree,188–190Bioconnectivity,117–119Bipartitegraph,13,110–111Bipartitematching,73–74,433–436,476,482Booleanmatrixmultiplication,169–172,176–177Boruvka’salgorithm,244,263–267Breadth-firstsearch(BFS),81–82,121–130

andDFS,127–134

forest,125fringesize,137–138PFS,253,302

Bridge,113–117BridgesofKönigsbergproblem,62–64Call,program,5Capacity,flow,372,375st-cut,385vertex-capacityconstraints,426

changepriorityoperation,253Circuit,4Circulation,flownetwork,377–378,429Class

augmenting-paths,391cyclecanceling,449DFS,96flownetworks,379,432networksimplexalgorithm,466–467weightedgraphs,233–236seealsoGraphADT

Clique,12,75Colorabilityproblem,75

two-coloring,110–111Communicationsnetwork,367–369Complement,12Completegraph,12Computerscience,365Connection,3Connectivity,11–12

bioconnectivity,117–119digraphs,158edge,438–439equivalencerelations,184generalconnectivity,74,119

k-connectedgraph,119maxflow,437–438random,141–144simpleconnectivity,72,106–109st-connectivity,119strongconnectivity,72,156–158,205vertexconnectivity,119,438

Constructoperation,262Constraint,186Cost,227,372,443

convex/nonlinear,483edge,474flow,443negative,446–447reduced,454–457seealsoPerformance;Weightedgraph

Criticaledge,398–399Crossedge,161–162,202,204Crossingedge,241,385Cut,241

mincut,386–387property,240–241,245set,385st-cut,427–428,385–387vertex,117–118

Cutproblem,369–370,385–386Cycle,10–11

DFS,105–106directed,14,155,162–165even,74,224flownetworks,377–378,427MSTs,240–244negative,279–280,348–350,446–454odd,110

property,242–243Cycle-cancelingalgorithm,372,447–452seealsoNetworksimplexalgorithm

Cyclicpath,10,156,477–478DAG.seeDirectedacyclicgraph(DAG)deBruijngraph,53Decreasekeyoperation,255,268–269,297Degrees-of-separationgraph,53Delauneytriangulation,274Densegraph,13,29–30Depth-firstsearch(DFS),81–82,87–91

andBFS,127–134classes,96cycledetection,105–106digraphs,160–168fringesize,137–138PFS,253runningtime,95–96simpleconnectivity,106–109simplepath,57–59,106spanningforest,93,110standardadjacencyrepresentations,97,161strongcomponentsalgorithms,207–214topologicalsorting,193–195,201–204transitiveclosure,177–179treelinks,99–100andTrémauxexploration,87–90two-colorability,110–111two-wayEulertour,66,109andunion-find,108–109vertexsearch,109–110seealsoDFStree

Destinationvertex,14DFSforest,98–103,124–125

DAGs,194digraphs,161–162,164spanningforest,110

DFStree,90,98–103bridges,113–117diagraphs,161–162

d-heap,269,271–272Diameter,network,304Differenceconstraints,332–336,338,347Digraph,14–15,149–224

adjacency-listsrepresentation,36–37,153,155,179adjacency-matrixrepresentation,36–37,153,154–155,169–172,176,179connectivity,158,438decomposing,166defined,152DFSin,160–168directedcycledetection,162–166,187directedpath,152,155edgedirection,149–150evencycles,74grid,152isomorphism,150map,154random,220andrelations,182–183reverse,154–155runningtime,221–222single-sourcereachability,166–167strongcomponents,157–158,205–214stronglyconnected,72,156–158,205transitiveclosure,169–180,216–220andundirectedgraphs,153,155–162,165–167,179–180uniconnected,224weighted,277–278

Dijkstra’salgorithm,256,293–302acyclicnetworks,320all-pairsshortestpaths,305–307Euclideannetworks,323–326negativeweights,349,354–360andPrim’salgorithm,294–296

Directedacyclicgraph(DAG),14,150,186–204binary,188–190defined,155detection,162–165,187dominators,223kernel,157,216–220longestpath,199partialorders,184–185schedulingproblem,186–187sink/source,195–199strongcomponents,157topologicalsorting,187,191–199transitiveclosure,201–204weighted,313–321

Directedcycle,14,155,162–165,187Directedgraph.seeDigraph;Directedacyclicgraph(DAG)Directedpath,72,152,155,224Disjointpaths,11,117Distancesmatrix,289,306,360Distributionnetwork,444–445Distributionproblem,368,430,444–445,475Dominator,223Downlink,99–100,161–162,202–204Dummyedge,447–451,454–455,472Dummyvertex,376–377Dynamicalgorithm,21–22,50Dynamicreachability,223Edge,7–10

back,99,113–115,161–165,202–203backward/forward,383–384class,379connectivity,438–439costs,474critical,398–399cross,161–162,202,204crossing,241,385directed,14disjoint,11,117,437down,99–100,161–162,202–204dummy,447–451,454–455,472eligible,412,455–457,462–468flows,375

incident,9markingvertices,93parallel,8,18,27,34,47,237,278pointers,232,235–237,277–278,432random,47relaxation,286–287,292,323separation,112tree,99,113–115,161–162,202uncapacitated,380vector-of-edges,24,37

Edgedatatype,17,37,231–232,379,390Edge-basedpreflow-pushalgorithm,413–414Edge-connectedgraph,113,117,438–439Edge-separablegraph,112Edmonds,J.,393Eligibleedge,412,455–457,462–468Enumeration,graph,150–151Equalweights,228–229Equivalenceclass,183

Equivalencerelation,183Equivalentproblems,328Erdos,P.,144Euclideangraph,10

flownetworks,402–404MSTs,274–276neighborgraphs,49networks,322–326

Euclideanheuristic,324–326Eulerpath,62–68

directed,224mailcarrier,477–478two-wayEulertour,109

Evencycle,74,224Excess,vertex,411Existenceproblem,75–76Fatinterface,39Feasibleflow,410,430–432,444Feasiblespanningtree,453–455Feedbackvertexset,224Fibonaccicomputation,188,268FIFOqueue,123–125,132–138

PFS,302preflow-push,416–420topologicalsort,197–199

Findedgeoperation,40–41Flowcost,443Flownetwork,15,367–486

augmenting-flowmethod,382–407backward/forwardedges,383–384capacities,372,375,385,426circulation,377–378,429cutproblems,369–370,385–386defined,375

distributionproblems,368,430,444–445,475equilibrium,376–377flows,372,375inflow/outflow,375–377matchingproblems,369,433–436,476,482maxflow-mincuttheorem,385–387model,373–374networksimplexalgorithm,372,453–470,479preflow-pushmethod,410–423random,402–404reductions,367,406,425–440representations,379residual,388–391,412,417,446,453st-network,375–377,429–430value,375–377seealsoMaxflowproblem;Mincostflowproblem;Network

Floyd’salgorithm,176,290,304,307–311negativeweights,349–351,354

Ford,L.R.,382Ford-Fulkersonmethod.seeAugmenting-pathmethodForest,12

BFS,125Boruvka’salgorithm,264DFS,98–103,109,124–125,161–162,164,194Kruskal’salgorithm,258spanning,109seealsoDFStree;Tree

Four-colortheorem,76Fringe,131–138,296

Prim’salgorithm,251–256priorityqueue,393

Fulkerson,D.R.,382Functioncallgraph,50–52Gabow’salgorithm,206,211–214

Generalconnectivity,74,119Generalizedqueue,416,418Geometricalgorithm,275–276Goldberg,A.,410Graph,3–73

applications,4–5bipartite,13,110–111complete,12connected,11–12deBruijn,53defined,7degrees-of-separation,53dense/sparse,13,29–30directed/undirected,14–15edge-connected/edge-separable,112–113,117Euclidean,10functioncall,50–52interval,53isomorphic,10,29multigraph,8neighbor,49planar,9,73random,47–48simple,8static,21–22subgraph,9transaction,49–50seealsoDigraph;Directedacyclicgraph(DAG);Undirectedgraph;Weightedgraph

GraphADT,16–24,39adjacency-listsrepresentation,31–35adjacency-matrixrepresentation,25–30all-pairsshortestpaths,304–305connectivityinterface,21–22

constructor,18equivalencerelations,184graph-searchfunctions,91–97iterator,18–20showfunction,20symboltable,50vertexdegrees,38weightedgraphs,230–239seealsoClass

Graphsearch,81–147ADTfunctions,91–97algorithmanalysis,140–146bioconnectivity,117–119generalized,131–138Hamiltontour,60mazeexploration,82–86priority-firstsearch,251–256,296–302,323,392–393randomized,136–138separability,112–117simplepath,57–59,61,106seealsoBreadth-firstsearch(BFS);Depth-firstsearch(DFS)

Graph-processingproblem,70–79client,19,23degreeofdifficulty,71,77–79existence,75–76intractable,75NP-hard,75,77,339–340tractable,72

Griddigraph,152Hamiltonpath,59–62,224,339–340Heightfunction,412–415Highest-vertexpreflow-push,420Hypertext,4Immediatedominator,223

Incidentedge,9Indegree,14,153–154,197Independentsetproblem,75Indexingfunction,51Inducedsubgraph,9Infeasibleproblem,336Inflow,375–377Inheritance,39Initialheightfunction,413Integerweights,379–380Interface,graphADT,16–24,39Intersection,82Intervalgraph,53Intractableproblem,75,342Irreflexiverelation,183Isomorphicgraphs,10,29,77,150Item,3Jarnik’salgorithm,256Jobplacement,369,476Jobscheduling,4,150,186–187

negativecycles,348–350shortestpaths,332–338,344seealsoTopologicalsorting

Johnson,D.,268Johnson’salgorithm,360Karp,R.,206,393k-connectedgraph,119KernelDAG,157,216–220k-neighborgraph,49Königsberg,bridgesof,62–64Kosaraju’salgorithm,206–208Kruskal’salgorithm,243–244,258–263,268Kuratowski’stheorem,73Leastcommonancestor(LCA),459–460

Length,path,11Libraryprogramming,336,485LIFOstack,132–134,138Linearprogramming(LP),333,342–343,365

maxflow,439–440mincostflow,425,479multicommodityflow,483networkflowalgorithms,371–372

Linearquantity,59Link,4,8

back,99DFStree,99–100down,99–100,161–162,202–204parent,99–100,165

Localityproperty,48Longestpaths,75,199,315–320

augmentingpaths,396differenceconstraints,334–335andshortestpath,329–330

Mailcarrierproblem,74,224,477–478Map,4,154,326Matching,5,73

bipartite,74,433–436,476maximum,482maximum-cardinality,433,482minimum-distancepointmatching,369

Mathematicalprogramming,343Maxflowproblem,372,375–378,382–440

acyclicnetworks,427–429augmenting-pathmethod,382–407bipartitematching,433–436capacityconstraints,426connectivity,437–438feasibleflow,430–432

generalnetworks,425–426linearprogramming,439–440andmincostflowproblem,425,472–473mincostmaxflow,443–447preflow-pushmethod,410–423reductions,370,406,425–440runningtime,434–436spanningtree,454–455standard,425–426undirectednetworks,429–430

Maxflow-mincuttheorem,385–387,437Maximalconnectedsubgraph,11–12Maximum-capacityaugmentingpath,394,399–401,405–406Maximum-cardinalitymatching,433,482Maze,82–86Menger’stheorem,119,437Merchandisedistribution,368,430,444–445,475Mincostflowproblem,372,443–479

assignmentproblem,476–477cycle-cancelingalgorithm,372,447–452edgecosts,474eligibleedges,455–457feasibleflow,444–445,472–473flowcost,443LPformulation,425,479mailcarrier,477–478andmaxflowproblem,425,472–473mincostmaxflow,443–447reductions,472–479runningtime,451single-sourceshortestpaths,472–473transportation,475–476seealsoNetworksimplexalgorithm

Mincut,386–387

Minimumspanningtree(MST),72,227–276Boruvka’salgorithm,244,263–267cutandcycleproperties,240–244defined,228equalweights,228–229Euclidean,274–276Kruskal’salgorithm,243–244,258–263,268performance,268–273,269Prim’salgorithm,243,247–256,263,269representations,237–239weighted-graphADTs,230–239

Minimum-distancepointmatching,369Modulararithmetic,183–184Multicommodityflow,482–483Multigraph,8Multisourcepaths,314–318Negativecost,446–447Negativecycle,279–280,348–350,446–454Negativeweight,345–365

arbitrage,348–349Bellman-Fordalgorithm,350–356,358Dijkstra’salgorithm,349,354–360Floyd’salgorithm,349–351,354

Neighborgraph,49,144Network,5,15,277–284

acyclic,313–321,334–335,427–429adjacencyrepresentations,37communications,368–369distribution,444–445reliability,370residual,388–391,412,417,446,453reweighting,324,357–360st-network,375–378,429–430

telephone,370undirected,429–430weighteddiameter,304seealsoFlownetwork;Shortestpath

Networksimplexalgorithm,372,453–470assignmentproblem,476–477eligibleedges,455–457,462–468feasiblespanningtree,453–455implementations,466–469initialization,464parent-linkrepresentation,458–464performance,466–470shortestpaths,472–473simplexmethod,479vertexpotentials,454–456,468seealsoMincostflowproblem

Node,8Nonlinearcost,483NP-hardproblem,75,77,339–340Onlinealgorithm,22Operationsresearch(OR),343,365Optimization,76Orderedpair,14,182–183Outdegree,14,153–154Outflow,375–378Paralleledges,8,18,27

adjacency-listsrepresentation,34networks,278randomgraphs,47weightedgraphs,237

Parent-linkrepresentationBFStree,127cycledetection,165DFStree,99–100

networksimplexalgorithm,458–464Partialorder,184–185Passage,82Path,10–11,56–58

cyclic,10,156,477–478directed,72,152,155,224disjoint,11,117Euler,62–64,109Hamilton,59–62,224,339–340mailcarrier,477–478relaxation,286,288–290,292simple,57–59,61,106,279weight,277seealsoLongestpath;Shortestpath

Pathsmatrix,289–290,306,360Performance,6,77–79

abstraction,42–43adjacency-listsrepresentation,33–35,37–40,145–146adjacency-matrixrepresentation,29–30,37–40,144–146augmenting-pathmethod,393,395–406,434–436cyclecanceling,452dense/sparsegraphs,13DFSforests,103Dijkstra’salgorithm,297–300,324–326equivalentproblems,330–331Kruskal’salgorithm,260–262MSTalgorithms,268–273networksimplexalgorithm,466–470pathsearch,61–62,64–65PFS,255–256,297–300preflow-push,419–423preprocessingtime,106–108,127,216–217,221–222randomgraphs,54shortest-pathsalgorithms,363–365

staticgraphs,42transitiveclosure,179–180,221–222union-findalgorithm,145vector-of-edges,37–38worst-case,43–44seealsoRunningtime

PERTchart,150,345Planargraph,9,73Planarityproblem,73Pointer,edge,232,235–237,277–278,432Polynomial-timereduction,439Postordernumbering,101,162,193Potentialfunction,416–419,454–456,468Precedenceconstraint,186,332Preflow,410–411Preflow-pushmethod,410–423

edge-based,413–414highest-vertex,420performance,419–423vertex-based,416

Preordernumbering,98–99,101,162Preprocessing

DFS,106–108shortestpaths,127,304–305transitiveclosure,216–217,221–222

Prim’salgorithm,243,247–256,263andDijkstra’salgorithm,294–296runningtime,269

Priorityqueue,136,268–269augmenting-pathmethod,392–393Dijkstra’salgorithm,296–302Kruskal’salgorithm,261–263multiwayheapimplementation,272

Priority-firstsearch(PFS),323

augmenting-pathmethod,392–393Prim’salgorithm,251–256

Programstructure,5Pushdownstack,123Quicksort,260,262,268Radartrackingsystem,369Radixsort,260,268Randomflownetwork,402–404Randomgraph,47–48,141–144Randomizedqueue,137,402Reachability,152,158,166–167

DAGs,201–204digraphs,205dynamic,223seealsoTransitiveclosure

Reducedcost,454–457Reduction,177,283,484

differenceconstraints,333–336,338equivalentproblems,328flownetworks,367implications,341jobscheduling,332–338linearprogramming,333,342–343maxflow,370,406,425–440mincostflow,472–479polynomial-time,439shortestpaths,328–343transitive,223upper/lowerbounds,342

Reflexiverelation,183Relation,182–183Relaxation,285–292,323Removeedgeoperation,34,40–41Removetheminimumoperation,251,268–271

Renyi,A.,144Residualnetwork,388–391,453

mincostflow,446preflow-push,412,417

Reversetopologicalsorting,193–195Reweighting,324,357–360Roadmap,282–283Runningtime,141

augmenting-pathmethod,393,405Bellman-Fordalgorithm,354–356Boruvka’salgorithm,265DFS,95–96digraphs,221–222Dijkstra’salgorithm,296,299,311Floyd’salgorithm,309–311Kruskal’salgorithm,260–262maxflow,434–436mincostflow,451MSTalgorithms,269–270NP-hardproblems,339pathsearch,61preflow-push,416–419Prim’salgorithm,269randomgraphconnectivity,144shortestpathalgorithms,301strongcomponentsindigraphs,205–206transitiveclosure,172–174,177,180,221–222weightedgraphs,235seealsoPerformance

Schedulingproblem,4,150,483DAGs,186–187precedence-constrained,332–338seealsoTopologicalsorting

Search.seeGraphsearch

Selection,331Self-loop,8,18,28

digraphs,152networks,278relations,183

Sentinelweight,235,278,287Separability,112–117Separationvertex,117–118Set-inclusionDAG,184Shortestpath,176,277–365

acyclicnetworks,313–321augmentingpaths,393–394,398–399,405–407Bellman-Fordalgorithm,350–356,358–360BFS,121–123defined,279Dijkstra’salgorithm,293–302,305–307,349,354–360Euclideannetworks,322–326Floyd’salgorithm,304,307–311,349–351,354–356andlongestpath,329–330multisource,314–318negativeweights,345–361networksimplexalgorithm,472–473NP-hardproblems,339–340performance,301,363–365reduction,328–343relaxation,285–292shortestpathstree(SPT),279,287–288,294source-sink,281,294,322–326terminology,286,313seealsoAll-pairsshortestpath;Network;Single-sourceshortestpath

Simpleconnectivity,72,106–109Simplegraph,8Simplepath,10,57–59,61

DFS,106

networks,279Simplexmethod,479Single-sourcelongestpaths,334–335Single-sourceshortestpaths,72,127,281

acyclicnetworks,314–315Dijkstra’salgorithm,293–294andmincostflowproblem,472–473negativeweights,350–354reachability,166–167

Sink,154,195–197,281,375–378Softwarelibrary,485Sollin,G.,267Sorting,331Sourcevertex,14,154,195–197,281,375–378Source-sinkshortestpath,281,294,322–326Spanningforest,12,93,110Spanningtree,12,72

feasible,453–455maxflow,454–455networksimplexalgorithm,457–464seealsoMinimumspanningtree

Sparsegraph,13,29st-connectivity,119st-cut,385–387,427–428st-network,375–378,429–430Stack,LIFO,132–134,138Stack-basedaugmenting-pathmethod,400,402Staticgraph,18–19,42,50STLlist,31–33Strongcomponents,157–158,205–214

transitiveclosure,217–219Strongconnectivity,72,156–158,205Subgraph,9

forbidden,73

maximalconnected,11–12Subsetinclusion,184Supplylinesproblem,370,386–387Symboltable

findedge,41graph-buildingfunction,50–52indexingfunction,51

Symmetricrelation,183Tarjan,R.E.,410Tarjan’salgorithm,73,206,210–213Telephonenetwork,370Topologicalsorting,150,187,191–199

DFS,193–195,201–204multisourceshortest-paths,314–318relabeling/rearrangement,191–192reverse,193–195

Totalorder,185Tour,10

Euler,62–64,109Hamilton,59–62mailcarrierproblem,74travelingsalespersonproblem,76

Tractableproblem,72Trafficflow,369Transaction,5Transactiongraph,49–50Transitiveclosure,72,169–180

abstract,174–175,216–220andall-pairsshortestpaths,175,328–329Booleanmatrixmultiplication,169–172,176–177DAGs,201–204DFS-based,177–179performance,172–174,177,179–180,221–222ofarelation,183

Warshall’salgorithm,172–175Transitivereduction,223Transitiverelation,183Transportationproblem,368,475–476Travelingsalespersonproblem,76Tree,12,72

BFS,124–129binary,188–190andDAGs,188–190DFS,90,98–103,113–117,161–162edge,99,113–115,161–162,202preordertreetraversal,101shortest-pathtree(SPT),279,287–288,294spanning,453–455,457–464seealsoMinimumspanningtree(MST)

Treelink,99–100Trémauxexploration,83–86,87–90,109Two-coloring,110–111Two-wayEulertour,66,109Uncapacitatededge,381Undirectedgraph,14–15,36

anddigraphs,153,155–162,165–167,179–180networks,429–430reachability,205underlying,15

Uniconnecteddigraph,224Union,12Union-findalgorithm,22,43

Boruvka’salgorithm,264–265andDFS,108–109Kruskal’salgorithm,263performance,145

Vector,vertex-indexed,37,96Vector-of-edgesrepresentation,24,37

Vertex,7–10active,411adjacent,9connectivity,119,438cut,117–118degreeof,9destination,14disjoint,11dummy,376–377excess,411fringe,253height,412–415indegree/outdegree,14inflow/outflow,375–378marking,92–93orderedpair,14potentials,416–419,454–456,468printing,19–20reachable,152,158removing,41search,109–110separation,117–118sink/source,14,154,195–197,281,375–378

Vertex-basedpreflow-pushalgorithm,416Vertex-indexedvector,37,96V-vertexgraph,7Warshall’salgorithm,172–175,289–290,307–309Weighteddiameter,304Weightedgraph,15,227–229

adjacency-matrixrepresentation,37ADTs,230–239arbitraryweights,228bipartitematching,74digraphs,277–278

edgeweights,234–235equalweights,228–229integerweights,379–380negativeweights,345–365pathweight,277reweighting,324,357–360sentinelweight,235,278seealsoMinimumspanningtree(MST);Shortestpath

Whitney’stheorem,119WorldWideWeb,4

Algorithms in C++ Part 5: Graph Algorithms

Documents

Transcript of Algorithms in C++ Part 5: Graph Algorithms