Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

GettingStartedwithTensorFlow

TableofContents

GettingStartedwithTensorFlowCreditsAbouttheAuthorAbouttheReviewerwww.PacktPub.com

eBooks,discountoffers,andmoreWhysubscribe?

PrefaceWhatthisbookcoversWhatyouneedforthisbookWhothisbookisforConventionsReaderfeedbackCustomersupport

DownloadingtheexamplecodeDownloadingthecolorimagesofthisbookErrataPiracyQuestions

1.TensorFlow–BasicConceptsMachinelearninganddeeplearningbasics

SupervisedlearningUnsupervisedlearningDeeplearning

TensorFlow–AgeneraloverviewPythonbasics

SyntaxDatatypesStringsControlflowFunctionsClassesExceptionsImportingalibrary

InstallingTensorFlow

InstallingonMacorLinuxdistributionsInstallingonWindowsInstallationfromsourceTestingyourTensorFlowinstallation

FirstworkingsessionDataFlowGraphsTensorFlowprogrammingmodel

HowtouseTensorBoardSummary

2.DoingMathwithTensorFlowThetensordatastructure

One-dimensionaltensorsTwo-dimensionaltensors

TensorhandlingThree-dimensionaltensorsHandlingtensorswithTensorFlow

PreparetheinputdataComplexnumbersandfractals

PreparethedataforMandelbrotsetBuildandexecutetheDataFlowGraphforMandelbrot'ssetVisualizetheresultforMandelbrot'ssetPreparethedataforJulia'ssetBuildandexecutetheDataFlowGraphforJulia'ssetVisualizetheresult

ComputinggradientsRandomnumbers

UniformdistributionNormaldistributionGeneratingrandomnumberswithseeds

Montecarlo'smethodSolvingpartialdifferentialequations

InitialconditionModelbuildingGraphexecution

ComputationalfunctionusedSummary

3.StartingwithMachineLearningThelinearregressionalgorithm

Datamodel

CostfunctionsandgradientdescentTestingthemodel

TheMNISTdatasetDownloadingandpreparingthedata

ClassifiersThenearestneighboralgorithm

BuildingthetrainingsetCostfunctionandoptimization

TestingandalgorithmevaluationDataclustering

Thek-meansalgorithmBuildingthetrainingsetCostfunctionsandoptimization

TestingandalgorithmevaluationSummary

4.IntroducingNeuralNetworksWhatareartificialneuralnetworks?

NeuralnetworkarchitecturesSingleLayerPerceptronThelogisticregression

TensorFlowimplementationBuildingthemodelLaunchthesessionTestevaluationSourcecode

MultiLayerPerceptronMultiLayerPerceptronclassification

BuildthemodelLaunchthesessionSourcecode

MultiLayerPerceptronfunctionapproximationBuildthemodelLaunchthesession

Summary5.DeepLearning

DeeplearningtechniquesConvolutionalneuralnetworks

CNNarchitectureTensorFlowimplementationofaCNN

InitializationstepFirstconvolutionallayerSecondconvolutionallayerDenselyconnectedlayerReadoutlayerTestingandtrainingthemodelLaunchingthesessionSourcecode

RecurrentneuralnetworksRNNarchitectureLSTMnetworksNLPwithTensorFlow

DownloadthedataBuildingthemodelRunningthecode

Summary6.GPUProgrammingandServingwithTensorFlow

GPUprogrammingTensorFlowServing

HowtoinstallTensorFlowServingBazelgRPC

TensorFlowservingdependenciesInstallServing

HowtouseTensorFlowServingTrainingandexportingtheTensorFlowmodelRunningasession

LoadingandexportingaTensorFlowmodelTesttheserver

Summary

GettingStartedwithTensorFlow

GettingStartedwithTensorFlowCopyright©2016PacktPublishing

Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.

Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.

PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.

Firstpublished:July2016

Productionreference:1190716

PublishedbyPacktPublishingLtd.

LiveryPlace

35LiveryStreet

Birmingham

B32PB,UK.

ISBN978-1-78646-857-4

www.packtpub.com

http://www.packtpub.com

CreditsAuthor

GiancarloZaccone

CopyEditor

AlphaSingh

Reviewer

JayaniWithanawasam

ProjectCoordinator

ShwetaHBirwatkar

CommissioningEditor

VeenaPagare

Proofreader

SafisEditing

AcquisitionEditor

VinayArgekar

Indexer

MariammalChettiyar

ContentDevelopmentEditor

SumeetSawant

ProductionCoordinator

NileshMohite

TechnicalEditor

DeeptiTuscano

CoverWork

NileshMohite

AbouttheAuthorGiancarloZacconehasmorethan10yearsofexperiencemanagingresearchprojectsinboththescientificandindustrialdomains.HeworkedasresearcherattheC.N.R,theNationalResearchCouncil,wherehewasinvolvedinprojectsrelatedtoparallelnumericalcomputingandscientificvisualization.

Currently,heisaseniorsoftwareengineerataconsultingcompanydevelopingandmaintainingsoftwaresystemsforspaceanddefenceapplications.

Giancarloholdsamaster'sdegreeinphysicsfromtheFedericoIIofNaplesanda2ndlevelpostgraduatemastercourseinscientificcomputingfromLaSapienzaofRome.

HehasalreadybeenaPacktauthorforthefollowingbook:PythonParallelProgrammingCookbook.

Youcancontacthimathttps://it.linkedin.com/in/giancarlozaccone

https://it.linkedin.com/in/giancarlozaccone

AbouttheReviewerJayaniWithanawasamisaseniorsoftwareengineeratZaiziAsia-ResearchandDevelopmentteam.SheistheauthorofthebookApacheMahoutEssentials,onscalablemachinelearning.ShewasasummitspeakeratAlfrescoSummit2014-London.Hertalkwasaboutapplicationsofmachinelearningtechniquesinsmartenterprisecontentmanagement(ECM)solutions.Shepresentedherresearch“ContentExtractionandContextInferencebasedInformationRetrieval”attheWomeninMachineLearning(WiML)2015workshop,whichwasco-locatedwiththeNeuralInformationProcessingSystems(NIPS)2015conference-Montreal,Canada.

JayaniiscurrentlypursuinganMScinArtificialIntelligenceattheUniversityofMoratuwa,SriLanka.Shehasstrongresearchinterestsinmachinelearningandcomputervision.

Youcancontactherathttps://lk.linkedin.com/in/jayaniwithanawasam

https://lk.linkedin.com/in/jayaniwithanawasam

www.PacktPub.com

eBooks,discountoffers,andmoreDidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusatcustomercare@packtpub.comformoredetails.

Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.

https://www2.packtpub.com/books/subscription/packtlib

DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt'sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt'sentirelibraryofbooks.

http://www.PacktPub.com

http://www.PacktPub.com

https://www2.packtpub.com/books/subscription/packtlib

Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser

PrefaceTensorFlowisanopensourcesoftwarelibraryusedtoimplementmachinelearninganddeeplearningsystems.

Behindthesetwonamesarehiddenaseriesofpowerfulalgorithmsthatshareacommonchallenge:toallowacomputertolearnhowtoautomaticallyrecognizecomplexpatternsandmakethesmartestdecisionspossible.

Machinelearningalgorithmsaresupervisedorunsupervised;simplifyingasmuchaspossible,wecansaythatthebiggestdifferenceisthatinsupervisedlearningtheprogrammerinstructsthecomputerhowtodosomething,whereasinunsupervisedlearningthecomputerwilllearnallbyitself.

Deeplearningisinsteadanewareaofmachinelearningresearchthathasbeenintroducedwiththeobjectiveofmovingmachinelearningclosertoartificialintelligencegoals.Thismeansthatdeeplearningalgorithmstrytooperatelikethehumanbrain.

Withtheaimofconductingresearchinthesefascinatingareas,theGoogleteamdevelopedTensorFlow,whichisthesubjectofthisbook.

TointroduceTensorFlow’sprogrammingfeatures,wehaveusedthePythonprogramminglanguage.Pythonisfunandeasytouse;itisatruegeneral-purposelanguageandisquicklybecomingamust-havetoolinthearsenalofanyself-respectingprogrammer.

ItisnottheaimofthisbooktocompletelydescribeallTensorFlowobjectsandmethods;insteadwewillintroducetheimportantsystemconceptsandleadyouupthelearningcurveasfastandefficientlyaswecan.EachchapterofthebookpresentsadifferentaspectofTensorFlow,accompaniedbyseveralprogrammingexamplesthatreflecttypicalissuesofmachineanddeeplearning.

Althoughitislargeandcomplex,TensorFlowisdesignedtobeeasytouseonceyoulearnaboutitsbasicdesignandprogrammingmethodology.

ThepurposeofGettingStartedwithTensorFlowistohelpyoudojustthat.

Enjoyreading!

WhatthisbookcoversChapter1,TensorFlow–BasicConcepts,containsgeneralinformationonthestructureofTensorFlowandtheissuesforwhichitwasdeveloped.ItalsoprovidesthebasicprogrammingguidelinesforthePythonlanguageandafirstTensorFlowworkingsessionaftertheinstallationprocedure.ThechapterendswithadescriptionofTensorBoard,apowerfultoolforoptimizationanddebugging.

Chapter2,DoingMathwithTensorFlow,describestheabilityofmathematicalprocessingofTensorFlow.Itcoversprogrammingexamplesonbasicalgebrauptopartialdifferentialequations.Also,thebasicdatastructureinTensorFlow,thetensor,isexplained.

Chapter3,StartingwithMachineLearning,introducessomemachinelearningmodels.Westarttoimplementthelinearregressionalgorithm,whichisconcernedwithmodelingrelationshipsbetweendata.Themainfocusofthechapterisonsolvingtwobasicproblemsinmachinelearning;classification,thatis,howtoassigneachnewinputtooneofthepossiblegivencategories;anddataclustering,whichisthetaskofgroupingasetofobjectsinsuchawaythatobjectsinthesamegrouparemoresimilartoeachotherthantothoseinothergroups.

Chapter4,IntroducingNeuralNetworks,providesaquickanddetailedintroductionofneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelements,theartificialneurons.Theyaremathematicalconstructsthattosomeextentmimicthepropertiesoflivingneurons.Neuralnetworksbuildthefoundationonwhichreststhearchitectureofdeeplearningalgorithms.Twobasictypesofneuralnetsarethenimplemented:theSingleLayerPerceptronandtheMultiLayerPerceptronforclassificationproblems.

Chapter5,DeepLearning,givesanoverviewofdeeplearningalgorithms.Onlyinrecentyearshasdeeplearningcollectedalargenumberofresultsconsideredunthinkableafewyearsago.We’llshowhowtoimplementtwofundamentaldeeplearningarchitectures,convolutionalneuralnetworks(CNN)andrecurrentneuralnetworks(RNN),forimagerecognitionandspeechtranslationproblemsrespectively.

Chapter6,GPUProgrammingandServingwithTensorFlow,showstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowServing,ahigh-performanceopensourceservingsystemformachinelearningmodelsdesignedfor

productionenvironmentsandoptimizedforTensorFlow.

WhatyouneedforthisbookAlltheexampleshavebeenimplementedusingPythonversion2.7onanUbuntuLinux64-bitmachine,includingtheTensorFlowlibraryversion0.7.1.

YouwillalsoneedthefollowingPythonmodules(preferablythelatestversion):

PipBazelMatplotlibNumPyPandas

WhothisbookisforThereadershouldhaveabasicknowledgeofprogrammingandmathconcepts,andatthesametime,wanttobeintroducedtothetopicsofmachineanddeeplearning.Afterreadingthisbook,youwillbeabletomasterTensorFlow’sfeaturestobuildpowerfulapplications.

ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.

Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:"Theinstructionsforflowcontrolareif,for,andwhile."

Anycommand-lineinputoroutputiswrittenasfollows:

>>>myvar=3

>>>myvar+=2

>>>myvar

5

>>>myvar-=1

>>>myvar

4

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,forexample,inmenusordialogboxes,appearinthetextlikethis:"TheshortcutsinthisbookarebasedontheMacOSX10.5+scheme."

Note

Warningsorimportantnotesappearinaboxlikethis.

Tip

Tipsandtricksappearlikethis.

ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook-whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.Tosendusgeneralfeedback,[email protected],andmentionthebook'stitleinthesubjectofyourmessage.Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.

http://www.packtpub.com/authors

CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.

DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.

Youcandownloadthecodefilesbyfollowingthesesteps:

1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.2. HoverthemousepointerontheSUPPORT tabatthetop.3. ClickonCodeDownloads&Errata.4. EnterthenameofthebookintheSearchbox.5. Selectthebookforwhichyou'relookingtodownloadthecodefiles.6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.7. ClickonCodeDownload.

Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthelatestversionof:

WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMac7-Zip/PeaZipforLinux

ThecodebundleforthebookisalsohostedonGitHubathttps://github.com/PacktPublishing/Getting-Started-with-TensorFlow.Wealsohaveothercodebundlesfromourrichcatalogofbooksandvideosavailableathttps://github.com/PacktPublishing/.Checkthemout!

http://www.packtpub.com

http://www.packtpub.com/support

https://github.com/PacktPublishing/Getting-Started-with-TensorFlow

https://github.com/PacktPublishing/

DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefromhttp://www.packtpub.com/sites/default/files/downloads/GettingStartedwithTensorFlow_ColorImages.pdf

http://www.packtpub.com/sites/default/files/downloads/Bookname_ColorImages.pdf

ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks-maybeamistakeinthetextorthecode-wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.

Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.

http://www.packtpub.com/submit-errata

https://www.packtpub.com/books/content/support

PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.

Pleasecontactusatcopyright@packtpub.comwithalinktothesuspectedpiratedmaterial.

Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.

QuestionsIfyouhaveaproblemwithanyaspectofthisbook,[email protected],andwewilldoourbesttoaddresstheproblem.

Chapter1.TensorFlow–BasicConceptsInthischapter,we'llcoverthefollowingtopics:

MachinelearninganddeeplearningbasicsTensorFlow–AgeneraloverviewPythonbasicsInstallingTensorFlowFirstworkingsessionDataFlowGraphTensorFlowprogrammingmodelHowtouseTensorBoard

MachinelearninganddeeplearningbasicsMachinelearningisabranchofartificialintelligence,andmorespecificallyofcomputerscience,whichdealswiththestudyofsystemsandalgorithmsthatcanlearnfromdata,synthesizingnewknowledgefromthem.

Thewordlearnintuitivelysuggeststhatasystembasedonmachinelearning,may,onthebasisoftheobservationofpreviouslyprocesseddata,improveitsknowledgeinordertoachievebetterresultsinthefuture,orprovideoutputclosertothedesiredoutputforthatparticularsystem.

Theabilityofaprogramorasystembasedonmachinelearningtoimproveitsperformanceinaparticulartask,thankstopastexperience,isstronglylinkedtoitsabilitytorecognizepatternsinthedata.Thistheme,calledpatternrecognition,isthereforeofvitalimportanceandofincreasinginterestinthecontextofartificialintelligence;itisthebasisofallmachinelearningtechniques.

Thetrainingofamachinelearningsystemcanbedoneindifferentways:

SupervisedlearningUnsupervisedlearning

SupervisedlearningSupervisedlearningisthemostcommonformofmachinelearning.Withsupervisedlearning,asetofexamples,thetrainingset,issubmittedasinputtothesystemduringthetrainingphase,whereeachexampleislabeledwiththerespectivedesiredoutputvalue.Forexample,let'sconsideraclassificationproblem,wherethesystemmustattributesomeexperimentalobservationsinoneoftheNdifferentclassesalreadyknown.Inthisproblem,thetrainingsetispresentedasasequenceofpairsofthetype{(X1,Y1),.....,(Xn,Yn)}whereXiaretheinputvectors(featurevectors)andYirepresentsthedesiredclassforthecorrespondinginputvector.Mostsupervisedlearningalgorithmsshareonecharacteristic:thetrainingisperformedbytheminimizationofaparticularlossfunction(costfunction),whichrepresentstheoutputerrorwithrespecttothedesiredoutputsystem.

Thecostfunctionmostusedforthistypeoftrainingcalculatesthestandarddeviationbetweenthedesiredoutputandtheonesuppliedbythesystem.Aftertraining,theaccuracyofthemodelismeasuredonasetofdisjointedexamplesfromthetrainingset,theso-calledvalidationset.

Supervisedlearningworkflow

Inthisphasethemodel'sgeneralizationcapabilityisthenverified:wewilltestiftheoutputiscorrectforanunusedinputduringthetrainingphase.

Unsupervisedlearning

Inunsupervisedlearning,thetrainingexamplesprovidedbythesystemarenotlabeledwiththerelatedbelongingclass.Thesystem,therefore,developsandorganizesthedata,lookingforcommoncharacteristicsamongthem,andchangingthembasedontheirinternalknowledge.

Unsupervisedlearningalgorithmsareparticularlyusedinclusteringproblems,inwhichanumberofinputexamplesarepresent,youdonotknowtheclassapriori,andyoudonotevenknowwhatthepossibleclassesare,orhownumeroustheyare.Thisisaclearcasewhenyoucannotusesupervisedlearning,becauseyoudonotknowapriorithenumberofclasses.

Unsupervisedlearningworkflow

Deeplearning

Deeplearningtechniquesrepresentaremarkablestepforwardtakenbymachinelearninginrecentdecades,havingprovidedresultsneverseenbeforeinmanyapplications,suchasimageandspeechrecognitionorNaturalLanguageProcessing(NLP).Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofthefieldofmachinelearningonlyinrecentdecades.Onereason,perhapsthemainone,issurelyrepresentedbyprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthembyafactorof10or20.Anotherreasoniscertainlytheevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofa

certaindepthandwithahighdimensionalityfortheinputdata.

Deeplearningworkflow

Deeplearningisbasedonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.Itconsistsinamachinelearningmodelatseverallevelsofrepresentationinwhichthedeeperlevelstakeasinputtheoutputsofthepreviouslevels,transformingthemandalwaysabstractingmore.Eachlevelcorrespondsinthishypotheticalmodeltoadifferentareaofthecerebralcortex:whenthebrainreceivesimages,itprocessesthemthroughvariousstagessuchasedgedetectionandformperception,thatis,fromaprimitiverepresentationleveltothemostcomplex.Forexample,inanimageclassificationproblem,eachblockgraduallyextractsthefeatures,atvariouslevelsofabstraction,inputtingofdataalreadyprocessed,bymeansoffilteringoperations.

TensorFlow–AgeneraloverviewTensorFlow(https://www.tensorflow.org/)isasoftwarelibrary,developedbyGoogleBrainTeamwithinGoogle'sMachineLearningIntelligenceresearchorganization,forthepurposesofconductingmachinelearninganddeepneuralnetworkresearch.TensorFlowthencombinesthecomputationalalgebraofcompilationoptimizationtechniques,makingeasythecalculationofmanymathematicalexpressionswheretheproblemisthetimerequiredtoperformthecomputation.

Themainfeaturesinclude:

Defining,optimizing,andefficientlycalculatingmathematicalexpressionsinvolvingmulti-dimensionalarrays(tensors).Programmingsupportofdeepneuralnetworksandmachinelearningtechniques.TransparentuseofGPUcomputing,automatingmanagementandoptimizationofthesamememoryandthedataused.YoucanwritethesamecodeandruniteitheronCPUsorGPUs.Morespecifically,TensorFlowwillfigureoutwhichpartsofthecomputationshouldbemovedtotheGPU.Highscalabilityofcomputationacrossmachinesandhugedatasets.

https://www.tensorflow.org/

TensorFlowhomepage

TensorFlowisavailablewithPythonandC++support,andweshallusePython2.7forlearning,asindeedPythonAPIisbettersupportedandmucheasiertolearn.ThePythoninstallationdependsonyoursystems;thedownloadpage(https://www.python.org/downloads/)containsalltheinformationneededforitsinstallation.Inthenextsection,weexplainverybrieflythemainfeaturesofthePythonlanguage,withsomeprogrammingexamples.

https://www.python.org/downloads/

PythonbasicsPythonisastronglytypedanddynamiclanguage(datatypesarenecessarybutitisnotnecessarytoexplicitlydeclarethem),case-sensitive(varandVARaretwodifferentvariables),andobject-oriented(everythinginPythonisanobject).

SyntaxInPython,alineterminatorisnotrequired,andtheblocksarespecifiedwiththeindentation.Indenttobeginablockandremoveindentationtoconcludeit,that'sall.Instructionsthatrequireanindentedblockendwithacolon(:).Commentsbeginwiththehashsign(#)andaresingle-line.Stringsonmultiplelinesareusedformulti-linecomments.Assignmentsareaccomplishedwiththeequalsign(=).Forequalitytestsweusethedoubleequal(==)symbol.Youcanincreaseanddecreaseavaluebyusing+=and-=followedbytheaddend.Thisworkswithmanydatatypes,includingstrings.Youcanassignandusemultiplevariablesonthesameline.

Followingaresomeexamples:

>>>myvar=3

>>>myvar+=2

>>>myvar

5

>>>myvar-=1

>>>myvar

4

"""Thisisacomment"""

>>>mystring="Hello"

>>>mystring+="world."

>>>printmystring

Helloworld.

Thefollowingcodeswapstwovariablesinoneline:

>>>myvar,mystring=mystring,myvar

DatatypesThemostsignificantstructuresinPythonarelists,tuples,anddictionaries.ThesetsareintegratedinPythonsinceversion2.5(forpreviousversions,theyareavailableinthesetslibrary).Listsaresimilartosingle-dimensionalarraysbutyoucancreateliststhatcontainotherlists.Dictionariesarearraysthatcontainpairsofkeysandvalues(hashtable),andtuplesareimmutablemono-dimensionalobjects.InPythonarrayscanbeofanytype,soyoucanmixintegers,strings,andsooninyourlists/dictionariesandtuples.Theindexofthefirstobjectinanytypeofarrayisalwayszero.Negativeindicesareallowedandcountingfromtheendofthearray,-1isthelastelement.Variablescanrefertofunctions.

>>>example=[1,["list1","list2"],("one","tuple")]

>>>mylist=["Element1",2,3.14]

>>>mylist[0]

"Element1"

>>>mylist[-1]

3.14

>>>mydict={"Key1":"Val1",2:3,"pi":3.14}

>>>mydict["pi"]

3.14

>>>mytuple=(1,2,3)

>>>myfunc=len

>>>printmyfunc(mylist)

3

Youcangetanarrayrangeusingacolon(:).Notspecifyingthestartingindexoftherangeimpliesthefirstelement;notindicatingthefinalindeximpliesthelastelement.Negativeindicescountfromthelastelement(-1isthelastelement).Thenrunthefollowingcommand:

>>>mylist=["firstelement",2,3.14]

>>>printmylist[:]

['firstelement',2,3.1400000000000001]

>>>printmylist[0:2]

['firstelement',2]

>>>printmylist[-3:-1]

['firstelement',2]

>>>printmylist[1:]

[2,3.14]

StringsPythonstringsareindicatedeitherwithasinglequotationmark(')ordouble(")andareallowedtouseanotationwithinadelimitedstringontheother("Hesaid'hello'."Itisvalid).Stringsofmultiplelinesareenclosedintriple(orsingle)quotes(""").Pythonsupportsunicode;justusethesyntax:"Thisisaunicodestring".Toinsertvaluesintoastring,usethe%operator(modulo)andatuple.Each%isreplacedbyatupleelement,fromlefttoright,andispermittedtouseadictionaryforthereplacements.

>>>print"Nome:%s\nNumber:%s\nString:%s"%(myclass.nome,3,3*"-

")

Name:Poromenos

Number:3

String:---

strString="""thisisastring

onmultiplelines."""

>>>print"This%(verbo)sun%(name)s."%{"name":"test","verb":

"is"}

Thisisatest.

ControlflowTheinstructionsforflowcontrolareif,for,andwhile.Thereistheselectcontrolflow;initsplace,weuseif.Theforcontrolflowisusedtoenumeratethemembersofalist.Togetalistofnumbers,youuserange(number).

rangelist=range(10)

>>>printrangelist

[0,1,2,3,4,5,6,7,8,9]

Let'scheckifnumberisoneofthenumbersinthetuple:

fornumberinrangelist:

ifnumberin(3,4,7,9):

#"Break"endstheforinstructionwithouttheelseclause

break

else:

#"Continue"continueswiththenextiterationoftheloop

continue

else:

#thisisanoptional"else"

#executedonlyiftheloopisnotinterruptedwith"break".

pass#itdoesnothing

ifrangelist[1]==2:

print"thesecondelement(listsare0-based)is2"

elifrangelist[1]==3:

print"thesecondelementis3"

else:

print"Idon'tknow"

whilerangelist[1]==1:

pass

FunctionsFunctionsaredeclaredwiththekeyworddef.Anyoptionalargumentsmustbedeclaredafterthosethataremandatoryandmusthaveavalueassigned.Whencallingfunctionsusingargumentstonameyoumustalsopassthevalue.Functionscanreturnatuple(tupleunpackingenablesthereturnofmultiplevalues).Lambdafunctionsarein-line.Parametersarepassedbyreference,butimmutabletypes(tuples,integers,strings,andsoon)cannotbechangedinthefunction.Thishappensbecauseitisonlypassedthroughthepositionoftheelementinmemory,andassigninganotherobjecttothevariableresultsinthelossoftheobjectreferenceearlier.

Forexample:

#equaltoadeff(x):returnx+1

funzionevar=lambdax:x+1

>>>printfunzionevar(1)

2

defpassing_example(my_list,my_int):

my_list.append("newelement")

my_int=4

returnmy_list,my_int

>>>input_my_list=[1,2,3]

>>>input_my_int=10

>>>printpassing_example(input_my_list,input_my_int)

([1,2,3,'newelement'],10)

>>>my_list

[1,2,3,'newelement']

>>>my_int

10

ClassesPythonsupportsmultipleinheritanceofclasses.Thevariablesandprivatemethodsaredeclaredbyconvection(itisnotaruleoflanguage)byprecedingthemwithtwounderscores(__).Wecanassignattributes(properties)toarbitraryinstancesofaclass.

Thefollowingisanexample:

classMyclass:

common=10

def__init__(self):

self.myvariable=3

defmyfunc(self,arg1,arg2):

returnself.myvariable

#Wecreateaninstanceoftheclass

>>>instance=Myclass()

>>>instance.myfunc(1,2)

3

#Thisvariableissharedbyallinstances

>>>instance2=Myclass()

>>>instance.common

10

>>>instance2.common

10

#Noteherehowweusetheclassname

#Insteadoftheinstance.

>>>Myclass.common=30

>>>instance.common

30

>>>instance2.common

30

#Thisdoesnotupdatethevariableintheclass,

#Insteadassignanewobjecttothevariable

#ofthefirstinstance.

>>>instance.common=10

>>>instance.common

10

>>>instance2.common

30

>>>Myclass.common=50

#Thevalueisnotchangedbecause"common"isaninstancevariable.

>>>instance.common

10

>>>instance2.common

50

#ThisclassinheritsfromMyclass.Multipleinheritance

#isdeclaredlikethis:

#classAltraClasse(Myclass1,Myclass2,MyclassN)

classAnotherClass(Myclass):

#Thetopic"self"isautomaticallypassed

#andmakesreferencetoinstanceoftheclass,soyoucanset

#ofinstancevariablesasabove,butwithintheclass.

def__init__(self,arg1):

self.myvariable=3

printarg1

>>>instance=AnotherClass("hello")

hello

>>>instance.myfunc(1,2)

3

#Thisclassdoesnothaveamember(property).testmember,but

#Wecanaddoneallinstancewhenwewant.Note

#.testThatwillbeamemberofonlyoneinstance.

>>>instance.test=10

>>>instance.test

10

ExceptionsExceptionsinPythonarehandledwithtry-exceptblocks[exception_name]:

defmy_func():

try:

#Divisionbyzerocausesanexception

10/0

exceptZeroDivisionError:

print"Oops,error"

else:

#noexception,let'sproceed

pass

finally:

#Thiscodeisexecutedwhentheblock

#Try..exceptisalreadyexecutedandallexceptions

#Werehandled,evenifthereisanew

#Exceptiondirectlyintheblock.

print"finish"

>>>my_func()

Oops,error.

finish

ImportingalibraryExternallibrariesareimportedwithimport[libraryname].Youcanalsousetheform[libraryname]import[funcname]toimportindividualfeatures.Here'sanexample:

importrandom

fromtimeimportclock

randomint=random.randint(1,100)

>>>printrandomint

64

InstallingTensorFlowTheTensorFlowPythonAPIsupportsPython2.7andPython3.3+.TheGPUversion(Linuxonly)requirestheCudaToolkit>=7.0andcuDNN>=v2.

WhenworkinginaPythonenvironment,itisrecommendedyouusevirtualenv.ItwillisolateyourPythonconfigurationfordifferentprojects;usingvirtualenvwillnotoverwriteexistingversionsofPythonpackagesrequiredbyTensorFlow.

InstallingonMacorLinuxdistributionsThefollowingarethestepstoinstallTensorFlowonMacandLinuxsystem:

1. Firstinstallpipandvirtualenv(optional)iftheyarenotalreadyinstalled:

ForUbuntu/Linux64-bit:

$sudoapt-getinstallpython-pippython-devpython-

virtualenv

ForMacOSX:

$sudoeasy_installpip

$sudopipinstall--upgradevirtualenv

2. Thenyoucancreateavirtualenvironmentvirtualenv.Thefollowingcommandscreateavirtualenvironmentvirtualenvinthe~/tensorflowdirectory:

$virtualenv--system-site-packages~/tensorflow

3. Thenextstepistoactivatevirtualenvasfollows:

$source~/tensorflow/bin/activate.csh

(tensorflow)$

4. Henceforth,thenameoftheenvironmentwe'reworkinginprecedesthecommandline.Onceactivated,PipisusedtoinstallTensorFlowwithinit.

ForUbuntu/Linux64-bit,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-

cp27-none-linux_x86_64.whl

ForMacOSX,CPU:

(tensorflow)$pipinstall--upgrade

https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-

none-any.whl

IfyouwanttouseyourGPUcardwithTensorFlow,theninstallanotherpackage.IrecommendyouvisittheofficialdocumentationtoseeifyourGPUmeetsthespecificationsrequiredtosupportTensorFlow.

Note

ToenableyourGPUwithTensorFlow,youcanreferto(https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#optional-linux-enable-gpu-support)foracompletedescription.

Finally,whenyou'vefinished,youmustdisablethevirtualenvironment:

(tensorflow)$deactivate

Note

Giventheintroductorynatureofthisbook,IsuggestthereadertovisitthedownloadandsetupTensorFlowpageat(https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#download-and-setup)tofindmoreinformationaboutotherwaystoinstallTensorFlow.

https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#optional-linux-enable-gpu-support

https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#download-and-setup

InstallingonWindowsIfyoucan'tgetaLinux-basedsystem,youcaninstallUbuntuonavirtualmachine;justuseafreeapplicationcalledVirtualBox,whichletsyoucreateavirtualPConWindowsandinstallUbuntuinthelatter.Soyoucantrytheoperatingsystemwithoutcreatingpartitionsordealingwithcumbersomeprocedures.

Note

AfterinstallingVirtualBox,youcaninstallUbuntu(www.ubuntu.com)andthenfollowtheinstallationforLinuxmachinestoinstallTensorFlow.

http://www.ubuntu.com

InstallationfromsourceHowever,itmayhappenthatthePipinstallationcausesproblems,particularlywhenusingthevisualizationtoolTensorBoard(seehttps://github.com/tensorflow/tensorflow/issues/530).Tofixthisproblem,IsuggestyoubuildandinstallTensorFlow,startingformsourcefiles,throughthefollowingsteps:

1. ClonetheTensorFlowrepository:

gitclone--recurse-submodules

https://github.com/tensorflow/tensorflow2. InstallBazel(dependenciesandinstaller),followingtheinstructionsat:

http://bazel.io/docs/install.html.3. RuntheBazelinstaller:

chmod+xbazel-version-installer-os.sh

./bazel-version-installer-os.sh--user

4. InstallthePythondependencies:

sudoapt-getinstallpython-numpyswigpython-dev

5. Configure(GPUornoGPU?)yourinstallationintheTensorFlowdownloadedrepository:

./configure

6. CreateyourownTensorFlowPippackageusingbazel:

bazelbuild-copt

//tensorflow/tools/pip_package:build_pip_package

7. TobuildwithGPUsupport,usebazelbuild-copt--config=cudafollowedagainby:

//tensorflow/tools/pip_package:build_pip_package

8. Finally,installTensorBoardwherethenameofthe.whlfilewilldependonyourplatform.

pipinstall/tmp/tensorflow_pkg/tensorflow-0.7.1-py2-none-

linux_x86_64.whl

https://github.com/tensorflow/tensorflow/issues/530

https://github.com/tensorflow/tensorflow

http://bazel.io/docs/install.html

9. GoodLuck!

Note

Pleaserefertohttps://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#installation-for-linuxforfurtherinformation.

https://www.tensorflow.org/versions/r0.7/get_started/os_setup.html#installation-for-linux

TestingyourTensorFlowinstallationOpenaterminalandtypethefollowinglinesofcode:

>>>importtensorflowastf

>>>hello=tf.constant("helloTensorFlow!")

>>>sess=tf.Session()

Toverifyyourinstallation,justtype:

>>>print(sess.run(hello))

Youshouldhavethefollowingoutput:

HelloTensorFlow!

>>>

FirstworkingsessionFinallyitistimetomovefromtheorytopractice.IwillusethePython2.7IDEtowritealltheexamples.TogetaninitialideaofhowtouseTensorFlow,openthePythoneditorandwritethefollowinglinesofcode:

x=1

y=x+9

print(y)

importtensorflowastf

x=tf.constant(1,name='x')

y=tf.Variable(x+9,name='y')

print(y)

Asyoucaneasilyunderstandinthefirstthreelines,theconstantx,setequalto1,isthenaddedto9tosetthenewvalueofthevariabley,andthentheendresultofthevariableyisprintedonthescreen.

Inthelastfourlines,wehavetranslatedaccordingtoTensorFlowlibrarythefirstthreevariables.

Ifweruntheprogram,wehavethefollowingoutput:

10

<tensorflow.python.ops.variables.Variableobjectat

0x7f30ccbf9190>

TheTensorFlowtranslationofthefirstthreelinesoftheprogramexampleproducesadifferentresult.Let'sanalyzethem:

1. ThefollowingstatementshouldneverbemissedifyouwanttousetheTensorFlowlibrary.Ittellsusthatweareimportingthelibraryandcallittf:


2. Wecreateaconstantvaluecalledx,withavalueequaltoone:

x=tf.constant(1,name='x')

3. Thenwecreateavariablecalledy.Thisvariableisdefinedwiththesimpleequationy=x+9:

y=tf.Variable(x+9,name='y')

4. Finally,printouttheresult:

print(y)

Sohowdoweexplainthedifferentresult?Thedifferenceliesinthevariabledefinition.Infact,thevariableydoesn'trepresentthecurrentvalueofx+9,insteaditmeans:whenthevariableyiscomputed,takethevalueoftheconstantxandadd9toit.Thisisthereasonwhythevalueofyhasneverbeencarriedout.Inthenextsection,I'lltrytofixit.

SoweopenthePythonIDEandenterthefollowinglines:

Runningtheprecedingcode,theoutputresultisfinallyasfollows:

10

Wehaveremovedtheprintinstruction,butwehaveinitializedthemodelvariables:

model=tf.initialize_all_variables()

And,mostly,wehavecreatedasessionforcomputingvalues.Inthenextstep,werunthemodel,createdpreviously,andfinallyrunjustthevariableyandprintoutitscurrentvalue.

withtf.Session()assession:

session.run(model)

print(session.run(y))

Thisisthemagictrickthatpermitsthecorrectresult.Inthisfundamentalstep,theexecutiongraphcalledDataFlowGraphiscreatedinthesession,withallthedependenciesbetweenthevariables.Theyvariabledependsonthevariablex,andthatvalueistransformedbyadding9toit.Thevalueisnotcomputeduntilthesessionisexecuted.

ThislastexampleintroducedanotherimportantfeatureinTensorFlow,theDataFlowGraph.

DataFlowGraphsAmachinelearningapplicationistheresultoftherepeatedcomputationofcomplexmathematicalexpressions.InTensorFlow,acomputationisdescribedusingtheDataFlowGraph,whereeachnodeinthegraphrepresentstheinstanceofamathematicaloperation(multiply,add,divide,andsoon),andeachedgeisamulti-dimensionaldataset(tensors)onwhichtheoperationsareperformed.

TensorFlowsupportstheseconstructsandtheseoperators.Let'sseeindetailhownodesandedgesaremanagedbyTensorFlow:

Node:InTensorFlow,eachnoderepresentstheinstantionofanoperation.Eachoperationhas>=inputsand>=0outputs.Edges:InTensorFlow,therearetwotypesofedge:

NormalEdges:Theyarecarriersofdatastructures(tensors),whereanoutputofoneoperation(fromonenode)becomestheinputforanotheroperation.SpecialEdges:Theseedgesarenotdatacarriersbetweentheoutputofanode(operator)andtheinputofanothernode.Aspecialedgeindicatesacontroldependencybetweentwonodes.Let'ssupposewehavetwonodesAandBandaspecialedgesconnectingAtoB;itmeansthatBwillstartitsoperationonlywhentheoperationinAends.SpecialedgesareusedinDataFlowGraphtosetthehappens-beforerelationshipbetweenoperationsonthetensors.

Let'sexploresomefeaturesinDataFlowGraphingreaterdetail:

Operation:Thisrepresentsanabstractcomputation,suchasaddingormultiplyingmatrices.Anoperationmanagestensors.Itcanjustbepolymorphic:thesameoperationcanmanipulatedifferenttensorelementtypes.Forexample,theadditionoftwoint32tensors,theadditionoftwofloattensors,andsoon.Kernel:Thisrepresentstheconcreteimplementationofthatoperation.Akerneldefinestheimplementationoftheoperationonaparticulardevice.Forexample,anaddmatrixoperationcanhaveaCPUimplementationandaGPUone.Inthefollowingsection,wehaveintroducedtheconceptofsessionstocreateadelexecutiongraphinTensorFlow.Let'sexplainthistopic:Session:WhentheclientprogramhastoestablishcommunicationwiththeTensorFlowruntimesystem,asessionmustbecreated.Assoonasthesession

iscreatedforaclient,aninitialgraphiscreatedandisempty.Ithastwofundamentalmethods:

session.extend:Inacomputation,theusercanextendtheexecutiongraph,requestingtoaddmoreoperations(nodes)andedges(data).session.run:UsingTensorFlow,sessionsarecreatedwithsomegraphs,andthesefullgraphsareexecutedtogetsomeoutputs,orsometimes,subgraphsareexecutedthousands/millionsoftimesusingruninvocations.Basically,themethodrunstheexecutiongraphtoprovideoutputs.

FeaturesinDataFlowGraph

TensorFlowprogrammingmodelAdoptingDataFlowGraphasexecutionmodel,youdividethedataflowdesign(graphbuildinganddataflow)fromitsexecution(CPU,GPUcards,oracombination),usingasingleprogramminginterfacethathidesallthecomplexities.ItalsodefineswhattheprogrammingmodelshouldbelikeinTensorFlow.

Let'sconsiderthesimpleproblemofmultiplyingtwointegers,namelyaandb.

Thefollowingarethestepsrequiredforthissimpleproblem:

1. Defineandinitializethevariables.Eachvariableshoulddefinethestateofacurrentexecution.AfterimportingtheTensorFlowmoduleinPython:


2. Wedefinethevariablesaandbinvolvedinthecomputation.Thesearedefinedviaamorebasicstructure,calledtheplaceholder:

a=tf.placeholder("int32")

b=tf.placeholder("int32")

3. Aplaceholderallowsustocreateouroperationsandtobuildourcomputationgraph,withoutneedingthedata.

4. Thenweusethesevariables,asinputsforTensorFlow'sfunctionmul:

y=tf.mul(a,b)

thisfunctionwillreturntheresultofthemultiplicationthe

inputintegersaandb.

5. Managetheexecutionflow,thismeansthatwemustbuildasession:

sess=tf.Session()

6. Visualizetheresults.Werunourmodelonthevariablesaandb,feedingdataintothedataflowgraphthroughtheplaceholderspreviouslydefined.

printsess.run(y,feed_dict={a:2,b:5})

HowtouseTensorBoardTensorBoardisavisualizationtool,devotedtoanalyzingDataFlowGraphandalsotobetterunderstandthemachinelearningmodels.Itcanviewdifferenttypesofstatisticsabouttheparametersanddetailsofanypartofacomputergraphgraphically.Itoftenhappensthatagraphofcomputationcanbeverycomplex.Adeepneuralnetworkcanhaveupto36,000nodes.Forthisreason,TensorBoardcollapsesnodesinhigh-levelblocks,highlightingthegroupswithidenticalstructures.Doingsoallowsabetteranalysisofthegraph,focusingonlyonthecoresectionsofthecomputationgraph.Also,thevisualizationprocessisinteractive;usercanpan,zoom,andexpandthenodestodisplaythedetails.

ThefollowingfigureshowsaneuralnetworkmodelwithTensorBoard:

ATensorBoardvisualizationexample

TensorBoard'salgorithmscollapsenodesintohigh-levelblocksandhighlightgroupswiththesamestructures,whilealsoseparatingouthigh-degreenodes.Thevisualizationtoolisalsointeractive:theuserscanpan,zoomin,expand,andcollapsethenodes.

TensorBoardisequallyusefulinthedevelopmentandtuningofamachinelearningmodel.Forthisreason,TensorFlowletsyouinsertso-calledsummaryoperationsintothegraph.Thesesummaryoperationsmonitorchangingvalues(duringtheexecutionofacomputation)writteninalogfile.ThenTensorBoardisconfiguredtowatchthislogfilewithsummaryinformationanddisplayhowthisinformationchangesovertime.

Let'sconsiderabasicexampletounderstandtheusageofTensorBoard.Wehavethefollowingexample:


a=tf.constant(10,name="a")

b=tf.constant(90,name="b")

y=tf.Variable(a+b*2,name="y")



merged=tf.merge_all_summaries()

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

session.run(model)


Thatgivesthefollowingresult:

190

Let'spointintothesessionmanagement.Thefirstinstructiontoconsiderisasfollows:

merged=tf.merge_all_summaries()

Thisinstructionmustmergeallthesummariescollectedinthedefaultgraph.

ThenwecreateSummaryWriter.Itwillwriteallthesummaries(inthiscasetheexecutiongraph)obtainedfromthecode'sexecutionintothe/tmp/tensorflowlogsdirectory:

writer=tf.train.SummaryWriter\

("/tmp/tensorflowlogs",session.graph)

Finally,werunthemodelandsobuildtheDataFlowGraph:

session.run(model)


TheuseofTensorBoardisverysimple.Let'sopenaterminalandenterthefollowing:

$tensorboard--logdir=/tmp/tensorflowlogs

Amessagesuchasthefollowingshouldappear:

startigtensorboardonport6006

Then,byopeningawebbrowser,weshoulddisplaytheDataFlowGraphwithauxiliarynodes:

DataFlowGraphdisplaywithTensorBoard

NowwewillbeabletoexploretheDataFlowGraph:

ExploretheDataFlowGraphdisplaywithTensorBoard

TensorBoardusesspecialiconsforconstantsandsummarynodes.Tosummarize,wereportinthenextfigurethetableofnodesymbolsdisplayed:

NodesymbolsinTensorBoard

SummaryInthischapter,weintroducedthemaintopics:machinelearninganddeeplearning.Whilemachinelearningexploresthestudyandconstructionofalgorithmsthatcanlearnfrom,andmakepredictionsondata,deeplearningisbasedpreciselyonthewaythehumanbrainprocessesinformationandlearns,respondingtoexternalstimuli.

Inthisvastscientificresearchandpracticalapplicationarea,wecanfirmlyplacetheTensorFlowsoftwarelibrary,developedbytheGoogle'sresearchgroupforartificialintelligence(GoogleBrainProject)andreleasedasopensourcesoftwareonNovember9,2015.

AfterelectingthePythonprogramminglanguageasthedevelopmenttoolforourexamplesandapplications,wesawhowtoinstallandcompilethelibrary,andthencarriedoutafirstworkingsession.ThisallowedustointroducetheexecutionmodelofTensorFlowandDataFlowGraph.Itledustodefinewhatourprogrammingmodelshouldbe.

Thechapterendedwithanexampleofhowtouseanimportanttoolfordebuggingmachinelearningapplications:TensorBoard.

Inthenextchapter,wewillcontinueourjourneyintotheTensorFlowlibrary,withtheintentionofshowingitsversatility.Startingfromthefundamentalconcept,tensors,wewillseehowtousethelibraryforpurelymathapplications.

https://en.wikipedia.org/wiki/Algorithm

Chapter2.DoingMathwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

ThetensordatastructureHandlingtensorswithTensorFlowComplexnumbersandfractalsComputingderivativesRandomnumbersSolvingpartialdifferentialequations

ThetensordatastructureTensorsarethebasicdatastructuresinTensorFlow.Aswehavealreadysaid,theyrepresenttheconnectingedgesinaDataFlowGraph.Atensorsimplyidentifiesamultidimensionalarrayorlist.

Itcanbeidentifiedbythreeparameters,rank,shape,andtype:

rank:Eachtensorisdescribedbyaunitofdimensionalitycalledrank.Itidentifiesthenumberofdimensionsofthetensor.Forthisreason,arankisknownasorderorn-dimensionsofatensor(forexample,arank2tensorisamatrixandarank1tensorisavector).shape:Theshapeofatensoristhenumberofrowsandcolumnsithas.type:Itisthedatatypeassignedtothetensor'selements.

Well,nowwetakeconfidencewiththisfundamentaldatastructure.Tobuildatensor,wecan:

Buildann-dimensionalarray;forexample,byusingtheNumPylibraryConvertthen-dimensionalarrayintoaTensorFlowtensor

Onceweobtainthetensor,wecanhandleitusingtheTensorFlowoperators.Thefollowingfigureprovidesavisualexplanationoftheconceptsintroduced:

Visualizationofmultidimensionaltensors

One-dimensionaltensorsTobuildaone-dimensionaltensor,weusetheNumpyarray(s)command,wheresisaPythonlist:

>>>importnumpyasnp

>>>tensor_1d=np.array([1.3,1,4.0,23.99])

UnlikeaPythonlist,thecommasbetweentheelementsarenotshown:

>>>printtensor_1d

[1.31.4.23.99]

TheindexingisthesameasPythonlists.Thefirstelementhasposition0,thethirdelementhasposition2,andsoon:

>>>printtensor_1d[0]

1.3

>>>printtensor_1d[2]

4.0

Finally,youcanviewthebasicattributesofthetensor,therankofthetensor:

>>>tensor_1d.ndim

1

Thetupleofthetensor'sdimensionisasfollows:

>>>tensor_1d.shape

(4L,)

Thetensor'sshapehasjustfourvaluesinarow.

Thedatatypeinthetensor:

>>>tensor_1d.dtype

dtype('float64')

Now,let'sseehowtoconvertaNumPyarrayintoaTensorFlowtensor:

importTensorFlowastf

TheTensorFlowfunctiontf_convert_to_tensorconvertsPythonobjectsof

varioustypestotensorobjects.Itacceptstensorobjects,Numpyarrays,Pythonlists,andPythonscalars:

tf_tensor=tf.convert_to_tensor(tensor_1d,dtype=tf.float64)

RunningtheSession,wecanvisualizethetensoranditselementsasfollows:

withtf.Session()assess:

printsess.run(tf_tensor)

printsess.run(tf_tensor[0])

printsess.run(tf_tensor[2])

Thatgivesthefollowingresults:

>>

[1.31.4.23.99]

1.3

4.0

>>>

Two-dimensionaltensorsTocreateatwo-dimensionaltensorormatrix,weagainusearray(s),butswillbeasequenceofarray:

>>>importnumpyasnp

>>>tensor_2d=np.array([(1,2,3,4),(4,5,6,7),(8,9,10,11),

(12,13,14,15)])

>>>printtensor_2d

[[1234]

[4567]

[891011]

[12131415]]

>>>

Avalueintensor_2disidentifiedbytheexpressiontensor_2d[row,col],whererowistherowpositionandcolisthecolumnposition:

>>>tensor_2d[3][3]

15

Youcanalsousethesliceoperator:toextractasubmatrix:

>>>tensor_2d[0:2,0:2]

array([[1,2],

[4,5]])

Inthiscase,weextracteda2×2submatrix,containingrow0and1,andcolumns0and1oftensor_2d.TensorFlowhasitsownsliceoperator.Inthenextsubsectionwewillseehowtouseit.

Tensorhandling

Let'sseehowwecanapplyalittlemorecomplexoperationstothesedatastructures.Considerthefollowingcode:

1. Importthelibraries:


importnumpyasnp

2. Let'sbuildtwointegerarrays.Theserepresentstwo3×3matrices:

matrix1=np.array([(2,2,2),(2,2,2),(2,2,2)],dtype='int32')

matrix2=np.array([(1,1,1),(1,1,1),(1,1,1)],dtype='int32')

3. Visualizethem:

print"matrix1="

printmatrix1

print"matrix2="

printmatrix2

4. TousethesematricesinourTensorFlowenvironment,theymustbetransformedintoatensordatastructure:

matrix1=tf.constant(matrix1)

matrix2=tf.constant(matrix2)

5. WeusedtheTensorFlowconstantoperatortoperformthetransformation.6. ThematricesarereadytobemanipulatedwithTensorFlowoperators.Inthis

case,wecalculateamatrixmultiplicationandamatrixsum:

matrix_product=tf.matmul(matrix1,matrix2)

matrix_sum=tf.add(matrix1,matrix2)

7. Thefollowingmatrixwillbeusedtocomputeamatrixdeterminant:

matrix_3=np.array([(2,7,2),(1,4,2),

(9,0,2)],dtype='float32')

print"matrix3="

printmatrix_3

matrix_det=tf.matrix_determinant(matrix_3)

8. It'stimetocreateourgraphandrunthesession,withthetensorsandoperatorscreated:


result1=sess.run(matrix_product)

result2=sess.run(matrix_sum)

result3=sess.run(matrix_det)

9. Theresultswillbeprintedoutbyrunningthefollowingcommand:

print"matrix1*matrix2="

printresult1

print"matrix1+matrix2="

printresult2

print"matrix3determinantresult="

printresult3

Thefollowingfigureshowstheresults,afterrunningthecode:

TensorFlowprovidesnumerousmathoperationsontensors.Thefollowingtablesummarizesthem:

TensorFlowoperator Description

tf.add Returnsthesum

tf.sub

Returnssubtraction

tf.mul Returnsthemultiplication

tf.div Returnsthedivision

tf.mod Returnsthemodule

tf.abs Returnstheabsolutevalue

tf.neg Returnsthenegativevalue

tf.sign Returnsthesign

tf.inv Returnstheinverse

tf.square Returnsthesquare

tf.round Returnsthenearestinteger

tf.sqrt Returnsthesquareroot

tf.pow Returnsthepower

tf.exp Returnstheexponential

tf.log Returnsthelogarithm

tf.maximum Returnsthemaximum

tf.minimum Returnstheminimum

tf.cos Returnsthecosine

tf.sin Returnsthesine

Three-dimensionaltensorsThefollowingcommandsbuildathree-dimensionaltensor:

>>>importnumpyasnp

>>>tensor_3d=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

>>>printtensor_3d

[[[12]

[34]]

[[56]

[78]]]

>>>

Thethree-dimensionaltensorcreatedisa2x2x2matrix:

>>>tensor_3d.shape

(2L,2L,2L)

Toretrieveanelementfromathree-dimensionaltensor,weuseanexpressionofthefollowingform:

tensor_3d[plane,row,col]

Followingthesesettings:

Matrix3×3representation

Soallthefourelementsinthefirstplaneidentifiedbythevalueofthevariableplaneequaltozero:

>>>tensor_3d[0,0,0]

1

>>>tensor_3d[0,0,1]

2

>>>tensor_3d[0,1,0]

3

>>>tensor_3d[0,1,1]

4

Thethree-dimensionaltensorsallowtointroducethenexttopic,linkedtothemanipulationofimagesbutmoregenerallyintroducesustooperateassimpletransformationsontensors.

HandlingtensorswithTensorFlowTensorFlowisdesignedtohandletensorsofallsizesandoperatorsthatcanbeusedtomanipulatethem.Inthisexample,inordertoseearraymanipulations,wearegoingtoworkwithadigitalimage.Asyouprobablyknow,acolordigitalimagethatisaMxNx3sizematrix(athreeordertensor),whosecomponentscorrespondtothecomponentsofred,green,andblueintheimage(RGBspace),meansthateachfeatureintherectangularboxfortheRGBimagewillbespecifiedbythreecoordinates,i,j,andk.

TheRGBtensor

ThefirstthingIwanttoshowyouishowtouploadanimage,andthentoextractasub-imagefromtheoriginal,usingtheTensorFlowsliceoperator.

Preparetheinputdata

Usingtheimreadcommandinmatplotlib,weimportadigitalimageinstandardformatcolors(JPG,BMP,TIF):

importmatplotlib.imageasmp_image

filename="packt.jpeg"

input_image=mp_image.imread(filename)

However,wecanseetherankandtheshapeofthetensor:

print'inputdim={}'.format(input_image.ndim)

print'inputshape={}'.format(input_image.shape)

You'llseetheoutput,whichis(80,144,3).Thismeanstheimageis80pixelshigh,144pixelswide,and3colorsdeep.

Finally,usingmatplotlib,itispossibletovisualizetheimportedimage:

importmatplotlib.pyplotasplt

plt.imshow(input_image)

plt.show()

Thestartingimage

Inthisexample,sliceisabidimensionalsegmentofthestartingimage,whereeachpixelhastheRGBcomponents,soweneedaplaceholdertostoreallthevaluesoftheslice:


my_image=tf.placeholder("uint8",[None,None,3])

Forthelastdimension,we'llneedonlythreevalues.ThenweusetheTensorFlowoperatorslicetocreateasub-image:

slice=tf.slice(my_image,[10,0,0],[16,-1,-1])

ThelaststepistobuildaTensorFlowworkingsession:


result=session.run(slice,feed_dict={my_image:input_image})

print(result.shape)

plt.imshow(result)

plt.show()

Theresultingshapeisthenasthefollowingimageshows:

Theinputimageaftertheslice

Inthisnextexample,wewillperformageometrictransformationoftheinputimage,usingthetransposeoperator:


Weassociatetheinputimagetoavariablewecallx:

x=tf.Variable(input_image,name='x')

Wetheninitializeourmodel:


Next,webuildupthesessionwiththatwerunourcode:


Toperformthetransposeofourmatrix,usethetransposefunctionofTensorFlow.Thismethodperformsaswapbetweentheaxes0and1oftheinputmatrix,whilethezaxisisleftunchanged:

x=tf.transpose(x,perm=[1,0,2])

session.run(model)

result=session.run(x)

plt.imshow(result)

plt.show()

Theresultisthefollowing:

Thetransposedimage

ComplexnumbersandfractalsFirstofall,welookathowPythonhandlescomplexnumbers.Itisasimplematter.Forexample,settingx=5+4jinPython,wemustwritethefollowing:

>>>x=5.+4j

Itmeansthat>>>xisequalto5+4j.

Atthesametime,youcanwritethefollowing:

>>>x=complex(5,4)

>>>x

(5+4j)

Wealsonotethat:

Pythonusesjtomean√-1insteadofiinmath.Ifyouputanumberbeforethej,Pythonwillconsideritasanimaginarynumber,otherwise,itsavariable.Itmeansthatifyouwanttowritetheimaginarynumberi,youmustwrite1jratherthanj.

TogettherealandimaginarypartsofaPythoncomplexnumber,youcanusethefollowing:

>>>x.real

5.0

>>>x.imag

4.0

>>>

Weturnnowtoourproblem,namelyhowtodisplaythefractalswithTensorFlow.TheMandelbrotsetisoneofthemostfamousfractals.Afractalisageometricobjectthatisrepeatedinitsstructureatdifferentscales.Fractalsareverycommoninnature,andanexampleisthecoastofGreatBritain.

TheMandelbrotsetisdefinedforthecomplexnumberscforwhichthefollowingsuccessionistrueandbounded:

Z(n+1)=Z(n)2+c,whereZ(0)=0

ThesetisnamedafteritscreatorBenoîtMandelbrot,aPolishmathematician

famousformakingfamousfractals.However,hewasabletogiveashapeorgraphicrepresentationtothesetofMandelbrotonlywiththehelpofcomputerprogramming.In1985,hepublishedinScientificAmericanthefirstalgorithmtocalculatetheMandelbrotset.Thealgorithm(foreachpointcomplexpointZ):

1. Zhasinitialvalueequalto0,Z(0)=0.2. Choosethecomplexnumbercasthecurrentpoint.IntheCartesianplane,the

abscissaaxis(horizontalline)representstherealpart,whiletheaxisofordinates(verticalline)representstheimaginarypartofc.

3. Iteration:Z(n+1)=Z(n)2+cStopwhenZ(n)2islargerthanthemaximumradius;

NowweseethroughsimplestepshowwecantranslatethealgorithmmentionedearlierusingTensorFlow.

PreparethedataforMandelbrotsetImportthenecessarylibrariestoourexample:


importnumpyasnp


WebuildacomplexgridthatwillcontainourMandelbrot'sset.Theregionofthecomplexplaneisbetween-1.3and+1.3ontherealaxisandbetween-2jand+1jontheimaginaryaxis.Eachpixellocationineachimagewillrepresentadifferentcomplexvalue,z:

Y,X=np.mgrid[-1.3:1.3:0.005,-2:1:0.005]

Z=X+1j*Y

c=tf.constant(Z.astype(np.complex64))

Thenwedefinedatastructures,orthetensorTensorFlowthatcontainsallthedatatobeincludedinthecalculation.Wethendefinetwovariables.Thefirstistheoneonwhichwewillmakeouriteration.Ithasthesamedimensionsasthecomplexgrid,butitisdeclaredasvariable,thatis,itsvalueswillchangeinthecourseofthecalculation:

zs=tf.Variable(c)

Thenextvariableisinitializedtozero.Italsohasthesamesizeasthevariablezs:

ns=tf.Variable(tf.zeros_like(c,tf.float32))

BuildandexecutetheDataFlowGraphforMandelbrot'ssetInsteadtointroduceasessionweinstantiateanInteractiveSession():

sess=tf.InteractiveSession()

Itrequires,asweshallsee,theTensor.eval()andOperation.run()methods.Thenweinitializeallthevariablesinvolvedthroughtherun()method:

tf.initialize_all_variables().run()

Starttheiteration:

zs_=zs*zs+c

Definethestopconditionoftheiteration:

not_diverged=tf.complex_abs(zs_)<4

Thenweusethegroupoperatorthatgroupsmultipleoperations:

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,tf.float32)))

ThefirstoperationisthestepiterationZ(n+1)=Z(n)2+ctocreateanewvalue.

Thesecondoperationaddsthisvaluetothecorrespondentelementvariableinns.Whenthisopfinishes,allopsininputhavefinished.Thisoperatorhasnooutput.

Thenweruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultforMandelbrot'ssetTheresultwillbethetensorns.eval().Usingmatplotlib,let'svisualizetheresult:

plt.imshow(ns.eval())

plt.show()

TheMandelbrotset

Ofcourse,theMandelbrotsetisnottheonlyfractalwecanvisualize.JuliasetsarefractalsthathavebeennamedafterGastonMauriceJuliaforhisworkinthisfield.TheirbuildingprocessisverysimilartothatusedfortheMandelbrotset.

PreparethedataforJulia'ssetLet'sdefinetheoutputcomplexplane.Itisbetween-2and+2ontherealaxisandbetween-2jand+2jontheimaginaryaxis:

Y,X=np.mgrid[-2:2:0.005,-2:2:0.005]

Andthecurrentpointlocation:

Z=X+1j*Y

ThedefinitionoftheJulia'ssetrequiresredefingZasaconstanttensor:

Z=tf.constant(Z.astype("complex64"))

Thustheinputtensorssupportingourcalculationisasfollows:

zs=tf.Variable(Z)

ns=tf.Variable(tf.zeros_like(Z,"float32"))

BuildandexecutetheDataFlowGraphforJulia'ssetAsinthepreviousexample,wecreateourowninteractivesession:

sess=tf.InteractiveSession()

Wetheninitializetheinputtensors:


TocomputethenewvaluesoftheJuliaset,wewillusetheiterativeformulaZ(n+1)=Z(n)2–c,wheretheinitialpointcwillbeequaltotheimaginarynumber0.75i:

c=complex(0.0,0.75)

zs_=zs*zs-c

Thegroupingoperatorandthestopiteration'sconditionwillbethesameasintheMandelbrotcomputation:

not_diverged=tf.complex_abs(zs_)<4

step=tf.group(zs.assign(zs_),\

ns.assign_add(tf.cast(not_diverged,"float32")))

Finally,weruntheoperatorfortwohundredsteps:

foriinrange(200):step.run()

VisualizetheresultTovisualizetheresultrunthefollowingcommand:

plt.imshow(ns.eval())

plt.show()

TheJuliaset

ComputinggradientsTensorFlowhasfunctionstosolveothermorecomplextasks.Forexample,wewilluseamathematicaloperatorthatcalculatesthederivativeofywithrespecttoitsexpressionxparameter.Forthispurpose,weusethetf.gradients()function.

Letusconsiderthemathfunctiony=2x².Wewanttocomputethegradientdiywithrespecttox=1.Thefollowingisthecodetocomputethisgradient:

1. First,importtheTensorFlowlibrary:


2. Thexvariableistheindependentvariableofthefunction:

x=tf.placeholder(tf.float32)

3. Let'sbuildthefunction:

y=2*x*x

4. Finally,wecallthetf.gradients()functionwithyandxasarguments:

var_grad=tf.gradients(y,x)

5. Toevaluatethegradient,wemustbuildasession:


6. Thegradientwillbeevaluatedonthevariablex=1:

var_grad_val=session.run(var_grad,feed_dict={x:1})

7. Thevar_grad_valvalueisthefeedresult,tobeprinted:

print(var_grad_val)

8. Thatgivesthefollowingresult:

>>

[4.0]

>>

RandomnumbersThegenerationofrandomnumbersisessentialinmachinelearningandwithinthetrainingalgorithms.Whenrandomnumbersaregeneratedbyacomputer,theyaregeneratedbyaPseudoRandomNumberGenerator(PRNG).Thetermpseudocomesfromthefactthatthecomputerisastainlogicallyprogrammedrunningofinstructionsthatcanonlysimulaterandomness.Despitethislogicallimitation,computersareveryefficientatgeneratingrandomnumbers.TensorFlowprovidesoperatorstocreaterandomtensorswithdifferentdistributions.

UniformdistributionGenerally,whenweneedtoworkwithrandomnumbers,wetrytogetrepeatedvalueswiththesamefrequency,uniformlydistributed.TheoperatorTensorFlowprovidesvaluesbetweenminvalandmaxval,allwiththesameprobability.Let'sseeasimpleexamplecode:

random_uniform(shape,minval,maxval,dtype,seed,name)

WeimporttheTensorFlowlibraryandmatplotlibtodisplaytheresults:



Theuniformvariableisa1-dimensionaltensor,theelements100,containingvaluesrangingfrom0to1,distributedwiththesameprobability:

uniform=tf.random_uniform([100],minval=0,maxval=1,dtype=tf.float32)

Let'sdefinethesession:

sess=tf.Session()

Inoursession,weevaluatethetensoruniform,usingtheeval()operator:


printuniform.eval()

plt.hist(uniform.eval(),normed=True)

plt.show()

Asyoucansee,allintermediatevaluesbetween0and1haveapproximatelythesamefrequency.Thisbehavioriscalleduniformdistribution.Theresultofexecutionisthereforeasfollows:

Uniformdistribution

NormaldistributionInsomespecificcases,youmayneedtogeneraterandomnumbersthatdifferbyafewunits.Inthiscase,weusedthenormaldistributionofrandomnumbers,alsocalledGaussiandistribution,thatincreasestheprobabilityofthenextissuesextractionat0.Eachintegerrepresentsthestandarddeviation.Asshownfromthefutureissuestothemarginsoftherangehaveaverylowchanceofbeingextracted.ThefollowingistheimplementationwithTensorFlow:



norm=tf.random_normal([100],mean=0,stddev=2)


plt.hist(norm.eval(),normed=True)

plt.show()

Wecreateda1d-tensorofshape[100]consistingofrandomnormalvalues,withmeanequalto0andstandarddeviationequalto2,usingtheoperatortf.random_normal.Thefollowingistheresult:

Normaldistribution

GeneratingrandomnumberswithseedsWerecallthatoursequenceispseudo-random,becausethevaluesarecalculatedusingadeterministicalgorithmandprobabilityplaysnorealrole.Theseedisjustastartingpointforthesequenceandifyoustartfromthesameseedyouwillendupwiththesamesequence.Thisisveryuseful,forexample,todebugyourcode,whenyouaresearchingforanerrorinaprogramandyoumustbeabletoreproducetheproblembecauseeveryrunwouldbedifferent.

Considerthefollowingexamplewherewehavetwouniformdistributions:

uniform_with_seed=tf.random_uniform([1],seed=1)

uniform_without_seed=tf.random_uniform([1])

Inthefirstuniformdistribution,webeganwiththeseed=1.Thismeansthatrepeatedlyevaluatingthetwodistributions,thefirstuniformdistributionwillalwaysgeneratethesamesequenceofvalues:

print("FirstRun")

withtf.Session()asfirst_session:

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}"\

.format(first_session.run(uniform_with_seed)))

print("uniformwithoutseed={}"\

.format(first_session.run(uniform_without_seed)))


.format(first_session.run(uniform_without_seed)))

print("SecondRun")

withtf.Session()assecond_session:

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))

print("uniformwith(seed=1)={}\

.format(second_session.run(uniform_with_seed)))


.format(second_session.run(uniform_without_seed)))


.format(second_session.run(uniform_without_seed)))

Asyoucansee,thisistheendresult.Theuniformdistributionwithseed=1alwaysgivesthesameresult:

>>>

FirstRun

uniformwith(seed=1)=[0.23903739]


uniformwithoutseed=[0.92157185]


SecondRun





>>>

Montecarlo'smethod

WeendthesectiononrandomnumberswithasimplenoteabouttheMontecarlomethod.Itisanumericalprobabilisticmethodwidelyusedintheapplicationofhigh-performancescientificcomputing.Inourexample,wewillcalculatethevalueofπ:


trials=100

hits=0

Generatepseudo-randompointsinsidethesquare[-1,1]×[-1,1],usingtherandom_uniformfunction:

x=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

y=tf.random_uniform([1],minval=-1,maxval=1,dtype=tf.float32)

pi=[]

Startthesession:

sess=tf.Session()

Insidethesession,wecalculatethevalueofπ:theareaofthecircleisπandthatofthesquareis4.Therelationshipbetweenthenumbersinsidethecircleandthetotalofgeneratedpointsmustconverge(veryslowly)toπ,andwecounthowmanypointsfallinsidethecircleequationx2+y2=1.

withsess.as_default():

foriinrange(1,trials):

forjinrange(1,trials):

ifx.eval()**2+y.eval()**2<1:

hits=hits+1

pi.append((4*float(hits)/i)/trials)

plt.plot(pi)

plt.show()

Thefigureshowstheconvergenceduringthenumberofteststotheπvalue

SolvingpartialdifferentialequationsApartialdifferentialequation(PDE)isadifferentialequationinvolvingpartialderivativesofanunknownfunctionofseveralindependentvariables.PDEsarecommonlyusedtoformulateandsolvemajorphysicalproblemsinvariousfields,fromquantummechanicstofinancialmarkets.Inthissection,wetaketheexamplefromhttps://www.TensorFlow.org/versions/r0.8/tutorials/pdes/index.html,showingtheuseofTensorFlowinatwo-dimensionalPDEsolutionthatmodelsthesurfaceofsquarepondwithafewraindropslandingonit.Theeffectwillbetoproducebi-dimensionalwavesontheponditself.Wewon'tconcentrateonthecomputationalaspectsoftheproblem,asthisisbeyondthescopeofthisbook;insteadwewillfocusonusingTensorFlowtodefinetheproblem.

Thestartingpointistoimportthesefundamentallibraries:


importnumpyasnp


https://www.TensorFlow.org/versions/r0.8/tutorials/pdes/index.html

InitialconditionFirstwehavetodefinethedimensionsoftheproblem.Let'simaginethatourpondisa500x500square:

N=500

Thefollowingtwo-dimensionaltensoristhepondattimet=0,thatis,theinitialconditionofourproblem:

u_init=np.zeros([N,N],dtype=np.float32)

Wehave40randomraindropsonit

forninrange(40):

a,b=np.random.randint(0,N,2)

u_init[a,b]=np.random.uniform()

Thenp.random.randint(0,N,2)isaNumPyfunctionthatreturnsrandomintegersfrom0toNonatwo-dimensionalshape.

Usingmatplotlib,wecanshowtheinitialsquarepond:

plt.imshow(U.eval())

plt.show()

Zoomingonthepondinitsinitialcondition:thecoloreddotsrepresenttheraindropsfallen

Thenwedefinethefollowingtensor:

ut_init=np.zeros([N,N],dtype=np.float32)

Itisthetemporalevolutionofthepond.Attimet=tenditwillcontainthefinalstateofthepond.

ModelbuildingWemustdefinesomefundamentalparameters(usingTensorFlowplaceholders)andatimestepofthesimulation:

eps=tf.placeholder(tf.float32,shape=())

Wemustalsodefineaphysicalparameterofthemodel,namelythedampingcoefficient:

damping=tf.placeholder(tf.float32,shape=())

ThenweredefineourstartingtensorsasTensorFlowvariables,sincetheirvalueswillchangeoverthecourseofthesimulation:

U=tf.Variable(u_init)

Ut=tf.Variable(ut_init)

Finally,webuildourPDEmodel.Itrepresentstheevolutionintimeofthepondaftertheraindropshavefallen:

U_=U+eps*Ut

Ut_=Ut+eps*(laplace(U)-damping*Ut)

Asyoucansee,weintroducedthelaplace(U)functiontoresolvethePDE(itwillbedescribedinthelastpartofthissection).

UsingtheTensorFlowgroupoperator,wedefinehowourpondintimetshouldevolve:

step=tf.group(

U.assign(U_),

Ut.assign(Ut_))

Let'srecallthatthegroupoperatorgroupsmultipleoperationsasasingleone.

GraphexecutionInoursessionwewillseetheevolutionintimeofthepondby1000steps,whereeachtimestepisequalto0.03s,whilethedampingcoefficientissetequalto0.04.

Let'sinitializetheTensorFlowvariables:


Thenwerunthesimulation:

foriinrange(1000):

step.run({eps:0.03,damping:0.04})

ifi%50==0:

clear_output()

plt.imshow(U.eval())

plt.show()

Every50stepsthesimulationresultwillbedisplayedasfollows:

Thepondafter400simulationsteps

Computationalfunctionused

Let'snowseewhatistheLaplace(U)functionandtheancillaryfunctionsused:

defmake_kernel(a):

a=np.asarray(a)

a=a.reshape(list(a.shape)+[1,1])

returntf.constant(a,dtype=1)

defsimple_conv(x,k):

x=tf.expand_dims(tf.expand_dims(x,0),-1)

y=tf.nn.depthwise_conv2d(x,k,[1,1,1,1],padding='SAME')

returny[0,:,:,0]

deflaplace(x):

laplace_k=make_kernel([[0.5,1.0,0.5],

[1.0,-6.,1.0],

[0.5,1.0,0.5]])

returnsimple_conv(x,laplace_k)

Thesefunctionsdescribethephysicsofthemodel,thatis,asthewaveiscreatedandpropagatesinthepond.Iwillnotgointothedetailsofthesefunctions,theunderstandingofwhichisbeyondthescopeofthisbook.

Thefollowingfigureshowsthewavesonthepondaftertheraindropshavefallen.

Zoomingonthepond

SummaryInthischapter,welookedatsomeofthemathematicalpotentialofTensorFlow.Fromthefundamentaldefinitionofatensor,thebasicdatastructureforanytypeofcomputation,wesawwithsomeexampleshowtohandlethesedatastructuresusingtheTensorFlow'smathoperators.Usingcomplexnumbers,weexploredtheworldoffractals.Thenweintroducedtheconceptofrandomnumbers.Theseareinfactusedinmachinelearningformodeldevelopmentandtesting,sothechapterendedwithanexampleofdefiningandsolvingamathematicalproblemusingdifferentialequationswithpartialderivatives.

Inthenextchapter,finallywe'llstarttoseeTensorFlowinactionrightinthefieldforwhichitwasdeveloped-inmachinelearning,solvingcomplexproblemssuchasclassificationanddataclustering.

Chapter3.StartingwithMachineLearningInthischapter,wewillcoverthefollowingtopics:

LinearregressionTheMNISTdatasetClassifiersThenearestneighboralgorithmDataclusteringThek-meansalgorithm

ThelinearregressionalgorithmInthissection,webeginourexplorationofmachinelearningtechniqueswiththelinearregressionalgorithm.Ourgoalistobuildamodelbywhichtopredictthevaluesofadependentvariablefromthevaluesofoneormoreindependentvariables.

Therelationshipbetweenthesetwovariablesislinear;thatis,ifyisthedependentvariableandxtheindependent,thenthelinearrelationshipbetweenthetwovariableswilllooklikethis:y=Ax+b.

Thelinearregressionalgorithmadaptstoagreatvarietyofsituations;foritsversatility,itisusedextensivelyinthefieldofappliedsciences,forexample,biologyandeconomics.

Furthermore,theimplementationofthisalgorithmallowsustointroduceinatotallyclearandunderstandablewaythetwoimportantconceptsofmachinelearning:thecostfunctionandthegradientdescentalgorithms.

DatamodelThefirstcrucialstepistobuildourdatamodel.Wementionedearlierthattherelationshipbetweenourvariablesislinear,thatis:y=Ax+b,whereAandbareconstants.Totestouralgorithm,weneeddatapointsinatwo-dimensionalspace.

WestartbyimportingthePythonlibraryNumPy:

importnumpyasnp

Thenwedefinethenumberofpointswewanttodraw:

number_of_points=500

Weinitializethefollowingtwolists:

x_point=[]

y_point=[]

Thesepointswillcontainthegeneratedpoints.

Wethensetthetwoconstantsthatwillappearinthelinearrelationofywithx:

a=0.22

b=0.78

ViaNumPy'srandom.normalfunction,wegenerate300randompointsaroundtheregressionequationy=0.22x+0.78:

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

Finally,viewthegeneratedpointsbymatplotlib:


plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

Linearregression:Thedatamodel

Costfunctionsandgradientdescent

ThemachinelearningalgorithmthatwewanttoimplementwithTensorFlowmustpredictvaluesofyasafunctionofxdataaccordingtoourdatamodel.ThelinearregressionalgorithmwilldeterminethevaluesoftheconstantsAandb(fixedforourdatamodel),whicharethenthetrueunknownsoftheproblem.

Thefirststepistoimportthetensorflowlibrary:


ThendefinetheAandbunknowns,usingtheTensorFlowtf.Variable:

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

TheunknownfactorAwasinitializedusingarandomvaluebetween-1and1,whilethevariablebisinitiallysettozero:

b=tf.Variable(tf.zeros([1]))

Sowewritethelinearrelationshipthatbindsytox:

y=A*x_point+b

Nowwewillintroduce,thiscostfunction:thathasparameterscontainingapairofvaluesAandbtobedeterminedwhichreturnsavaluethatestimateshowwelltheparametersarecorrect.Inthisexample,ourcostfunctionismeansquareerror:

cost_function=tf.reduce_mean(tf.square(y-y_point))

Itprovidesanestimateofthevariabilityofthemeasures,ormoreprecisely,ofthedispersionofvaluesaroundtheaveragevalue;asmallvalueofthisfunctioncorrespondstoabestestimatefortheunknownparametersAandb.

Tominimizecost_function,weuseanoptimizationalgorithmwiththegradientdescent.Givenamathematicalfunctionofseveralvariables,gradientdescentallowstofindalocalminimumofthisfunction.Thetechniqueisasfollows:

Evaluate,atanarbitraryfirstpointofthefunction'sdomain,thefunctionitselfanditsgradient.Thegradientindicatesthedirectioninwhichthefunctiontendstoaminimum.Selectasecondpointinthedirectionindicatedbythegradient.Ifthefunctionforthissecondpointhasavaluelowerthanthevaluecalculatedatthefirstpoint,thedescentcancontinue.

Youcanrefertothefollowingfigureforavisualexplanationofthealgorithm:

Thegradientdescentalgorithm

Wealsoremarkthatthegradientdescentisonlyalocalfunctionminimum,butitcanalsobeusedinthesearchforaglobalminimum,randomlychoosinganewstartpointonceithasfoundalocalminimumandrepeatingtheprocessmanytimes.Ifthenumberofminimaofthefunctionislimited,andthereareveryhighnumberofattempts,thenthereisagoodchancethatsoonerorlatertheglobalminimumwillbeidentified.

UsingTensorFlow,theapplicationofthisalgorithmisverysimple.Theinstructionareasfollows:

optimizer=tf.train.GradientDescentOptimizer(0.5)

Here0.5isthelearningrateofthealgorithm.

Thelearningratedetermineshowfastorslowwemovetowardstheoptimal

weights.Ifitisverylarge,weskiptheoptimalsolution,andifitistoosmall,weneedtoomanyiterationstoconvergetothebestvalues.

Anintermediatevalue(0.5)isprovided,butitmustbetunedinordertoimprovetheperformanceoftheentireprocedure.

Wedefinetrainastheresultoftheapplicationofthecost_function(optimizer),throughitsminimizefunction:

train=optimizer.minimize(cost_function)

Testingthemodel

Nowwecantestthealgorithmofgradientdescentonthedatamodelyoucreatedearlier.Asusual,wehavetoinitializeallthevariables:


Sowebuildouriteration(20computationsteps),allowingustodeterminethebestvaluesofAandb,whichdefinethelinethatbestfitsthedatamodel.Instantiatetheevaluationgraph:


Weperformthesimulationonourmodel:

session.run(model)

forstepinrange(0,21):

Foreachiteration,weexecutetheoptimizationstep:

session.run(train)

Everyfivesteps,weprintourpatternofdots:

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

Andthestraightlinesareobtainedbythefollowingcommand:

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

Thefollowingfigureshowstheconvergenceoftheimplementedalgorithm:

Linearregression:startcomputation(step=0)

Afterjustfivesteps,wecanalreadysee(inthenextfigure)asubstantialimprovementinthefitoftheline:

Linearregression:situationafter5computationsteps

Thefollowing(andfinal)figureshowsthedefinitiveresultafter20steps.Wecanseetheefficiencyofthealgorithmused,withthestraightlineefficiencyperfectlyacrossthecloudofpoints.

Linearregression:finalresult

Finallywereport,tofurtherourunderstanding,thecompletecode:

importnumpyasnp



number_of_points=200

x_point=[]

y_point=[]

a=0.22

b=0.78

foriinrange(number_of_points):

x=np.random.normal(0.0,0.5)

y=a*x+b+np.random.normal(0.0,0.1)

x_point.append([x])

y_point.append([y])

plt.plot(x_point,y_point,'o',label='InputData')

plt.legend()

plt.show()

A=tf.Variable(tf.random_uniform([1],-1.0,1.0))

B=tf.Variable(tf.zeros([1]))

y=A*x_point+B

cost_function=tf.reduce_mean(tf.square(y-y_point))

optimizer=tf.train.GradientDescentOptimizer(0.5)

train=optimizer.minimize(cost_function)



session.run(model)

forstepinrange(0,21):

session.run(train)

if(step%5)==0:

plt.plot(x_point,y_point,'o',

label='step={}'

.format(step))

plt.plot(x_point,

session.run(A)*

x_point+

session.run(B))

plt.legend()

plt.show()

TheMNISTdatasetTheMNISTdataset(availableathttp://yann.lecun.com/exdb/mnist/),iswidelyusedfortrainingandtestinginthefieldofmachinelearning,andwewilluseitintheexamplesofthisbook.Itcontainsblackandwhiteimagesofhandwrittendigitsfrom0to9.

Thedatasetisdividedintotwogroups:60,000totrainthemodelandanadditional10,000totestit.Theoriginalimages,inblackandwhite,werenormalizedtofitintoaboxofsize28×28pixelsandcenteredbycalculatingthecenterofmassofthepixels.ThefollowingfigurerepresentshowthedigitscouldberepresentedintheMNISTdataset:

MNISTdigitsampling

EachMNISTdatapointisanarrayofnumbersdescribinghowdarkeachpixelis.Forexample,forthefollowingdigit(thedigit1),wecouldhave:

http://yann.lecun.com/exdb/mnist/

Pixelrepresentationofthedigit1

DownloadingandpreparingthedataThefollowingcodeimportstheMNISTdatafilesthatwearegoingtoclassify.IamusingascriptfromGooglethatcanbedownloadedfrom:

https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/examples/tutorials/mnist/input_data.pyThismustberuninthesamefolderwherethefilesarelocated.

Nowwewillshowhowtoloadanddisplaythedata:

importinput_data

importnumpyasnp


Usinginput_data,weloadthedatasets:

mnist_images=input_data.read_data_sets\

("MNIST_data/",\

one_hot=False)

train.next_batch(10)returnsthefirst10images:

pixels,real_values=mnist_images.train.next_batch(10)

Thisalsoreturnstwolists:thematrixofthepixelsloadedandthelistthatcontainstherealvaluesloaded:

print"listofvaluesloaded",real_values

example_to_visualize=5

print"elementN°"+str(example_to_visualize+1)\

+"ofthelistplotted"

>>

ExtractingMNIST_data/train-labels-idx1-ubyte.gz

ExtractingMNIST_data/t10k-images-idx3-ubyte.gz

ExtractingMNIST_data/t10k-labels-idx1-ubyte.gz

listofvaluesloaded[7346181098]

elementN6ofthelistplotted

>>

Whiledisplayinganelement,wecanusematplotlib,asfollows:

image=pixels[example_to_visualize,:]

image=np.reshape(image,[28,28])

plt.imshow(image)

plt.show()

Hereistheresult:

https://github.com/tensorflow/tensorflow/blob/r0.7/tensorflow/examples/tutorials/mnist/input_data.py

AMNISTimageofthenumbereight

ClassifiersInthecontextofmachinelearning,thetermclassificationidentifiesanalgorithmicprocedurethatassignseachnewinputdatum(instance)tooneofthepossiblecategories(classes).Ifweconsideronlytwoclasses,wetalkaboutbinaryclassification;otherwisewehaveamulti-classclassification.

Theclassificationfallsintothesupervisedlearningcategory,whichpermitsustoclassifynewinstancesbasedontheso-calledtrainingset.Thebasicstepstofollowtoresolveasupervisedclassificationproblemareasfollows:

1. Buildthetrainingexamplesinordertorepresenttheactualcontextandapplicationonwhichtoaccomplishtheclassification.

2. Choosetheclassifierandthecorrespondingalgorithmimplementation.3. Trainthealgorithmonthetrainingsetandsetanycontrolparametersthrough

validation.4. Evaluatetheaccuracyandperformanceoftheclassifierbyapplyingasetof

newinstances(testset).

ThenearestneighboralgorithmTheK-nearestneighbor(KNN)isasupervisedlearningalgorithmforbothclassificationorregression.Itisasystemthatassignstheclassofthesampletestedaccordingtoitsdistancefromtheobjectsstoredinthememory.

Thedistance,d,isdefinedastheEuclideandistancebetweentwopoints:

Herenisthedimensionofthespace.Theadvantageofthismethodofclassificationistheabilitytoclassifyobjectswhoseclassesarenotlinearlyseparable.Itisastableclassifier,giventhatsmallperturbationsofthetrainingdatadonotsignificantlyaffecttheresultsobtained.Themostobviousdisadvantage,however,isthatitdoesnotprovideatruemathematicalmodel;instead,foreverynewclassification,itshouldbecarriedoutbyaddingthenewdatatoallinitialinstancesandrepeatingthecalculationprocedurefortheselectedKvalue.

Moreover,itrequiresafairlyhighamountofdatatomakerealisticpredictionsandissensitivetothenoiseoftheanalyzeddata.

Inthenextexample,wewillimplementtheKNNalgorithmusingtheMNISTdataset.

Buildingthetrainingset

Let'sstartwiththeimportlibrariesneededforthesimulation:

importnumpyasnp


importinput_data

Toconstructthedatamodelforthetrainingset,usetheinput_data.read_data_setsfunction,introducedearlier:

mnist=input_data.read_data_sets("/tmp/data/",one_hot=True)

Inourexamplewewilltaketrainingphaseconsistingof100MNISTimages:

train_pixels,train_list_values=mnist.train.next_batch(100)

Whilewetestouralgorithmfor10images:

test_pixels,test_list_of_values=mnist.test.next_batch(10)

Finally,wedefinethetensorstrain_pixel_tensorandtest_pixel_tensorweusetoconstructourclassifier:

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

Costfunctionandoptimization

Thecostfunctionisrepresentedbythedistanceintermsofpixels:

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

Thetf.reducefunctionsumcomputesthesumofelementsacrossthedimensionsofatensor.Forexample(fromtheTensorFlowon-linemanual):

#'x'is[[1,1,1]

#[1,1,1]]

tf.reduce_sum(x)==>6

tf.reduce_sum(x,0)==>[2,2,2]

tf.reduce_sum(x,1)==>[3,3]

tf.reduce_sum(x,1,keep_dims=True)==>[[3],[3]]

tf.reduce_sum(x,[0,1])==>6

Finally,tominimizethedistancefunction,weusearg_min,whichreturnstheindexwiththesmallestdistance(nearestneighbor):

pred=tf.arg_min(distance,0)

Testingandalgorithmevaluation

Accuracyisaparameterthathelpsustocomputethefinalresultoftheclassifier:

accuracy=0

Initializethevariables:

init=tf.initialize_all_variables()

Startthesimulation:


sess.run(init)

foriinrange(len(test_list_of_values)):

Thenweevaluatethenearestneighborindex,usingthepredfunction,definedearlier:

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

Finally,wefindthenearestneighborclasslabelandcompareittoitstruelabel:

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

Thenweevaluateandreporttheaccuracyoftheclassifier:

accuracy+=1./len(test_pixels)

print"Result=",accuracy

Aswecansee,eachelementofthetrainingsetiscorrectlyclassified.Theresultofthesimulationshowsthepredictedclasswiththerealclass,andfinallythetotalvalueofthesimulationisreported:

>>>

Extracting/tmp/data/train-labels-idx1-ubyte.gz

Extracting/tmp/data/t10k-images-idx3-ubyte.gz

Extracting/tmp/data/t10k-labels-idx1-ubyte.gz

TestN°0PredictedClass:7TrueClass:7










Result=0.9

>>>

Theresultisnot100%accurate;thereasonisthatitliesinawrongevaluationofthetestno.8insteadof5,theclassifierhasrated6.

Finally,wereportthecompletecodeforKNNclassification:

importnumpyasnp


importinput_data

#BuildtheTrainingSet


train_pixels,train_list_values=mnist.train.next_batch(100)

test_pixels,test_list_of_values=mnist.test.next_batch(10)

train_pixel_tensor=tf.placeholder\

("float",[None,784])

test_pixel_tensor=tf.placeholder\

("float",[784])

#CostFunctionanddistanceoptimization

distance=tf.reduce_sum\

(tf.abs\

(tf.add(train_pixel_tensor,\

tf.neg(test_pixel_tensor))),\

reduction_indices=1)

pred=tf.arg_min(distance,0)

#Testingandalgorithmevaluation

accuracy=0.



sess.run(init)

foriinrange(len(test_list_of_values)):

nn_index=sess.run(pred,\

feed_dict={train_pixel_tensor:train_pixels,\

test_pixel_tensor:test_pixels[i,:]})

print"TestN°",i,"PredictedClass:",\

np.argmax(train_list_values[nn_index]),\

"TrueClass:",np.argmax(test_list_of_values[i])

ifnp.argmax(train_list_values[nn_index])\

==np.argmax(test_list_of_values[i]):

accuracy+=1./len(test_pixels)

print"Result=",accuracy

DataclusteringAclusteringproblemconsistsintheselectionandgroupingofhomogeneousitemsfromasetofinitialdata.Tosolvethisproblem,wemust:

IdentifyaresemblancemeasurebetweenelementsFindoutiftherearesubsetsofelementsthataresimilartothemeasurechosen

Thealgorithmdetermineswhichelementsformaclusterandwhatdegreeofsimilarityunitesthemwithinthecluster.

Theclusteringalgorithmsfallintotheunsupervisedmethods,becausewedonotassumeanypriorinformationonthestructuresandcharacteristicsoftheclusters.

Thek-meansalgorithmOneofthemostcommonandsimpleclusteringalgorithmsisk-means,whichallowssubdividinggroupsofobjectsintokpartitionsonthebasisoftheirattributes.Eachclusterisidentifiedbyapointorcentroidaverage.

Thealgorithmfollowsaniterativeprocedure:

1. RandomlyselectKpointsastheinitialcentroids.2. Repeat.3. FormKclustersbyassigningallpointstotheclosestcentroid.4. Recomputethecentroidofeachcluster.5. Untilthecentroidsdon'tchange.

Thepopularityofthek-meanscomesfromitsconvergencespeedanditseaseofimplementation.Intermsofthequalityofthesolutions,thealgorithmdoesnotguaranteeachievingtheglobaloptimum.Thequalityofthefinalsolutiondependslargelyontheinitialsetofclustersandmay,inpractice,toobtainamuchworsetheglobaloptimumsolution.Sincethealgorithmisextremelyfast,youcanapplyitseveraltimesandproducesolutionsfromwhichyoucanchooseamongmostsatisfyingone.Anotherdisadvantageofthealgorithmisthatitrequiresyoutochoosethenumberofclusters(k)tofind.

Ifthedataisnotnaturallypartitioned,youwillendupgettingstrangeresults.Furthermore,thealgorithmworkswellonlywhenthereareidentifiablesphericalclustersinthedata.

Letusnowseehowtoimplementthek-meansbytheTensorFlowlibrary.

BuildingthetrainingsetImportallthenecessarylibrariestooursimulation:


importnumpyasnp


importpandasaspd

Note

Pandasisanopensource,easy-to-usedatastructure,anddataanalysistoolforthePythonprogramminglanguage.Toinstallit,typethefollowingcommand:

sudopipinstallpandas

Wemustdefinetheparametersofourproblem.Thetotalnumberofpointsthatwewanttoclusteris1000points:

num_vectors=1000

Thenumberofpartitionsyouwanttoachievebyallinitial:

num_clusters=4

Wesetthenumberofcomputationalstepsofthek-meansalgorithm:

num_steps=100

Weinitializetheinitialinputdatastructures:

x_values=[]

y_values=[]

vector_values=[]

Thetrainingsetcreatesarandomsetofpoints,whichiswhyweusetherandom.normalNumPyfunction,allowingustobuildthex_valuesandy_valuesvectors:

foriinxrange(num_vectors):

ifnp.random.random()>0.5:

x_values.append(np.random.normal(0.4,0.7))

y_values.append(np.random.normal(0.2,0.8))

else:



WeusethePythonzipfunctiontoobtainthecompletelistofvector_values:

vector_values=zip(x_values,y_values)

Thenvector_valuesisconvertedintoaconstant,usablebyTensorFlow:

vectors=tf.constant(vector_values)

Wecanseeourtrainingsetfortheclusteringalgorithmwiththefollowingcommands:

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

Thetrainingsetfork-means

Afterrandomlybuildingthetrainingset,wehavetogenerate(k=4)centroid,then

determineanindexusingtf.random_shuffle:

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

Byadoptingthisprocedure,weareabletodeterminefourrandomindices:

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

Theyhavetheirownindexesofourinitialcentroids:

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather\

(vector_values,centroid_indices))

CostfunctionsandoptimizationThecostfunctionwewanttominimizeforthisproblemisagaintheEuclideandistancebetweentwopoints:

Inordertomanagethetensorsdefinedpreviously,vectorsandcentroids,weusetheTensorFlowfunctionexpand_dims,whichautomaticallyexpandsthesizeofthetwoarguments:

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

Thisfunctionallowsyoutostandardizetheshapeofthetwotensors,inordertoevaluatethedifferencebythetf.submethod:

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

Finally,webuildtheeuclidean_distancescostfunction,usingthetf.reduce_sumfunction,whichcomputesthesumofelementsacrossthedimensionsofatensor,whilethetf.squarefunctioncomputesthesquareofthevectors_subtrationelement-wisetensor:

euclidean_distances=tf.reduce_sum(tf.square\

(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

Hereassignmentsisthevalueoftheindexwiththesmallestdistanceacrossthetensoreuclidean_distances.Letusnowturntotheoptimizationphase,thepurposeofwhichistoimprovethechoiceofcentroids,onwhichtheconstructionoftheclustersdepends.Wepartitionthevectors(whichisourtrainingset)intonum_clusterstensors,usingindicesfromassignments.

Thefollowingcodetakesthenearestindicesforeachsample,andgrabsthoseoutasseparategroupsusingtf.dynamic_partition:

partitions=tf.dynamic_partition\

(vectors,assignments,num_clusters)

Finally,weupdatethecentroids,usingtf.reduce_meanonasinglegrouptofindtheaverageofthatgroup,formingitsnewcentroid:

update_centroids=tf.concat(0,\

[tf.expand_dims\

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

Toformtheupdate_centroidstensor,weusetf.concattoconcatenatethesingleone.

Testingandalgorithmevaluation

It'stimetotestandevaluatethealgorithm.Thefirstprocedureistoinitializeallthevariablesandinstantiatetheevaluationgraph:

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

Nowwestartthecomputation:

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

Todisplaytheresult,weimplementthefollowingfunction:

display_partition(x_values,y_values,assignment_values)

Thistakesthex_valuesandy_valuesvectorsofthetrainingset,andtheassignemnt_valuesvector,todrawtheclusters.

Thecodeforthisvisualizationfunctionisasfollows:

defdisplay_partition(x_values,y_values,assignment_values):

labels=[]

colors=["red","blue","green","yellow"]

foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()

ax.scatter(df['x'],df['y'],c=df['color'])

plt.show()

Itassociatestoeachclusteritscolorbymeansofthefollowingdatastructure:


Itthendrawsthemthroughthescatterfunctionofmatplotlib:


Let'sdisplaytheresult:

Finalresultofthek-meansalgorithm

Hereisthecompletecodeofthek-meansalgorithm:


importnumpyasnp

importpandasaspd


defdisplay_partition(x_values,y_values,assignment_values):

labels=[]


foriinxrange(len(assignment_values)):

labels.append(colors[(assignment_values[i])])

color=labels

df=pd.DataFrame\

(dict(x=x_values,y=y_values,color=labels))

fig,ax=plt.subplots()


plt.show()

num_vectors=2000

num_clusters=4

n_samples_per_cluster=500

num_steps=1000

x_values=[]

y_values=[]

vector_values=[]

#CREATERANDOMDATA

foriinxrange(num_vectors):

ifnp.random.random()>0.5:



else:



vector_values=zip(x_values,y_values)

vectors=tf.constant(vector_values)

n_samples=tf.shape(vector_values)[0]

random_indices=tf.random_shuffle(tf.range(0,n_samples))

begin=[0,]

size=[num_clusters,]

size[0]=num_clusters

centroid_indices=tf.slice(random_indices,begin,size)

centroids=tf.Variable(tf.gather(vector_values,centroid_indices))

expanded_vectors=tf.expand_dims(vectors,0)

expanded_centroids=tf.expand_dims(centroids,1)

vectors_subtration=tf.sub(expanded_vectors,expanded_centroids)

euclidean_distances=

\tf.reduce_sum(tf.square(vectors_subtration),2)

assignments=tf.to_int32(tf.argmin(euclidean_distances,0))

partitions=[0,0,1,1,0]

num_partitions=2

data=[10,20,30,40,50]

outputs[0]=[10,20,50]

outputs[1]=[30,40]

partitions=tf.dynamic_partition(vectors,assignments,num_clusters)

update_centroids=tf.concat(0,[tf.expand_dims

(tf.reduce_mean(partition,0),0)\

forpartitioninpartitions])

init_op=tf.initialize_all_variables()

sess=tf.Session()

sess.run(init_op)

forstepinxrange(num_steps):

_,centroid_values,assignment_values=\

sess.run([update_centroids,\

centroids,\

assignments])

display_partition(x_values,y_values,assignment_values)

plt.plot(x_values,y_values,'o',label='InputData')

plt.legend()

plt.show()

SummaryInthischapter,webegantoexplorethepotentialofTensorFlowforsometypicalproblemsinMachineLearning.Withthelinearregressionalgorithm,theimportantconceptsofcostfunctionandoptimizationusinggradientdescentwereexplained.WethendescribedthedatasetMNISTofhandwrittendigits.Wealsoimplementedamulticlassclassifierusingthenearestneighboralgorithm,whichfallsintotheMachineLearningsupervisedlearningcategory.Thenthechapterconcludedwithanexampleofunsupervisedlearning,byimplementingthek-meansalgorithmforsolvingadataclusteringproblem.

Inthenextchapter,wewillintroduceneuralnetworks.Thesearemathematicalmodelsthatrepresenttheinterconnectionbetweenelementsdefinedasartificialneurons,namelymathematicalconstructsthatmimicthepropertiesoflivingneurons.

We'llalsoimplementsomeneuralnetworklearningmodelsusingTensorFlow.

Chapter4.IntroducingNeuralNetworksInthischapter,wewillcoverthefollowingtopics:

Whatareneuralnetworks?SingleLayerPerceptronLogisticregressionMultiLayerPerceptronMultiLayerPerceptronclassificationMultiLayerPerceptronfunctionapproximation

Whatareartificialneuralnetworks?Anartificialneuralnetwork(ANN)isaninformationprocessingsystemwhoseoperatingmechanismisinspiredbybiologicalneuralcircuits.Thankstotheircharacteristics,neuralnetworksaretheprotagonistsofarealrevolutioninmachinelearningsystemsandmorespecificallyinthecontextofartificialintelligence.AnANNpossessesmanysimpleprocessingunitsvariouslyconnectedtoeachother,accordingtovariousarchitectures.IfwelookattheschemaofanANNreportedlater,itcanbeseenthatthehiddenunitscommunicatewiththeexternallayer,bothininputandoutput,whiletheinputandoutputunitscommunicateonlywiththehiddenlayerofthenetwork.

Eachunitornodesimulatestheroleoftheneuroninbiologicalneuralnetworks.Eachnode,saidartificialneuron,hasaverysimpleoperation:itbecomesactiveifthetotalquantityofsignalthatitreceivesexceedsitsactivationthreshold,definedbytheso-calledactivationfunction.Ifanodebecomesactive,itemitsasignalthatistransmittedalongthetransmissionchannelsuptotheotherunittowhichitisconnected.Eachconnectionpointactsasafilterthatconvertsthemessageintoaninhibitoryorexcitatorysignal,increasingordecreasingtheintensityaccordingtotheirindividualcharacteristics.Theconnectionpointssimulatethebiologicalsynapsesandhavethefundamentalfunctionofweighingtheintensityofthetransmittedsignals,bymultiplyingthembytheweightswhosevaluesdependontheconnectionitself.

ANNschematicdiagram

NeuralnetworkarchitecturesThewaytoconnectthenodes,thetotalnumberoflayers,thatisthelevelsofnodesbetweeninputandoutputsandthenumberofneuronsperlayer-allthesedefinethearchitectureofaneuralnetwork.Forexample,inmultilayernetworks(weintroducetheseinthesecondpartofthischapter),onecanidentifytheartificialneuronsoflayerssuchthat:

EachneuronisconnectedwithallthoseofthenextlayerTherearenoconnectionsbetweenneuronsbelongingtothesamelayerThenumberoflayersandofneuronsperlayerdependsontheproblemtobesolved

Nowwestartourexplorationofneuralnetworkmodels,introducingthemostsimpleneuralnetworkmodel:theSingleLayerPerceptronortheso-calledRosenblatt'sPerceptron.

SingleLayerPerceptronTheSingleLayerPerceptronwasthefirstneuralnetworkmodel,proposedin1958byFrankRosenblatt.Inthismodel,thecontentofthelocalmemoryoftheneuronconsistsofavectorofweights,W=(w1,w2,......,wn).ThecomputationisperformedoverthecalculationofasumoftheinputvectorX=(x1,x2,......,xn),eachofwhichismultipliedbythecorrespondingelementofthevectoroftheweights;thenthevalueprovidedintheoutput(thatis,aweightedsum)willbetheinputofanactivationfunction.Thisfunctionreturns1iftheresultisgreaterthanacertainthreshold,otherwiseitreturns-1.Inthefollowingfigure,theactivationfunctionistheso-calledsignfunction:

+1x>0

sign(x)=

−1otherwise

Itispossibletouseotheractivationfunctions,preferablynon-linear(suchasthesigmoidfunction,whichwewillseeinthenextsection).Thelearningprocedureofthenetisiterative:itslightlymodifiesforeachlearningcycle(calledepoch)thesynapticweightsbyusingaselectedsetcalledatrainingset.Ateachcycle,theweightsmustbemodifiedtominimizeacostfunction,whichisspecifictotheproblemunderconsideration.Finally,whentheperceptronhasbeentrainedonthetrainingset,itwillbetestedonotherinputs(thetestset)inordertoverifyitscapacityforgeneralization.

SchemaofaRosemblatt'sPerceptron

LetusnowseehowtoimplementasinglelayerneuralnetworkforanimageclassificationproblemusingTensorFlow.

ThelogisticregressionThisalgorithmhasnothingtodowiththecanonicallinearregressionwesawinChapter3,StartingwithMachineLearning,butitisanalgorithmthatallowsustosolveproblemsofsupervisedclassification.Infact,toestimatethedependentvariable,nowwemakeuseoftheso-calledlogisticfunctionorsigmoid.Itispreciselybecauseofthisfeaturewecallthisalgorithmlogisticregression.Thesigmoidfunctionhasthefollowingpattern:

Sigmoidfunction

Aswecansee,thedependentvariabletakesvaluesstrictlybetween0and1thatispreciselywhatservesus.Inthecaseoflogisticregression,wewantourfunctiontotelluswhat'stheprobabilityofbelongingtoaparticularelementofourclass.We

recallagainthatthesupervisedlearningbytheneuralnetworkisconfiguredasaniterativeprocessofoptimizationoftheweights;thesearethenmodifiedonthebasisofthenetwork'sperformanceofthetrainingset.Indeedtheaimistominimizethelossfunction,whichindicatesthedegreetowhichthebehaviorofthenetworkdeviatesfromthedesiredone.Theperformanceofthenetworkisthenverifiedonatestset,consistingofimagesotherthanthoseoftrained.

Thebasicstepsoftrainingthatwe'regoingtoimplementareasfollows:

Theweightsareinitializedwithrandomvaluesatthebeginningofthetraining.Foreachelementofthetrainingsettheerroriscalculated,thatis,thedifferencebetweenthedesiredoutputandtheactualoutput.Thiserrorisusedtoadjusttheweights.Theprocessisrepeated,resubmittingtothenetwork,inarandomorder,alltheexamplesofthetrainingsetuntiltheerrormadeontheentiretrainingsetisnotlessthanacertainthreshold,oruntilthemaximumnumberofiterationsisreached.

LetusnowseeindetailhowtoimplementthelogisticregressionwithTensorFlow.TheproblemwewanttosolveistoclassifyimagesfromtheMNISTdataset,whichasexplainedintheChapter3,StartingwithMachineLearningisadatabaseofhandwrittennumbers.

TensorFlowimplementationToimplementTensorFlow,weneedtoperformthefollowingsteps:

1. Firstofall,wehavetoimportallthenecessarylibraries:

importinput_data



2. Weusetheinput_data.readfunctionintroducedinChapter3,StartingwithMachineLearning,intheMNISTdatasetsection,touploadtheimagestoourproblem:


3. Thenwesetthetotalnumberofepochsforthetrainingphase:

training_epochs=25

4. Wemustalsodefineotherparametersthatarenecessarytobuildamodel:

learning_rate=0.01

batch_size=100

display_step=1

5. Nowwemovetotheconstructionofthemodel.

BuildingthemodelDefinexastheinputtensor;itrepresentstheMNISTdataimageofsize28x28=784pixels:

x=tf.placeholder("float",[None,784])

Werecallthatourproblemconsistsofassigningaprobabilityvalueforeachofthepossibleclassesofmembership(thenumbersfrom0to9).Attheendofthiscalculation,wewilluseaprobabilitydistribution,whichgivesusthevalueofwhatisconfidentwithourprediction.

Sotheoutputwe'regoingtogetwillbeanoutputtensorwith10probabilities,eachonecorrespondingtoadigit(ofcoursethesumofprobabilitiesmustbeone):

y=tf.placeholder("float",[None,10])

Toassignprobabilitiestoeachimage,wewillusetheso-calledsoftmaxactivationfunction.

Thesoftmaxfunctionisspecifiedintwomainsteps:

CalculatetheevidencethatacertainimagebelongstoaparticularclassConverttheevidenceintoprobabilitiesofbelongingtoeachofthe10possibleclasses

Toevaluatetheevidence,wefirstdefinetheweightsinputtensorasW:

W=tf.Variable(tf.zeros([784,10]))

Foragivenimage,wecanevaluatetheevidenceforeachclassibysimplymultiplyingthetensorWwiththeinputtensorx.UsingTensorFlow,weshouldhavesomethinglikethefollowing:

evidence=tf.matmul(x,W)

Ingeneral,themodelsincludeanextraparameterrepresentingthebias,whichindicatesacertaindegreeofuncertainty.Inourcase,thefinalformulafortheevidenceisasfollows:

evidence=tf.matmul(x,W)+b

Itmeansthatforeveryi(from0to9)wehaveaWimatrixelements784(28×

28),whereeachelementjofthematrixismultipliedbythecorrespondingcomponentjoftheinputimage(784parts)isaddedandthecorrespondingbiaselementbi.

Sotodefinetheevidence,wemustdefinethefollowingtensorofbiases:


Thesecondstepistofinallyusethesoftmaxfunctiontoobtaintheoutputvectorofprobabilities,namelyactivation:

activation=tf.nn.softmax(tf.matmul(x,W)+b)

TensorFlow'stf.nn.softmaxfunctionprovidesaprobability-basedoutputfromtheinputevidencetensor.Onceweimplementthemodel,wecanspecifythenecessarycodetofindtheweightsWandbiasesbnetworkthroughtheiterativetrainingalgorithm.Ineachiteration,thetrainingalgorithmtakesthetrainingdata,appliestheneuralnetwork,andcomparestheresultwiththeexpected.

Note

TensorFlowprovidesmanyotheractivationfunctions.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/index.htmlforbetterreferences.

Inordertotrainourmodelandknowwhenwehaveagoodone,wemustdefinehowtodefinetheaccuracyofourmodel.OurgoalistotrytogetvaluesofparametersWandbthatminimizethevalueofthemetricthatindicateshowbadthemodelis.

Differentmetricscalculateddegreeoferrorbetweenthedesiredoutputandthetrainingdataoutputs.AcommonmeasureoferroristhemeansquarederrorortheSquaredEuclideanDistance.However,therearesomeresearchfindingsthatsuggesttouseothermetricstoaneuralnetworklikethis.

Inthisexample,weusetheso-calledcross-entropyerrorfunction.Itisdefinedas:

cross_entropy=y*tf.lg(activation)

Inordertominimizecross_entropy,wecanusethefollowingcombinationoftf.reduce_meanandtf.reduce_sumtobuildthecostfunction:

cost=tf.reduce_mean\

https://www.tensorflow.org/versions/r0.8/api_docs/index.html

(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

Thenwemustminimizeitusingthegradientdescentoptimizationalgorithm:

optimizer=tf.train.GradientDescentOptimizer\

(learning_rate).minimize(cost)

Fewlinesofcodetobuildaneuralnetmodel!

LaunchthesessionIt'stimetobuildthesessionandlaunchourneuralnetmodel.

Wefixthefollowingliststovisualizethetrainingsession:

avg_set=[]

epoch_set=[]

ThenweinitializetheTensorFlowvariables:


Startthesession:


sess.run(init)

Asexplained,eachepochisatrainingcycle:

forepochinrange(training_epochs):

avg_cost=0.

total_batch=int(mnist.train.num_examples/batch_size)

Thenweloopoverallthebatches:

foriinrange(total_batch):

batch_xs,batch_ys=\

mnist.train.next_batch(batch_size)

Fitthetrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

Computetheaveragelossrunningthetrain_stepfunctionwiththegivenimagevalues(x)andtherealoutput(y_):

avg_cost+=sess.run\

(cost,feed_dict={x:batch_xs,\

y:batch_ys})/total_batch

Duringcomputation,wedisplayalogperepochstep:

ifepoch%display_step==0:

print"Epoch:",\

'%04d'%(epoch+1),\

"cost=","{:.9f}".format(avg_cost)

print"Trainingphasefinished"

Let'sgettheaccuracyofourmode.Itiscorrectiftheindexwiththehighestyvalueisthesameasintherealdigitvectorthemeanofthecorrect_predictiongivesustheaccuracy.Weneedtoruntheaccuracyfunctionwithourtestset(mnist.test).

Weusethekeyimagesandlabelsforxandy:

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

accuracy=tf.reduce_mean\

(tf.cast(correct_prediction,"float"))

print"MODELaccuracy:",accuracy.eval({x:mnist.test.images,\

y:mnist.test.labels})

TestevaluationWepreviouslyshowedthetrainingphaseandforeachepochwehaveprintedtherelativecostfunction:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.>>>=======================RESTART

============================

>>>

Extracting/tmp/data/train-images-idx3-ubyte.gz




Epoch:0001cost=1.174406662

Epoch:0002cost=0.661956009

Epoch:0003cost=0.550468774

Epoch:0004cost=0.496588717

Epoch:0005cost=0.463674555

Epoch:0006cost=0.440907706

Epoch:0007cost=0.423837747

Epoch:0008cost=0.410590841

Epoch:0009cost=0.399881751

Epoch:0010cost=0.390916621

Epoch:0011cost=0.383320325

Epoch:0012cost=0.376767031

Epoch:0013cost=0.371007620

Epoch:0014cost=0.365922904

Epoch:0015cost=0.361327561

Epoch:0016cost=0.357258660

Epoch:0017cost=0.353508228

Epoch:0018cost=0.350164634

Epoch:0019cost=0.347015593

Epoch:0020cost=0.344140861

Epoch:0021cost=0.341420144

Epoch:0022cost=0.338980592

Epoch:0023cost=0.336655581

Epoch:0024cost=0.334488012

Epoch:0025cost=0.332488823

Trainingphasefinished

Asyoucansee,duringthetrainingphasethecostfunctionisminimized.Attheendofthetest,weshowhowaccuratetheimplementedmodelis:

ModelAccuracy:0.9475

>>>

Finally,usingthefollowinglinesofcode,wecanvisualizethetrainingphaseofthenet:

plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

Trainingphaseinlogisticregression

Sourcecode#ImportMINSTdata

importinput_data




#Parameters

learning_rate=0.01

training_epochs=25

batch_size=100

display_step=1

#tfGraphInput

x=tf.placeholder("float",[None,784])

y=tf.placeholder("float",[None,10])

#Createmodel

#Setmodelweights

W=tf.Variable(tf.zeros([784,10]))


#Constructmodel

activation=tf.nn.softmax(tf.matmul(x,W)+b)

#Minimizeerrorusingcrossentropy

cross_entropy=y*tf.log(activation)


(-tf.reduce_sum\

(cross_entropy,reduction_indices=1))

optimizer=tf.train.\

GradientDescentOptimizer(learning_rate).minimize(cost)

#Plotsettings

avg_set=[]

epoch_set=[]

#Initializingthevariables


#Launchthegraph


sess.run(init)

#Trainingcycle


avg_cost=0.


#Loopoverallbatches


batch_xs,batch_ys=\


#Fittrainingusingbatchdata

sess.run(optimizer,\

feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,feed_dict=\

{x:batch_xs,\


#Displaylogsperepochstep


print"Epoch:",'%04d'%(epoch+1),\


avg_set.append(avg_cost)

epoch_set.append(epoch+1)


plt.plot(epoch_set,avg_set,'o',\

label='LogisticRegressionTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal\

(tf.argmax(activation,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"Modelaccuracy:",accuracy.eval({x:mnist.test.images,\


MultiLayerPerceptronAmorecomplexandefficientarchitectureisthatofMultiLayerPerceptron(MLP).Itissubstantiallyformedfrommultiplelayersofperceptrons,andthereforebythepresenceofatleastonehiddenlayer,thatisnotconnectedeithertotheinputsortotheoutputsofthenetwork:

TheMLParchitecture

Anetworkofthistypeistypicallytrainedusingsupervisedlearning,accordingtotheprinciplesoutlinedinthepreviousparagraph.Inparticular,atypicallearningalgorithmforMLPnetworksistheso-calledbackpropagation'salgorithm.

Note

Thebackpropagationalgorithmisalearningalgorithmforneuralnetworks.Itcomparestheoutputvalueofthesystemwiththedesiredvalue.Onthebasisofthedifferencethuscalculated(namely,theerror),thealgorithmmodifiesthesynapticweightsoftheneuralnetwork,byprogressivelyconvergingthesetofoutputvaluesofthedesiredones.

ItisimportanttonotethatinMLPnetworks,althoughyoudon'tknowthedesiredoutputsoftheneuronsofthehiddenlayersofthenetwork,itisalwayspossibleto

applyasupervisedlearningmethodbasedontheminimizationofanerrorfunctionviatheapplicationofgradient-descenttechniques.

Inthefollowingexample,weshowtheimplementationwithMLPforanimageclassificationproblem(MNIST).

MultiLayerPerceptronclassificationImportthenecessarylibraries:

importinput_data



Loadtheimagestoclassify:


FixsomeparametersfortheMLPmodel:

Learningrateofthenet:

learning_rate=0.001

Theepochs:

training_epochs=20

Thenumberofimagestoclassify:

batch_size=100

display_step=1

Thenumberofneuronsforthefirstlayer:

n_hidden_1=256

Thenumberofneuronsforthesecondlayer:

n_hidden_2=256

Thesizeoftheinput(eachimagehas784pixels):

n_input=784#MNISTdatainput(imgshape:28*28)

Thesizeofoftheoutputclasses:

n_classes=10

Itshouldthereforebenotedthatwhileforagivenapplication,theinputandoutputsizeisperfectlydefined,therearenostrictcriteriaforhowtodefinethenumberof

hiddenlayersandthenumberofneuronsforeachlayer.

Everychoicemustbebasedonexperienceofsimilarapplications,asinourcase:

Whenincreasingthenumberofhiddenlayers,weshouldalsoincreasethesizeofthetrainingsetthatisnecessaryandalsoincreasethenumberofconnectionstobeupdated,duringthelearningphase.Thisresultsinanincreaseinthetrainingtime.Also,iftherearetoomanyneuronsinthehiddenlayer,notonlyaretheremoreweightstobeupdatedbutthenetworkalsohasatendencytolearntoomuchfromthetrainingexamplesset,resultinginapoorgeneralizationability.Butthenifthehiddenneuronsaretoofew,thenetworkisnotabletolearnevenwiththetrainingset.

Buildthemodel

Theinputlayeristhextensor[1×784],whichrepresentstheimagetoclassify:

x=tf.placeholder("float",[None,n_input])

Theoutputtensoryisequaltothenumberofclasses:

y=tf.placeholder("float",[None,n_classes])

Inthemiddle,wehavetwohiddenlayers.Thefirstlayerisconstitutedbythehtensorofweights,whosesizeis[784×256],where256isthetotalnumberofnodesofthelayer:

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

Forlayer1,sowehavetodefinetherespectivebiasestensor:

bias_layer_1=tf.Variable(tf.random_normal([n_hidden_1]))

Eachneuronreceivesthepixelsofinputimagetobeclassifiedcombinedwiththehijweightconnectionsandaddedtotherespectivevaluesofthebiasestensor:

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

Itsendsitsoutputtotheneuronsofthenextlayerthroughtheactivationfunction.Itmustbesaidthatfunctionscanbedifferentfromoneneurontoanother,butinpractice,however,weadoptacommonfeatureforalltheneurons,typicallyofthesigmoidaltype.Sometimestheoutputneuronsareequippedwithalinearactivationfunction.Itisinterestingtonotethattheactivationfunctionsoftheneuronsinthe

hiddenlayerscannotbelinearbecause,inthiscase,theMLPnetworkwouldbeequivalenttoanetworkwithtwolayersandthereforenolongeroftheMLPtype.Thesecondlayermustperformthesamestepsasthefirst.

Thesecondintermediatelayerisrepresentedbytheshapeoftheweightstensor[256×256]:

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

Withthetensorofbiases:


Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwiththeweightWijconnectionsandaddedtotherespectivebiasesoflayer2:

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

Itsendsitsoutputtothenextlayer,namelytheoutputlayer:

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

bias_output=tf.Variable(tf.random_normal([n_classes]))

output_layer=tf.matmul(layer_2,output)+bias_output

Theoutputlayerreceivesasinputn-stimuli(256)comingfromlayer2,whichisconvertedtotherespectiveclassesofprobabilityforeachnumber.

Asforthelogisticregression,wethendefinethecostfunction:


(tf.nn.softmax_cross_entropy_with_logits\

(output_layer,y))

TheTensorFlowfunctiontf.nn.softmax_cross_entropy_with_logitscomputesthecostforasoftmaxlayer.Itisonlyusedduringtraining.Thelogitsaretheunnormalizedlogprobabilitiesoutputthemodel(thevaluesoutputbeforethesoftmaxnormalizationisappliedtothem).

Thecorrespondingoptimizerthatminimizesthecostfunctionis:

optimizer=tf.train.AdamOptimizer\

(learning_rate=learning_rate).minimize(cost)

tf.train.AdamOptimizerusesKingmaandBa'sAdamalgorithmtocontrolthelearningrate.Adamoffersseveraladvantagesoverthesimpletf.train.GradientDescentOptimizer.Infact,itusesalargereffectivestepsize,andthealgorithmwillconvergetothisstepsizewithoutfinetuning.

Asimpletf.train.GradientDescentOptimizercouldequallybeusedinyourMLP,butwouldrequiremorehyperparametertuningbeforeitcouldconvergeasquickly.

Note

TensorFlowprovidestheoptimizerbaseclasstocomputegradientsforalossandapplygradientstovariables.ThisclassdefinestheAPItoaddopstotrainamodel.Youneverusethisclassdirectly,butinsteadinstantiateoneofitssubclasses.Seehttps://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#Optimizertoseetheoptimizerimplemented.

Launchthesession

Thefollowingarethestepstolaunchthesession:

1. Plotthesettings:

avg_set=[]

epoch_set=[]

2. Initializethevariables:


3. Launchthegraph:


sess.run(init)

4. Definethetrainingcycle:


avg_cost=0.


5. Loopoverallthebatches(100):


batch_xs,batch_ys=


6. Fittrainingusingthebatchdata:

https://www.tensorflow.org/versions/r0.8/api_docs/python/train.html#Optimizer

sess.run(optimizer,feed_dict={x:batch_xs,y:

batch_ys})

7. Computetheaverageloss:

avg_cost+=sess.run(cost,feed_dict={x:batch_xs,\


Displaylogsperepochstep







8. Withtheselinesofcodes,weplotthetrainingphase:

plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

9. Finally,wecantesttheMLPmodel:

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

evaluatingitsaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,

"float"))

print"ModelAccuracy:",accuracy.eval({x:

mnist.test.images,\


10. Hereistheoutputresultafter20epochs:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.1

20151010]onlinux2Type"copyright","credits"or"license()"for

moreinformation.

>>>==========================RESTART

==============================

>>>

Succesfullydownloadedtrain-images-idx3-ubyte.gz9912422bytes.


Succesfullydownloadedtrain-labels-idx1-ubyte.gz28881bytes.


Succesfullydownloadedt10k-images-idx3-ubyte.gz1648877bytes.


Succesfullydownloadedt10k-labels-idx1-ubyte.gz4542bytes.


Epoch:0001cost=1.723947845

Epoch:0002cost=0.539266024

Epoch:0003cost=0.362600502

Epoch:0004cost=0.266637279

Epoch:0005cost=0.205345784

Epoch:0006cost=0.159139332

Epoch:0007cost=0.125232637

Epoch:0008cost=0.098572041

Epoch:0009cost=0.077509963

Epoch:0010cost=0.061127526

Epoch:0011cost=0.048033808

Epoch:0012cost=0.037297983

Epoch:0013cost=0.028884999

Epoch:0014cost=0.022818390

Epoch:0015cost=0.017447586

Epoch:0016cost=0.013652348

Epoch:0017cost=0.010417282

Epoch:0018cost=0.008079228

Epoch:0019cost=0.006203546

Epoch:0020cost=0.004961207

Trainingphasefinished

ModelAccuracy:0.9775

>>>

Weshowthetrainingphaseinthefollowingfigure:

TrainingphaseinMultiLayerPerceptron

Sourcecode#ImportMINSTdata

importinput_data




#Parameters

learning_rate=0.001

training_epochs=20

batch_size=100

display_step=1

#NetworkParameters

n_hidden_1=256#1stlayernumfeatures

n_hidden_2=256#2ndlayernumfeatures


n_classes=10#MNISTtotalclasses(0-9digits)

#tfGraphinput

x=tf.placeholder("float",[None,n_input])

y=tf.placeholder("float",[None,n_classes])

#weightslayer1

h=tf.Variable(tf.random_normal([n_input,n_hidden_1]))

#biaslayer1


#layer1

layer_1=tf.nn.sigmoid(tf.add(tf.matmul(x,h),bias_layer_1))

#weightslayer2

w=tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2]))

#biaslayer2


#layer2

layer_2=tf.nn.sigmoid(tf.add(tf.matmul(layer_1,w),bias_layer_2))

#weightsoutputlayer

output=tf.Variable(tf.random_normal([n_hidden_2,n_classes]))

#biaroutputlayer

bias_output=tf.Variable(tf.random_normal([n_classes]))

#outputlayer

output_layer=tf.matmul(layer_2,output)+bias_output

#costfunction


(tf.nn.softmax_cross_entropy_with_logits(output_layer,y))

#optimizer

optimizer=tf.train.AdamOptimizer\


#Plotsettings

avg_set=[]

epoch_set=[]



#Launchthegraph


sess.run(init)

#Trainingcycle


avg_cost=0.


#Loopoverallbatches


batch_xs,batch_ys=mnist.train.next_batch(batch_size)


sess.run(optimizer,feed_dict={x:batch_xs,y:batch_ys})

#Computeaverageloss

avg_cost+=sess.run(cost,\

feed_dict={x:batch_xs,\


#Displaylogsperepochstep







plt.plot(epoch_set,avg_set,'o',label='MLPTrainingphase')

plt.ylabel('cost')

plt.xlabel('epoch')

plt.legend()

plt.show()

#Testmodel

correct_prediction=tf.equal(tf.argmax(output_layer,1),\

tf.argmax(y,1))

#Calculateaccuracy

accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

print"ModelAccuracy:",accuracy.eval({x:mnist.test.images,\


MultiLayerPerceptronfunctionapproximationInthefollowingexample,weimplementanMLPnetworkthatwillbeabletolearnthetrendofanarbitraryfunctionf(x).Inthetrainingphasethenetworkwillhavetolearnfromaknownsetofpoints,thatisxandf(x),whileinthetestphasethenetworkwilldeductthevaluesoff(x)onlyfromthexvalues.

Thisverysimplenetworkwillbebuiltbyasinglehiddenlayer.

Importthenecessarylibraries:


importnumpyasnp

importmath,random


Webuildthedatamodel.Thefunctiontobelearnedwillfollowthetrendofthecosinefunction,evaluatedfor1000pointstowhichweaddaverylittlerandomerror(noise)toreproducearealcase:

NUM_points=1000

np.random.seed(NUM_points)

function_to_learn=lambdax:np.cos(x)+\

0.1*np.random.randn(*x.shape)

OurMLPnetworkwillbeformedbyahiddenlayerof10neurons:

layer_1_neurons=10

Thenetworklearnsfor100pointsatatimetoatotalof1500learningcycles(epochs):

batch_size=100

NUM_EPOCHS=1500

Finally,weconstructthetrainingsetandthetestset:

all_xcontienetuttiipunti

all_x=np.float32(np.random.uniform\

(-2*math.pi,2*math.pi,\

(1,NUM_points))).T

np.random.shuffle(all_x)

train_size=int(900)

Thefirst900pointsareinthetrainingset:

x_training=all_x[:train_size]

y_training=function_to_learn(x_training)

Thelast100willbeinthevalidationset:

x_validation=all_x[train_size:]

y_validation=function_to_learn(x_validation)

Usingmatplotlib,wedisplaythesesets:

plt.figure(1)

plt.scatter(x_training,y_training,c='blue',label='train')

plt.scatter(x_validation,y_validation,c='red',label='validation')

plt.legend()

plt.show()

Trainingandvalidationset

Buildthemodel

First,wecreatetheplaceholdersfortheinputtensor(X)andtheoutputtensor(Y):

X=tf.placeholder(tf.float32,[None,1],name="X")

Y=tf.placeholder(tf.float32,[None,1],name="Y")

Thenwebuildthehiddenlayerof[1x10]dimensions:

w_h=tf.Variable(tf.random_uniform([1,layer_1_neurons],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_h=tf.Variable(tf.zeros([1,layer_1_neurons],\

dtype=tf.float32))

ItreceivestheinputvaluefromtheXinputtensor,combinedwiththeweightw_hijconnectionsandaddedwiththerespectivebiasesoflayer1:

h=tf.nn.sigmoid(tf.matmul(X,w_h)+b_h)

Theoutputlayerisa[10x1]tensor:

w_o=tf.Variable(tf.random_uniform([layer_1_neurons,1],\

minval=-1,maxval=1,\

dtype=tf.float32))

b_o=tf.Variable(tf.zeros([1,1],dtype=tf.float32))

Eachneuroninthissecondlayerreceivesinputsfromtheneuronsoflayer1,combinedwithweightw_oijconnectionsandaddedtogetherwiththerespectivebiasesoftheoutputlayer:

model=tf.matmul(h,w_o)+b_o

Wethendefineouroptimizerforthenewlydefinedmodel:

train_op=tf.train.AdamOptimizer().minimize\

(tf.nn.l2_loss(model-Y))

Wealsonotethatinthiscase,thecostfunctionadoptedisthefollowing:

tf.nn.l2_loss(model-Y)

Thetf.nn.l2_lossfunctionisaTensorFlowthatcomputeshalftheL2normofa

tensorwithoutthesqrt,thatis,theoutputfortheprecedingfunctionisasfollows:

output=sum((model-Y)**2)/2

Thetf.nn.l2_lossfunctioncanbeaviablecostfunctionforourexample.

Launchthesession

Let'sbuildtheevaluationgraph:

sess=tf.Session()

sess.run(tf.initialize_all_variables())

Nowwecanlaunchthelearningsession:

errors=[]

foriinrange(NUM_EPOCHS):

forstart,endinzip(range(0,len(x_training),batch_size),\

range(batch_size,\

len(x_training),batch_size)):

sess.run(train_op,feed_dict={X:x_training[start:end],\

Y:y_training[start:end]})

cost=sess.run(tf.nn.l2_loss(model-y_validation),\

feed_dict={X:x_validation})

errors.append(cost)

ifi%100==0:print"epoch%d,cost=%g"%(i,cost)

Runningthisnetworkfor1400epochs,we'llseetheerrorprogressivelyreducingandeventuallyconverging:

Python2.7.10(default,Oct142015,16:09:02)[GCC5.2.120151010]

onlinux2Type"copyright","credits"or"license()"formore

information.

>>>=======================RESTART============================

>>>

epoch0,cost=55.9286

epoch100,cost=22.0084









epoch1000,cost=0.668316

epoch1100,cost=0.633737

epoch1200,cost=0.608306

epoch1300,cost=0.590429

epoch1400,cost=0.574602

>>>

Thefollowinglinesofcodeallowustodisplayhowthecostchangesintherunningepochs:

plt.plot(errors,label='MLPFunctionApproximation')

plt.xlabel('epochs')

plt.ylabel('cost')

plt.legend()

plt.show()

TrainingphaseinMultiLayerPerceptron

SummaryInthischapter,weintroducedartificialneuralnetworks.Anartificialneuronisamathematicalmodelthattosomeextentmimicsthepropertiesofalivingneurons.Eachneuronofthenetworkhasaverysimpleoperationwhichconsistsofbecomingactiveifthetotalamountofsignalthatitreceivesexceedsalookattheactivationthreshold.Thelearningprocessistypicallysupervised:theneuralnetusesatrainingsettoinfertherelationshipbetweentheinputandthecorrespondingoutput,whilethelearningalgorithmmodifiestheweightsofthenetinordertominimizeacostfunctionthatrepresentstheforecasterrorrelatingtothetrainingset.Ifthetrainingissuccessful,theneuralnetwillbeabletomakeforecastsevenwheretheoutputisnotknownapriori.Inthischapterweimplemented,usingTensorFlow,someexamplesinvolvingneuralnetworks.WehaveseenneuralnetsusedtosolveclassificationandregressionsproblemsasthelogisticregressionalgorithminaclassificationproblemusingtheRosemblatt'sPerceptron.Attheendofthechapter,weintroducedtheMultiLayerPerceptronarchitecture,whichwehaveseeninactionpriortotheimplementationofanimageclassifier,thenforasimulatorofmathematicalfunctions.

Inthenextchapter,wefinallyintroducedeeplearningmodels;wewillexamineandimplementmorecomplexneuralnetworkarchitectures,suchasaconvolutionalneuralnetworkandarecurrentneuralnetwork.

Chapter5.DeepLearningInthischapter,wewillcoverthefollowingtopics:

DeeplearningtechniquesConvolutionalneuralnetwork(CNN)

CNNarchitectureTensorFlowimplementationofaCNN

Recurrentneuralnetwork(RNN)RNNarchitectureNaturalLanguageProcessingwithTensorFlow

DeeplearningtechniquesDeeplearningtechniquesareacrucialstepforwardtakenbythemachinelearningresearchersinrecentdecades,havingprovidedsuccessfulresultseverseenbeforeinmanyapplications,suchasimagerecognitionandspeechrecognition.

Thereareseveralreasonsthatledtodeeplearningbeingdevelopedandplacedatthecenterofattentioninthescopeofmachinelearning.Oneofthesereasonsisrepresentedbytheprogressinhardware,withtheavailabilityofnewprocessors,suchasgraphicsprocessingunits(GPUs),whichhavegreatlyreducedthetimeneededfortrainingnetworks,loweringthem10/20times.

Anotherreasoniscertainlytheincreasingeaseoffindingevermorenumerousdatasetsonwhichtotrainasystem,neededtotrainarchitecturesofacertaindepthandwithhighdimensionalityoftheinputdata.Deeplearningconsistsofasetofmethodsthatallowasystemtoobtainahierarchicalrepresentationofthedataonmultiplelevels.Thisisachievedbycombiningsimpleunits(notlinear),eachofwhichtransformstherepresentationatitsownlevel,startingfromtheinputlevel,toarepresentationatahigher,levelslightlymoreabstract.Withasufficientnumberofthesetransformations,considerablycomplexinput-outputfunctionscanbelearned.

Withreferencetoaclassificationproblem,forexample,thehighestlevelsofrepresentation,highlighttheaspectsoftheinputdatathatarerelevantfortheclassification,suppressingtheonesthathavenoeffectontheclassificationpurposes.

Hierarchicalfeatureextractioninanimageclassificationsystem

Theprecedingschemedescribesthefeaturesoftheimageclassificationsystem(afacerecognizer):eachblockgraduallyextractsthefeaturesoftheinputimage,goingtoprocessdataalreadypre-processedfromthepreviousblocks,extractingincreasinglycomplexfeaturesoftheinputimage,andthusbuildingthehierarchicaldatarepresentationthatcharacterizesadeeplearning-basedsystem.

Apossiblerepresentationofthefeaturesofthehierarchycouldbeasfollows:

pixel-->edge-->texture-->motif-->part-->object

Inatextrecognitionproblem,however,thehierarchicalrepresentationcanbestructuredasfollows:

character-->word-->wordgroup-->clause-->sentence-->story

Adeeplearningarchitectureis,therefore,amulti-levelarchitecture,consistingofsimpleunits,allsubjecttotraining,manyofwhichcarrynon-lineartransformations.Eachunittransformsitsinputtoimproveitspropertiestoselectandamplifyonlytherelevantaspectsforclassificationpurposes,anditsinvariance,namelyitspropensitytoignoretheirrelevantaspectsandnegligible.

Withmultiplelevelsofnon-lineartransformations,therefore,withadepthapproximatelybetween5and20levels,adeeplearningsystemcanlearnandimplementextremelyintricateandcomplexfunctions,simultaneouslyverysensitivetothesmallestrelevantdetails,andextremelyinsensitiveandindifferenttolargevariationsofirrelevantaspectsoftheinputdatawhichcanbe,inthecaseofobjectrecognition:image'sbackground,brightness,orthepositionoftherepresentedobject.

Thefollowingsectionswillillustrate,withtheaidofTensorFlow,twoimportanttypesofdeepneuralnetworks:theconvolutionalneuralnetworks(CNNs),mainlyaddressedtotheclassificationproblems,andthentherecurrentneuralnetworks(RNNs),targetingNaturalLanguageProcessing(NLP)issues.

ConvolutionalneuralnetworksConvolutionalneuralnetworks(CNNs)areaparticulartypeofneuralnetwork-orienteddeeplearningthathaveachievedexcellentresultsinmanypracticalapplications,inparticulartheobjectrecognitioninimages.

Infact,CNNsaredesignedtoprocessdatarepresentedintheformofmultiplearrays,forexample,thecolorimages,representablebymeansofthreetwo-dimensionalarrayscontainingthepixel'scolorintensity.ThesubstantialdifferencebetweenCNNsandordinaryneuralnetworksisthattheformeroperatedirectlyontheimageswhilethelatteronfeaturesextractedfromthem.TheinputofaCNN,therefore,unlikethatofanordinaryneuralnetwork,willbetwo-dimensional,andthefeatureswillbethepixelsoftheinputimage.

TheCNNisthedominantapproachforalmostalltheproblemsofrecognition.Thespectacularperformanceofferedbynetworksofthistypehaveinfactpromptedthebiggestcompaniesintechnology,suchasGoogleandFacebook,toinvestinresearchanddevelopmentprojectsfornetworksofthiskind,andtodevelopanddistributeproductsimagerecognitionbasedonCNNs.

CNNarchitecture

TheCNNusethreebasicideas:localreceptivefields,convolution,andpooling.

Inconvolutionalnetworks,weconsiderinputassomethingsimilartowhatisshowninthefollowingfigure:

Inputneurons

OneoftheconceptsbehindCNNsislocalconnectivity.CNNs,infact,utilizespatial

correlationsthatmayexistwithintheinputdata.Eachneuronofthefirstsubsequentlayerconnectsonlysomeoftheinputneurons.Thisregioniscalledlocalreceptivefield.Inthefollowingfigure,itisrepresentedbytheblack5x5squarethatconvergestoahiddenneuron:

Frominputtohiddenneurons

Thehiddenneuron,ofcourse,willonlyprocesstheinputdatainsideofitsreceptivefield,notrealizingthechangesoutsideofthat.However,itiseasytoseethat,bysuperimposingseverallayers,thatarelocallyconnected,levelingupyouwillhaveunitsthatprocessmoreandmoreglobaldatacomparedtoinput,inaccordancewiththebasicprincipleofdeeplearning,tobringtheperformancetoalevelofabstractionthatisalwaysgrowing.

Note

Thereasonforthelocalconnectivityresidesinthefactthatindataofarraysform,suchastheimages,thevaluesareoftenhighlycorrelated,formingdistinctgroupsofdatathatcanbeeasilyidentified.

Eachconnectionlearnsaweight(soitwillget5x5=25),insteadofthehiddenneuronwithanassociatedconnectinglearnsatotalbias,thenwearegoingtoconnecttheregionstoindividualneuronsbyperformingashiftfromtimetotime,asinthefollowingfigures:

Theconvolutionoperation

Thisoperationiscalledconvolution.Doingso,ifwehaveanimageof28x28inputsand5x5regions,wewillget24x24neuronsinthehiddenlayer.Wesaidthateachneuronhasabiasand5x5weightsconnectedtotheregion:wewillusetheseweightsandbiasesforall24x24neurons.Thismeansthatalltheneuronsinthefirsthiddenlayerwillrecognizethesamefeatures,justplaceddifferentlyintheinputimage.Forthisreason,themapofconnectionsfromtheinputlayertothehiddenfeaturemapiscalledsharedweightsandbiasiscalledsharedbias,sincetheyareinfactshared.

Obviously,weneedtorecognizeanimageofmorethanamapoffeatures,soacompleteconvolutionallayerismadefrommultiplefeaturemaps.

Multiplefeaturemaps

Intheprecedingfigure,weseethreefeaturemaps;ofcourse,itsnumbercanincreaseinpracticeandyoucangettouseconvolutionallayerswitheven20or40featuremaps.Agreatadvantageinthesharingofweightsandbiasisthesignificantreductionoftheparametersinvolvedinaconvolutionalnetwork.Consideringourexample,foreachfeaturemapweneed25weights(5x5)andabias(shared);thatis26parametersintotal.Assumingwehave20featuremaps,wewillhave520parameterstobedefined.Withafullyconnectednetwork,with784inputneuronsand,forexample,30hiddenlayerneurons,weneed30more784x30biasweights,reachingatotalof23.550parameters.

Thedifferenceisevident.Theconvolutionalnetworksalsousepoolinglayers,whicharelayersimmediatelypositionedaftertheconvolutionallayers;thesesimplifytheoutputinformationofthepreviouslayertoit(theconvolution).Ittakestheinputfeaturemapscomingoutoftheconvolutionallayerandpreparesacondensedfeaturemap.Forexample,wecansaythatthepoolinglayercouldbesummedup,inallitsunits,ina2x2regionofneuronsofthepreviouslayer.

Thistechniqueiscalledpoolingandcanbesummarizedwiththefollowingscheme:

Thepoolingoperationhelpstosimplifytheinformationfromalayertothenext

Obviously,weusuallyhavemorefeaturesmapsandweapplythemaximumpoolingtoeachofthemseparately.

Fromtheinputlayertothesecondhiddenlayer

Sowehavethreefeaturemapsofsize24x24forthefirsthiddenlayer,andthesecondhiddenlayerwillbeofsize12x12,sinceweareassumingthatforeveryunitsummarizea2x2region.

Combiningthesethreeideas,weformacompleteconvolutionalnetwork.Itsarchitecturecanbedisplayedasfollows:

ACNNsarchitecturalschema

Let'ssummarize:therearethe28x28inputneuronsfollowedbyaconvolutionallayerwithalocalreceptivefield5x5and3featuremaps.Weobtainasaresultofahiddenlayerofneurons3x24x24.Thenthereisthemax-poolingappliedto2x2onthe3regionsoffeaturemapsgettingahiddenlayer3x12x12.Thelastlayerisfullyconnected:itconnectsalltheneuronsofthemax-poolinglayertoall10outputneurons,usefultorecognizethecorrespondingoutput.

Thisnetworkwillthenbetrainedbygradientdescentandthebackpropagationalgorithm.

TensorFlowimplementationofaCNN

Inthefollowingexample,wewillseeinactiontheCNNinaproblemofimageclassification.WewanttoshowtheprocessofbuildingaCNNnetwork:whatarethestepstoexecuteandwhatreasoningneedstobedonetorunaproperdimensioningoftheentirenetwork,andofcoursehowtoimplementitwithTensorFlow.

Initializationstep

1. LoadandpreparetheMNISTdata:


importinput_data


2. DefinealltheCNNparameters:

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

3. MNISTdatainput(eachshapeisof28x28arraypixels):

n_input=784

4. TheMNISTtotalclasses(0-9digits)

n_classes=10

5. Toreducetheoverfitting,weapplythedropouttechnique.Thistermreferstodroppingoutunits(hidden,input,andoutput)inaneuralnetwork.Decidingwhichneuronstoeliminateisrandom;onewayistoapplyaprobability,asweshallseeinourcode.Forthisreason,wedefinethefollowingparameter(tobetuned):

dropout=0.75

6. Definetheplaceholdersfortheinputgraph.ThexplaceholdercontainstheMNISTdatainput(exactly728pixels):

x=tf.placeholder(tf.float32,[None,n_input])

7. Thenwechangetheformof4Dinputimagestoatensor,usingtheTensorFlowreshapeoperator:

_X=tf.reshape(x,shape=[-1,28,28,1])

Thesecondandthirddimensionscorrespondtothewidthandheightoftheimage,whilethelatterdimensionisthetotalnumberofcolorchannels(inourcase1).

Sowecandisplayourinputimageasatwo-dimensionaltensor,ofsize28x28:

Theinputtensorforourproblem

Theoutputtensorwillcontaintheoutputprobabilityforeachdigittoclassify:

y=tf.placeholder(tf.float32,[None,n_classes]).

Firstconvolutionallayer

Eachneuronofthehiddenlayerisconnectedtoasmallsubsetoftheinputtensorofdimension5x5.Thisimpliesthatthehiddenlayerwillhavea24x24size.Wealsodefineandinitializethetensorsofsharedweightsandsharedbias:

wc1=tf.Variable(tf.random_normal([5,5,1,32]))

bc1=tf.Variable(tf.random_normal([32]))

Recallthatinordertorecognizeanimage,weneedmorethanamapoffeatures.Thenumberisjustthenumberoffeaturemapsweareconsideringforthisfirstlayer.Inourcase,theconvolutionallayeriscomposedof32featuremaps.

Thenextstepistheconstructionofthefirstconvolutionlayer,conv1:

conv1=conv2d(_X,wc1,bc1)

Here,conv2disthefollowingfunction:

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

Forthispurpose,weusedtheTensorFlowtf.nn.conv2dfunction.Itcomputesa2Dconvolutionfromtheinputtensorandthesharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Forthispurpose,weusedthefunctiontf.nn.conv2dtocomputea2-Dconvolutionfromtheinputtensorandthetensorofsharedweights.Theresultofthisoperationwillbethenaddedtothebiasesbc1matrix.Whiletf.nn.reluistheRelufunction(Rectifiedlinearunit)thatistheusualactivationfunctioninthehiddenlayerofadeepneuralnetwork.

Wewillapplythisactivationfunctiontothereturnvaluethatwehavewiththeconvolutionfunction.Thepaddingvalueis'SAME',whichindicatesthattheoutputtensoroutputwillhavethesamesizeofinputtensor.

Onewaytorepresenttheconvolutionallayer,namelyconv1,isasfollows:

Thefirsthiddenlayer

Aftertheconvolutionoperation,weimposethepoolingstepthatsimplifiestheoutputinformationofthepreviouslycreatedconvolutionallayer.

Inourexample,let'stakea2x2regionoftheconvolutionlayerandwewillsummarizetheinformationateachpointinthepoolinglayer.

conv1=max_pool(conv1,k=2)

Here,forthepoolingoperation,wehaveimplementedthefollowingfunction:

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

Thetf.nn.max_poolfunctionperformsthemaxpoolingontheinput.Ofcourse,weapplythemaxpoolingforeachconvolutionallayer,andtherewillbemanylayersofpoolingandconvolution.Attheendofthepoolingphase,we'llhave12x12x32convolutionalhiddenlayers.

ThenextfigureshowstheCNNslayersafterthepoolingandconvolutionoperation:

TheCNNsafterafirstconvolutionandpoolingoperations

Thelastoperationistoreducetheoverfittingbyapplyingthetf.nn.dropoutTensorFlowoperatorsontheconvolutionallayer.Todothis,wecreateaplaceholderfortheprobability(keep_prob)thataneuron'soutputiskeptduringthedropout:

keep_prob=tf.placeholder(tf.float32)

conv1=tf.nn.dropout(conv1,keep_prob)

Secondconvolutionallayer

Forthesecondhiddenlayer,wemustapplythesameoperationsasthefirstlayer,andsowedefineandinitializethetensorsofsharedweightsandsharedbias:



Asyoucannote,thissecondhiddenlayerwillhave64featuresfora5x5window,whilethenumberofinputlayerswillbegivenfromthefirstconvolutionalobtainedlayer.Wenextapplyasecondlayertotheconvolutionalconv1tensor,butthistimeweapply64setsof5x5filterseachtothe32conv1layers:

conv2=conv2d(conv1,wc2,bc2)

Itgiveus6414x14arrayswhichwereducewithmaxpoolingto647x7arrays:


Finally,weagainusethedropoutoperation:


Theresultinglayerisa7x7x64convolutiontensorbecausewestartedfromtheinputtensor12x12andaslidingwindowof5x5,consideringthathasastrideof1.

Buildingthesecondhiddenlayer

Denselyconnectedlayer

Inthisstep,webuildadenselyconnectedlayerthatweusetoprocesstheentireimage.Theweightandbiastensorsareasfollows:

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

bd1=tf.Variable(tf.random_normal([1024]))

Asyoucannote,thislayerwillbeformedby1024neurons.

Thenwereshapethetensorfromthesecondconvolutionallayerintoabatchofvectors:

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

Multiplythistensorbytheweightmatrix,wd1,addthetensorbias,bd1,andapplyaRELUoperation:

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

Wecompletethislayerbyagainusingthedropoutoperator:

dense1=tf.nn.dropout(dense1,keep_prob)

Readoutlayer

Thelastlayerdefinesthetensorswoutandbout:

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bout=tf.Variable(tf.random_normal([n_classes]))

Beforeapplyingthesoftmaxfunction,wemustcalculatetheevidencethattheimagebelongstoacertainclass:

pred=tf.add(tf.matmul(dense1,wout),bout)

Testingandtrainingthemodel

Theevidencemustbeconvertedintoprobabilitiesforeachofthe10possibleclasses(themethodisidenticaltowhatwesawinChapter4,IntroducingNeuralNetworks).Sowedefinethecostfunction,whichevaluatesthequalityofourmodel,byapplyingthesoftmaxfunction:

cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred,

y))

Anditsfunctionoptimization,usingtheTensorFlowAdamOptimizerfunction:

optimizer=

tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

Thefollowingtensorwillserveintheevaluationphaseofthemodel:

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))

Launchingthesession

Initializethevariables:


Buildtheevaluationgraph:


sess.run(init)

step=1

Let'strainthenetuntiltraining_iters:

whilestep*batch_size<training_iters:


Fittrainingusingthebatchdata:

sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

Calculatetheaccuracy:

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

Calculatetheloss:

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

Weprinttheaccuracyforthe256MNISTtestimages:

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

Runningthecode,wehavethefollowingoutput:





Iter1280,MinibatchLoss=27900.769531,

TrainingAccuracy=0.17188

Iter2560,MinibatchLoss=17168.949219,TrainingAccuracy=0.21094






Iter10240,MinibatchLoss=5066.223633,TrainingAccuracy=0.70312.

...................

....................
















OptimizationFinished!

TestingAccuracy:0.921875

Itprovidesanaccuracyofabout99.2%.Obviously,itdoesnotrepresentthestateoftheart,becausethepurposeoftheexampleistojustseehowtobuildaCNN.Themodelcanbefurtherrefinedtogivebetterresults.

Sourcecode

#ImportMINSTdata

importinput_data



#Parameters

learning_rate=0.001

training_iters=100000

batch_size=128

display_step=10

#NetworkParameters


n_classes=10#MNISTtotalclasses(0-9digits)

dropout=0.75#Dropout,probabilitytokeepunits

#tfGraphinput

x=tf.placeholder(tf.float32,[None,n_input])

y=tf.placeholder(tf.float32,[None,n_classes])

#dropout(keepprobability)

keep_prob=tf.placeholder(tf.float32)

#Createmodel

defconv2d(img,w,b):

returntf.nn.relu(tf.nn.bias_add\

(tf.nn.conv2d(img,w,\

strides=[1,1,1,1],\

padding='SAME'),b))

defmax_pool(img,k):

returntf.nn.max_pool(img,\

ksize=[1,k,k,1],\

strides=[1,k,k,1],\

padding='SAME')

#Storelayersweight&bias

#5x5conv,1input,32outputs



#5x5conv,32inputs,64outputs



#fullyconnected,7*7*64inputs,1024outputs

wd1=tf.Variable(tf.random_normal([7*7*64,1024]))

#1024inputs,10outputs(classprediction)

wout=tf.Variable(tf.random_normal([1024,n_classes]))

bd1=tf.Variable(tf.random_normal([1024]))

bout=tf.Variable(tf.random_normal([n_classes]))

#Constructmodel

_X=tf.reshape(x,shape=[-1,28,28,1])

#ConvolutionLayer

conv1=conv2d(_X,wc1,bc1)

#MaxPooling(down-sampling)


#ApplyDropout


#ConvolutionLayer

conv2=conv2d(conv1,wc2,bc2)

#MaxPooling(down-sampling)


#ApplyDropout


#Fullyconnectedlayer

#Reshapeconv2outputtofitdenselayerinput

dense1=tf.reshape(conv2,[-1,wd1.get_shape().as_list()[0]])

#Reluactivation

dense1=tf.nn.relu(tf.add(tf.matmul(dense1,wd1),bd1))

#ApplyDropout

dense1=tf.nn.dropout(dense1,keep_prob)

#Output,classprediction

pred=tf.add(tf.matmul(dense1,wout),bout)

#Definelossandoptimizer


(tf.nn.softmax_cross_entropy_with_logits(pred,y))

optimizer=\

tf.train.AdamOptimizer\


#Evaluatemodel

correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))



#Launchthegraph


sess.run(init)

step=1

#Keeptraininguntilreachmaxiterations

whilestep*batch_size<training_iters:



sess.run(optimizer,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:dropout})

ifstep%display_step==0:

#Calculatebatchaccuracy

acc=sess.run(accuracy,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

#Calculatebatchloss

loss=sess.run(cost,feed_dict={x:batch_xs,\

y:batch_ys,\

keep_prob:1.})

print"Iter"+str(step*batch_size)+\

",MinibatchLoss="+\

"{:.6f}".format(loss)+\

",TrainingAccuracy="+\

"{:.5f}".format(acc)

step+=1

print"OptimizationFinished!"

#Calculateaccuracyfor256mnisttestimages

print"TestingAccuracy:",\

sess.run(accuracy,\

feed_dict={x:mnist.test.images[:256],\

y:mnist.test.labels[:256],\

keep_prob:1.})

RecurrentneuralnetworksAnotherdeeplearning-orientedarchitectureisthatoftheso-calledrecurrentneuralnetworks(RNNs).ThebasicideaofRNNsistomakeuseofthesequentialinformationtypeintheinput.Inneuralnetworks,wetypicallyassumethateachinputandoutputisindependentfromalltheothers.Formanytypesofproblems,however,thisassumptiondoesnotresulttobepositive.Forexample,ifyouwanttopredictthenextwordofaphrase,itiscertainlyimportanttoknowthosethatprecedeit.Theseneuralnetsarecalledrecurrentbecausetheyperformthesamecomputationsforallelementsofasequenceofinputs,andtheoutputeachelementdepends,inadditiontothecurrentinput,onallpreviouscomputations.

RNNarchitecture

RNNsprocessasequentialinputitematatime,maintainingasortofupdatedstatevectorthatcontainsinformationaboutallpastelementsofthesequence.Ingeneral,anRNNhasashapeofthefollowingtype:

RNNarchitectureschema

TheprecedingfigureshowstheaspectofanRNN,withitsunfoldedversion,explainingthenetworkstructureforthewholesequenceofinputs,ateachinstantoftime.Itbecomesclearthat,differentlyfromthetypicalmulti-levelneuralnetworks,

whichuseseveralparametersateachlevel,anRNNalwaysusesthesameparameters,denominatedU,V,andW(seethepreviousfigure).Furthermore,anRNNperformsthesamecomputationateachinstant,onmultipleofthesamesequenceininput.Sharingthesameparameters,itstronglyreducesthenumberofparametersthatthenetworkmustlearnduringthetrainingphase,thusalsoimprovingthetrainingtime.

Itisalsoevidenthowyoucantrainnetworksofthistype,infact,becausetheparametersaresharedforeachinstantoftime,thegradientcalculatedforeachoutputdependsnotonlyfromthecurrentcomputationbutalsofromthepreviousones.Forexample,tocalculatethegradientattimet=4,itisnecessarytobackpropagatethegradientforthethreepreviousinstantsoftimeandthensumthegradientsthusobtained.Also,theentireinputsequenceistypicallyconsideredtobeasingleelementofthetrainingset.

However,thetrainingofthistypeofnetworksuffersfromtheso-calledvanishing/explodinggradientproblem;thegradients,computedandbackpropagated,tendtoincreaseordecreaseateachinstantoftimeandthen,afteracertainnumberofinstantsoftime,divergetoinfinityorconvergetozero.

LetusnowexaminehowanRNNoperates.Xt;isthenetworkinputatinstantt,whichcouldbe,forexample,avectorthatrepresentsawordofasentence,whileSt;isthestatevectorofthenet.Itcanbeconsideredasortofmemoryofthesystemwhichcontainsinformationonallthepreviouselementsoftheinputsequence.Thestatevectoratinstanttisevaluatedstartingfromthecurrentinput(timet)andthestatusevaluatedatthepreviousinstant(timet-1)throughtheUandWparameters:

St=f([U]Xt+[W]St-1)

Thefunctionfisanonlinearfunctionsuchasrectifiedlinearunit(ReLu),whileOt;istheoutputatinstantt,calculatedusingtheparameterV.

Theoutputwilldependonthetypeofproblemforthewhichthenetworkisused.Forexample,ifyouwanttopredictthenextwordofasentence,itcouldbeaprobabilityvectorwithrespecttoeachwordinthevocabularyofthesystem.

LSTMnetworks

LongSharedTermMemory(LSTM)networksareanextensionofthebasicmodelofRNNarchitectures.Themainideaistoimprovethenetwork,providingitwithan

explicitmemory.TheLSTMnetworks,infact,despitenothavinganessentiallydifferentarchitecturefromRNN,areequippedwithspecialhiddenunits,calledmemorycells,thebehaviorofwhichistorememberthepreviousinputforalongtime.

ALSTM)unit

TheLSTMunithasthreegatesandfourinputweights,xt(fromthedatatotheinputandthreegates),whilehtistheoutputoftheunit.

ALSTMblockcontainsgatesthatdeterminewhetheraninputissignificantenoughtobesaved.Thisblockisformedbyfourunits:

Inputgate:AllowsthevalueinputinthestructureForgetgate:GoestoeliminatethevaluescontainedinthestructureOutputgate:DetermineswhentheunitwilloutputthevaluestrappedinstructureCell:Enablesordisablesthememorycell

Inthenextexample,wewillseeaTensorFlowimplementationofaLSTMnetworkinalanguageprocessingproblem.

NLPwithTensorFlow

RNNshaveprovedtohaveexcellentperformanceinproblemssuchaspredictingthenextcharacterinatextor,similarly,thepredictionofthenextsequencewordinasentence.However,theyarealsousedformorecomplexproblems,suchasMachineTranslation.Inthiscase,thenetworkwillhaveasinputasequenceofwordsinasourcelanguage,whileyouwanttooutputthecorrespondingsequenceofwordsinalanguagetarget.Finally,anotherapplicationofgreatimportanceinwhichRNNsarewidelyusedisthatofspeechrecognition.Inthefollowing,wewilldevelopacomputationalmodelthatcanpredictthenextwordinatextbasedonthesequenceoftheprecedingwords.Tomeasuretheaccuracyofthemodel,wewillusethePennTreeBank(PTB)dataset,whichisthebenchmarkusedtomeasuretheprecisionofthesemodels.

Thisexamplereferstothefilesthatyoufindinthe/rnn/ptbdirectoryofyourTensorFlowdistribution.Itcomprisesofthefollowingtwofiles:

ptb_word_lm.py:ThequeuestotrainalanguagemodelonthePTBdatasetreader.py:Thecodetoreadthedataset

Unlikepreviousexamples,wewillpresentonlythepseudocodeoftheprocedureimplemented,inordertounderstandthemainideasbehindtheconstructionofthemodel,withoutgettingboggeddowninunnecessaryimplementationdetails.Thesourcecodeisquitelong,andanexplanationofthecodelinebylinewouldbetoocumbersome.

Note

Seehttps://www.tensorflow.org/versions/r0.8/tutorials/recurrent/index.htmlforotherreferences.

Downloadthedata

Youcandownloadthedatafromthewebpagehttp://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgzandthenextractthedatafolder.Thedatasetispreprocessedandcontains10000,differentwords,includingtheend-of-sentencemarkerandaspecialsymbol(<unk>)forrarewords.Weconvertalloftheminreader.pytouniqueintegeridentifierstomakeiteasyfortheneuralnetworktoprocess.

Toextracta.tgzfilewithtar,youneedtousethefollowing:

https://www.tensorflow.org/versions/r0.8/tutorials/recurrent/index.html

http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz

tar-xvzf/path/to/yourfile.tgz

BuildingthemodelThismodelimplementsanarchitectureoftheRNNusingtheLSTM.Infact,itplanstoincreasethearchitectureoftheRNNbyincludingstorageunitsthatallowsavinginformationregardinglong-termtemporaldependencies.

TheTensorFlowlibraryallowsyoutocreateaLSTMthroughthefollowingcommand:

lstm=rnn_cell.BasicLSTMCell(size)

HeresizeshouldbethenumberofunitstobeusedLSTM.TheLSTMmemoryisinitializedtozero:

state=tf.zeros([batch_size,lstm.state_size])

Inthecourseofcomputation,aftereachwordtoexaminethestatevalueisupdatedwiththeoutputvalue,followingisthepseudocodelistoftheimplementedsteps:

loss=0.0

forcurrent_batch_of_wordsinwords_in_dataset:

output,state=lstm(current_batch_of_words,state)

outputisthenusedtomakepredictionsonthepredictionofthenextword:

logits=tf.matmul(output,softmax_w)+softmax_b

probabilities=tf.nn.softmax(logits)

loss+=loss_function(probabilities,target_words)

Thelossfunctionminimizestheaveragenegativelogprobabilityofthetargetwords,itistheTensorFowfunction:

tf.nn.seq2seq.sequence_loss_by_example

Itcomputestheaverageper-wordperplexity,itsvaluemeasurestheaccuracyofthemodel(tolowervaluescorrespondbestperformance)andwillbemonitoredthroughoutthetrainingprocess.

RunningthecodeThemodelimplementedsupportsthreetypesofconfigurations:small,medium,andlarge.ThedifferencebetweenthemisinsizeoftheLSTMsandthesetofhyperparametersusedfortraining.Thelargerthemodel,thebetterresultsitshouldget.Thesmallmodelshouldbeabletoreachperplexitybelow120onthetestsetandthelargeonebelow80,thoughitmighttakeseveralhourstotrain.

Toexecutethemodelsimplytypethefollowing:

pythonptb_word_lm--data_path=/tmp/simple-examples/data/--model

small

In/tmp/simple-examples/data/,youmusthavedownloadedthedatafromthePTBdataset.

Thefollowinglistshowstherunafter8hoursoftraining(13epochsforasmallconfiguration):

Epoch:1Learningrate:1.000

0.004perplexity:5263.762speed:391wps










Epoch:1TrainPerplexity:268.124

Epoch:1ValidPerplexity:180.210














............................................................



























TestPerplexity:117.171

Asyoucansee,theperplexitybecameloweraftereachepoch.

SummaryInthischapter,wegaveanoverviewofdeeplearningtechniques,examiningtwoofthedeeplearningarchitecturesinuse,CNNandRNNs.ThroughtheTensorFlowlibrary,wedevelopedaconvolutionalneuralnetworkarchitectureforimageclassificationproblem.ThelastpartofthechapterwasdevotedtoRNNs,wherewedescribedtheTensorFlow'stutorialforRNNs,whereaLSTMnetworkisbuilttopredictthenextwordinanEnglishsentence.

ThenextchaptershowstheTensorFlowfacilitiesforGPUcomputingandintroducesTensorFlowserving,ahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.

https://www.tensorflow.org/

Chapter6.GPUProgrammingandServingwithTensorFlowInthischapter,wewillcoverthefollowingtopics:

GPUprogrammingTensorFlowServing:

HowtoinstallTensorFlowServingHowtouseTensorFlowServingHowtoloadandexportaTensorFlowmodel

GPUprogrammingInChapter5,DeepLearning,wherewetrainedarecurrentneuralnetwork(RNN)foranNLPapplication,wecouldseethatdeeplearningapplicationscanbecomputationallyintensive.However,youcanreducethetrainingtimebyusingparallelprogrammingtechniquesthroughagraphicprocessingunit(GPU).Infact,thecomputationalresourcesofmoderngraphicsunitsmakethemabletoperformparallelcodeportions,ensuringhighperformance.

TheGPUprogrammingmodelisaprogrammingstrategythatconsistsofreplacingaCPUtoaGPUtoacceleratetheexecutionofavarietyofapplications.Therangeofapplicationsofthisstrategyisverylargeandisgrowingdaybyday;theGPUs,currently,areabletoreducetheexecutiontimeofapplicationsacrossdifferentplatforms,fromcarstomobilephones,andfromtabletstodronesandrobots.

ThefollowingdiagramshowshowtheGPUprogrammingmodelworks.Intheapplication,therearecallstotelltheCPUtogiveawayspecificpartofthecodeGPUandletitruntogethighexecutionspeed.ThereasonforsuchspecificparttorelyontwoGPUisuptothespeedprovidedbytheGPUarchitecture.GPUhasmanyStreamingMultiprocessors(SMPs),witheachhavingmanycomputationalcores.ThesecoresarecapableofperformingALUandotheroperationswiththehelpofSingleInstructionMultipleThread(SIMT)calls,whichreducetheexecutiontimedrastically.

IntheGPUprogrammingmodeltherearepiecesofcodethatareexecutedsequentiallyintheCPU,andsomepartsareexecutedinparallelbytheGPU

TensorFlowpossessescapabilitiesthatyoucantakeadvantageofthisprogrammingmodel(ifyouhaveaNVIDIAGPU),thepackageversionthatsupportsGPUrequiresCudaToolkit7.0and6.5CUDNNV2.

Note

FortheinstallationofCudaenvironment,wesuggestreferringtheCudainstallationpage:http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/#axzz49w1XvzNj

TensorFlowreferstothesedevicesinthefollowingway:

/cpu:0:ToreferencetheserverCPU/gpu:0:TheGPUserverifthereisonlyone

http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/#axzz49w1XvzNj

/gpu:1:ThesecondGPUserverandsoon

Tofindoutwhichdeviceisassignedtoouroperationsandtensionersneedtocreatethesessionwiththeoptionofsettinglog_device_placementinstantiatedtoTrue.

Considerthefollowingexample.

Wecreateacomputationalgraph;aandbwillbetwomatrices:

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

Incweputthematrixmultiplicationofthesetwoinputtensors:

c=tf.matmul(a,b)

Thenwebuildasessionwithlog_device_placementsettoTrue:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

Finally,welaunchthesession:

printsess.run(c)

Youshouldseethefollowingoutput:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/gpu:0

a:/job:localhost/replica:0/task:0/gpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

Ifyouwouldlikeaparticularoperationtorunonadeviceofyourchoiceinsteadofwhat'sautomaticallyselectedforyou,youcanusetf.devicetocreateadevicecontext,sothatalltheoperationswithinthatcontextwillhavethesamedeviceassignment.

Let'screatethesamecomputationalgraphusingthetf.deviceinstruction:

withtf.device('/cpu:0'):

a=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[2,3],

name='a')

b=tf.constant([1.0,2.0,3.0,4.0,5.0,6.0],shape=[3,2],

name='b')

c=tf.matmul(a,b)

Again,webuildthesessiongraphandlaunchit:

sess=tf.Session(config=tf.ConfigProto(log_device_placement=True))

printsess.run(c)

Youwillseethatnowaandbareassignedtocpu:0:

Devicemapping:

/job:localhost/replica:0/task:0/gpu:0->device:0,name:TeslaK40c,

pcibus

id:0000:05:00.0

b:/job:localhost/replica:0/task:0/cpu:0

a:/job:localhost/replica:0/task:0/cpu:0

MatMul:/job:localhost/replica:0/task:0/gpu:0

[[22.28.]

[49.64.]]

IfyouhavemorethanaGPU,youcandirectlyselectitsettingallow_soft_placementtoTrueintheconfigurationoptionwhencreatingthesession.

TensorFlowServingServingisaTensorFlowpackagethathasbeendevelopedtotakemachinelearningmodelsintoproductionsystems.ItmeansthatadevelopercanuseTensorFlowServing'sAPItobuildaservertoservetheimplementedmodel.

Theservedmodelwillbeabletomakeinferencesandpredictionseachtimeondatapresentedbyitsclients,allowingtoimprovethemodel.

Tocommunicatewiththeservingsystem,theclientsuseahighperformanceopensourceremoteprocedurecall(RPC)interfacedevelopedbyGoogle,calledgRPC.

Thetypicalpipeline(seethefollowingfigure)isthattrainingdataisfedtothelearner,whichoutputsamodel.Afterbeingvalidated,itisreadytobedeployedtotheTensorFlowservingsystem.Itisquitecommontolaunchanditerateonourmodelovertime,asnewdatabecomesavailable,orasyouimprovethemodel.

TensorFlowServingpipeline

HowtoinstallTensorFlowServingTocompileanduseTensorFlowServing,youneedtosetupsomeprerequisites.

Bazel

TensorFlowServingrequiresBazel0.2.0(http://www.bazel.io/)orhigher.Downloadbazel-0.2.0-installer-linux-x86_64.sh.

Note

Bazelisatoolthatautomatessoftwarebuildsandtests.Supportedbuildtasksincluderunningcompilersandlinkerstoproduceexecutableprogramsandlibraries,andassemblingdeployablepackages.

Runthefollowingcommands:

chmod+xbazel-0.2.0-installer-linux-x86_64.sh

./bazel-0.2.0-installer-linux-x86_64.sh-user

Finally,setupyourenvironment.Exportthisinyour~/.bashrcdirectory:

exportPATH="$PATH:$HOME/bin"

gRPC

OurtutorialsusegRPC(0.13orhigher)asourRPCframework.

Note

Youcanfindotherreferencesathttps://github.com/grpc.

TensorFlowservingdependencies

ToinstallTensorFlowservingdependencies,executethefollowing:

sudoapt-getupdate&&sudoapt-getinstall-y\

build-essential\

curl\

git\

libfreetype6-dev\

libpng12-dev\

libzmq3-dev\

pkg-config\

python-dev\

python-numpy\

http://www.bazel.io/

https://github.com/grpc

python-pip\

software-properties-common\

swig\

zip\

zlib1g-dev

ThenconfigureTensorFlow,byrunningthefollowingcommand:

cdtensorflow

./configure

cd..

InstallServing

UseGittoclonetherepository:

gitclone--recurse-submodules

https://github.com/tensorflow/serving

cdserving

The--recurse-submodulesoptionisrequiredtofetchTensorFlow,gRPC,andotherlibrariesthatTensorFlowservingdependson.TobuildTensorFlow,youmustuseBazel:

bazelbuildtensorflow_serving/

Thebinarieswillbeplacedinthebazel-bindirectory,andcanberunusingthefollowingcommand:

/bazel-bin/tensorflow_serving/example/mnist_inference

Finally,youcantesttheinstallationbyexecutingthefollowingcommand:

bazeltesttensorflow_serving/

HowtouseTensorFlowServingInthistutorial,wewillshowhowtoexportatrainedTensorFlowmodelandbuildaservertoservetheexportedmodel.TheimplementedmodelisaSoftmaxRegressionmodelforhandwrittenimageclassification(MNISTdata).

Thecodewillconsistoftwoparts:

APythonfile(mnist_export.py)thattrainsandexportsthemodelAC++file(mnist_inference.cc)thatloadstheexportedmodelandrunsagRPCservicetoserveit

Inthefollowingsections,wereportthebasicstepstouseTensorFlowServing.Forotherreferences,youcanviewhttps://tensorflow.github.io/serving/serving_basic.

TrainingandexportingtheTensorFlowmodel

Asyoucanseeinmnist_export.py,thetrainingisdonethesamewayasintheMNIST.Forabeginnerstutorial,referthefollowinglink:

https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html

TheTensorFlowgraphislaunchedinTensorFlowsessionsess,withtheinputtensor(image)asxandtheoutputtensor(Softmaxscore)asy.ThenweusetheTensorFlowservingexportertoexportthemodel;itbuildsasnapshotofthetrainedmodelsothatitcanbeloadedlaterforinference.Let'snowseethemainfunctiontousetoexportatrainedmodel.

Importtheexportertoserializethemodel:

fromtensorflow_serving.session_bundleimportexporter

ThenyoumustdefinesaverusingtheTensorFlowfunctiontf.train.Saver.IthastheshardedparameterequaltoTrue:

saver=tf.train.Saver(sharded=True)

saverisusedtoserializegraphvariablevaluestothemodelexportsothattheycanbeproperlyrestoredlater.

Thenextstepistodefinemodel_exporter:

https://tensorflow.github.io/serving/serving_basic

https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html

model_exporter=exporter.Exporter(saver)

signature=exporter.classification_signature\

(input_tensor=x,scores_tensor=y)

model_exporter.init(sess.graph.as_graph_def(),

default_graph_signature=signature)

model_exportertakesthefollowingtwoarguments:

sess.graph.as_graph_def()istheprotobufofthegraph.ExportingwillserializetheprotobuftothemodelexportsothattheTensorFlowgraphcanbeproperlyrestoredlater.default_graph_signature=signaturespecifiesamodelexportsignature.Thesignaturespecifieswhattypeofmodelisbeingexported,andtheinput/outputtensorstobindtowhenrunninginference.Inthiscase,youuseexporter.classification_signaturetospecifythatthemodelisaclassificationmodel.

Finally,wecreateourexport:

model_exporter.export(export_path,tf.constant\

(FLAGS.export_version),sess)

model_exporter.exporttakesthefollowingarguments:

export_pathisthepathoftheexportdirectory.Exportwillcreatethedirectoryifitdoesnotexist.tf.constant(FLAGS.export_version)isatensorthatspecifiestheversionofthemodel.Youshouldspecifyalargerintegervaluewhenexportinganewerversionofthesamemodel.Eachversionwillbeexportedtoadifferentsub-directoryunderthegivenpath.sessistheTensorFlowsessionthatholdsthetrainedmodelyouareexporting.

Runningasession

Toexportthemodel,firstcleartheexportdirectory:

$>rm-rf/tmp/mnist_model

Then,usingbazel,buildthemnist_exportexample:

$>bazelbuild//tensorflow_serving/example:mnist_export

Finally,youcanrunthefollowingexample:

$>bazel-bin/tensorflow_serving/example/mnist_export/tmp/mnist_model

Trainingmodel...

Donetraining!

Exportingtrainedmodelto/tmp/mnist_model

Doneexporting!

Lookingintheexportdirectory,weshouldhaveasub-directoryforexportingeachversionofthemodel:

$>ls/tmp/mnist_model

00000001

Thecorrespondingsub-directoryhasthedefaultvalueof1,becausewespecifiedtf.constant(FLAGS.export_version)asthemodelversionearlier,andFLAGS.export_versionhasthedefaultvalueof1.

Eachversionofsub-directorycontainsthefollowingfiles:

export.metaistheserializedtensorflow::MetaGraphDefofthemodel.Itincludesthegraphdefinitionofthemodel,aswellasmetadataofthemodel,suchassignatures.export-?????-of-?????arefilesthatholdtheserializedvariablesofthegraph.

$>ls/tmp/mnist_model/00000001

checkpointexport-00000-of-00001export.meta

LoadingandexportingaTensorFlowmodelTheC++codeforloadingtheexportedTensorFlowmodelisinthemain()functioninmnist_inference.cc.Herewereportanexcerpt;wedonotconsidertheparametersforbatching.Ifyouwanttoadjustthemaximumbatchsize,timeoutthreshold,orthenumberofbackgroundthreadsusedforbatchedinference,youcandosobysettingmorevaluesinBatchingParameters:

intmain(intargc,char**argv)

{

SessionBundleConfigsession_bundle_config;

...Herebatchingparameters

std::unique_ptr<SessionBundleFactory>bundle_factory;

TF_QCHECK_OK(

SessionBundleFactory::Create(session_bundle_config,

&bundle_factory));

std::unique_ptr<SessionBundle>bundle(newSessionBundle);

TF_QCHECK_OK(bundle_factory->CreateSessionBundle(bundle_path,

&bundle));

......

RunServer(FLAGS_port,std::move(bundle));

return0;

}

SessionBundleisacomponentofTensorFlowServing.Let'sconsidertheincludefileSessionBundle.h:

structSessionBundle{

std::unique_ptr<tensorflow::Session>session;

tensorflow::MetaGraphDefmeta_graph_def;

};

ThesessionparameterisaTensorFlowsessionthathastheoriginalgraphwiththenecessaryvariablesproperlyrestored.

SessionBundleFactory::CreateSessionBundle()loadstheexportedTensorFlowmodelfrombundle_pathandcreatesaSessionBundleobjectforrunninginferencewiththemodel.

RunServerbringsupagRPCserverthatexportsasingleClassify()API.

Eachinferencerequestwillbeprocessedinthefollowingsteps:

1. Verifytheinput.TheserverexpectsexactlyoneMNIST-formatimageforeachinferencerequest.

2. Transforminputtoinferenceinputtensorandcreateoutputtensorplaceholder.3. Runinference.

Torunaninference,youmusttypethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_inference

$>bazel-bin/tensorflow_serving/example/mnist_inference--port=9000

/tmp/mnist_model/00000001

TesttheserverTotesttheserver,weusethemnist_client.py(https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_client.pyutility.

ThisclientdownloadsMNISTtestdata,sendsitasrequeststotheserver,andcalculatestheinferenceerrorrate.

Torunit,typethefollowingcommand:

$>bazelbuild//tensorflow_serving/example:mnist_client

$>bazel-bin/tensorflow_serving/example/mnist_client--num_tests=1000

--server=localhost:9000

Inferenceerrorrate:10.5%

Theresultconfirmsthattheserverloadsandrunsthetrainedmodelsuccessfully.Infact,a10.5%inferenceerrorratefor1,000imagesgivesus91%accuracyforthetrainedSoftmaxmodel.

https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_client.py

SummaryWedescribedtwoimportantfeaturesofTensorFlowinthischapter.FirstwasthepossibilityofusingtheprogrammingmodelknownasGPUcomputing,withwhichitbecomespossibletospeedupthecode(forexample,thetrainingphaseofaneuralnetwork).ThesecondpartofthechapterwasdevotedtodescribingtheframeworkTensorFlowServing.Itisahighperformance,opensourceservingsystemformachinelearningmodels,designedforproductionenvironmentsandoptimizedforTensorFlow.Thispowerfulframeworkcanrunmultiplemodelsatlargescalethatchangeovertime,basedonreal-worlddata,enablingamoreefficientuseofGPUresourcesandallowingthedevelopertoimprovetheirownmachinelearningmodels.

Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...

Documents

Transcript of Getting Started with TensorFlow - pudn.comread.pudn.com/downloads779/ebook/3085493/Getting...