Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

98

Transcript of Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Page 1: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras
Page 2: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

DeepLearningwithKeras

Beginner’sGuideTo

DeepLearningWithKeras

ByFrankMillstein

Page 3: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

PleaseCheckOutMyOtherBooksBeforeYouContinueBelowyouwillfindjustafewofmyotherbooksthatarepopularonAmazonandKindle.Simplyclickonthelinkorimagebelowtocheckthemout.

LinkToMyAuthorPage

Page 4: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

WHATISINTHEBOOK?INTRODUCTION

HOWDEEPLEARNINGISDIFFERENTFROMMACHINELEARNINGDEEPERINTODEEPLEARNING

CHAPTER1:AFIRSTLOOKATNEURALNETWORKS

CONVOLUTIONALNEURALNETWORKRECURRENTNEURALNETWORKRNNSEQUENCETOSEQUENCEMODELAUTOENCODERSREINFORCEMENTDEEPLEARNINGGENERATIVEADVERSARIALNETWORK

CHAPTER2:GETTINGSTARTEDWITHKERAS

BUILDINGDEEPLEARNINGMODELSWITHKERAS

CHAPTER3:MULTI-LAYERPERCEPTRONNETWORKMODELS

MODELLAYERSMODELCOMPILATIONMODELTRAININGMODELPREDICTION

CHAPTER4:ACTIVATIONFUNCTIONSFORNEURALNETWORKS

SIGMOIDACTIVATIONFUNCTIONTANHACTIVATIONFUNCTIONRELUACTIVATIONFUNCTION

CHAPTER5:MNISTHANDWRITTENRECOGNITION

CHAPTER6:NEURALNETWORKMODELSFORMULTI-CLASSCLASSIFICATIONPROBLEMS

ONE-HOTENCODINGDEFININGNEURALNETWORKMODELSWITHSCIKIT-LEARNEVALUATINGMODELSWITHK-FOLDCROSSVALIDATION

CHAPTER7:RECURRENTNEURALNETWORKS

SEQUENCECLASSIFICATIONWITHLSTMRECURRENTNEURALNETWORKSWORDEMBEDDINGAPPLYINGDROPOUTNATURALLANGUAGEPROCESSINGWITHRECURRENTNEURALNETWORKS

LASTWORDS

Page 5: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Copyright©2018byFrankMillstein-Allrightsreserved.

This document is geared towards providing exact and reliable information inregardstothetopicandissuecovered.Thepublicationissoldwiththeideathatthe publisher is not required to render accounting, officially permitted, orotherwise, qualified services. If advice is necessary, legal or professional, apracticedindividualintheprofessionshouldbeordered.FromaDeclarationofPrincipleswhichwasacceptedandapprovedequallybyaCommitteeoftheAmericanBarAssociationandaCommitteeofPublishersandAssociations.In no way is it legal to reproduce, duplicate, or transmit any part of thisdocument by either electronic means or in printed format. Recording of thispublicationisstrictlyprohibited,andanystorageofthisdocumentisnotallowedunlesswithwrittenpermissionfromthepublisher.Allrightsreserved.The informationprovidedherein is stated to be truthful and consistent, in thatanyliability,intermsofinattentionorotherwise,byanyusageorabuseofanypolicies, processes, or directions contained within is the solitary and utterresponsibility of the recipient reader. Under no circumstances will any legalresponsibility or blame be held against the publisher for any reparation,damages, or monetary loss due to the information herein, either directly orindirectly.Respectiveauthorsownallcopyrightsnotheldbythepublisher.The information herein is offered for informational purposes solely and isuniversal as so.Thepresentationof the information iswithout contractor anytypeofguaranteeassurance.Thetrademarksthatareusedarewithoutanyconsent,andthepublicationofthetrademark is without permission or backing by the trademark owner. Alltrademarksandbrandswithinthisbookareforclarifyingpurposesonlyandareownedbytheownersthemselves,notaffiliatedwiththisdocument.

Page 6: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

INTRODUCTION

NeuralnetworksanddeeplearningareincreasinglyimportantstudiesandconceptsincomputersciencewithamazingstridesbeingmadebymajortechcompanieslikeGoogle.Overtheyears,youmayhaveheardwordslikebackpropagation,neuralnetworks,anddeeplearningtossedaroundalot.Therefore,aswehearthemmoreoften,thereislittlewonderwhythesetermshaveseizedyourcuriosity.

Deeplearningisanimportantareaofactiveresearchtodayinthefieldofcomputerscience.Ifyouareinvolvedinthisscientificarea,Iamsureyouhavecomeacrossthesetermsatleastonce.Deeplearningandneuralnetworksmaybeanintimidatingconcept,butsinceitisincreasinglypopularthesedays,thistopicismostdefinitelyworthyourattention.

Googleandotherlargeglobaltechcompaniesaremakinggreatstrideswithdeep-learningprojects,liketheGoogleBrainprojectanditsrecentacquisitioncalledDeepMind.Moreover,manydeeplearningmethodsarebeatingthosetraditionalmachinelearningmethodsoneverysinglematric.

Page 7: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

HOWDEEPLEARNINGISDIFFERENTFROMMACHINELEARNING

Beforegoingfurtherintothissubject,wemusttakeastepback,soyougettolearnmoreaboutthebroaderfieldofmachinelearning.Veryoften,weencounterproblemsforwhichitishardtowriteacomputerprogramforsolvingthoseissues.Forinstance,ifyouwanttoprogramyourcomputertorecognizespecifichandwrittendigitsthatyoumayencounteroncertainissues,youcantrytodeviseacollectionofrulestodistinguisheveryindividualdigit.Inthiscase,zerosareoneclosedloop,butwhatifyoudidnotperfectlyclosethisloop.Ontheotherhand,whatiftherighttopofyourloopclosesonthatpartwherethelefttopofyourloopstarts?

Issueslikethishappenroutinely,aszeromaybeverydifficultwhenitcomestodistinguishingfromsixalgorithmically.Therefore,youhaveissueswhendifferentiatingzeroesfromsixes.Youcouldestablishakindofcutoff,butyouwillhaveproblemsdecidingtheoriginationofthecutoffinthefirstplace.Therefore,quicklyitbecomesverycomplicatedtocompilealistofguessesandrulesthatwillaccuratelyclassifyyourhandwrittendigits.

Therearemanymorekindsofissuesthatfallintothiscategorysuchascomprehendingspeech,recognizingobjects,andunderstandingconcepts.Therefore,wecanhaveissueswhenwritingcomputerprograms,aswedonotknowhowthisisdonebyhumanbrains.Despitethefactyouhavearelativelygoodideaonhowtodothis,yourprogrammaybeverycomplicated.

Page 8: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Therefore,insteadofwritingaprogram,youcantryanddevelopanalgorithmwhichyourcomputercanusetolookatthousandsofexamplesandcorrectanswers.Therefore,yourcomputercanusetheexperiencethathasbeenpreviouslygainedtosolvethesameprobleminnumerousothersituations.Ourmaingoalwiththissubjectistoteachourcomputerstosolvebyexampleintheverysimilarwayyoucanteachyourchildtodistinguishadogfromacat.

Deeplearningwasfirsttheorizedbackintheearly1980sandwasoneofthemainparadigmsforperformingbroadermachinelearning.Overthepastfewdecades,computerscientistshavesuccessfullydevelopedawiderangeofdifferentalgorithmswhichtrytoallowcomputerstolearntosolveproblemsthroughexamples.Becauseoftheflurryofmoderntechnologicaladvancementsandmodernresearch,deeplearningisontherisesinceithasproventobeextremelygoodwhenitcomestoteachingourcomputerstodowhatthehumanbraincandonaturallyandeffortlessly.

Oneofthemainchallengeswithtraditionalmachinelearningmodelsisaprocessnamedfeatureextraction.Morespecifically,theprogrammersmusttellthecomputerwhatkindoffeaturesandinformationitshouldbelookingforwhentryingtomakeachoiceordecision.

However,feedingthealgorithmrawdatainfactrarelyworks,sothisprocessoffeatureextractionisoneofthecriticalpartsofthetraditionalmachinelearningworkflow.Moreover,thisplacesamassiveburdenontheprogrammerastheeffectivenessofthealgorithmmainlyreliesonhowtheinsightoftheprogrammer.Formorecomplexissues,suchashandwritingrecognitionorobjectrecognition,thisisoneofthemainchallenges.

Page 9: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Fortunately,wehavedeeplearningmethodsbywhichwecansurelycircumventthesechallengesregardingfeatureextraction.Thisismainlybecausedeeplearningalgorithmsarecapableoflearningtofocusonlyonthoseinformative,rightfeaturesbythemselveswhileatthesametimetheyrequireverylittleguidancefromtheprogrammer.Moreover,thismakesdeeplearninganamazinglypowerfultoolformachinelearning.

Machinelearningusesourcomputerstorunpredictivemodels,whicharecapableoflearningfromalreadyexistingdatatoforecastfutureoutcomes,behaviors,andtrends.Ontheotherhand,deeplearningisanimportantsubfieldofmachinelearninginwhichalgorithmsormodelsareinspiredbyhowthehumanbrainworks.Thesedeeplearningmodelsareexpressedmathematicallywhereparametersthatdefinemathematicalmodelscanbeintheorderofseveralthousandtomillions.Indeeplearningmodels,everythingislearnedautomatically.

Moreover,deeplearningisoneofthemainkeysenablingartificialintelligencepoweredtechnologiesthatarebeingdevelopedaroundtheglobeeveryday.Inthefollowingsectionsofthebook,youaregoingtolearnhowtobuildcomplexmodelswhichhelpmachinessolvedistinctreal-worldissueswithhuman-likeintelligence.YouwilllearnhowtobuildandderivemanyinsightsfromthesemodelsusingKerasrunningonyourLinuxmachine.

Thebook,infact,providesthelevelofdetailsneededfordatascientistsandengineerstodevelopagreaterintuitiveunderstandingofthemainconceptsofdeeplearning.Youwillalsolearnpowerfulmotifsthatcanbeusedinbuildingnumerousdeeplearningmodelsandmuchmore.

Page 10: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Machinelearninganddeeplearninghaveonethingincommon,thatistheyarebothrelatedtoartificialintelligence.Artificialintelligenceregardscomputersystems,whichmimicorreplicateshumanintelligence,whilebroaderfieldofmachinelearningallowsmachinetolearnentirelyontheirown.Ontheotherhand,deeplearningregardsmanycomputeralgorithms,whichattempttomodelhigh-levelabstractionscontainedindatatodeterminehigh-levelmeaning.

Forinstance,ifartificialintelligenceisusedtorecognizeemotionsinpictures,thenmachinelearningmodelswouldinputhundredsorthousandsofpicturesofhumanfacesintothesystemwhiledeeplearningwillhelpthatsystemtorecognizecountlesspatternsinthehumanfacesandtheemotionstheyshare.

Thisisaverysimpleexplanationofthethree.However,itismorecomplex.Deeplearningbyfaristhemostconfusingasitworkswithneuralnetworks,data,andmath.Unlikedeeplearning,machinelearninganalyzes,crunchesnumbersanddata,learnsfromitandusesthatinfotomakeinnumerablepredictions,truthstatementsanddeterminationsdependingonthescenario.

Inthiscase,themachineisbeingtrainedoritistrainingitselfonhowtoperformtaskscorrectlyafterlearningfromnumbersanddataithaspreviouslyanalyzed.Therefore,machinelearningmodels,buildstheirownsolutionsandlogic.MachinelearningcanbedonewithseveralalgorithmslikerandomforestanddecisiontreeusedbyNetflixforinstance,thatsuggestmoviestoitscustomersbasedontheirstarratings.

Anothercommonmachinelearningmodelisalinearregressionthatpredictsthe

Page 11: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

valueofamultitudeofcategoricaloutcomeswithlimitlessresultslikefiguringouthowmuchmoneyyoucansellyourcarforbasedonthecurrentmarketflow.Othermachinelearningmodelsincludelogisticregressionthatpredictsthevalueofcategoricaloutcomesbasedonalimitednumberofpossiblevalues.

ClassificationandnaiveBayesadditionallyincludemachinelearningmodels.MachinelearningclassificationputsdataintodistinctgroupslikeemailsorfilingdocumentswhilenaïveBayesincludesafamilyofalgorithms,whichallsharecommonprinciplesinwhicheveryfeatureisbeingclassifiedindependentlyofotherfeatures.Thismaygoonastherearemanyothermachinelearningmodels.Therearetwotypesofmachinelearningmodelsaswellincludingsupervisedandunsupervised.

Supervisedlearningmodelsrequireahumantoinputboththedataandthesolution.Ontheotherhand,thesemodelsallowthemachinetofigureoutonitsowntherelationshipbetweenthetwo.Ontheotherhand,unsupervisedmachinelearningmodelsincludeputtinginrandomdataandnumbersforaspecificsituationandaskingthemachineorcomputertofindarelationshipandsolution.Therefore,machinelearningeliminatedtheneedforsomeonetoconstantlyanalyzeorcodedatatopresentlogicandasolution.

Page 12: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

DEEPERINTODEEPLEARNING

Unlikemachinelearning,deeplearningcrunchesmoredata,whichisthebiggestdifferencebetweenthetwo.Forinstance,ifyouhavealittlebitofdatatoanalyze,thewaytogoismachinelearning.However,ifyouhavemoredatatoanalyze,deeplearningisyoursolution.Deeplearningmodelsareextremelypowerful,andtheyneedalotofdatatogiveyouthebestpossibleoutcomeorsolution.Ontheotherhand,deeplearningmodelsneedmorepowerfulmachineswhilemachinelearningmodelsdonot.

MorepowerfulmachinesarerequiredfordeeplearningasdeeplearningmodelsdomorecomplicatedthingssuchasmatrixmultiplicationsthatrequireaGPUorgraphicsprocessingunit.Deeplearningmodelsalsotrytolearnhigh-levelfeatures,sointhecaseoffacialrecognition,thedeeplearningmodelwillgettheimagethatisquiteclosetotheRAWversion,whileamachinelearningmodelwillgetablurryimage.Otherpowerfuldeeplearningfeaturesareformingend-to-endsolutionsinsteadofbreakingissuesandsolutionsdowninparts.

Deeplearningisoneofthemostpowerfultoolsusedbymajorglobaltechcompanies.Deeplearningtakesalongtimeinprocessingdataandfindingcorrectsolutions.Justkeepinmind,itmaybechallengingattheverybeginning,butyouwillgetthereeventually.Fortunately,youhavethebooktostartoffyourdeeplearningjourney.

Page 13: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER1:AFIRSTLOOKATNEURALNETWORKS

Inrecentyears,neuralnetworks,ormorespecificallydeepneuralnetworks,havewonnumerouscontestsinmachinelearningandpatternrecognition.Deeplearnersaremainlydistinguishedbythedepthoftheirpathsthatarechainsofthepossiblylearnablecausalrelationshipsbetweeneffectsandactions.

Deeplearningalgorithmsinverysimplewordsaredeep,largeartificialneuralnets.AnNNorneuralnetworkcanbeeasilypresentedasadirectedacrylicgraphinwhichtheinitialinputlayerbyitselftakesinsignalvectorsinadditiontoincludingsingleormultiplehiddenlayersandthenprocesstheoutputsofthosepreviouslayers.

Infact,themainconceptbehindneuralnetworkscanbetracedtohalfacenturyago.Thereismoretalkabouttheideatodaybecausewehavealotmoredataandwehavesignificantlymorepowerfulcomputers,whichwerenotavailabledecadesago.

Adeepneuralnetworkhasmanymorelayersandmanymorenodesineverylayerwhichresultinexponentiallymoreparameterstotune.Inthecasewhenwedonothaveenoughdata,wearenotabletolearnthoseimportantparametersefficiently.Inaddition,withoutpowerfulmachinesorcomputers,learningwouldbeinsufficientaswellastooslow.

Page 14: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Whenitcomestosmalldatasets,traditionalmachinelearningalgorithmssuchasrandomforests,regression,GBM,SVMandstatisticallearningdoanamazingjob.However,whenthedatascalegoesupcontainingalargeamountofinformation,thoselarge,deepneuralnetworksquicklyoutperformthetraditionalones.

Thishappensprimarilybecausecomparedtoatraditionalmachinelearningalgorithm,adeepneuralnetworkmodelhasawiderrangeofparametersandithasthecapabilityoflearningmorecomplexnonlinearparameters.Therefore,weexpectadeepneuralnetworkmodeltoautomaticallypickthemostimportantandhelpfulfeaturesonitsownwithouttoomuchmanualengineering.

Asalreadymentioned,deeplearningisoneofthemainformsofmachinelearningwhichusesamodeloralgorithmofcomputingthatisverymuchbasedorinspiredbythestructureofthehumanbrain.Therefore,thesemodelsarecalledneuralnetworks.Thebasicunitofanyneuralnetworkistheneuron.Everyneuronhasaspecificsetofinputs.Inaddition,everyneuronhasaspecificweight.Theneuronshavethepowerofcomputingfunctionsbasedonthesespecificweightedinputs.Forinstance,alinearneurontakesalinearcombinationofitsweightedinputswhilesigmoidalneuronsfeedthespecificweightedsumofitsinputsintoalogisticfunction.

Thelogisticfunctionalwaysreturnsavaluebetween1and0.Inthecasewhentheweightedsumisnegative,thereturnvalueiscloseto0.Ontheotherhand,whentheweightedsumislargeorpositive,thereturnvalueiscloseto1.Whenitcomestomathematicalproblems,thelogisticfunctionisalogicalchoicesinceithasnicelookingderivativesthatmakethelearningprocesssimpler.

Page 15: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

However,whateverfunctiontheneuronuses,thevaluecomputedisimmediatelytransmittedtootherincludedneuronsthatshowtheneuron’soutput.Inpractice,thesesigmoidalneuronsareusedmoreoftenthanlinearfunctionssincetheyenablemoreversatiledeeplearningmodelsincomparisontolinearneurons.

Adeepneuralnetworkoccurswhenyoustartconnectingneuronstoeachotherinyourinputdataandeventuallytotheoutletsthatcorrespondtoyournetwork’sanswertoyourlearningproblems.Tomakethisstructureeasiertovisualize,youmustobtaintheweightofyourneuronsinthelinkthatconnectyourinitiallayertootherlayerscontainedinyourneuron.

Verysimilartohowneuronsareorganizedinlayersinthebrain,neuronsindeepneuralnetsaretypicallyorganizedincertainlayers.Indeepneuralnetworks,neuronssituatedonthebottomlayersarethosethatreceivesignalsfromtheinputs.Whileneuronssituatedinthetoplayersareconnectedtotheanswer;thankstotheiroutlets.Usually,therearenoconnectionsbetweenneuronsinthesamelayersasmorecomplexconnectivitybetweenneuronsrequiremoremathematicalanalysis.

Inthecasewhentherearenoconnections,whichleadfromaneuroninahigherleveltothoseneuronsinlowerlayers,wecallthemfeed-forwardneuralnetworks.Opposedtotheseneuralnetworksarerecursiveneuralnetworksthataremuchmorecomplicatedtotrainandanalyze.Now,wewillgothroughseveralofthemostcommonlyuseddeepneuralnetworks.

Page 16: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CONVOLUTIONALNEURALNETWORK

ConvolutionalneuralnetworksoralsoknownasCNNandareoneofthemostcommonlyusedtypesoffeed-forwardneuralnetworksinthattheconnectivitypatternbetweenneuronsisbasedontheorganizationofthecommonvisualcortexsystem.Therefore,theV1ortheprimaryvisualcortexdoesedgedetectionoutofrawvisualinputobtainedfromtheretina.Then,theV2orsecondaryvisualcortexreceivestheedgefeaturesfromtheprimaryvisualcortexandextractssimplevisualpropertiessuchasspatialfrequency,orientation,andcolor.

ThevisualareaofV4oranothervisualcortexmainlyhandlesmorecomplicatedattributesorgrainedobjects.Then,allthoseprocessedvisualfeaturesflowintothefinalunitnamedinferiortemporalgyrusofITforfurtherobjectrecognition.ThisspecificshortcutbetweenV1layerandV4layer,infact,inspiredacertaintypeofconvolutionalneuralnetworkswithconnectionsbetweenthosenon-adjacentlayersnamedresidualnet.Residualnetscontainresidualblocksthatsupportinputsofonelayertobereadilypassedtothoselayerscominglater.

Therefore,convolutionalneuralnetworksarecommonlyusedforedgedetections,extractingsimplevisualpropertiessuchasspatialfrequency,orientation,andcolors,detectingobjectfeaturesofintermediatecomplexityandobjectrecognition.

Convolutioniscommonlyusedinmathematicaltermsreferringtoanoperationbetweenmatricesasconvolutionallayerswhichgenerallyhaveasmallmatrixnamedfilterorkernel.Asthefilterorkernelisslidingorconvolvingacross

Page 17: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

thesematricesofinputimages,itis,atthesametime,computingtheimportantelement-widemultiplicationofspecificvaluescontainedinthekernelmatrixaswellascontainedintheoriginalimagevalues.

Therefore,specificallydesignedfiltersorkernelsarecapableofprocessingimagesforveryspecificpurposessuchasimagesharpening,blurring,edgedetectionandmanyotherprocessesefficientlyandrapidly.

Page 18: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

RECURRENTNEURALNETWORK

Aneuralnetworksequencemodeliscommonlydesignedtotransforminputsequencesintoanoutputsequences,whichliveinadifferentdomain.AnothercommontypeofdeepneuralnetworksnamedRNNorrecurrentneuralnetworksaregreatlysuitableforthesepurposesastheyhaveshownanamazingimprovementinproblemslikespeechrecognition,handwritingrecognition,andmachinetranslation.

AnRNNmodelisbornwithanamazingcapabilityofprocessinglongsequentialdataandtacklingverycomplextaskswithcontextspreadoveraperiod.Therecurrentmodel,infact,processessingleelementintheneuralsequenceatthetime.Aftertheinitialcomputation,thisnewlyupdatedunitstateiseasilypasseddowntothatnexttimesteptofacilitatethecomputationofeverynextelement.

ImaginethecasewhenrecurrentneuralnetworkmodelreadsallarticlesonWikipediacharacterbycharacter.Ontheotherhand,simpleperceptronneuronswhichlinearlycombinethecurrentinputelements,aswellasthelastunitstatetypically,maylosethoselong-termdependencies.

Forinstance,wecanstartasentencewithSusanisworkingat…Then,afterawholeparagraph,wewanttostartournextsentenceswithHeorShecorrectly.Iftherecurrentneuralmodelforgetsthecharacter’snameweused,wecanneverknow.Toresolvethisissue,engineershavecreatedaspecialdeepneuronthatcomeswithmorecomplicatedinternalstructuredesignedtomemorizethelong-termcontextnamedLTSMorlong-short-termmemory.

Page 19: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

LTSMmodelsaresmartenoughtolearnlong-termcontext.Thesemodelscanlearnforhowlongtheyshouldmemorizetheoldinformation,whentoforgetinformation,whentousenewlyupdateddata,andwhentocombinethenewinputwitholdmemory.UsingthepowerofLSTMandRNNcells,youcanbuildanRNNcharacter-basedmodelthatwillbeabletolearnthespecificrelationshipbetweenthecharacterstoformwordsandsentenceswithoutanypreviousknowledgeofEnglishvocabulary.ThisRNNmodel,infact,couldachieveaverygoodperformanceevenwithoutalargesetoftrainingdata.

Page 20: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

RNNSEQUENCETOSEQUENCEMODEL

Thecommonsequence-to-sequencemodelisveryoftenusedasanextendedversionofrecurrentneuralmodels,butitsapplicationfieldismoredistinguishable.Sameasrecurrentneuralnetworks,sequence-to-sequencemodelsoperateonsequentialdata,buttheyarecommonlyusedtodeveloppersonalassistantsorchatbotsbygeneratingmeaningfulresponsestonumerousinputquestions.

Somecommonsequence-to-sequencemodelsconsistoftworecurrentneuralnetworksincludingdecoderandencoder.Inthiscase,theencoderlearnseverythingabouttheobtainedcontextualinformationfromvariousinputwords.Then,theencoderhandsthisknowledgedowntothedecoderbyusingaspecificcontextvectoralsoknownasthoughtvectors.Eventually,thedecoderconsumesthesecontextvectorsandgeneratescorrectresponses.

Page 21: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

AUTOENCODERS

Autoencodersaredifferentfromthepreviousdeeplearningmodelsastheyareusedonlyforunsuperviseddeeplearning.Autoencodersaredesignedmainlytolearnlow-dimensionalrepresentationofhigh-dimensionaldatasetsverysimilartowhatPCAorprincipalcomponentsanalysisdoes.Theautoencodermodeliscapableoflearningapproximationfunctionstoreproducetheinputdata.

Ontheotherhand,specificbottlenecklayerssituatedinthemiddlecontainingaverysmallnumberofnodesrestrictthesemodels.Withthisverylimitednumberofnodes,thesemodelscomewithverylimitedcapacity,sotheyareforcedtoformaspecific,veryefficientencodingofthedata,whichisthelow-dimensionalcodeweobtained.

Youcanuseautoencodermodelstocompressyourdocumentsonavarietyoftopics.Therearesomelimitationsasthesemodelsastheycomewithabottlenecklayerthatcontainsonlyafewneurons.

However,whenyouusebothautoencoderandPCA,youcanreduceyourdocumentsontotwodimensions,soyourautoencodermodelwilldemonstrateamuchbetteroutcome.Withthehelpofthesemodels,youcandoveryefficientdatacompressiontospeeduptheoverallprocessofinformationretrievalincludingbothimagesanddocuments.

Page 22: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

REINFORCEMENTDEEPLEARNING

ReinforcementlearningisoneofthesecretsbehindmanysuccessfulAIprojectsdoneinthepast.Reinforcementlearningisasubfieldofmachinelearningthatallowssoftwareagentsandmachinestoautomaticallydeterminetheoptimalbehaviorwithinagivencontextwiththemaingoalofmaximizingthelong-termperformances,whicharemeasuredbyagivenmetric.

Mostreinforcementlearningprojectsstartwithaspecificsupervisedlearningprocess,trainafastrolloutpolicyaswellaspolicynetwork,mainlyrelyingonthemanuallycuratedtrainingdata.Thereinforcementlearningpolicynetworkgetsimprovedwhenitgainedmoreversionsofthepreviouspolicynetwork.Therefore,withmoreandmoreobtaineddata,itgetsstrongerandstrongerwithoutrequiringanyadditionalexternaltrainingofdata.

Page 23: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

GENERATIVEADVERSARIALNETWORK

AnothertypeofdeepneuralnetworkcommonlyusedisgenerativeadversarialnetworkorGAN.Thisisatypeofdeepgenerativealgorithm.GANhasthepowerofcreatingnewexamplesaftergoingthroughandlearningsomeoftherealdata.AcommontypeofGANconsistsoftwomodelsthatarecompetingagainsteachotherinazero-sumnetwork.

Thegenerativeadversarialnetworkmainlycontainssomereal-worldexamples,generator,generatedfakesamples,discriminator,andfinetunetraining.Thesemodelscantellthefakedataapartfromthetruedataduetodatadistribution.GANwasinitiallyproposedtogeneratemeaningfulimagesafterlearningfromrealworldphotos.

TheGANmodelproposedintheoriginalGANpaperwascomposedoftwoindependentmodelsincludingthediscriminatorandthegenerator.Inthiscase,thegeneratorsproducedfakeimagesandsenttheoutputbacktothediscriminatormodel.Then,thediscriminatorworkedinamannerverysimilartoajudgesinceitwasfullyoptimizedtoidentifythefakephotosfromtherealones.

Inaddition,thegeneratormodel,atthesametime,wastryinghardtocheatthediscriminatorasthejudgewastryingveryhardnottobecheatedbythegenerators.Thiswasaveryinterestingzero-sumgameoccurringbetweentheseGANmodelsthatmotivatedbothmodelstofurtherimprovetheirfunctionalitiesanddeveloptheirdesignedskills.

Page 24: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Afterlearningaboutthesedeepneuralnetworkmodels,youprobablywonderhowyoucanimplementthesemodelsandusetheminrealproblemsolvingwithdeeplearningissues.Fortunately,therearemanysourcelibrariesandtoolkitsyoucanuseforbuildingyourowndeeplearningmodels.TensorFlowarguablyisoneofthemostpopularthatattractedalotofattention.Intermsofpopularity,TheanofollowsTensorFlowveryclosely.Therefore,thosetwoarethebestnumericalplatformsinPythonwhichprovidethebasisforinnumerabledeeplearningprojects.

Bothareverypowerfullibrariesandcanbeusedfordifferenttasksforcreatingdeeplearningmodels.AnotherpowerfultoolinPython’slibraryisKeraswhichwearegoingtouseinthisbook.Kerasisanamazingly,powerfulhigh-levelneuralnetworkAPIwithastonishingpowerofrunningontopofTheano,TensorFloworCNTK.ItwaswritteninPythonanddevelopedwiththefocusofenablingfastandefficientdeeplearningexperimentation.

Page 25: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER2:GETTINGSTARTEDWITHKERAS

WearegoingtouseKerasasitallowsfastandeasyprototyping,supportsbothrecurrentandconvolutionalnetworksandacombinationofthetwo,andrunsseamlesslyonGPUandCPU.Designedtoenablefastdeeplearningmodelingandexperimentationwithneuralnetworks,itfocusesonbeingmodular,minimalandextensible.

Therefore,withKerasyoucanbuildawiderangeofdifferentdeeplearningmodels,whichrunontopofTensorFloworTheanoeffortlesslyandefficiently.Kerasisafreeopen-sourceneuralnetwork,soyouwillfindandinstalliteasily.ThecoredatastructureofKerasisamodelthatisawayoforganizingmultiplelayers.BeforeyoudelvedeeperintoKeras,youmustinstallit,ofcourse.BeawarethatthispopularprogrammingdeeplearningframeworkuseseitherTheanoorTensorFlowbehindthescenesinsteadofprovidingallthefunctionalitybyitself.

KerasisverysimpletoinstallifyouhavebeenworkinginSciPyandPythonenvironment.MakesureyouhaveaninstallationofTensorFloworTheanoonyoursystemalreadybeforeyouinstallKeras.KerascanbeveryeasilyinstalledusingPyPI.

sudopipinstallkeras

python-c"importkeras;printkeras.__version__"

Page 26: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

1.1.0

sudopipinstall--upgradekeras

Usingthesamemethod,youcanupgradeyourversionofKeras.AssumingyouhaveinstalledbothTensorFlowandTheano,youareabletoconfigurethebackendusedbyKeras.ThebestwayisbyeditingoraddingtheKerasconfigurationfileinyourdirectory.

~/.keras/keras.json

{

"imagedimordering":"tf",

"epsilon":1e-07,

"floatx":"float32",

"backend":"tensorflow"

}

Inthisconfigurationfile,youcanchangethepropertyofbackendfromTensorFlowtoTheano.Inthiscase,Keraswilluseyournewconfigurationthenexttimeyourunit.Inaddition,youcaneasilyconfigureKerasasfollows.

python-c"fromkerasimportbackend;printbackend._BACKEND"

UsingTensorFlowbackend.

Tensorflow

Inaddition,youcanspecifywhichbackendyouwantKerastouseonyourcommandlineasshownbelow.

Page 27: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

KERAS_BACKEND=theanopython-c"fromkerasimportbackend;print(backend._BACKEND)"

Runningthisscript,yougetasillustratedbelow.

UsingTheanobackend.

theano

Page 28: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

BUILDINGDEEPLEARNINGMODELSWITHKERAS

ThefocusofKerasisamodel.ThemainkindofmodelbuildinKerasiscalledasequencecontainingalinearstackofmultiplelayers.Therefore,youcreateasequenceandgraduallyaddlayerstothemodelintheorderyouwantsoyougetpropercomputation.Onceyoudefineyourmodel,youmustcompileyourmodelsothatitmakesuseofthespecificunderlyingframeworktooptimizetheentireprocessofcomputation,whichwillbeperformedonyourdeeplearningmodel.

Inthiscase,youmustspecifytheoptimizerandthelossfunction,whichwillbeused.Onceyoucompileyourmodel,yourmodelmustfittothedata.Thisisdoneonebyone,onebatchofdataatatime.Infact,thisiswhereallthecomputationoccurs.Onceyoutrainyourmodel,youcanuseittomakeothernewpredictionsonyourdata.

Insummary,theconstructionofdeeplearningmodelsinKerascanbeexplainedasdefiningyourmodel,compilingyourmodel,fittingyourmodelandmakingpredictions.Todefineit,youmustcreateasequenceandaddmultiplelayers.Oncedone,youcompileyourmodelbyspecifyingoptimizerandlossfunctions.Then,youmustfityourmodelbyexecutingthemodelusingdata.Finally,youmakepredictionsonnewdata.

Asalreadymentioned,Kerasisamazinglypowerfulandeasytouseforevaluatinganddevelopingvarieddeeplearningmodels.ItwrapsallthoseefficientnumericalcomputationlibrarieslikeTensorFlowandTheanoandallowsyoutodefineandproperlytrainyourneuralnetworkmodelsinseveral

Page 29: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

linesofcode.

Following,youaregoingtolearnhowtocreateyourfirstnetworkmodelinPythonusingKeras.Beforeyoubegin,makesureyouhavePython2,oranewerversioninstalledandconfigured.YoualsoneedNumPyandSciPyinstalledandconfigured,and,ofcourse,youneedtohaveKerasandTensorFloworTheanoinstalledandconfigured.Onceyouhavetheseupandrunning,createanewfileasfollows.

keras_first_network.py

Inthefollowingsection,youwilllearnhowtoloaddata,defineyourmodel,compilemodel,fitmodel,evaluateyourmodel,andtieitalltogethertoworkperfectlyonyourfuturemodels.

Wheneverweworkwithdeeplearningmodelswhichuseastochasticprocesslikerandomnumbers,itisaverygoodideatosettherandomnumberseed.Thisisagoodideaasyouwillbeabletorunthesamecodeoverandoverandgetthesameresult.Thisisalsoveryusefulinthecasewhenyouneedtodemonstratearesult,comparedifferentmodelsusingthesamesource,orwhenyouneedtodebugapartofyourcode.Fortheinitializationoftherandomnumbergenerator,usethefollowingscript.

fromkeras.modelsimportSequential

fromkeras.layersimportDense

importnumpy

Page 30: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

numpy.random.seed(7)

Oncedone,youcanloadyourdata.Todosointhisexample,wearegoingtouseaverypopularPimaIndiansonsetwhichisastandarddeeplearningdatasetdevelopedbytheUCIMachineLearningrepository.ItdescribespatientmedicaldataforPimaIndianswithinfiveyears.Therefore,thisisabinaryclassificationproblemasalltheinputvariablesusedarenumerical.Therefore,thisreallymakesiteasytousethisdatasetdirectlywithaneuralnetworkinKeras.Touseit,downloadthedatasetandplaceitinyourworkingdirectory,whichisthesameasyourpreviouslycreatedPythonfile.

Now,wecontinuewithbuildingyourmodelwithKeras.YoumustloadthefiledirectlyusingthespecificNumPyfunction.Thereisoneoutputvariableinthelastcolumnandeightinputvariables.Onceyouloadthedata,youcansplityourdatasetintooutputvariabledenotedasYandinputvariablesdenotedasX.

dataset=numpy.loadtxt("pima–indians–diabetes.csv",delimiter=",")

X=dataset[:,0:8]

Y=dataset[:,8]

Makesureyouhaveinitializedyourrandomnumbergeneratortoensureyourresultsarereproducibleaswellasproperlyloadedonyourdata.Now,youmustdefineyourneuralnetworkmodel.

Asalreadymentioned,modelsinKerasaredefinedasaspecificsequenceofmultiplelayers.Tocreateasequentialmodelandthenaddonelayeratatime

Page 31: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

untilyouaresatisfiedwithyournetworktopology,youmustdefineyourmodel.Thefirstthingyoumustdoistoensureyourinputlayerhasthepropernumberofinputs.Youcanspecifythiswhenyoucreateyourfirstlayerusingtheinputdimargument.Makesureyousetittoeightfortheeightinputvariables.

Nowyouprobablywonder,howdoyouknowtherightnumberoflayersandtheirtypes.Well,thisisacomplexquestion.Therearesomeheuristicsyoucanuse.However,thebestnetworkstructureisfoundthroughacertainprocessoftrialanderror.Youwillneedanetworklargeenoughtocapturethecorestructureoftheproblem.

Further,wearegoingtoseeafully-connectedneuralnetworkstructurecontainingthreelayers.Takeintoconsiderationthatfully-connectedlayersaremainlydefinedusingthedenseclass.Youcanspecifytherightnumberofneuronscontainedinthelayerasyourfirstargument,whileyoursecondargumentisdefinedasinit.Then,youcanspecifyyouractivationfunctionaswellusingtheactivationargument.

Inthefollowingcase,wearegoingtoinitializethespecificnetworkweightstoyoursmallrandomnumbergeneratedfromaspecificuniformdistribution.Inthiscase,youwillgetbetween0and0.05asthisisthedefaultuniformweightwhenusingKeras.ThereisalsoanotheralternativenamednormalandinvariablyusedforsmallrandomnumbersthataregeneratedfromGaussiandistribution.

Inourexample,wearegoingtousethereluorrectifieractivationfunctionaswellonthetwoinitiallayers.Wemustusethesigmoidfunctioninouroutput

Page 32: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

layer.Before,itwascommontousethetanhandsigmoidactivationfunctionsforalllayers.

However,thesedaysthisisnotthecaseasbetterperformanceisachievedwhenyouusetherectifieractivationfunctioninadditiontousingasigmoidfunctiononyouroutputlayertoensureyourneuralnetworkoutputisbetweenoneandzero.Usingthismanner,itisveryeasytomapallclasseswithadefaultthreshold.Then,youcanpieceeverythingtogetherbyaddinglayers.Yourfirstlayerwillhavetwelveneurons,soexpecteightinputvariables.Yoursecondhiddenlayerswillhaveeightneuronswhileyouroutputlayerwillhaveoneneuronpredictingtheclasses.

model=Sequential()

model.add(Dense(12,input_dim=8,activation='relu'))

model.add(Dense(8,activation='relu'))

model.add(Dense(1,activation='sigmoid'))

Thenextstepistocompileyourmodel.Tocompileyourneuralnetworkmodel,youmustusetheefficientnumericallibrariescalledbackendlikeTensorFloworTheano.

Whenyouuseitforcompiling,thebackendautomaticallychoosesthebestpossiblewayforrepresentingyournetworkformakingpredictionsandtraining,whichwillrunonyourhardwarelikeGPUorCPUandsometimesevendistributed.

Whencompilingyourmodel,firstly,youmustspecifysomeadditionalproperties,whicharerequiredwhenyoutrainyourneuralnetwork.Becognizant

Page 33: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

thattrainingyournetworkmeansfindingthebestpossiblesetofweightswhichyouuseformakingrightpredictionsforyourspecificproblem.

Toevaluateasetofweights,firstly,youmustspecifythelossfunctions.Youwillusetheoptimizerthatisusedtosearchthroughdifferentweightsforyournetworkaswellasoptionalmetricsyouwouldliketocollectandreportduringmodeltraining.Inthisspecificcase,youwilluselogarithmicloss,whichisdefinedasbinarycrossentropyformostofbinaryclassificationproblems.

YoumustusetheefficientgradientdescentalgorithmorAdamonlybecauseitisaveryefficientdefault.Sincethisisaclassificationproblem,youmustcollectandreportthemetricoftheclassificationaccuracy.

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

Onceyouaredonewithcompilingyourmodel,thenextstepisfittingyourmodel.Onceyoudefineandcompileyourmodel,youmustfitit,soitisreadyforefficientcomputation.Therefore,nowistherighttimetoexecuteyourmodelwiththedata.Youcantrainorfityourneuralnetworkmodelonyourloadeddatabycallingthespecificfitfunctiononyourmodel.

Considerthatthemodeltrainingprocesswillrunforaspecificnumberofiterationsthroughyourdatasetnamedepochs.Therefore,youmustspecifyyourmodelusinganepochsargument.Youcansetthespecificnumberofinstances,whichwillbeevaluatedbeforeyouupdateweight.

Page 34: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Thisiscalledthebatchsetandsiteyoucallusingthebatchsizeargument.Forthisspecificproblem,youwillrunasmallnumberofiterations,hundredandfifty.Youwillalsouseasmallbatchsizeoften.Asmentionedbefore,thesecanbechosenexperimentallybyyourmodelusingtrialanderror.Duringthisstep,thecomputationoccursonyourGPUorCPU.

model.fit(X,Y,epochs=150,batch_size=10)

Onceyouaredonewithfittingyourmodel,youmustevaluateitasthenextstep.Uptothispoint,youhavetrainedyourneuralnetworkontheentiredataset,sonowyoucaneasilyevaluatetheoverallperformanceofyourneuralnetworkonthesamedataset.Thiswillgiveyouthebestideaonhowwellyouhavejustmodeledtheobtaineddatasetalsoknownastrainaccuracy.

However,youwillhavenoideaofhowwellyourmodelmayperformonthenewdata.Youhavedonethismainlyforsimplicity,butanidealpathistoseparateyourdataintotestandtraindatasetsforevaluationandtrainingofyourmodel.

Youcansurelyevaluateyourmodelonyourtrainingdatasetusingthespecificevaluatefunctiononyourmodelandthenpassthatsameinputandoutputyouhaveusedtotrainyourmodel.

Thiswillresultinapredictionforeveryinputandoutputpair,soyoucollectscoresinpatientsincludingaveragelossandanyotherimportantmetricsyouhavejustconfiguredlikeaccuracy.

Page 35: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

scores=model.evaluate(X,Y)

print("\n%s:%.2f%%"%(model.metrics_names[1],scores[1]*100))

Onceyoutieeverythingtogether,yougetyourfirstneuralnetworkmodelyouhavejustcreatedinKeras.Yourcompletecodewilllookasfollows.

fromkeras.modelsimportSequential

fromkeras.layersimportDense

importnumpy

numpy.random.seed(7)

dataset=numpy.loadtxt("pima–indians–diabetes.csv",delimiter=",")

X=dataset[:,0:8]

Y=dataset[:,8]

model=Sequential()

model.add(Dense(12,input_dim=8,activation='relu'))

model.add(Dense(8,activation='relu'))

model.add(Dense(1,activation='sigmoid'))

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

model.fit(X,Y,epochs=150,batch_size=10)

scores=model.evaluate(X,Y)

print("\n%s:%.2f%%"%(model.metrics_names[1],scores[1]*100))

Runningthis,youshouldseeamessageforeveryofthehundredandfiftyepochsthatprintboththeaccuracyandlossforeach,followedbythefinalevaluationofyourtrainedmodelonyourtrainingdataset.Youshouldgetmessagelikefollowing.

Page 36: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

...

Epoch145/150

768/768[==============================]-0s-loss:0.5105-acc:0.7396

Epoch146/150

768/768[==============================]-0s-loss:0.4900-acc:0.7591

Epoch147/150

768/768[==============================]-0s-loss:0.4939-acc:0.7565

Epoch148/150

768/768[==============================]-0s-loss:0.4766-acc:0.7773

Epoch149/150

768/768[==============================]-0s-loss:0.4883-acc:0.7591

Epoch150/150

768/768[==============================]-0s-loss:0.4827-acc:0.7656

32/768[>.............................]-ETA:0s

acc:78.26%

Youcanusethisneuralnetworkmodelyouhavejustcreatedformakingpredictionsaswell.However,youwillmustadaptourexamplefromtheabovejustabittouseitformakingpredictions.Makingpredictionsisveryeasyonceyoucallmodelpredictargument.

Inthiscase,youaregoingtouseasigmoidactivationfunctiononyourinputlayers,soyougetpredictionsintherangebetweenoneandzero.Inaddition,youcanquicklyconvertthemintobinarypredictionsforyourclassificationtaskbyjustroundingthem.

Torunpredictionsforeveryrecordcontainedinyourtrainingdata,youmustrun

Page 37: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

codeasshownbelow.

fromkeras.modelsimportSequential

fromkeras.layersimportDense

importnumpy

seed=7

numpy.random.seed(seed)

dataset=numpy.loadtxt("pima–indians–diabetes.csv",delimiter=",")

X=dataset[:,0:8]

Y=dataset[:,8]

model=Sequential()

model.add(Dense(12,input_dim=8,init='uniform',activation='relu'))

model.add(Dense(8,init='uniform',activation='relu'))

model.add(Dense(1,init='uniform',activation='sigmoid'))

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

model.fit(X,Y,epochs=150,batch_size=10,verbose=2)

predictions=model.predict(X)

rounded=[round(x[0])

print(rounded)

Runningthisneuralnetworkmodel,youwillprintthepredictionsforeveryinputpatternobtained.Inaddition,youcanusetheseobtainedpredictionsdirectlyinanapplication,ifrequired.

AsyounowknowhowtocreateyourneuralnetworkinKeras,wearegoingtomovetomorecomplexdeeplearningtasksthatyoucanefficientlyexecuteusingthepowerfulKerasPythonlibrary.

Page 38: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER3:MULTI-LAYERPERCEPTRONNETWORKMODELS

ThepowerfulKerasPythonlibraryfordeeplearningproblemsmainlyfocusesonthecreationofdeeplearningmodelsasacollection,asequenceofmultiplelayers.Inthefollowingsectionofthebook,youaregoingtolearnhowtousesimplecomponentstocreatesimplemulti-layerperceptronnetworkmodelsusingKeras.

Asalreadymentionedinthebook,thesimplestmodelyoucancreateisdefinedinthesequentialclassthatisalinearstackofmultiplelayers.Youcancreateasequentialmodelandthendefineallincludedlayersasseenbelow.

fromkeras.modelsimportSequential

model=Sequential(...)

However,amodelusefulidiomistofirstcreateyoursequentialmodelandthenaddlayersforpropercomputationasillustratedbelow.

fromkeras.modelsimportSequential

model=Sequential()

model.add(...)

model.add(...)

model.add(...)

Page 39: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Oncedone,youmustaddmodelinputs.Bemindfulthatthefirstlayersinyourneuralnetworkmodelmustspecifytheshapeofyourinput.This,infact,isthetotalnumberofinputsattributedasdefinedbytheargumentinputdim.Thisargumentwillexpectaninteger.Forinstance,youcanreadilydefineyourinputintermsofeightinputsforyourdenselayerasfollows.

Dense(16,input_dim=8)

Page 40: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

MODELLAYERS

Oncedone,youmustmodellayersofyourneuralmodel.Rememberthatlayersofdifferentkindusuallyhaveseveralpropertiesincommonespeciallytheiractivationfunctionsandtheirweightinitializationfunctions.Foryourmodels,youmustuseweightinitializationarguments.Thiskindofinitializationisusedforacertainlayer,whichisspecifiedintheinitargument.

Someofthemostcommonlyusedweightinitializationargumentsincludeuniform,normalandzero.Whenitcomestotheuniformweightinitialization,weightsareinitializedtosmalluniformlyrandomvaluesbetween0and0.5.Ontheotherhand,normalweightinitializationareweightsthatareinitializedtoasmallGaussianrandomvalue.Considerthatstandarddeviationiszeromeanof0.5.Thelasttypeiszerowhenallweightissettozerovalues.

Kerassupportsmanystandardneuronactivationfunctionsaswellsuchasrectifier,sigmoid,tanhandsoftmax.Youwillordinarilyspecifythetypeofyouractivationfunctionsusedbyaspecificlayerinyouractivationargumentthattakesastringvalue.Youcancreateanactivationobjectthatyoucanadddirectlytoyourmodeljustafteryouapplytheactivationfunctionstotheoutputofthespecificlayer.

ThereisawiderangeofdifferentcorelayersinKerasusedforstandardneuralnetworks.Someofthemostusefulandroutinelyusedcorelayertypesincludedense,dropoutandmergelayers.Denseisafully-connectedlayerusedmostoftenonmulti-layerperceptronmodels.Dropoutcorelayersapplydropouttotheneuralnetworkmodelbysettingafractionofinputstozerotoreducevery

Page 41: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

commonissue,overfitting.MergecorelayerscombinetheinputsfromseveralKerasmodelsintoasingleKerasmodel.

Page 42: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

MODELCOMPILATION

Asyoualreadyknow,onceyouhavedonedefiningyourmodel,youmustcompileit.Themodelcompilationwillcreatethehighlyefficientstructurethatwillbeusedbytheunderlyingbackend,TensorFloworTheano,toefficientlyexecuteyourneuralnetworkmodelduringthetrainingprocess.Youcancompileyourneuralnetworkmodelusingthecompileargument.Itwillacceptthreeimportantattributesofyourmodelincludinglossfunction,modeloptimizer,andmetrics.

model.compile(optimizer=,loss=,metrics=)

Whenitcomestothemodeloptimizers,theoptimizeristhemainsearchtechniquerecurrentlyusedtoupdateweightinyourneuralnetworkmodel.Youcancreateanoptimizerobjectandpassittoyourcompilefunctionusingtheoptimizerargument.Thiswillallowyoutoeffortlesslyconfiguretheoveralloptimizationprocesswithitsownargumentslikelearningaspecificrate.

sgd=SGD(...)

model.compile(optimizer=sgd)

Youcanalsousethedefaultparametersofyouroptimizerbyjustspecifyingthenameofthespecificoptimizertoyouroptimizerargumentasshownbelow.

model.compile(optimizer='sgd')

Page 43: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

SomefrequentlyusedgradientdescentoptimizersyoucanuseincludeSGD,RMSpropandAdam.TheSGDisusuallyusedstochasticgradientdescentwithgreatsupportformomentum.TheRMSpropisoftenusedinadaptivelearningrateoptimizationmethodswhileAdam,shortforAdaptiveMomentEstimation,otheradaptivelearningrates.

Onceyouusetherightoptimizer,youmustmovetomodellossfunctions.Thelossfunctionisalsocalledtheobjectivefunction.Itistheevaluationoftheneuralnetworkmodelsusedbytheoptimizerstonavigatetheweightspace.

Youcanquicklyspecifythenameofyourlossfunction,whichwillbefurtherusedbythecompilefunctions.SomeofthemostnormallyusedlossfunctionargumentsincludeMSEformeansquarederror,categoricalcrossentropyfornumerousmulti-classlogarithmictasksandbinarycrossentropyforbinarylogarithmicloss.

Onceyouobtainyourmodellossfunction,youmovetometrics.Metricsareevaluatedduringtheprocessofmodeltraining.Considerthatonlyonemetricissupportedatthetimeandthatisforaccuracy.

Page 44: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

MODELTRAINING

YourneuralnetworkmodelistrainedonNumPyarrays.Youwillusethefitfunctionasseenbelow.

model.fit(X,y,epochs=,batch_size=)

Modeltrainingbothspecifiesthenumberofepochsyoumusttrainandthebatchsize.Asalreadymentionedinthebook,epochsarethetotalnumberoftimesyourmodelisexposedtothedatasetusedfortrainingwhilethebatchsizeisthetotalnumberoftraininginstancesshowntoyourmodelbeforeyouperformweightupdate.

Asmentioned,formodeltrainingyouaregoingtothefitfunctionswhichallowsabasicevaluationtobeperformedonyourmodelduringmodeltraining.Youcanhandilysetthevalidationsplitvaluetoholdbackacertainfractionofyourtrainingdatasetforfurthervalidationtobeevaluatedbyeachepoch.YoumayuseavalidationdatatupleofYandXofdatatoevaluate.Moreover,fittingyourmodelreturnsahistoryobjectwithmetricsanddetailspreviouslycalculatedfortheneuralnetworkmodeleachepoch.Thisisinvariablyusedforgraphingyourmodel’soverallperformance.

Page 45: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

MODELPREDICTION

Onceyouaredonewithtrainingyourneuralnetworkmodel,youcanuseittomakepredictionsonyourtestdataoronothernewdata.Thereisawiderangeofdifferentoutputtypesyoucancalculatefromyourtrainedneuralnetworkmodel.Eachofthesemodelsiscalculatedusingadifferentfunctionyoucallonyourneuralnetworkmodel.

Forinstance,youcanusemodelevaluatefunctiontocalculatethelossvaluesofyourinputdataoryoucanusemodelpredictstogenerateyournetworkoutputforyourinputdata.Youhaveanoptiontousemodelpredictprobaargumenttogenerateclassprobabilitiesforyourinputdataorusemodelpredictclassesfunctiontogeneratedifferentclassoutputsforyourinputdata.Onsomeclassificationproblems,youmustusethepredictclassesargumenttomakedifferentpredictionsfornewdatainstancesorfortestdata.

Onceyouarehappywithyourmodelanditsproperties,youcanfinalizeit.Youmayneedasummaryofyourmodel.Ifso,youcanreadilydisplayasummaryofyourneuralnetworkmodelbycallingtheroutinelyusedsummaryfunctionasfollows.

model.summary()

Youhaveanoptiontoretrieveyourmodelsummaryusingthegetconfigargumentasfollows.

model.get_config()

Page 46: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Finally,youhaveanoptiontocreateanimageofyourneuralnetworkmodelstructureasseenbelow.

fromkeras.utils.visutilsimportplotmodel

plot(model,to_file='model.png')

Therefore,inthissectionofthebook,youdiscoveredtheKerasAPI,whichyoucanusetocreateinnumerabledeeplearningandartificialneuralnetworkmodels.Youhavelearnedhowtoconstructamulti-layerneuralnetworkmodel,howtoaddmultiplelayersincludingactivationandweightinitialization.Youhavethuslearnedhowtocompileyourneuralnetworkmodelusingseveraloptimizersincludingmetricsandlossfunctions.Now,youknowhowtofityourmodelsincludingbatchsizeandepochsaswellashowtomakemodelpredictionsandsummarizeyourmodel.

Page 47: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER4:ACTIVATIONFUNCTIONSFORNEURAL

NETWORKS

Inthissectionofthebook,wearegoingtogivemoreattentiontomostregularlyusedactivationfunctionsinneuralnetworks.Inthisexample,wearegoingtousetheMNIST.MNISTdataisasetofapproximately70000photosofmiscellaneoushandwrittendigitswhereeachphotoisblackandwhiteand28x28insize.Wearegoingtosolvethisspecificproblemusingafullyconnectedneuralnetworkwithseveraldifferentactivationfunctions.

Ourinputdatawillbe70000,784whileouroutputshapewillbe70000,10.Therefore,weuseafullyconnectedneuralnetworkmodelwithonehiddenlayer.Thereare784neuronscontainedintheinputlayer,oneforeverypixelinthephotosandthereare521neuronscontainedinthehiddenlayer.Intheoutputlayer,thereare10neuronsforeverydigit.UsingKeras,wecanutilizeseveraldifferentactivationfunctionsforeverylayerinourneuralnetworkmodel.Thismeansthatinthiscase,wemustdecidewhichactivationfunctionsshouldbeusedintheoutputlayerandwhichactivationfunctionshouldbeusedinthehiddenlayer.Therearemanydifferentactivationfunctions,butmostoftenusedarerelu,tanhandsigmoid.Firstly,wewillnotuseanyactivationfunctionstostartbuildingabasicsequentialmodel.

model=Sequential()model.add(Dense(512,input_shape=(784,)))model.add(Dense(10,activation='softmax'))Asalreadymentioned,intheinputthere

Page 48: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

are784neurons,inthehiddenlayerthereare512,andthereare10neuronscontainedintheoutputlayer.Beforeyoutrainyourmodel,youcanlookatyourneuralnetworkstructureandparametersusingmodelsummaryargumentasillustratedbelow.

Layers(input==>output)--------------------------dense_1(None,784)==>(None,512)dense_2(None,512)==>(None,10)SummaryLayer(type)OutputShapeParam#=================================================================dense_1(Dense)(None,512)401920_________________________________________________________________output(Dense)(None,10)5130=================================================================Totalparams:407,050Trainableparams:407,050Non-trainableparams:0_________________________________________________________________NoneOnceyouaresureaboutthestructureofyourmodel,youmusttrainitforfiveepochs.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-3s-loss:0.3813-acc:0.8901-val_loss:0.2985-val_acc:0.9178Epoch2/560000/60000[==============================]-3s-loss:0.3100-acc:0.9132-val_loss:0.2977-val_acc:0.9196Epoch3/560000/60000[==============================]-3s-loss:0.2965-acc:0.9172-val_loss:0.2955-val_acc:0.9186Epoch4/560000/60000[==============================]-3s-loss:0.2873-acc:0.9209-val_loss:0.2857-val_acc:0.9245Epoch5/560000/60000[==============================]-3s-loss:0.2829-acc:0.9214-val_loss:0.2982-val_acc:0.9185

Testloss:,0.299Testaccuracy:0.918

Page 49: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Asyoucansee,ourresultsof91.8%usingMNISTisquitebad.Whenyouplotthelosses,youwillseethatthevalidationlossisfarawayfromimprovinganditwillnotimproveevenafterhundredepochs.

Therefore,wemusttrydifferenttechniquestopreventacommonproblemofover-fittingfromoccurring.Weneedmoretechniquestomakeourneuralnetworkmodellearningbetterandworkingsmarter.Wecanachievethisbyusingoneofthemostcustomarilyusedactivationfunctions,thesigmoidactivationfunction.

Page 50: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

SIGMOIDACTIVATIONFUNCTION

Toimproveourneuralnetworkmodel,wewillusesigmoidactivationfunction.Itwillsquashourinputintoa0,1interval.

model=Sequential()model.add(Dense(512,activation='sigmoid',input_shape=(784,)))model.add(Dense(10,activation='softmax'))Youwillseethatthestructureofyourneuralnetworkremainedthesameasyoujusthavechangedtheactivationfunctionofyourdenselayer.Youcantrythesameforfiveepochs.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-3s-loss:0.4224-acc:0.8864-val_loss:0.2617-val_acc:0.9237Epoch2/560000/60000[==============================]-3s-loss:0.2359-acc:0.9310-val_loss:0.1989-val_acc:0.9409Epoch3/560000/60000[==============================]-3s-loss:0.1785-acc:0.9477-val_loss:0.1501-val_acc:0.9550Epoch4/560000/60000[==============================]-3s-loss:0.1379-acc:0.9598-val_loss:0.1272-val_acc:0.9629Epoch5/560000/60000[==============================]-3s-loss:0.1116-acc:0.9673-val_loss:0.1131-val_acc:0.9668

Testloss:0.113Testaccuracy:0.967

Thislooksmuchbetter.Yougetalinearcombinationofyourinputwiththebiasandtheweightsevenafteryoustackedmanylayers.Thisisverysimilartoaneuralnetworkwithoutanyhiddenlayers.Youcanaddsomemorelayersjusttoseewhatwilloccurasshownbelow.

Page 51: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

model=Sequential()model.add(Dense(512,input_shape=(784,)))

foriinrange(5):model.add(Dense(512))

model.add(Dense(10,activation='softmax'))

Whenyoudothis,yougetyourneuralnetworkmodellookingasindicatedbelow.

Dense_1(None,784)==>(None,512)dense_2(None,512)==>(None,512)dense_3(None,512)==>(None,512)dense_4(None,512)==>(None,512)dense_5(None,512)==>(None,512)dense_6(None,512)==>(None,10)

_________________________________________________________________Layer(type)OutputShapeParam#=================================================================dense_1(Dense)(None,512)401920_________________________________________________________________dense_2(Dense)(None,512)262656_________________________________________________________________dense_3(Dense)(None,512)262656_________________________________________________________________dense_4(Dense)(None,512)262656_________________________________________________________________dense_5(Dense)(None,512)262656_________________________________________________________________dense_16(Dense)(None,10)5130=================================================================Totalparams:1,720,330Trainableparams:1,720,330Non-trainableparams:0_________________________________________________________________NoneYougetresultsforfiveepochsasfollows.

Trainon60000samples,validateon10000samples

Page 52: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Epoch1/560000/60000[==============================]-17s-loss:1.3217-acc:0.7310-val_loss:0.7553-val_acc:0.7928Epoch2/560000/60000[==============================]-16s-loss:0.5304-acc:0.8425-val_loss:0.4121-val_acc:0.8787Epoch3/560000/60000[==============================]-15s-loss:0.4325-acc:0.8724-val_loss:0.3683-val_acc:0.9005Epoch4/560000/60000[==============================]-16s-loss:0.3936-acc:0.8852-val_loss:0.3638-val_acc:0.8953Epoch5/560000/60000[==============================]-16s-loss:0.3712-acc:0.8945-val_loss:0.4163-val_acc:0.8767

Testloss:0.416Testaccuracy:0.877

Thisisquitebad.Youcanseethatyourneuralnetworkmodelisjustunabletolearnwhatyouwant.Thishappenedbecausewithoutnonlinearity,yourneuralnetworkisjustabasiclinearclassifierunableofacquiringanynonlinearrelationships.

Ontheotherhand,sigmoidisalwaysanonlinearfunction,sowecannotrepresentitasalinealcombinationofourinput.Thatisexactlywhatbringsnonlinearitytoyourneuralnetworkmodel,soitcanlearnanynonlinearrelationships.Now,trainyourneuralnetworkmodel.Trainthefive-hiddenlayermodelusingsigmoidactivations.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-16s-loss:0.8012-acc:0.7228-val_loss:0.3798-val_acc:0.8949Epoch2/560000/60000[==============================]-15s-loss:0.3078-acc:0.9131-val_loss:0.2642-val_acc:0.9264

Page 53: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Epoch3/560000/60000[==============================]-15s-loss:0.2031-acc:0.9419-val_loss:0.2095-val_acc:0.9408Epoch4/560000/60000[==============================]-15s-loss:0.1545-acc:0.9544-val_loss:0.2434-val_acc:0.9282Epoch5/560000/60000[==============================]-15s-loss:0.1236-acc:0.9633-val_loss:0.1504-val_acc:0.9548

Testloss:0.15Testaccuracy:0.955

Thisismuchbetter.Inthiscase,youareprobablyover-fitting,butyoucanseethatyougotagreatboostinyourmodel’sperformancejustbyusingtheactivationfunction.Sigmoidactivationfunctionsaregreatastheyhavemanyphenomenalpropertieslikedifferentiability,nonlinearityandthis0,1rangegivesusanamazingprobabilityofreturnvaluesthatisanicefunction.

However,thisapproachhasitsdrawbacks.Forinstance,whenyouusebackpropagation,youmustback-propagatethederivativefromyourinputbacktoitsinitialweights.Youwanttopassyourregressionorclassificationerrorinthatfinaloutputvaluebackthroughyourwholeneuralnetwork.

Therefore,youmustderiveyourlayersaswellasupdateallweights.However,withsigmoid,thereisanissuewithaderivative.Withsigmoid,themaxvalueofthederivativeisquitesmalljustaround0.25.Thismeansyoucanonlypassasmallfractionofyourerrortoyourpreviousneuralnetworklayers.Thisissuemaycauseyourmodeltolearnslow,soitneedsmoreepochsanddata.Tosolvethisproblem,youcanusethetanhfunction.

Page 54: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

TANHACTIVATIONFUNCTION

Tanhactivationfunction,justlikesigmoid,isdifferentiableandnonlinear.Tanhactivationfunctionsgiveoutputwhichisinthe-1,1rangewhichisnotasniceas0,1range.However,thisisokayforneuralnetworkhiddenlayers.Tanhfunctionsalsohavemaxedderivative,whichisgoodforourissuehereaswecaneasilypassourerror,whichwasnotthecasewithsigmoidfunctions.

Tousethetanhactivationfunction,youmustchangetheactivationattributeofyourdenselayer.

model=Sequential()model.add(Dense(512,activation=’tanh’,input_shape=(784,)))model.add(Dense(10,activation=’softmax’))Again,youcanseethatthestructureofyourneuralnetworkisthesame.Now,trainforfiveepochs.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-5s-loss:0.3333-acc:0.9006-val_loss:0.2106-val_acc:0.9383Epoch2/560000/60000[==============================]-3s-loss:0.1754-acc:0.9489-val_loss:0.1485-val_acc:0.9567Epoch3/560000/60000[==============================]-3s-loss:0.1165-acc:0.9657-val_loss:0.1082-val_acc:0.9670Epoch4/560000/60000[==============================]-3s-loss:0.0843-acc:0.9750-val_loss:0.0920-val_acc:0.9717Epoch5/560000/60000[==============================]-3s-loss:0.0653-acc:0.9806-val_loss:0.0730-val_acc:0.9782

Testloss:0.073Testaccuracy:0.978

Page 55: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Youcanseethatyouimprovedyourtestaccuracybymorethanonepercentjustbyusingadifferentactivationfunction.Now,youprobablywonder,canyoudobetter?Fortunately,youcanthankstothereluactivationfunction.

Page 56: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

RELUACTIVATIONFUNCTION

Therangeofreluactivationfunctionsis0toinfinity.However,unliketanhandsigmoidfunctions,reluisbothdifferentiableatzeroeventhoughtherearesolutionstothis.

Thebestthingaboutreluactivationfunctionisitsgradient,whichisalwaysequaltoone,sothiswayyoucaneasilypassthemaximumamountoftheerrorduringbackpropagation.Now,trainyourmodelandseetheresults.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-5s-loss:0.2553-acc:0.9263-val_loss:0.1505-val_acc:0.9516Epoch2/560000/60000[==============================]-3s-loss:0.1041-acc:0.9693-val_loss:0.0920-val_acc:0.9719Epoch3/560000/60000[==============================]-3s-loss:0.0690-acc:0.9790-val_loss:0.0833-val_acc:0.9744Epoch4/560000/60000[==============================]-4s-loss:0.0493-acc:0.9844-val_loss:0.0715-val_acc:0.9781Epoch5/560000/60000[==============================]-3s-loss:0.0376-acc:0.9885-val_loss:0.0645-val_acc:0.9823

Testloss:0.064Testaccuracy:0.982

Now,yougothebestresultof98.2%.Thisisquiteamazing,andyoudidnotuseanyhiddenlayer.

Page 57: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Itisveryimportanttosaythereisnobestactivationfunctionyoucanuse.Onemaybebetterinsomecaseswhileanotherisbetterinotherinstances.Anotherimportantthingtosayisthatusingdifferentactivationfunctionsdoesnotinanywayaffectwhatyourneuralnetworkcanlearnandhowfast.

Page 58: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER5:MNISTHANDWRITTENRECOGNITION

Inthissectionofthebook,wearegoingtobuildasimpleneuralnetworkinKerasandtrainitonaGPU-enabledserver.ThismodelwillbeabletorecognizehandwrittendigitsthankstotheMNISTdataset.Asyoualreadyknow,MNISTcontains70000images,10000fortestingand60000fortraining.Allimagesare28x28pixels,centeredtoreducepreprocessingtimes.

Tostart,youmustsetyourenvironmentwithKerasusingTheanoorTensorFlowasthebackend.Inthisexample,wearegoingtousetheTensorFlowandKeraspackagesasshownbelow.

Page 59: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

condainstall-qy-canacondatensorflow-gpuh5py

pipinstallkeras

Theseimportsarequitestandard.Oncedone,youmustimportKerasimportsasfollows.Youwillimportplottingandarray-handling.Oncecomplete,youmustkeepyourKerasbackendTensorFlowquiet.

importnumpyasnp

importmatplotlib

matplotlib.use('agg')

importmatplotlib.pyplotasplt

importos

os.environ['TFCPPMINLOGLEVEL']='3'

fromkeras.datasetsimportmnist

fromkeras.modelsimportSequential,load_model

fromkeras.layers.coreimportDense,Dropout,Activation

fromkeras.utilsimportnp_utils

Afterthatisdone,Keraswillimportthedatasetandbuilditonyourneuralnetwork.Thefollowingstepistopreparethedatasetwearegoingtouse,MNIST.YoumustloadthedatasetusingthisveryhandyfunctionthatwillsplitMNISTintotestsetsandtrainsets.

(Xtrain,ytrain),(Xtest,ytest)=mnist.load_data()

Thenextstepistoinspectseveralexamples.TakeintoconsiderationthatMNISTcontainsonlygrayscaleimages,soformoreadvanceddatasets,wewill

Page 60: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

useRGBorthree-colorchannels.

fig=plt.figure()

foriinrange(9):

plt.subplot(3,3,i+1)

plt.tight_layout()

plt.imshow(X_train[i],cmap='gray',interpolation='none')

plt.title("Class{}".format(y_train[i]))

plt.xticks([])

plt.yticks([])

fig

Next,youbegintotrainyourmodeltoclassifyimages.Todothis,youmustunrollthewidthheightpixelformatintoonehugevector,yourinputvectors.Therefore,graphthedistributionofyourpixelvaluesasfollows.

fig=plt.figure()

plt.subplot(2,1,1)

plt.imshow(X_train[0],cmap='gray',interpolation='none')

plt.title("Class{}".format(y_train[0]))

plt.xticks([])

plt.yticks([])

plt.subplot(2,1,2)

plt.hist(X_train[0].reshape(784))

plt.title("PixelValueDistribution")

fig

Page 61: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Justasexpected,yougetapixelvaluerangingfromzeroto255.Inthiscase,thebackgroundmajorityisclosertozerowhilethosepixelscloserto255representMNISTdigits.Tospeedupthemodeltraining,youshouldnormalizetheinputdata.Normalizingyourinputdata,youalsoreducethechanceofyourmodelgettingstuckinlocaloptimaasyouareusingstochasticgradientdescenttofindtheoptimalweightsforyourneuralnetwork.

Thenextstepisreshapingyourinputstoasinglevectorandnormalizingthepixelvaluetobebetweenzeroandone.Todoso,youmustprinttheshapebeforeyoucannormalizeandreshapeit.

Print("X_trainshape",X_train.shape)

Print("y_trainshape",y_train.shape)

Print("X_testshape",X_test.shape)

print("y_testshape",y_test.shape)

Afterthat,youmustbuildyourinputvectorfrom28x28pixelsasseenbelow.

X_train=X_train.reshape(60000,784)

X_test=X_test.reshape(10000,784)

X_train=X_train.astype('float32')

X_test=X_test.astype('float32')

Thenextstepistonormalizethedatatoboostyourmodeltraining.

X_train/=255

X_test/=255

Page 62: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Thefollowingstepistoprintyourfinalinputshape,whichisreadyfortraining.

Page 63: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

print("Trainmatrixshape",X_train.shape)

print("Testmatrixshape",X_test.shape)

('X_trainshape',(60000,28,28))

('y_trainshape',(60000,))

('X_testshape',(10000,28,28))

('y_testshape',(10000,))

('Trainmatrixshape',(60000,784))

('Testmatrixshape',(10000,784))

Asyoucansee,Yinthistrainingmodelholdsintegervaluesfromzerotonine.Useitformodeltraining.

print(np.unique(y_train,return_counts=True))

(array([0,1,2,3,4,5,6,7,8,9],dtype=uint8),array([5923,6742,5958,6131,5842,5421,5918,6265,5851,5949]))

Thenextstepistoencodeyourcategories,digitsfromzerotonineusingone-hotencoding.Youwillgettheresultthatisavectorwithalengthequaltoyournumberofcategories.Inaddition,thevectoryougetisallzerosexceptinthemiddleposition.

N_classes=10

Print("",y_train.shape)

Y_train=nputils.tocategorical(ytrain,nclasses)

Y_test=nputils.tocategorical(y_test,n_classes)

print(":",Y_train.shape)

Page 64: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

ThenextstepistoturntoKerastobuildyourneuralnetwork.Atthispoint,yourpixelvectorservesastheinput.Therearetwohidden512-nodelayersaswell.Therefore,thereismodelcomplexityyouwilluseforrecognizingdigits.Wemustaddanotherfully-connectedlayerforthetendifferentoutputclassesduetomulti-classclassification.Further,wearegoingtousethesequentialmodel.Thefirststepistostackmorelayersusingtheaddargument.

WhenyouaddyourfirstlayerintheKerassequentialmodel,youmustspecifyyourinputshapeforKerastocreatethepropermatriceswhiletheshapeforotherlayersisinferredbyKerasautomatically.Tointroducenonlinearitiesintoyournetworkandtoevaluateitbeyondcapabilitiesofabasicperceptron,youmustaddactivationfunctionstoyourhiddenlayers.

Thisdifferentiationforthemodeltrainingusingbackpropagationisoccurringbehindthescenes.Youmustaddadropoutmodelasthebestwayforpreventingmodelover-fitting.Youwillusethesoftmaxactivationasthestandardforeverymulti-classtargets.Whenbuildingyourmodel,thefirststepistobuildalinearstackoflayers.

model=Sequential()

model.add(Dense(512,input_shape=(784,)))

model.add(Activation('relu'))

model.add(Dropout(0.2))

model.add(Dense(512))

model.add(Activation('relu'))

model.add(Dropout(0.2))

Page 65: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

model.add(Dense(10))

model.add(Activation('softmax'))

Followingthat,youmustcompileyourmodelusingthecompileargument.Inthisstep,youmustspecifyyourobjectiveorlossfunction.Inthisexample,wearegoingtoseecategoricalcrossentropy,butyoucanuseanyotherlossfunctions.

Whenitcomestotheoptimizers,inthisexamplewearegoingtousetheAdamoptimizerwithdefaultsettings.Youcaninstantiateyouroptimizerandsetparametersjustbeforeyouusethemodelcompileargument.Youmustchoosewhichmetricsyouwanttoevaluateduringmodeltrainingandtesting.Youcanhavemetricsdisplaysduringyourtestingandtrainingstageifyoulike.Tocompileyourmodel,doasfollows.

Model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer='adam')

Onceyoucompileyourmodel,youcanmovetomodeltraining.Youmustspecifyhowmanytimesyouwanttoiterateyourtrainingsetorepochs.Youmustspecifyhowmanysamplesyouwanttoupdatetothemodel’sbatchsizeorweights.

Keepinmindthatbiggerthebatch,themorestablestochasticgradientdescentupdatesbecome.However,beawareofGPUmemorylimitation.Inthisexample,wearegoingwithabatchsizeof8and128epochs.

Page 66: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Tohandleyourmodeltrainingprocesscorrectly,youshouldgraphthelearningcurveforyourmodellookingonlyatthemodelaccuracyandloss.Beforeyoucontinue,youshouldsaveyourmodel.Oncecompleted,yougettoworkwithyourtrainedmodelandfinallyevaluateitsperformance.Tosavemetricsrunasshownbelow.

history=model.fit(Xtrain,Ytrain,

batch_size=128,epochs=8,

verbose=2,

validation_data=(Xtest,Ytest))

Then,youmustsaveyourmodel.

Save_dir="/results/"

Model_name='keras_mnist.h5'

Modelpath=os.path.join(savedir,model_name)

Model.save(model_path)

print('Savedtrainedmodelat%s'%model_path)

Page 67: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Thenextstepistoplotthemetrics.

fig=plt.figure()

plt.subplot(2,1,1)

plt.plot(history.history['acc'])

plt.plot(history.history['val_acc'])

plt.title('modelaccuracy')

plt.ylabel('accuracy')

plt.xlabel('epoch')

plt.legend(['train','test'],loc='lowerright')

plt.subplot(2,1,2)

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('modelloss')

plt.ylabel('loss')

plt.xlabel('epoch')

plt.legend(['train','test'],loc='upperright')

plt.tight_layout()

fig

Youwillnoticethatthelossonyourtrainingsetisdecreasingrapidlywhenitcomestothefirsttwoepochs.Thismeansthatyourneuralmodelislearningtoclassifyhandwrittendigitsquitefast.Whenitcomestothetestset,thelosswillnotbedecreasingasfast,butitwillstaywithintherangeofthetrainingloss.Thismeansthatyourmodelisableofgeneralizingwelltodata,whichisunseen.

Page 68: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Thefollowingstepistoevaluateyourmodel’sperformance.Inthisstep,youaregoingtoseehowwellyourmodelperformsonthegiventestset.Toassessyourmodel,usemodelevaluatesfromtheargumentthatcomputeanymetricdefinedduringthemodelcompileprocess.Inthisexample,themodelaccuracyiscomputedonthe10000imagesoftestingexamplesusingthemodelweightsgivenbyoursavedmodel.

Mnistmodel=loadmodel

Loss_andmetrics=mnistmodel.evaluate(Xtest,Ytest,verbose=2)

print("TestLoss",lossandmetrics[0])

print("TestAccuracy",lossandmetrics[1])

('TestLoss',0.06264158328680787)

('TestAccuracy',0.98299999999999998)

Youwillgetthismodelaccuracythatlooksquitegood.However,youshouldlookatnineexampleseachsoyouevaluatebothincorrectlyandcorrectlyclassifiedexamples.Thefirststepistoloadthemodelandcreatepredictionsonyourtestset.

Mnistmodel=loadmodel

Predictedclasses=mnistmodel.predictclasses(Xtest)

Then,seewhatyoupredictedcorrectlyandincorrectly.

Correctindices=np.nonzero(predictedclasses==y_test)[0]

Page 69: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Incorrect_indices=np.nonzero(predictedclasses!=ytest)[0]

Print()

Print(len(correct_indices),"classifiedcorrectly")

Print(len(incorrect_indices),"classifiedincorrectly")

Thefollowingstepistoadaptthefiguresizetoaccommodateeighteensubplotsasfollows.

Plt.rcParams['figure.figsize']=(7,14)

Figure_evaluation=plt.figure()

Then,youmustplotninecorrectandnineincorrectpredictions.

(correct_indices[:9]):

Plt.subplot(6,3,i+1)

Plt.imshow(X_test[correct].reshape(28,28),cmap='gray',interpolation='none')

Plt.title("Predicted{},Class{}".format(predicted_classes[correct],y_test[correct]))

Plt.xticks([])

Plt.yticks([])

(incorrect_indices[:9]):

Plt.subplot(6,3,i+10)

Plt.imshow(X_test[incorrect].reshape(28,28),cmap='gray',interpolation='none')

Plt.title("Predicted{},Class{}".format(predicted_classes[incorrect],y_test[incorrect]))

Plt.xticks([])

Plt.yticks([])

Figure_evaluation

9696/10000[============================>.]-ETA:0s()

Page 70: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

(9830,'classifiedcorrectly')

(170,'classifiedincorrectly')

Asyoucansee,theseincorrectpredictionsarequiteforgivableasinsomecasesitishardforthehumanreadertorecognize.Inthissectionofthebook,weusedKeraswithitsbackendTensorFlowonaGPUenabledservertotrainourneuralnetworktorecognizethehandwrittendigitsinjustunder20secondsofoveralltrainingtime.

Page 71: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER6:NEURALNETWORKMODELSFORMULTI-CLASSCLASSIFICATIONPROBLEMS

Asyoualreadyknow,KerasisahighlypowerfulpartofthePythonlibraryfordeeplearning,whichwrapstheefficientnumericallibrariesTensorFlowandTheano.Inthissectionofthebook,youaregoingtolearnhowtouseKerastodevelopandevaluateyourneuralnetworkmodelsyoucanuseforassortedmulti-classclassificationproblems.

Afterthat,youwillknowhowtoloaddatafromCSVtoKeras,youwillhowtopreparemulti-classclassificationdataforfurthermodelingwithyourneuralnetworks,andyouwillknowhowtoevaluateyourKerasneuralnetworkmodelsusingscikit-learn.

Inthisspecificexample,wearegoingtoseeoneofthestandardmachinelearningproblemsnamedirisflowersdataset.Thisproblemiswell-studied,soitisthebestexampletousewhenyouwanttopracticemoreonneuralnetworksasallfourinputvariablesarenumericmeaningandtheyhavethesamescalerepresentedincentimeters.Inaddition,everyinstancedescribesthepropertiesoftheflowermeasurements,sotheoutputvariablesareveryspecificirisspecies.

Thisisastandardmulti-classclassificationproblem.Thismeanstherewillbemorethantwoclassesyoumustpredict,astherewillbethreeflowerspecies.

Page 72: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

ThisisthebestillustrationtousewhenyouwanttopracticeonneuralnetworkmodelsinKerasbecausethesethreeclassvaluesrequireveryspecializedhandling.Sincethisirisflowerdatasetisaverycommonandwell-studiedproblem,expecttogetyourmodelaccuracysomewherebetween95%to97%.Tostart,youmustdownloadthisirisflowerdatasetfromtheMachineLearningrepository.Oncedownloaded,placeitinyourworkingdirectory.Thenextstepistoimportallclassesandfunction.Thisincludesboththedataloadingfrompandasaswellasdatapreparation.Youneedmodelevaluationfromscikit-learn.

importnumpy

importpandas

fromkeras.modelsimportSequential

fromkeras.layersimportDense

fromkeras.wrappers.scikit_learnimportKerasClassifier

fromkeras.utilsimportnp_utils

fromsklearn.modelselectionimportcrossval_score

fromsklearn.model_selectionimportKFold

fromsklearn.preprocessingimportLabelEncoder

fromsklearn.pipelineimportPipeline

Thenextstepistoinitializearandomnumbergeneratortoaconstantvalueofseven.Thisisveryimportant,asyouwanttoensurethattheresultsyougetfromthisneuralnetworkmodelcaninfactbeachievedagain.

Thisstepensuresthatthestochasticprocessofmodeltrainingcanbereproduced.Therefore,yournextstepistofixrandomseedforreproducibilityasseenbelow.

Page 73: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

seed=7

numpy.random.seed(seed)

Oncefixed,youmustloadthedatasetdirectly.Youcandoitdirectlyastheoutputvariablecontainsstrings,sothebestwayistoloadthedatausingpandas.Inaddition,youcansplitattributedintoinputvariablesasXandoutputvariablesasY.

dataframe=pandas.read_csv("iris.csv",header=None)

dataset=dataframe.values

X=dataset[:,0:4].astype(float)

Y=dataset[:,4]

Page 74: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

ONE-HOTENCODING

Asalreadymentioned,inthisexample,theoutputvariablecontainsthreestringvalues.Therefore,youmustencodetheoutputvariables.Whenyoumodelmulti-classclassificationproblemswithneuralnetworks,thebestwayistoreshapeyouroutputattributedfromavectorwhichcontainsvaluesforeveryclasstoamatrixthathasaBooleanforeveryclassvaluenomattergiveninstanceoftheclassvalue.Thisiscalledcreatingdummyvariablesorone-hotencodingofcategoricalvariables.

Forinstance,inthisproblemhere,therearethreeclassvaluesnamedIris-versicolor,Iris-setosaandIris-virginica.

Iris-setosa

Iris-versicolor

Iris-virginica

Inthiscase,youcanturnthisintoasingleone-hotencodedbinarymatrixforeverydatainstance,whichwouldlikeasshownbelow.

Iris-setosa,Iris-versicolor,Iris-virginica

1,0,0

0,1,00,0,1

Youcandothisbyjustencodingthestringstointegerswiththescikit-learnclass

Page 75: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

LabelEncoder.Oncecompleted,youcanconverttheintegervectortoasingleone-hotencodingusingthefunctiontocategoricalinKerasasdemonstratedbelow.

Page 76: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

encoder=LabelEncoder()

encoder.fit(Y)

encoded_Y=encoder.transform(Y)

dummy_y=nputils.tocategorical(encoded_Y)

Page 77: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

DEFININGNEURALNETWORKMODELSWITHSCIKIT-LEARN

Oncedonewithone-hotencoding,youmustdefineyourneuralnetworkmodel.TheKeraslibrarycomeswithwrapperclasses,whichallowyoutouseyourneuralnetworkmodelsyoucreatedinKerasinscikit-learn.Plus,thereisaKerasClassifierclassthatcanbeusedasscikit-learnestimator.

ThisKerasClassifiertakesthenameofthefunctionsasanargument.Considerthatthisfunctionmustreturntoyourneuralnetworkmodelwhichisreadyfortraining.Further,wearegoingtocreateafunctionforthisirisclassificationproblem.

Onceyourunthecode,youwillcreateasimple,fully-connectedneuralnetworkcontainingonehiddenlayerwitheightneurons.Thishiddenlayerwillusearectifiedactivationfunctionthatisaverygoodpractice.Sincewealreadyusedone-hotencodingonthisirisdataset,youroutputlayersmustcreatethreeoutputvaluesforeachclass.Oncedone,theoutputvaluewiththebiggestvaluebecomestheclasspredictedbyyourneuralnetworkmodelasfollows.

4inputs->[8hiddennodes]->3outputs

Onethingshouldbenoted.Weusedasoftmaxactivationfunctioninouroutputlayertoensurethattheoutputvaluesofourmodelareintherangeofzeroandone,sotheymaybeusedasourpredictedprobabilities.

Page 78: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Finally,wemustusetheefficientAdamgradientdescentoptimizationmodelalongsidelogarithmiclossfunctionrepresentedasthecategoricalcrossentropyargumentinKeras.Therefore,thenextstepistodefineyourbaselinemodel.Oncecompleted,youmustcreateitandcompileitasbelow.

defbaseline_model()

model=Sequential()

model.add(Dense(8,input_dim=4,activation='relu'))

model.add(Dense(3,activation='softmax'))

model.(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

returnmodel

Afterthat,youcanfinallycreateyourKerasClassifierinscikit-learn.YoucanpassargumentsaswellduringtheconstructionofyourKerasClassifier,whichwillbequicklypassedontoyourfitfunction,whichyouwillusefortrainingyourneuralnetworkmodel.Following,wearegoingtopassthenumberofepochsastwohundredandbatchsizeas5tousethemduringmodeltraining.Bearinmindthatdebuggingisalsoturnedoffaswesetverbosetozero.

estimator=KerasClassifier(buildfn=baselinemodel,epochs=200,batch_size=5,verbose=0)

Page 79: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

EVALUATINGMODELSWITHK-FOLDCROSSVALIDATION

Oncefinishedwiththepreviousstep,youmustevaluateyourneuralnetworkmodelonyourtrainingdata.Thepowerfulscikit-learnhasexcellentcapabilityofevaluatingneuralnetworkmodelsusingseveraldifferenttechniques.Thebestwayforevaluatingyourneuralnetworkmodelsisusingk-foldcrossvalidation.

Usingk-foldcrossvalidation,youcanevaluateyourmodelonyourdatasetusingaten-foldcrossvalidationargumentork-fold.Theprocessofevaluatingyourmodelwilltakeabouttenseconds.

Whenfinished,yourmodelwillreturnasanobjectwhichdescribestheevaluationofthetenconstructedmodelsforeachofthesplitsinthedatasetasshownbelow.

results=cross_valscore(estimator,X,dummyy,cv=kfold)

print("Baseline:%.2f%%(%.2f%%)"%(results.mean()*100,results.std()*100))

Oncecompleted,youwillseethattheresultsaresummarizedasboththestandardandmeandeviationofyourneuralnetworkmodelaccuracyonthedatasetweused.

Thisisaveryreasonableestimationoftheoverallperformanceofyourneuralnetworkmodelonthisunseendata.Thisiswellwithintherealmofknown

Page 80: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

resultsforthisspecificproblemasyougetaccuracyasseenbelow.

Accuracy:97.33%(4.42%)

Page 81: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

CHAPTER7:RECURRENTNEURALNETWORKS

Inthislastsectionofthebook,youaregoingtolearnhowtocreaterecurrentneuralnetworksinKeras.Recurrentneuralnetworksareaclassofneuralnetworkmodels,whichexploitthesequentialnatureoftheirinput.Suchinputscanbespeech,text,timeseries,andeverythingelsewheretheoccurrenceofanelementinthesequenceisdependentontheelements,whichappearedbeforeit.

AnRNNmodelcanbethoughtofasagraphofrecurrentneuralnetworkcellswhereeverycellperformsthesameoperationoneachelementinthesequence.

Recurrentneuralnetworksareveryflexible,sotheyhavebeenusedtosolvediverseproblemslikelanguagemodeling,speechrecognition,sentimentanalysis,machinetranslationandimagecaptioning,tonameafew.

Recurrentneuralnetworkscanbereadilyadaptedtomanykindsofproblemsjustbyrearrangingthewaythecellsaresituatedinthegraphs.Inthissectionofthebook,youaregoingtolearnmoreaboutLSTMorlongshort-termmemoryandGRUorgatedrecurrentunitmodels,abouttheirpowersandtheirlimitations.

BothGRUandLSTMaredrop-inreplacementsforthebasicrecurrentneuralnetworkcell,sojustbyreplacingtherecurrentneuralnetworkcellwithoneofthesetwovariationsyoucangetamajorperformanceboostinyournetwork.

Page 82: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

WhileGRUandLSTMarenottheonlyvariants,theyhaveproventobethemostefficientforsolvingmostsequenceproblems.

Page 83: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

SEQUENCECLASSIFICATIONWITHLSTMRECURRENTNEURALNETWORKS

Sequenceclassificationisacommonpredictivemodelingprobleminwhichyouhaveasequenceofinputsplacedovertimeorspace,andyourtaskistopredictaspecificcategoryforthatsequence.ApowerfultypeofneuralnetworkmodelcreatedtohandleproblemslikethisisaLSTMrecurrentneuralnetwork.

Thelongshort-termmemoryisatypeofrecurrentneuralnetworkordinarilyusedindeeplearningproblemsduetoitslargearchitectureswhichcanbesuccessfullytrained.Inthissection,youaregoingtolearnsequenceclassificationinKerasusingLSTMrecurrentneuralnetworks.

Whatmakesthisproblemdifficultisthatthespecificsequencescanvaryinlength,theymaysometimescontaintheverylargevocabularyoftheirinputsymbolsandtheymayrequireyourneuralnetworkmodeltolearnthatlong-termcontextofdiversedependenciesexistingbetweendifferentsymbolsinyourinputsentence.

TheproblemwearegoingtosolveistheIMDBmoviesentimentclassificationproblem.EachmoviereviewontheIMDBisavariablesequenceofwordswhileeachsentimentofeverymoviereviewmustbeclassified.WewillusetheIMDBdatasetthatcontains25,000moviereviewsbothgoodorbadfortestingandtraining.Theproblemhereistodeterminewhetherthemoviehasanegativeorpositivesentiment.

Page 84: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

KerascomeswithbuiltinaccesstotheIMDBdataset.Toloadit,useIMDBloaddatafunction.Onceloaded,youcanuseitforyourdeeplearningmodels.Thewordsherehavebeenreplacedbyintegers,whichindicatethespecificorderedfrequencyofeachwordintheIMDBdatasetwhilethesentencesineachmoviereviewarecomprisedofacertainsequenceofintegers.

Page 85: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

WORDEMBEDDING

Ourfirstmoveistomapeachmoviereviewintoarealvectordomain.Thisisaverypopulartechniqueusedwithtextnamedwordembedding.Inthistechnique,wordsareencodedasreal-valuedvectorsinahighdimensionalspaceinwhichthesimilaritybetweenwords,intermsoftheirmeaning,translatestotheclosenessofthatvectorspace.

KerasisgoodforthisasitprovidesahighlyeffectivelyandconvenientwayforconvertingpositiveintegerwordrepresentationsintoawordembeddingusingEmbeddinglayers.Therefore,wearegoingtomapeachwordontoathirty-two-lengthreal-valuedvector.Inaddition,wearegoingtolimitthetotalnumberofwordsweareinterestedinneuralnetworkmodelingtothefivethousandmostfrequentwordsandzeroouttheremaining.

Moreover,wearegoingtoconstraineachmoviereviewtobefivehundredwordsaswetruncatelongmoviereviewsandpadthoseshorterreviewswithzerovalues.Thefirststepistoprepareandmodeldata.Oncedone,youarereadytocreateyourLSTMmodel,whichwillclassifythesentimentofmoviereviews.

YourfirststepistoquicklydevelopabasicLSTMforthisIMDBproblem.Startwithimportingfunctionsandclasses,whicharerequiredforthismodel.Then,initializetherandomnumbergeneratortoaconstantvaluetomakesureyoucaneffortlesslyreproducetheresultsyouget.

Page 86: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

importnumpy

fromkeras.datasetsimportimdb

fromkeras.modelsimportSequential

fromkeras.layersimportDense

fromkeras.layersimportLSTM

fromkeras.layers.embeddingsimportEmbedding

fromkeras.preprocessingimportsequence

numpy.random.seed(7)

Oncedone,youmustloadtheIMDBdataset.Additionally,youwillconstrainthedatasettothetopfivethousandwords.Youmustsplitthedatasetintotestandtrainsets.

Top_words=5000

(Xtrain,ytrain),(Xtest,ytest)=imdb.load_data(numwords=topwords)

Thefollowingstepistruncatingandpaddingyourinputsequences,sotheyaresameinthelength.Yourmodelwilllearnthesezerovaluesthatcarrynoinformation,sosamelengthvectorsarerequiredforcomputationhere.

Maxreviewlength=500

Xtrain=sequence.padsequences(Xtrain,maxlen=maxreview_length)

X_test=sequence.pad_sequences(Xtest,maxlen=maxreview_length)

Oncecompleted,youmustdefine,compileandfinallyfityourLSTMmodel.

Page 87: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Thefirstlayerinyourembeddedlayerusesthirty-two-lengthvectors,whichrepresenteveryword.ThefollowinglayerisyourLSTMlayerthathashundredsmartormemoryunits.Furthermore,youmustuseadenseoutputlayercontainingasigmoidactivationfunctionandasingleneurontomakezeroandonepredictionsforyourtwoclasses,goodorbadreviews,intheproblem.

Sincethisisabinaryclassificationproblem,youmustuseloglossasyourlossfunctionsinadditiontotheefficientAdamoptimizer.Yourmodelswillbefitforonlytwoepochs,soitwillquicklyover-fitthisproblem.Tospaceoutweightupdates,youwilluseabigbatchsizeofsixty-fourmoviereviews.

Embeddingvecorlength=32

model=Sequential()

model.add(Embedding(top_words,embeddingvecorlength,inputlength=maxreview_length))

model.add(LSTM(100))

model.add(Dense(1,activation='sigmoid'))

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

print(model.summary())

model.fit(Xtrain,ytrain,validation_data=(Xtest,ytest),epochs=3,batch_size=64)

Thenextstepistoestimatetheperformanceofyourmodelonafewunseenmoviereviewsasfollows.

scores=model.evaluate(Xtest,ytest,verbose=0)

print("Accuracy:%.2f%%"%(scores[1]*100))

Page 88: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Runningthiscode,youwillgetasindicatedbelow.

Epoch1/3

16750/16750[==============================]-107s-loss:0.5570-acc:0.7149

Epoch2/3

16750/16750[==============================]-107s-loss:0.3530-acc:0.8577

Epoch3/3

16750/16750[==============================]-107s-loss:0.2559-acc:0.9019

Accuracy:86.79%

Page 89: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

APPLYINGDROPOUT

ThisverysimpleLSTMmodelwithlittletuningprovidesgreatresultsonthisIMDBdatasetproblem.UsethismodelasatemplatewhichyoucanapplytoavarietyofLSTMneuralnetworkstoyourownsequenceclassificationproblems.

RecurrentneuralnetworkslikeLSTMfrequentlycomewithover-fittingproblemsyoucansolvebyapplyingthedropoutKeraslayerbetweenlayers.YoujustaddnewlayersbetweenyourembeddingandLSTMlayersandbetweenyourLSTMandDenseoutputlayersasfollows.

model=Sequential()

model.add(Embedding(top_words,embeddingvecorlength,inputlength=maxreview_length))

model.add(Dropout(0.2))

model.add(LSTM(100))

model.add(Dropout(0.2))

model.add(Dense(1,activation='sigmoid'))

Runningthisyouwillgetfollowingresult.

Epoch1/3

16750/16750[==============================]-112s-loss:0.6623-acc:0.5935

Epoch2/3

16750/16750[==============================]-113s-loss:0.5159-acc:0.7484

Epoch3/3

16750/16750[==============================]-113s-loss:0.4502-acc:0.7981

Accuracy:82.82%

Page 90: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Asyoucansee,thedropoutlayerhasanimpactontrainingwithalowerfinalaccuracyandslowertrendinconvergence.ThisLSTMmodelprobablycoulduseseveralmoreepochsoftrainingforbetterskill.DropoutcanalsobeappliedtotherecurrentconnectionsofthememoryunitswiththeLSTMseparatelyandprecisely.

KerascomeswithamazingcapabilityontheLSTMlayers.Youcanusethedropoutfunctionforconfiguringyourinputdropoutandyourrecurrentdropout.Youcanmodifythecodeandadddropouttoyourrecurrentconnectionsandtotheinputasillustratedbelow.

model=Sequential()

model.add(Embedding(top_words,embedding_vecor_length,inputlength=maxreview_length))

model.add(LSTM(100,dropout=0.2,recurrent_dropout=0.2))

model.add(Dense(1,activation='sigmoid'))

YoucanseethatthisLSTMspecificdropouthasmoreeffectonthelayer-wisedropoutandontheconvergenceofyournetwork.Dropoutisaverypowerfultechniqueyoushoulduseforcombatingover-fittingissuesinyourLSTMmodels.Makesureyouusebothmethodseventhoughyoumaygetbetterresultswhenusingthisgate-specificdropoutmethod.

Page 91: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

NATURALLANGUAGEPROCESSINGWITHRECURRENTNEURALNETWORKS

Inthissectionofthebook,wearegoingtosolveanaturallanguageprocessingproblemusingrecurrentneuralnetworksinKeras.Thisnaturallanguageprocessingproblemaimstoextractthemeaningofspeechutterances.Wearegoingtobreakthisproblemintosolvablepracticalissuesofunderstatingthespeakerinalimitedcontext.Here,wewanttoidentifytheintentofaspeakeraskingforinfoaboutflights.

WearegoingtouseAirlineTravelInformationSystemorATIS.ThisdatasetwasobtainedbyDARPAbackintheearly90s.Thedatasetconsistsofspokenqueriesonnumerousflights.ATIScontains4,978sentencesand56,590wordsbothinthetestandtrainset.Thenumberofclassescontainedin128.Ourapproachhereistouserecurrentneuralnetworksandwordembedding.

Asyoualreadyknow,wordembeddingmapswordstovectorsinahigh-dimensionalspace.Thewordembeddingwhenlearnedrightcanlearnsyntacticandsemanticinformationofthewordsinthisspace.Thisembeddingspacewillbelearnedbyyourmodelthatyoudefinelater.

Forthisproblem,aconvolutionallayercandogreatwhenitcomestopoolinglocalinformation,buttheyarenotcapableofcapturingtherealdatasequentially,sowearegoingtouserecurrentneuralnetworkswhichwillhelpustacklethisconsecutiveinformationasnaturallanguage.

Page 92: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

Arecurrentneuralnetworkmodelhassuchamemorythatstoresthesummaryofcountlesssequencesthemodelhasseenbefore.

ThismeansyoucanuserecurrentneuralnetworkstosolvecomplexwordtaggingproblemslikePOSorpartofspeechtaggingorslotfillingasinthisproblem.

Forthisproblem,youmustpassthewordembeddingssequenceastheinputofyourrecurrentneuralnetwork.

AsyouaregoingtouseIOBrepresentationforyourlabels,itisnecessarytocalculatethescoresofyourmodel.Therefore,youwillruncodeasshownbelowforyourscorecalculation.Priortothathowever,youmustdownloadthecorrespondingATISfile.

gitclonehttps://github.com/chsasank/ATIS.keras.git

cdATIS.keras

IrecommendyoutouseJupyterNotebook.Afterthat,youmustloadyourdatausingdataloadatisfullargument.Keraswilldownloadthedatathefirsttimeyourunit.

Labelsandwordsareencodedasindexestoyourdatasetvocabularyandvocabularyisstoredinlabels2idx.

Page 93: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

importnumpyasnp

importdata.load

train_set,valid_set,dicts=data.load.atisfull()w2idx,labels2idx=dicts['words2idx'],dicts['labels2idx']

train_x,,trainlabel=train_setval_x,,vallabel=valid_setThenextstepistocreateanindextolabelandworddictsasseenbelow.

idx2w={w2idx[k]:kforkinw2idx}

idx2la={labels2idx[k]:kforkinlabels2idx}

Then,createconllevalscriptasfollows.

Words_train=[list(map(lambdax:idx2w[x],w))forwintrain_x]

labels_train=[list(map(lambdax:idx2la[x],y))foryintrain_label]

words_val=[list(map(lambdax:idx2w[x],w))forwinval_x]

labels_val=[list(map(lambdax:idx2la[x],y))foryinval_label]

n_classes=len(idx2la)

n_vocab=len(idx2w)

Thenextstepistoprintanexamplelabelandsentence.

print("Examplesentence:{}".format(words_train[0]))

print("Encodedform:{}".format(train_x[0]))

print()

print("It'slabel:{}".Format(labels_train[0]))

Page 94: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

print("Encodedform:{}".format(train_label[0]))

Thisiswhatyouget.

Examplesentence:[...]

Encodedform:[2325425021962087762103540582341376211234481321]

It'slabel:[...]

Encodedform:[126126126126126481263599126126126781261412612612]

ThenextstepistodefineyourKerasmodel.Kerascomeswithbuiltembeddinglayeryoucanuseforwordembeddings.Itwillexpectintegerindices.

YoualsomustuseTimeDistributedargumenttopasstheouputofyourrecurrentneuralnetworkateachtimesteptoafullyconnectedlayers.Ifyoudonotperformthisstep,youroutputatthetimeofthefinalstepwillbepassedonyournextlayer.

fromkeras.modelsimportSequential

fromkeras.layers.embeddingsimportEmbedding

fromkeras.layers.recurrentimportSimpleRNN

fromkeras.layers.coreimportDense,Dropout

fromkeras.layers.wrappersimportTimeDistributed

fromkeras.layersimportConvolution1D

model=Sequential()

model.add(Embedding(n_vocab,100))model.add(Dropout(0.25))

model.add(SimpleRNN(100,return_sequences=True))model.add(TimeDistributed(Dense(n_classes,activation='softmax')))model.compile('rmsprop','categorical_crossentropy')Thenext

Page 95: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

stepistotrainyourmodel.Youwillpasseverysentenceasabatchtoyourmodel.Youcannotusemodelfitargumentasitexpectsallincludedsentencestobethesamesize.Therefore,youaregoingtousemodeltrainonbatchargument.

importprogressbar

n_epochs=30

print("Trainingepoch{}".format(i))

bar=progressbar.ProgressBar(maxvalue=len(trainx))

label=trainlabel[nbatch]

Then,youmustmakelabelsone-hot.Whenthatstepisfinished,youmustmakeyourmodelvieweachsentenceasabatchasindicatedbelow.

label=np.eye(n_classes)[label][np.newaxis,:]

sent=sent[np.newaxis,:]

model.trainonbatch(sent,label)Tomeasuretheaccuracyofyourmodel,youaregoingtousethemodelpredictonbatchargumentalongsidemetricsaccuracyconllevalargument.

frommetrics.accuracyimportconlleval

labelspredval=[]

bar=progressbar.ProgressBar(max_value=len(val_x))

forn_batch,sentinbar(enumerate(val_x)):

label=vallabel[nbatch]

label=np.eye(n_classes)[label][np.newaxis,:]

sent=sent[np.newaxis,:]

pred=model.predictonbatch(sent)

Page 96: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

pred=np.argmax(pred,-1)[0]

labelspredval.append(pred)

labelspredval=[list(map(lambdax:idx2la[x],y))\

foryinlabelspredval]

con_dict=conlleval(labelspredval,labels_val,

words_val,'measure.txt')

print('Precision={},Recall={},F1={}'.format(

condict['r'],condict['p'],con_dict['f1']))

Withthismodel,youshouldgetaroundninety-twoF1Score.Onedrawbackonthismodelisthatthereisnolookahead.Youcanhandilyimplementitbyaddingaconvolutionallayerbeforerecurrentneuralnetworklayersandjustafterwordembeddingsasfollows.

model=Sequential()

model.add(Embedding(n_vocab,100))

model.add(Convolution1D(128,5,border_mode='same',activation='relu'))

model.add(Dropout(0.25))

model.add(GRU(100,return_sequences=True))

model.add(TimeDistributed(Dense(n_classes,activation='softmax')))

model.compile('rmsprop','categorical_crossentropy')

Withthisgreatlyimprovedmodel,youshouldgetaroundninety-fourF1Score.Toimproveyourmodelevenfurther,youcanuseotherwordembeddingcorpuseslikeWikipedia.YoucantryotherrecurrentneuralnetworkvariantslikeGRUandLSTMthatallowmoreexperimentation.

Page 97: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

LASTWORDS

Deeplearningisanewareaofbroadermachinelearningthathasbeenintroducedwiththemaingoalofmovingmachinelearningclosertoartificialintelligence,whichwasoneofitsoriginalgoals.Ifyouwanttobreakdeeperintoartificialintelligence,first,youneedtofocusondeeplearninganditspowers.Deeplearningarguablyisoneofthemosthighlysoughttechskills.

Thisbookwillhelpyoubecomegoodatdeeplearningbasics.Thebookwillhelpyoustartyourdeeplearningjourneyproperly.Sinceyouaredonewiththereadingpart,youknowalotaboutneuralnetworksmodels,howtobuildthemandhowtosolvedifferentdeeplearningproblemslikenaturallanguageprocessingandspeechrecognition.Therefore,youcanfocusonmoreadvanceddeeplearningproblemsinthefuture.

Inthebook,yousurveyedseveralneuralnetworkmodelsandtheirapplicationstothereal-worldproblems.YoucanusethisknowledgeinsolvingyourowndeeplearningtasksasyoubuildyourownneuralnetworkmodelsusingKeras.Onethingisforsure,youshouldtakeadvantageoftheknowledgeyougainedthroughthebookandfocusonmorecomplexdeeplearningproblems.

DeeplearningistheonlyfieldofAIthatwentviralanditsfuturelooksverybright.Therefore,youshouldnotstophere.Youshouldfocusonimprovingyourskillandgainingmoreknowledge.Machinelearningalreadyplaysamassivepartinyoureverydaylifeanddeeplearningisnotfarawayfrombecominga

Page 98: Deep Learning With Keras: Beginner’s Guide To Deep Learning With Keras

largerpartofmodernsocietyaswell.

MachinelearningwasjustthebeginningasmoreandmoretechcompanieslikeMicrosoft,GoogleandFacebookspendmillionsondeeplearningandadvancedneuralnetworksresearchascomputersgetsmartereveryday.

However,deeplearningisnotaboutself-awaremachines.Itisabouthowingeniousneuralnetworkmodelsandcodearegivingmachinestheabilitytodothingswepreviouslythoughtimpossible.Therefore,deeplearningdoesconcernourfuture.Letthebookbeyourguideintothisworld,butdonotstophereandmakesureyoutakeastepfurtherbylearningsomethingneweveryday.