Deep Learning With Keras: Beginnerâ€™s Guide To Deep Learning With Keras

DeepLearningwithKeras

Beginner’sGuideTo

DeepLearningWithKeras

ByFrankMillstein

PleaseCheckOutMyOtherBooksBeforeYouContinueBelowyouwillfindjustafewofmyotherbooksthatarepopularonAmazonandKindle.Simplyclickonthelinkorimagebelowtocheckthemout.

LinkToMyAuthorPage

http://amzn.to/2pu6wSf

http://amzn.to/2pu6wSf

WHATISINTHEBOOK?INTRODUCTION

HOWDEEPLEARNINGISDIFFERENTFROMMACHINELEARNINGDEEPERINTODEEPLEARNING

CHAPTER1:AFIRSTLOOKATNEURALNETWORKS

CONVOLUTIONALNEURALNETWORKRECURRENTNEURALNETWORKRNNSEQUENCETOSEQUENCEMODELAUTOENCODERSREINFORCEMENTDEEPLEARNINGGENERATIVEADVERSARIALNETWORK

CHAPTER2:GETTINGSTARTEDWITHKERAS

BUILDINGDEEPLEARNINGMODELSWITHKERAS

CHAPTER3:MULTI-LAYERPERCEPTRONNETWORKMODELS

MODELLAYERSMODELCOMPILATIONMODELTRAININGMODELPREDICTION

CHAPTER4:ACTIVATIONFUNCTIONSFORNEURALNETWORKS

SIGMOIDACTIVATIONFUNCTIONTANHACTIVATIONFUNCTIONRELUACTIVATIONFUNCTION

CHAPTER5:MNISTHANDWRITTENRECOGNITION

CHAPTER6:NEURALNETWORKMODELSFORMULTI-CLASSCLASSIFICATIONPROBLEMS

ONE-HOTENCODINGDEFININGNEURALNETWORKMODELSWITHSCIKIT-LEARNEVALUATINGMODELSWITHK-FOLDCROSSVALIDATION

CHAPTER7:RECURRENTNEURALNETWORKS

SEQUENCECLASSIFICATIONWITHLSTMRECURRENTNEURALNETWORKSWORDEMBEDDINGAPPLYINGDROPOUTNATURALLANGUAGEPROCESSINGWITHRECURRENTNEURALNETWORKS

LASTWORDS

Copyright©2018byFrankMillstein-Allrightsreserved.

This document is geared towards providing exact and reliable information inregardstothetopicandissuecovered.Thepublicationissoldwiththeideathatthe publisher is not required to render accounting, officially permitted, orotherwise, qualified services. If advice is necessary, legal or professional, apracticedindividualintheprofessionshouldbeordered.FromaDeclarationofPrincipleswhichwasacceptedandapprovedequallybyaCommitteeoftheAmericanBarAssociationandaCommitteeofPublishersandAssociations.In no way is it legal to reproduce, duplicate, or transmit any part of thisdocument by either electronic means or in printed format. Recording of thispublicationisstrictlyprohibited,andanystorageofthisdocumentisnotallowedunlesswithwrittenpermissionfromthepublisher.Allrightsreserved.The informationprovidedherein is stated to be truthful and consistent, in thatanyliability,intermsofinattentionorotherwise,byanyusageorabuseofanypolicies, processes, or directions contained within is the solitary and utterresponsibility of the recipient reader. Under no circumstances will any legalresponsibility or blame be held against the publisher for any reparation,damages, or monetary loss due to the information herein, either directly orindirectly.Respectiveauthorsownallcopyrightsnotheldbythepublisher.The information herein is offered for informational purposes solely and isuniversal as so.Thepresentationof the information iswithout contractor anytypeofguaranteeassurance.Thetrademarksthatareusedarewithoutanyconsent,andthepublicationofthetrademark is without permission or backing by the trademark owner. Alltrademarksandbrandswithinthisbookareforclarifyingpurposesonlyandareownedbytheownersthemselves,notaffiliatedwiththisdocument.

INTRODUCTION

NeuralnetworksanddeeplearningareincreasinglyimportantstudiesandconceptsincomputersciencewithamazingstridesbeingmadebymajortechcompanieslikeGoogle.Overtheyears,youmayhaveheardwordslikebackpropagation,neuralnetworks,anddeeplearningtossedaroundalot.Therefore,aswehearthemmoreoften,thereislittlewonderwhythesetermshaveseizedyourcuriosity.

Deeplearningisanimportantareaofactiveresearchtodayinthefieldofcomputerscience.Ifyouareinvolvedinthisscientificarea,Iamsureyouhavecomeacrossthesetermsatleastonce.Deeplearningandneuralnetworksmaybeanintimidatingconcept,butsinceitisincreasinglypopularthesedays,thistopicismostdefinitelyworthyourattention.

Googleandotherlargeglobaltechcompaniesaremakinggreatstrideswithdeep-learningprojects,liketheGoogleBrainprojectanditsrecentacquisitioncalledDeepMind.Moreover,manydeeplearningmethodsarebeatingthosetraditionalmachinelearningmethodsoneverysinglematric.

HOWDEEPLEARNINGISDIFFERENTFROMMACHINELEARNING

Beforegoingfurtherintothissubject,wemusttakeastepback,soyougettolearnmoreaboutthebroaderfieldofmachinelearning.Veryoften,weencounterproblemsforwhichitishardtowriteacomputerprogramforsolvingthoseissues.Forinstance,ifyouwanttoprogramyourcomputertorecognizespecifichandwrittendigitsthatyoumayencounteroncertainissues,youcantrytodeviseacollectionofrulestodistinguisheveryindividualdigit.Inthiscase,zerosareoneclosedloop,butwhatifyoudidnotperfectlyclosethisloop.Ontheotherhand,whatiftherighttopofyourloopclosesonthatpartwherethelefttopofyourloopstarts?

Issueslikethishappenroutinely,aszeromaybeverydifficultwhenitcomestodistinguishingfromsixalgorithmically.Therefore,youhaveissueswhendifferentiatingzeroesfromsixes.Youcouldestablishakindofcutoff,butyouwillhaveproblemsdecidingtheoriginationofthecutoffinthefirstplace.Therefore,quicklyitbecomesverycomplicatedtocompilealistofguessesandrulesthatwillaccuratelyclassifyyourhandwrittendigits.

Therearemanymorekindsofissuesthatfallintothiscategorysuchascomprehendingspeech,recognizingobjects,andunderstandingconcepts.Therefore,wecanhaveissueswhenwritingcomputerprograms,aswedonotknowhowthisisdonebyhumanbrains.Despitethefactyouhavearelativelygoodideaonhowtodothis,yourprogrammaybeverycomplicated.

Therefore,insteadofwritingaprogram,youcantryanddevelopanalgorithmwhichyourcomputercanusetolookatthousandsofexamplesandcorrectanswers.Therefore,yourcomputercanusetheexperiencethathasbeenpreviouslygainedtosolvethesameprobleminnumerousothersituations.Ourmaingoalwiththissubjectistoteachourcomputerstosolvebyexampleintheverysimilarwayyoucanteachyourchildtodistinguishadogfromacat.

Deeplearningwasfirsttheorizedbackintheearly1980sandwasoneofthemainparadigmsforperformingbroadermachinelearning.Overthepastfewdecades,computerscientistshavesuccessfullydevelopedawiderangeofdifferentalgorithmswhichtrytoallowcomputerstolearntosolveproblemsthroughexamples.Becauseoftheflurryofmoderntechnologicaladvancementsandmodernresearch,deeplearningisontherisesinceithasproventobeextremelygoodwhenitcomestoteachingourcomputerstodowhatthehumanbraincandonaturallyandeffortlessly.

Oneofthemainchallengeswithtraditionalmachinelearningmodelsisaprocessnamedfeatureextraction.Morespecifically,theprogrammersmusttellthecomputerwhatkindoffeaturesandinformationitshouldbelookingforwhentryingtomakeachoiceordecision.

However,feedingthealgorithmrawdatainfactrarelyworks,sothisprocessoffeatureextractionisoneofthecriticalpartsofthetraditionalmachinelearningworkflow.Moreover,thisplacesamassiveburdenontheprogrammerastheeffectivenessofthealgorithmmainlyreliesonhowtheinsightoftheprogrammer.Formorecomplexissues,suchashandwritingrecognitionorobjectrecognition,thisisoneofthemainchallenges.

Fortunately,wehavedeeplearningmethodsbywhichwecansurelycircumventthesechallengesregardingfeatureextraction.Thisismainlybecausedeeplearningalgorithmsarecapableoflearningtofocusonlyonthoseinformative,rightfeaturesbythemselveswhileatthesametimetheyrequireverylittleguidancefromtheprogrammer.Moreover,thismakesdeeplearninganamazinglypowerfultoolformachinelearning.

Machinelearningusesourcomputerstorunpredictivemodels,whicharecapableoflearningfromalreadyexistingdatatoforecastfutureoutcomes,behaviors,andtrends.Ontheotherhand,deeplearningisanimportantsubfieldofmachinelearninginwhichalgorithmsormodelsareinspiredbyhowthehumanbrainworks.Thesedeeplearningmodelsareexpressedmathematicallywhereparametersthatdefinemathematicalmodelscanbeintheorderofseveralthousandtomillions.Indeeplearningmodels,everythingislearnedautomatically.

Moreover,deeplearningisoneofthemainkeysenablingartificialintelligencepoweredtechnologiesthatarebeingdevelopedaroundtheglobeeveryday.Inthefollowingsectionsofthebook,youaregoingtolearnhowtobuildcomplexmodelswhichhelpmachinessolvedistinctreal-worldissueswithhuman-likeintelligence.YouwilllearnhowtobuildandderivemanyinsightsfromthesemodelsusingKerasrunningonyourLinuxmachine.

Thebook,infact,providesthelevelofdetailsneededfordatascientistsandengineerstodevelopagreaterintuitiveunderstandingofthemainconceptsofdeeplearning.Youwillalsolearnpowerfulmotifsthatcanbeusedinbuildingnumerousdeeplearningmodelsandmuchmore.

Machinelearninganddeeplearninghaveonethingincommon,thatistheyarebothrelatedtoartificialintelligence.Artificialintelligenceregardscomputersystems,whichmimicorreplicateshumanintelligence,whilebroaderfieldofmachinelearningallowsmachinetolearnentirelyontheirown.Ontheotherhand,deeplearningregardsmanycomputeralgorithms,whichattempttomodelhigh-levelabstractionscontainedindatatodeterminehigh-levelmeaning.

Forinstance,ifartificialintelligenceisusedtorecognizeemotionsinpictures,thenmachinelearningmodelswouldinputhundredsorthousandsofpicturesofhumanfacesintothesystemwhiledeeplearningwillhelpthatsystemtorecognizecountlesspatternsinthehumanfacesandtheemotionstheyshare.

Thisisaverysimpleexplanationofthethree.However,itismorecomplex.Deeplearningbyfaristhemostconfusingasitworkswithneuralnetworks,data,andmath.Unlikedeeplearning,machinelearninganalyzes,crunchesnumbersanddata,learnsfromitandusesthatinfotomakeinnumerablepredictions,truthstatementsanddeterminationsdependingonthescenario.

Inthiscase,themachineisbeingtrainedoritistrainingitselfonhowtoperformtaskscorrectlyafterlearningfromnumbersanddataithaspreviouslyanalyzed.Therefore,machinelearningmodels,buildstheirownsolutionsandlogic.MachinelearningcanbedonewithseveralalgorithmslikerandomforestanddecisiontreeusedbyNetflixforinstance,thatsuggestmoviestoitscustomersbasedontheirstarratings.

Anothercommonmachinelearningmodelisalinearregressionthatpredictsthe

valueofamultitudeofcategoricaloutcomeswithlimitlessresultslikefiguringouthowmuchmoneyyoucansellyourcarforbasedonthecurrentmarketflow.Othermachinelearningmodelsincludelogisticregressionthatpredictsthevalueofcategoricaloutcomesbasedonalimitednumberofpossiblevalues.

ClassificationandnaiveBayesadditionallyincludemachinelearningmodels.MachinelearningclassificationputsdataintodistinctgroupslikeemailsorfilingdocumentswhilenaïveBayesincludesafamilyofalgorithms,whichallsharecommonprinciplesinwhicheveryfeatureisbeingclassifiedindependentlyofotherfeatures.Thismaygoonastherearemanyothermachinelearningmodels.Therearetwotypesofmachinelearningmodelsaswellincludingsupervisedandunsupervised.

Supervisedlearningmodelsrequireahumantoinputboththedataandthesolution.Ontheotherhand,thesemodelsallowthemachinetofigureoutonitsowntherelationshipbetweenthetwo.Ontheotherhand,unsupervisedmachinelearningmodelsincludeputtinginrandomdataandnumbersforaspecificsituationandaskingthemachineorcomputertofindarelationshipandsolution.Therefore,machinelearningeliminatedtheneedforsomeonetoconstantlyanalyzeorcodedatatopresentlogicandasolution.

DEEPERINTODEEPLEARNING

Unlikemachinelearning,deeplearningcrunchesmoredata,whichisthebiggestdifferencebetweenthetwo.Forinstance,ifyouhavealittlebitofdatatoanalyze,thewaytogoismachinelearning.However,ifyouhavemoredatatoanalyze,deeplearningisyoursolution.Deeplearningmodelsareextremelypowerful,andtheyneedalotofdatatogiveyouthebestpossibleoutcomeorsolution.Ontheotherhand,deeplearningmodelsneedmorepowerfulmachineswhilemachinelearningmodelsdonot.

MorepowerfulmachinesarerequiredfordeeplearningasdeeplearningmodelsdomorecomplicatedthingssuchasmatrixmultiplicationsthatrequireaGPUorgraphicsprocessingunit.Deeplearningmodelsalsotrytolearnhigh-levelfeatures,sointhecaseoffacialrecognition,thedeeplearningmodelwillgettheimagethatisquiteclosetotheRAWversion,whileamachinelearningmodelwillgetablurryimage.Otherpowerfuldeeplearningfeaturesareformingend-to-endsolutionsinsteadofbreakingissuesandsolutionsdowninparts.

Deeplearningisoneofthemostpowerfultoolsusedbymajorglobaltechcompanies.Deeplearningtakesalongtimeinprocessingdataandfindingcorrectsolutions.Justkeepinmind,itmaybechallengingattheverybeginning,butyouwillgetthereeventually.Fortunately,youhavethebooktostartoffyourdeeplearningjourney.

CHAPTER1:AFIRSTLOOKATNEURALNETWORKS

Inrecentyears,neuralnetworks,ormorespecificallydeepneuralnetworks,havewonnumerouscontestsinmachinelearningandpatternrecognition.Deeplearnersaremainlydistinguishedbythedepthoftheirpathsthatarechainsofthepossiblylearnablecausalrelationshipsbetweeneffectsandactions.

Deeplearningalgorithmsinverysimplewordsaredeep,largeartificialneuralnets.AnNNorneuralnetworkcanbeeasilypresentedasadirectedacrylicgraphinwhichtheinitialinputlayerbyitselftakesinsignalvectorsinadditiontoincludingsingleormultiplehiddenlayersandthenprocesstheoutputsofthosepreviouslayers.

Infact,themainconceptbehindneuralnetworkscanbetracedtohalfacenturyago.Thereismoretalkabouttheideatodaybecausewehavealotmoredataandwehavesignificantlymorepowerfulcomputers,whichwerenotavailabledecadesago.

Adeepneuralnetworkhasmanymorelayersandmanymorenodesineverylayerwhichresultinexponentiallymoreparameterstotune.Inthecasewhenwedonothaveenoughdata,wearenotabletolearnthoseimportantparametersefficiently.Inaddition,withoutpowerfulmachinesorcomputers,learningwouldbeinsufficientaswellastooslow.

Whenitcomestosmalldatasets,traditionalmachinelearningalgorithmssuchasrandomforests,regression,GBM,SVMandstatisticallearningdoanamazingjob.However,whenthedatascalegoesupcontainingalargeamountofinformation,thoselarge,deepneuralnetworksquicklyoutperformthetraditionalones.

Thishappensprimarilybecausecomparedtoatraditionalmachinelearningalgorithm,adeepneuralnetworkmodelhasawiderrangeofparametersandithasthecapabilityoflearningmorecomplexnonlinearparameters.Therefore,weexpectadeepneuralnetworkmodeltoautomaticallypickthemostimportantandhelpfulfeaturesonitsownwithouttoomuchmanualengineering.

Asalreadymentioned,deeplearningisoneofthemainformsofmachinelearningwhichusesamodeloralgorithmofcomputingthatisverymuchbasedorinspiredbythestructureofthehumanbrain.Therefore,thesemodelsarecalledneuralnetworks.Thebasicunitofanyneuralnetworkistheneuron.Everyneuronhasaspecificsetofinputs.Inaddition,everyneuronhasaspecificweight.Theneuronshavethepowerofcomputingfunctionsbasedonthesespecificweightedinputs.Forinstance,alinearneurontakesalinearcombinationofitsweightedinputswhilesigmoidalneuronsfeedthespecificweightedsumofitsinputsintoalogisticfunction.

Thelogisticfunctionalwaysreturnsavaluebetween1and0.Inthecasewhentheweightedsumisnegative,thereturnvalueiscloseto0.Ontheotherhand,whentheweightedsumislargeorpositive,thereturnvalueiscloseto1.Whenitcomestomathematicalproblems,thelogisticfunctionisalogicalchoicesinceithasnicelookingderivativesthatmakethelearningprocesssimpler.

However,whateverfunctiontheneuronuses,thevaluecomputedisimmediatelytransmittedtootherincludedneuronsthatshowtheneuron’soutput.Inpractice,thesesigmoidalneuronsareusedmoreoftenthanlinearfunctionssincetheyenablemoreversatiledeeplearningmodelsincomparisontolinearneurons.

Adeepneuralnetworkoccurswhenyoustartconnectingneuronstoeachotherinyourinputdataandeventuallytotheoutletsthatcorrespondtoyournetwork’sanswertoyourlearningproblems.Tomakethisstructureeasiertovisualize,youmustobtaintheweightofyourneuronsinthelinkthatconnectyourinitiallayertootherlayerscontainedinyourneuron.

Verysimilartohowneuronsareorganizedinlayersinthebrain,neuronsindeepneuralnetsaretypicallyorganizedincertainlayers.Indeepneuralnetworks,neuronssituatedonthebottomlayersarethosethatreceivesignalsfromtheinputs.Whileneuronssituatedinthetoplayersareconnectedtotheanswer;thankstotheiroutlets.Usually,therearenoconnectionsbetweenneuronsinthesamelayersasmorecomplexconnectivitybetweenneuronsrequiremoremathematicalanalysis.

Inthecasewhentherearenoconnections,whichleadfromaneuroninahigherleveltothoseneuronsinlowerlayers,wecallthemfeed-forwardneuralnetworks.Opposedtotheseneuralnetworksarerecursiveneuralnetworksthataremuchmorecomplicatedtotrainandanalyze.Now,wewillgothroughseveralofthemostcommonlyuseddeepneuralnetworks.

CONVOLUTIONALNEURALNETWORK

ConvolutionalneuralnetworksoralsoknownasCNNandareoneofthemostcommonlyusedtypesoffeed-forwardneuralnetworksinthattheconnectivitypatternbetweenneuronsisbasedontheorganizationofthecommonvisualcortexsystem.Therefore,theV1ortheprimaryvisualcortexdoesedgedetectionoutofrawvisualinputobtainedfromtheretina.Then,theV2orsecondaryvisualcortexreceivestheedgefeaturesfromtheprimaryvisualcortexandextractssimplevisualpropertiessuchasspatialfrequency,orientation,andcolor.

ThevisualareaofV4oranothervisualcortexmainlyhandlesmorecomplicatedattributesorgrainedobjects.Then,allthoseprocessedvisualfeaturesflowintothefinalunitnamedinferiortemporalgyrusofITforfurtherobjectrecognition.ThisspecificshortcutbetweenV1layerandV4layer,infact,inspiredacertaintypeofconvolutionalneuralnetworkswithconnectionsbetweenthosenon-adjacentlayersnamedresidualnet.Residualnetscontainresidualblocksthatsupportinputsofonelayertobereadilypassedtothoselayerscominglater.

Therefore,convolutionalneuralnetworksarecommonlyusedforedgedetections,extractingsimplevisualpropertiessuchasspatialfrequency,orientation,andcolors,detectingobjectfeaturesofintermediatecomplexityandobjectrecognition.

Convolutioniscommonlyusedinmathematicaltermsreferringtoanoperationbetweenmatricesasconvolutionallayerswhichgenerallyhaveasmallmatrixnamedfilterorkernel.Asthefilterorkernelisslidingorconvolvingacross

thesematricesofinputimages,itis,atthesametime,computingtheimportantelement-widemultiplicationofspecificvaluescontainedinthekernelmatrixaswellascontainedintheoriginalimagevalues.

Therefore,specificallydesignedfiltersorkernelsarecapableofprocessingimagesforveryspecificpurposessuchasimagesharpening,blurring,edgedetectionandmanyotherprocessesefficientlyandrapidly.

RECURRENTNEURALNETWORK

Aneuralnetworksequencemodeliscommonlydesignedtotransforminputsequencesintoanoutputsequences,whichliveinadifferentdomain.AnothercommontypeofdeepneuralnetworksnamedRNNorrecurrentneuralnetworksaregreatlysuitableforthesepurposesastheyhaveshownanamazingimprovementinproblemslikespeechrecognition,handwritingrecognition,andmachinetranslation.

AnRNNmodelisbornwithanamazingcapabilityofprocessinglongsequentialdataandtacklingverycomplextaskswithcontextspreadoveraperiod.Therecurrentmodel,infact,processessingleelementintheneuralsequenceatthetime.Aftertheinitialcomputation,thisnewlyupdatedunitstateiseasilypasseddowntothatnexttimesteptofacilitatethecomputationofeverynextelement.

ImaginethecasewhenrecurrentneuralnetworkmodelreadsallarticlesonWikipediacharacterbycharacter.Ontheotherhand,simpleperceptronneuronswhichlinearlycombinethecurrentinputelements,aswellasthelastunitstatetypically,maylosethoselong-termdependencies.

Forinstance,wecanstartasentencewithSusanisworkingat…Then,afterawholeparagraph,wewanttostartournextsentenceswithHeorShecorrectly.Iftherecurrentneuralmodelforgetsthecharacter’snameweused,wecanneverknow.Toresolvethisissue,engineershavecreatedaspecialdeepneuronthatcomeswithmorecomplicatedinternalstructuredesignedtomemorizethelong-termcontextnamedLTSMorlong-short-termmemory.

LTSMmodelsaresmartenoughtolearnlong-termcontext.Thesemodelscanlearnforhowlongtheyshouldmemorizetheoldinformation,whentoforgetinformation,whentousenewlyupdateddata,andwhentocombinethenewinputwitholdmemory.UsingthepowerofLSTMandRNNcells,youcanbuildanRNNcharacter-basedmodelthatwillbeabletolearnthespecificrelationshipbetweenthecharacterstoformwordsandsentenceswithoutanypreviousknowledgeofEnglishvocabulary.ThisRNNmodel,infact,couldachieveaverygoodperformanceevenwithoutalargesetoftrainingdata.

RNNSEQUENCETOSEQUENCEMODEL

Thecommonsequence-to-sequencemodelisveryoftenusedasanextendedversionofrecurrentneuralmodels,butitsapplicationfieldismoredistinguishable.Sameasrecurrentneuralnetworks,sequence-to-sequencemodelsoperateonsequentialdata,buttheyarecommonlyusedtodeveloppersonalassistantsorchatbotsbygeneratingmeaningfulresponsestonumerousinputquestions.

Somecommonsequence-to-sequencemodelsconsistoftworecurrentneuralnetworksincludingdecoderandencoder.Inthiscase,theencoderlearnseverythingabouttheobtainedcontextualinformationfromvariousinputwords.Then,theencoderhandsthisknowledgedowntothedecoderbyusingaspecificcontextvectoralsoknownasthoughtvectors.Eventually,thedecoderconsumesthesecontextvectorsandgeneratescorrectresponses.

AUTOENCODERS

Autoencodersaredifferentfromthepreviousdeeplearningmodelsastheyareusedonlyforunsuperviseddeeplearning.Autoencodersaredesignedmainlytolearnlow-dimensionalrepresentationofhigh-dimensionaldatasetsverysimilartowhatPCAorprincipalcomponentsanalysisdoes.Theautoencodermodeliscapableoflearningapproximationfunctionstoreproducetheinputdata.

Ontheotherhand,specificbottlenecklayerssituatedinthemiddlecontainingaverysmallnumberofnodesrestrictthesemodels.Withthisverylimitednumberofnodes,thesemodelscomewithverylimitedcapacity,sotheyareforcedtoformaspecific,veryefficientencodingofthedata,whichisthelow-dimensionalcodeweobtained.

Youcanuseautoencodermodelstocompressyourdocumentsonavarietyoftopics.Therearesomelimitationsasthesemodelsastheycomewithabottlenecklayerthatcontainsonlyafewneurons.

However,whenyouusebothautoencoderandPCA,youcanreduceyourdocumentsontotwodimensions,soyourautoencodermodelwilldemonstrateamuchbetteroutcome.Withthehelpofthesemodels,youcandoveryefficientdatacompressiontospeeduptheoverallprocessofinformationretrievalincludingbothimagesanddocuments.

REINFORCEMENTDEEPLEARNING

ReinforcementlearningisoneofthesecretsbehindmanysuccessfulAIprojectsdoneinthepast.Reinforcementlearningisasubfieldofmachinelearningthatallowssoftwareagentsandmachinestoautomaticallydeterminetheoptimalbehaviorwithinagivencontextwiththemaingoalofmaximizingthelong-termperformances,whicharemeasuredbyagivenmetric.

Mostreinforcementlearningprojectsstartwithaspecificsupervisedlearningprocess,trainafastrolloutpolicyaswellaspolicynetwork,mainlyrelyingonthemanuallycuratedtrainingdata.Thereinforcementlearningpolicynetworkgetsimprovedwhenitgainedmoreversionsofthepreviouspolicynetwork.Therefore,withmoreandmoreobtaineddata,itgetsstrongerandstrongerwithoutrequiringanyadditionalexternaltrainingofdata.

GENERATIVEADVERSARIALNETWORK

AnothertypeofdeepneuralnetworkcommonlyusedisgenerativeadversarialnetworkorGAN.Thisisatypeofdeepgenerativealgorithm.GANhasthepowerofcreatingnewexamplesaftergoingthroughandlearningsomeoftherealdata.AcommontypeofGANconsistsoftwomodelsthatarecompetingagainsteachotherinazero-sumnetwork.

Thegenerativeadversarialnetworkmainlycontainssomereal-worldexamples,generator,generatedfakesamples,discriminator,andfinetunetraining.Thesemodelscantellthefakedataapartfromthetruedataduetodatadistribution.GANwasinitiallyproposedtogeneratemeaningfulimagesafterlearningfromrealworldphotos.

TheGANmodelproposedintheoriginalGANpaperwascomposedoftwoindependentmodelsincludingthediscriminatorandthegenerator.Inthiscase,thegeneratorsproducedfakeimagesandsenttheoutputbacktothediscriminatormodel.Then,thediscriminatorworkedinamannerverysimilartoajudgesinceitwasfullyoptimizedtoidentifythefakephotosfromtherealones.

Inaddition,thegeneratormodel,atthesametime,wastryinghardtocheatthediscriminatorasthejudgewastryingveryhardnottobecheatedbythegenerators.Thiswasaveryinterestingzero-sumgameoccurringbetweentheseGANmodelsthatmotivatedbothmodelstofurtherimprovetheirfunctionalitiesanddeveloptheirdesignedskills.

Afterlearningaboutthesedeepneuralnetworkmodels,youprobablywonderhowyoucanimplementthesemodelsandusetheminrealproblemsolvingwithdeeplearningissues.Fortunately,therearemanysourcelibrariesandtoolkitsyoucanuseforbuildingyourowndeeplearningmodels.TensorFlowarguablyisoneofthemostpopularthatattractedalotofattention.Intermsofpopularity,TheanofollowsTensorFlowveryclosely.Therefore,thosetwoarethebestnumericalplatformsinPythonwhichprovidethebasisforinnumerabledeeplearningprojects.

Bothareverypowerfullibrariesandcanbeusedfordifferenttasksforcreatingdeeplearningmodels.AnotherpowerfultoolinPython’slibraryisKeraswhichwearegoingtouseinthisbook.Kerasisanamazingly,powerfulhigh-levelneuralnetworkAPIwithastonishingpowerofrunningontopofTheano,TensorFloworCNTK.ItwaswritteninPythonanddevelopedwiththefocusofenablingfastandefficientdeeplearningexperimentation.

CHAPTER2:GETTINGSTARTEDWITHKERAS

WearegoingtouseKerasasitallowsfastandeasyprototyping,supportsbothrecurrentandconvolutionalnetworksandacombinationofthetwo,andrunsseamlesslyonGPUandCPU.Designedtoenablefastdeeplearningmodelingandexperimentationwithneuralnetworks,itfocusesonbeingmodular,minimalandextensible.

Therefore,withKerasyoucanbuildawiderangeofdifferentdeeplearningmodels,whichrunontopofTensorFloworTheanoeffortlesslyandefficiently.Kerasisafreeopen-sourceneuralnetwork,soyouwillfindandinstalliteasily.ThecoredatastructureofKerasisamodelthatisawayoforganizingmultiplelayers.BeforeyoudelvedeeperintoKeras,youmustinstallit,ofcourse.BeawarethatthispopularprogrammingdeeplearningframeworkuseseitherTheanoorTensorFlowbehindthescenesinsteadofprovidingallthefunctionalitybyitself.

KerasisverysimpletoinstallifyouhavebeenworkinginSciPyandPythonenvironment.MakesureyouhaveaninstallationofTensorFloworTheanoonyoursystemalreadybeforeyouinstallKeras.KerascanbeveryeasilyinstalledusingPyPI.

sudopipinstallkeras

python-c"importkeras;printkeras.__version__"

1.1.0

sudopipinstall--upgradekeras

Usingthesamemethod,youcanupgradeyourversionofKeras.AssumingyouhaveinstalledbothTensorFlowandTheano,youareabletoconfigurethebackendusedbyKeras.ThebestwayisbyeditingoraddingtheKerasconfigurationfileinyourdirectory.

~/.keras/keras.json

{

"imagedimordering":"tf",

"epsilon":1e-07,

"floatx":"float32",

"backend":"tensorflow"

}

Inthisconfigurationfile,youcanchangethepropertyofbackendfromTensorFlowtoTheano.Inthiscase,Keraswilluseyournewconfigurationthenexttimeyourunit.Inaddition,youcaneasilyconfigureKerasasfollows.

python-c"fromkerasimportbackend;printbackend._BACKEND"

UsingTensorFlowbackend.

Tensorflow

Inaddition,youcanspecifywhichbackendyouwantKerastouseonyourcommandlineasshownbelow.

KERAS_BACKEND=theanopython-c"fromkerasimportbackend;print(backend._BACKEND)"

Runningthisscript,yougetasillustratedbelow.

UsingTheanobackend.

theano

BUILDINGDEEPLEARNINGMODELSWITHKERAS

ThefocusofKerasisamodel.ThemainkindofmodelbuildinKerasiscalledasequencecontainingalinearstackofmultiplelayers.Therefore,youcreateasequenceandgraduallyaddlayerstothemodelintheorderyouwantsoyougetpropercomputation.Onceyoudefineyourmodel,youmustcompileyourmodelsothatitmakesuseofthespecificunderlyingframeworktooptimizetheentireprocessofcomputation,whichwillbeperformedonyourdeeplearningmodel.

Inthiscase,youmustspecifytheoptimizerandthelossfunction,whichwillbeused.Onceyoucompileyourmodel,yourmodelmustfittothedata.Thisisdoneonebyone,onebatchofdataatatime.Infact,thisiswhereallthecomputationoccurs.Onceyoutrainyourmodel,youcanuseittomakeothernewpredictionsonyourdata.

Insummary,theconstructionofdeeplearningmodelsinKerascanbeexplainedasdefiningyourmodel,compilingyourmodel,fittingyourmodelandmakingpredictions.Todefineit,youmustcreateasequenceandaddmultiplelayers.Oncedone,youcompileyourmodelbyspecifyingoptimizerandlossfunctions.Then,youmustfityourmodelbyexecutingthemodelusingdata.Finally,youmakepredictionsonnewdata.

Asalreadymentioned,Kerasisamazinglypowerfulandeasytouseforevaluatinganddevelopingvarieddeeplearningmodels.ItwrapsallthoseefficientnumericalcomputationlibrarieslikeTensorFlowandTheanoandallowsyoutodefineandproperlytrainyourneuralnetworkmodelsinseveral

linesofcode.

Following,youaregoingtolearnhowtocreateyourfirstnetworkmodelinPythonusingKeras.Beforeyoubegin,makesureyouhavePython2,oranewerversioninstalledandconfigured.YoualsoneedNumPyandSciPyinstalledandconfigured,and,ofcourse,youneedtohaveKerasandTensorFloworTheanoinstalledandconfigured.Onceyouhavetheseupandrunning,createanewfileasfollows.

keras_first_network.py

Inthefollowingsection,youwilllearnhowtoloaddata,defineyourmodel,compilemodel,fitmodel,evaluateyourmodel,andtieitalltogethertoworkperfectlyonyourfuturemodels.

Wheneverweworkwithdeeplearningmodelswhichuseastochasticprocesslikerandomnumbers,itisaverygoodideatosettherandomnumberseed.Thisisagoodideaasyouwillbeabletorunthesamecodeoverandoverandgetthesameresult.Thisisalsoveryusefulinthecasewhenyouneedtodemonstratearesult,comparedifferentmodelsusingthesamesource,orwhenyouneedtodebugapartofyourcode.Fortheinitializationoftherandomnumbergenerator,usethefollowingscript.

fromkeras.modelsimportSequential

fromkeras.layersimportDense

importnumpy

numpy.random.seed(7)

Oncedone,youcanloadyourdata.Todosointhisexample,wearegoingtouseaverypopularPimaIndiansonsetwhichisastandarddeeplearningdatasetdevelopedbytheUCIMachineLearningrepository.ItdescribespatientmedicaldataforPimaIndianswithinfiveyears.Therefore,thisisabinaryclassificationproblemasalltheinputvariablesusedarenumerical.Therefore,thisreallymakesiteasytousethisdatasetdirectlywithaneuralnetworkinKeras.Touseit,downloadthedatasetandplaceitinyourworkingdirectory,whichisthesameasyourpreviouslycreatedPythonfile.

Now,wecontinuewithbuildingyourmodelwithKeras.YoumustloadthefiledirectlyusingthespecificNumPyfunction.Thereisoneoutputvariableinthelastcolumnandeightinputvariables.Onceyouloadthedata,youcansplityourdatasetintooutputvariabledenotedasYandinputvariablesdenotedasX.

dataset=numpy.loadtxt("pima–indians–diabetes.csv",delimiter=",")

X=dataset[:,0:8]

Y=dataset[:,8]

Makesureyouhaveinitializedyourrandomnumbergeneratortoensureyourresultsarereproducibleaswellasproperlyloadedonyourdata.Now,youmustdefineyourneuralnetworkmodel.

Asalreadymentioned,modelsinKerasaredefinedasaspecificsequenceofmultiplelayers.Tocreateasequentialmodelandthenaddonelayeratatime

untilyouaresatisfiedwithyournetworktopology,youmustdefineyourmodel.Thefirstthingyoumustdoistoensureyourinputlayerhasthepropernumberofinputs.Youcanspecifythiswhenyoucreateyourfirstlayerusingtheinputdimargument.Makesureyousetittoeightfortheeightinputvariables.

Nowyouprobablywonder,howdoyouknowtherightnumberoflayersandtheirtypes.Well,thisisacomplexquestion.Therearesomeheuristicsyoucanuse.However,thebestnetworkstructureisfoundthroughacertainprocessoftrialanderror.Youwillneedanetworklargeenoughtocapturethecorestructureoftheproblem.

Further,wearegoingtoseeafully-connectedneuralnetworkstructurecontainingthreelayers.Takeintoconsiderationthatfully-connectedlayersaremainlydefinedusingthedenseclass.Youcanspecifytherightnumberofneuronscontainedinthelayerasyourfirstargument,whileyoursecondargumentisdefinedasinit.Then,youcanspecifyyouractivationfunctionaswellusingtheactivationargument.

Inthefollowingcase,wearegoingtoinitializethespecificnetworkweightstoyoursmallrandomnumbergeneratedfromaspecificuniformdistribution.Inthiscase,youwillgetbetween0and0.05asthisisthedefaultuniformweightwhenusingKeras.ThereisalsoanotheralternativenamednormalandinvariablyusedforsmallrandomnumbersthataregeneratedfromGaussiandistribution.

Inourexample,wearegoingtousethereluorrectifieractivationfunctionaswellonthetwoinitiallayers.Wemustusethesigmoidfunctioninouroutput

layer.Before,itwascommontousethetanhandsigmoidactivationfunctionsforalllayers.

However,thesedaysthisisnotthecaseasbetterperformanceisachievedwhenyouusetherectifieractivationfunctioninadditiontousingasigmoidfunctiononyouroutputlayertoensureyourneuralnetworkoutputisbetweenoneandzero.Usingthismanner,itisveryeasytomapallclasseswithadefaultthreshold.Then,youcanpieceeverythingtogetherbyaddinglayers.Yourfirstlayerwillhavetwelveneurons,soexpecteightinputvariables.Yoursecondhiddenlayerswillhaveeightneuronswhileyouroutputlayerwillhaveoneneuronpredictingtheclasses.

model=Sequential()

model.add(Dense(12,input_dim=8,activation='relu'))

model.add(Dense(8,activation='relu'))

model.add(Dense(1,activation='sigmoid'))

Thenextstepistocompileyourmodel.Tocompileyourneuralnetworkmodel,youmustusetheefficientnumericallibrariescalledbackendlikeTensorFloworTheano.

Whenyouuseitforcompiling,thebackendautomaticallychoosesthebestpossiblewayforrepresentingyournetworkformakingpredictionsandtraining,whichwillrunonyourhardwarelikeGPUorCPUandsometimesevendistributed.

Whencompilingyourmodel,firstly,youmustspecifysomeadditionalproperties,whicharerequiredwhenyoutrainyourneuralnetwork.Becognizant

thattrainingyournetworkmeansfindingthebestpossiblesetofweightswhichyouuseformakingrightpredictionsforyourspecificproblem.

Toevaluateasetofweights,firstly,youmustspecifythelossfunctions.Youwillusetheoptimizerthatisusedtosearchthroughdifferentweightsforyournetworkaswellasoptionalmetricsyouwouldliketocollectandreportduringmodeltraining.Inthisspecificcase,youwilluselogarithmicloss,whichisdefinedasbinarycrossentropyformostofbinaryclassificationproblems.

YoumustusetheefficientgradientdescentalgorithmorAdamonlybecauseitisaveryefficientdefault.Sincethisisaclassificationproblem,youmustcollectandreportthemetricoftheclassificationaccuracy.

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

Onceyouaredonewithcompilingyourmodel,thenextstepisfittingyourmodel.Onceyoudefineandcompileyourmodel,youmustfitit,soitisreadyforefficientcomputation.Therefore,nowistherighttimetoexecuteyourmodelwiththedata.Youcantrainorfityourneuralnetworkmodelonyourloadeddatabycallingthespecificfitfunctiononyourmodel.

Considerthatthemodeltrainingprocesswillrunforaspecificnumberofiterationsthroughyourdatasetnamedepochs.Therefore,youmustspecifyyourmodelusinganepochsargument.Youcansetthespecificnumberofinstances,whichwillbeevaluatedbeforeyouupdateweight.

Thisiscalledthebatchsetandsiteyoucallusingthebatchsizeargument.Forthisspecificproblem,youwillrunasmallnumberofiterations,hundredandfifty.Youwillalsouseasmallbatchsizeoften.Asmentionedbefore,thesecanbechosenexperimentallybyyourmodelusingtrialanderror.Duringthisstep,thecomputationoccursonyourGPUorCPU.

model.fit(X,Y,epochs=150,batch_size=10)

Onceyouaredonewithfittingyourmodel,youmustevaluateitasthenextstep.Uptothispoint,youhavetrainedyourneuralnetworkontheentiredataset,sonowyoucaneasilyevaluatetheoverallperformanceofyourneuralnetworkonthesamedataset.Thiswillgiveyouthebestideaonhowwellyouhavejustmodeledtheobtaineddatasetalsoknownastrainaccuracy.

However,youwillhavenoideaofhowwellyourmodelmayperformonthenewdata.Youhavedonethismainlyforsimplicity,butanidealpathistoseparateyourdataintotestandtraindatasetsforevaluationandtrainingofyourmodel.

Youcansurelyevaluateyourmodelonyourtrainingdatasetusingthespecificevaluatefunctiononyourmodelandthenpassthatsameinputandoutputyouhaveusedtotrainyourmodel.

Thiswillresultinapredictionforeveryinputandoutputpair,soyoucollectscoresinpatientsincludingaveragelossandanyotherimportantmetricsyouhavejustconfiguredlikeaccuracy.

scores=model.evaluate(X,Y)

print("\n%s:%.2f%%"%(model.metrics_names[1],scores[1]*100))

Onceyoutieeverythingtogether,yougetyourfirstneuralnetworkmodelyouhavejustcreatedinKeras.Yourcompletecodewilllookasfollows.



importnumpy



X=dataset[:,0:8]

Y=dataset[:,8]

model=Sequential()


model.add(Dense(8,activation='relu'))



model.fit(X,Y,epochs=150,batch_size=10)

scores=model.evaluate(X,Y)

print("\n%s:%.2f%%"%(model.metrics_names[1],scores[1]*100))

Runningthis,youshouldseeamessageforeveryofthehundredandfiftyepochsthatprintboththeaccuracyandlossforeach,followedbythefinalevaluationofyourtrainedmodelonyourtrainingdataset.Youshouldgetmessagelikefollowing.

...

Epoch145/150

768/768[==============================]-0s-loss:0.5105-acc:0.7396

Epoch146/150

768/768[==============================]-0s-loss:0.4900-acc:0.7591

Epoch147/150

768/768[==============================]-0s-loss:0.4939-acc:0.7565

Epoch148/150

768/768[==============================]-0s-loss:0.4766-acc:0.7773

Epoch149/150

768/768[==============================]-0s-loss:0.4883-acc:0.7591

Epoch150/150

768/768[==============================]-0s-loss:0.4827-acc:0.7656

32/768[>.............................]-ETA:0s

acc:78.26%

Youcanusethisneuralnetworkmodelyouhavejustcreatedformakingpredictionsaswell.However,youwillmustadaptourexamplefromtheabovejustabittouseitformakingpredictions.Makingpredictionsisveryeasyonceyoucallmodelpredictargument.

Inthiscase,youaregoingtouseasigmoidactivationfunctiononyourinputlayers,soyougetpredictionsintherangebetweenoneandzero.Inaddition,youcanquicklyconvertthemintobinarypredictionsforyourclassificationtaskbyjustroundingthem.

Torunpredictionsforeveryrecordcontainedinyourtrainingdata,youmustrun

codeasshownbelow.



importnumpy

seed=7

numpy.random.seed(seed)


X=dataset[:,0:8]

Y=dataset[:,8]

model=Sequential()

model.add(Dense(12,input_dim=8,init='uniform',activation='relu'))

model.add(Dense(8,init='uniform',activation='relu'))

model.add(Dense(1,init='uniform',activation='sigmoid'))


model.fit(X,Y,epochs=150,batch_size=10,verbose=2)

predictions=model.predict(X)

rounded=[round(x[0])

print(rounded)

Runningthisneuralnetworkmodel,youwillprintthepredictionsforeveryinputpatternobtained.Inaddition,youcanusetheseobtainedpredictionsdirectlyinanapplication,ifrequired.

AsyounowknowhowtocreateyourneuralnetworkinKeras,wearegoingtomovetomorecomplexdeeplearningtasksthatyoucanefficientlyexecuteusingthepowerfulKerasPythonlibrary.

CHAPTER3:MULTI-LAYERPERCEPTRONNETWORKMODELS

ThepowerfulKerasPythonlibraryfordeeplearningproblemsmainlyfocusesonthecreationofdeeplearningmodelsasacollection,asequenceofmultiplelayers.Inthefollowingsectionofthebook,youaregoingtolearnhowtousesimplecomponentstocreatesimplemulti-layerperceptronnetworkmodelsusingKeras.

Asalreadymentionedinthebook,thesimplestmodelyoucancreateisdefinedinthesequentialclassthatisalinearstackofmultiplelayers.Youcancreateasequentialmodelandthendefineallincludedlayersasseenbelow.


model=Sequential(...)

However,amodelusefulidiomistofirstcreateyoursequentialmodelandthenaddlayersforpropercomputationasillustratedbelow.


model=Sequential()

model.add(...)

model.add(...)

model.add(...)

Oncedone,youmustaddmodelinputs.Bemindfulthatthefirstlayersinyourneuralnetworkmodelmustspecifytheshapeofyourinput.This,infact,isthetotalnumberofinputsattributedasdefinedbytheargumentinputdim.Thisargumentwillexpectaninteger.Forinstance,youcanreadilydefineyourinputintermsofeightinputsforyourdenselayerasfollows.

Dense(16,input_dim=8)

MODELLAYERS

Oncedone,youmustmodellayersofyourneuralmodel.Rememberthatlayersofdifferentkindusuallyhaveseveralpropertiesincommonespeciallytheiractivationfunctionsandtheirweightinitializationfunctions.Foryourmodels,youmustuseweightinitializationarguments.Thiskindofinitializationisusedforacertainlayer,whichisspecifiedintheinitargument.

Someofthemostcommonlyusedweightinitializationargumentsincludeuniform,normalandzero.Whenitcomestotheuniformweightinitialization,weightsareinitializedtosmalluniformlyrandomvaluesbetween0and0.5.Ontheotherhand,normalweightinitializationareweightsthatareinitializedtoasmallGaussianrandomvalue.Considerthatstandarddeviationiszeromeanof0.5.Thelasttypeiszerowhenallweightissettozerovalues.

Kerassupportsmanystandardneuronactivationfunctionsaswellsuchasrectifier,sigmoid,tanhandsoftmax.Youwillordinarilyspecifythetypeofyouractivationfunctionsusedbyaspecificlayerinyouractivationargumentthattakesastringvalue.Youcancreateanactivationobjectthatyoucanadddirectlytoyourmodeljustafteryouapplytheactivationfunctionstotheoutputofthespecificlayer.

ThereisawiderangeofdifferentcorelayersinKerasusedforstandardneuralnetworks.Someofthemostusefulandroutinelyusedcorelayertypesincludedense,dropoutandmergelayers.Denseisafully-connectedlayerusedmostoftenonmulti-layerperceptronmodels.Dropoutcorelayersapplydropouttotheneuralnetworkmodelbysettingafractionofinputstozerotoreducevery

commonissue,overfitting.MergecorelayerscombinetheinputsfromseveralKerasmodelsintoasingleKerasmodel.

MODELCOMPILATION

Asyoualreadyknow,onceyouhavedonedefiningyourmodel,youmustcompileit.Themodelcompilationwillcreatethehighlyefficientstructurethatwillbeusedbytheunderlyingbackend,TensorFloworTheano,toefficientlyexecuteyourneuralnetworkmodelduringthetrainingprocess.Youcancompileyourneuralnetworkmodelusingthecompileargument.Itwillacceptthreeimportantattributesofyourmodelincludinglossfunction,modeloptimizer,andmetrics.

model.compile(optimizer=,loss=,metrics=)

Whenitcomestothemodeloptimizers,theoptimizeristhemainsearchtechniquerecurrentlyusedtoupdateweightinyourneuralnetworkmodel.Youcancreateanoptimizerobjectandpassittoyourcompilefunctionusingtheoptimizerargument.Thiswillallowyoutoeffortlesslyconfiguretheoveralloptimizationprocesswithitsownargumentslikelearningaspecificrate.

sgd=SGD(...)

model.compile(optimizer=sgd)

Youcanalsousethedefaultparametersofyouroptimizerbyjustspecifyingthenameofthespecificoptimizertoyouroptimizerargumentasshownbelow.

model.compile(optimizer='sgd')

SomefrequentlyusedgradientdescentoptimizersyoucanuseincludeSGD,RMSpropandAdam.TheSGDisusuallyusedstochasticgradientdescentwithgreatsupportformomentum.TheRMSpropisoftenusedinadaptivelearningrateoptimizationmethodswhileAdam,shortforAdaptiveMomentEstimation,otheradaptivelearningrates.

Onceyouusetherightoptimizer,youmustmovetomodellossfunctions.Thelossfunctionisalsocalledtheobjectivefunction.Itistheevaluationoftheneuralnetworkmodelsusedbytheoptimizerstonavigatetheweightspace.

Youcanquicklyspecifythenameofyourlossfunction,whichwillbefurtherusedbythecompilefunctions.SomeofthemostnormallyusedlossfunctionargumentsincludeMSEformeansquarederror,categoricalcrossentropyfornumerousmulti-classlogarithmictasksandbinarycrossentropyforbinarylogarithmicloss.

Onceyouobtainyourmodellossfunction,youmovetometrics.Metricsareevaluatedduringtheprocessofmodeltraining.Considerthatonlyonemetricissupportedatthetimeandthatisforaccuracy.

MODELTRAINING

YourneuralnetworkmodelistrainedonNumPyarrays.Youwillusethefitfunctionasseenbelow.

model.fit(X,y,epochs=,batch_size=)

Modeltrainingbothspecifiesthenumberofepochsyoumusttrainandthebatchsize.Asalreadymentionedinthebook,epochsarethetotalnumberoftimesyourmodelisexposedtothedatasetusedfortrainingwhilethebatchsizeisthetotalnumberoftraininginstancesshowntoyourmodelbeforeyouperformweightupdate.

Asmentioned,formodeltrainingyouaregoingtothefitfunctionswhichallowsabasicevaluationtobeperformedonyourmodelduringmodeltraining.Youcanhandilysetthevalidationsplitvaluetoholdbackacertainfractionofyourtrainingdatasetforfurthervalidationtobeevaluatedbyeachepoch.YoumayuseavalidationdatatupleofYandXofdatatoevaluate.Moreover,fittingyourmodelreturnsahistoryobjectwithmetricsanddetailspreviouslycalculatedfortheneuralnetworkmodeleachepoch.Thisisinvariablyusedforgraphingyourmodel’soverallperformance.

MODELPREDICTION

Onceyouaredonewithtrainingyourneuralnetworkmodel,youcanuseittomakepredictionsonyourtestdataoronothernewdata.Thereisawiderangeofdifferentoutputtypesyoucancalculatefromyourtrainedneuralnetworkmodel.Eachofthesemodelsiscalculatedusingadifferentfunctionyoucallonyourneuralnetworkmodel.

Forinstance,youcanusemodelevaluatefunctiontocalculatethelossvaluesofyourinputdataoryoucanusemodelpredictstogenerateyournetworkoutputforyourinputdata.Youhaveanoptiontousemodelpredictprobaargumenttogenerateclassprobabilitiesforyourinputdataorusemodelpredictclassesfunctiontogeneratedifferentclassoutputsforyourinputdata.Onsomeclassificationproblems,youmustusethepredictclassesargumenttomakedifferentpredictionsfornewdatainstancesorfortestdata.

Onceyouarehappywithyourmodelanditsproperties,youcanfinalizeit.Youmayneedasummaryofyourmodel.Ifso,youcanreadilydisplayasummaryofyourneuralnetworkmodelbycallingtheroutinelyusedsummaryfunctionasfollows.

model.summary()

Youhaveanoptiontoretrieveyourmodelsummaryusingthegetconfigargumentasfollows.

model.get_config()

Finally,youhaveanoptiontocreateanimageofyourneuralnetworkmodelstructureasseenbelow.

fromkeras.utils.visutilsimportplotmodel

plot(model,to_file='model.png')

Therefore,inthissectionofthebook,youdiscoveredtheKerasAPI,whichyoucanusetocreateinnumerabledeeplearningandartificialneuralnetworkmodels.Youhavelearnedhowtoconstructamulti-layerneuralnetworkmodel,howtoaddmultiplelayersincludingactivationandweightinitialization.Youhavethuslearnedhowtocompileyourneuralnetworkmodelusingseveraloptimizersincludingmetricsandlossfunctions.Now,youknowhowtofityourmodelsincludingbatchsizeandepochsaswellashowtomakemodelpredictionsandsummarizeyourmodel.

CHAPTER4:ACTIVATIONFUNCTIONSFORNEURAL

NETWORKS

Inthissectionofthebook,wearegoingtogivemoreattentiontomostregularlyusedactivationfunctionsinneuralnetworks.Inthisexample,wearegoingtousetheMNIST.MNISTdataisasetofapproximately70000photosofmiscellaneoushandwrittendigitswhereeachphotoisblackandwhiteand28x28insize.Wearegoingtosolvethisspecificproblemusingafullyconnectedneuralnetworkwithseveraldifferentactivationfunctions.

Ourinputdatawillbe70000,784whileouroutputshapewillbe70000,10.Therefore,weuseafullyconnectedneuralnetworkmodelwithonehiddenlayer.Thereare784neuronscontainedintheinputlayer,oneforeverypixelinthephotosandthereare521neuronscontainedinthehiddenlayer.Intheoutputlayer,thereare10neuronsforeverydigit.UsingKeras,wecanutilizeseveraldifferentactivationfunctionsforeverylayerinourneuralnetworkmodel.Thismeansthatinthiscase,wemustdecidewhichactivationfunctionsshouldbeusedintheoutputlayerandwhichactivationfunctionshouldbeusedinthehiddenlayer.Therearemanydifferentactivationfunctions,butmostoftenusedarerelu,tanhandsigmoid.Firstly,wewillnotuseanyactivationfunctionstostartbuildingabasicsequentialmodel.

model=Sequential()model.add(Dense(512,input_shape=(784,)))model.add(Dense(10,activation='softmax'))Asalreadymentioned,intheinputthere

are784neurons,inthehiddenlayerthereare512,andthereare10neuronscontainedintheoutputlayer.Beforeyoutrainyourmodel,youcanlookatyourneuralnetworkstructureandparametersusingmodelsummaryargumentasillustratedbelow.

Layers(input==>output)--------------------------dense_1(None,784)==>(None,512)dense_2(None,512)==>(None,10)SummaryLayer(type)OutputShapeParam#=================================================================dense_1(Dense)(None,512)401920_________________________________________________________________output(Dense)(None,10)5130=================================================================Totalparams:407,050Trainableparams:407,050Non-trainableparams:0_________________________________________________________________NoneOnceyouaresureaboutthestructureofyourmodel,youmusttrainitforfiveepochs.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-3s-loss:0.3813-acc:0.8901-val_loss:0.2985-val_acc:0.9178Epoch2/560000/60000[==============================]-3s-loss:0.3100-acc:0.9132-val_loss:0.2977-val_acc:0.9196Epoch3/560000/60000[==============================]-3s-loss:0.2965-acc:0.9172-val_loss:0.2955-val_acc:0.9186Epoch4/560000/60000[==============================]-3s-loss:0.2873-acc:0.9209-val_loss:0.2857-val_acc:0.9245Epoch5/560000/60000[==============================]-3s-loss:0.2829-acc:0.9214-val_loss:0.2982-val_acc:0.9185

Testloss:,0.299Testaccuracy:0.918

Asyoucansee,ourresultsof91.8%usingMNISTisquitebad.Whenyouplotthelosses,youwillseethatthevalidationlossisfarawayfromimprovinganditwillnotimproveevenafterhundredepochs.

Therefore,wemusttrydifferenttechniquestopreventacommonproblemofover-fittingfromoccurring.Weneedmoretechniquestomakeourneuralnetworkmodellearningbetterandworkingsmarter.Wecanachievethisbyusingoneofthemostcustomarilyusedactivationfunctions,thesigmoidactivationfunction.

SIGMOIDACTIVATIONFUNCTION

Toimproveourneuralnetworkmodel,wewillusesigmoidactivationfunction.Itwillsquashourinputintoa0,1interval.

model=Sequential()model.add(Dense(512,activation='sigmoid',input_shape=(784,)))model.add(Dense(10,activation='softmax'))Youwillseethatthestructureofyourneuralnetworkremainedthesameasyoujusthavechangedtheactivationfunctionofyourdenselayer.Youcantrythesameforfiveepochs.


Testloss:0.113Testaccuracy:0.967

Thislooksmuchbetter.Yougetalinearcombinationofyourinputwiththebiasandtheweightsevenafteryoustackedmanylayers.Thisisverysimilartoaneuralnetworkwithoutanyhiddenlayers.Youcanaddsomemorelayersjusttoseewhatwilloccurasshownbelow.

model=Sequential()model.add(Dense(512,input_shape=(784,)))

foriinrange(5):model.add(Dense(512))

model.add(Dense(10,activation='softmax'))

Whenyoudothis,yougetyourneuralnetworkmodellookingasindicatedbelow.

Dense_1(None,784)==>(None,512)dense_2(None,512)==>(None,512)dense_3(None,512)==>(None,512)dense_4(None,512)==>(None,512)dense_5(None,512)==>(None,512)dense_6(None,512)==>(None,10)

_________________________________________________________________Layer(type)OutputShapeParam#=================================================================dense_1(Dense)(None,512)401920_________________________________________________________________dense_2(Dense)(None,512)262656_________________________________________________________________dense_3(Dense)(None,512)262656_________________________________________________________________dense_4(Dense)(None,512)262656_________________________________________________________________dense_5(Dense)(None,512)262656_________________________________________________________________dense_16(Dense)(None,10)5130=================================================================Totalparams:1,720,330Trainableparams:1,720,330Non-trainableparams:0_________________________________________________________________NoneYougetresultsforfiveepochsasfollows.

Trainon60000samples,validateon10000samples

Epoch1/560000/60000[==============================]-17s-loss:1.3217-acc:0.7310-val_loss:0.7553-val_acc:0.7928Epoch2/560000/60000[==============================]-16s-loss:0.5304-acc:0.8425-val_loss:0.4121-val_acc:0.8787Epoch3/560000/60000[==============================]-15s-loss:0.4325-acc:0.8724-val_loss:0.3683-val_acc:0.9005Epoch4/560000/60000[==============================]-16s-loss:0.3936-acc:0.8852-val_loss:0.3638-val_acc:0.8953Epoch5/560000/60000[==============================]-16s-loss:0.3712-acc:0.8945-val_loss:0.4163-val_acc:0.8767


Thisisquitebad.Youcanseethatyourneuralnetworkmodelisjustunabletolearnwhatyouwant.Thishappenedbecausewithoutnonlinearity,yourneuralnetworkisjustabasiclinearclassifierunableofacquiringanynonlinearrelationships.

Ontheotherhand,sigmoidisalwaysanonlinearfunction,sowecannotrepresentitasalinealcombinationofourinput.Thatisexactlywhatbringsnonlinearitytoyourneuralnetworkmodel,soitcanlearnanynonlinearrelationships.Now,trainyourneuralnetworkmodel.Trainthefive-hiddenlayermodelusingsigmoidactivations.

Trainon60000samples,validateon10000samplesEpoch1/560000/60000[==============================]-16s-loss:0.8012-acc:0.7228-val_loss:0.3798-val_acc:0.8949Epoch2/560000/60000[==============================]-15s-loss:0.3078-acc:0.9131-val_loss:0.2642-val_acc:0.9264

Epoch3/560000/60000[==============================]-15s-loss:0.2031-acc:0.9419-val_loss:0.2095-val_acc:0.9408Epoch4/560000/60000[==============================]-15s-loss:0.1545-acc:0.9544-val_loss:0.2434-val_acc:0.9282Epoch5/560000/60000[==============================]-15s-loss:0.1236-acc:0.9633-val_loss:0.1504-val_acc:0.9548


Thisismuchbetter.Inthiscase,youareprobablyover-fitting,butyoucanseethatyougotagreatboostinyourmodel’sperformancejustbyusingtheactivationfunction.Sigmoidactivationfunctionsaregreatastheyhavemanyphenomenalpropertieslikedifferentiability,nonlinearityandthis0,1rangegivesusanamazingprobabilityofreturnvaluesthatisanicefunction.

However,thisapproachhasitsdrawbacks.Forinstance,whenyouusebackpropagation,youmustback-propagatethederivativefromyourinputbacktoitsinitialweights.Youwanttopassyourregressionorclassificationerrorinthatfinaloutputvaluebackthroughyourwholeneuralnetwork.

Therefore,youmustderiveyourlayersaswellasupdateallweights.However,withsigmoid,thereisanissuewithaderivative.Withsigmoid,themaxvalueofthederivativeisquitesmalljustaround0.25.Thismeansyoucanonlypassasmallfractionofyourerrortoyourpreviousneuralnetworklayers.Thisissuemaycauseyourmodeltolearnslow,soitneedsmoreepochsanddata.Tosolvethisproblem,youcanusethetanhfunction.

TANHACTIVATIONFUNCTION

Tanhactivationfunction,justlikesigmoid,isdifferentiableandnonlinear.Tanhactivationfunctionsgiveoutputwhichisinthe-1,1rangewhichisnotasniceas0,1range.However,thisisokayforneuralnetworkhiddenlayers.Tanhfunctionsalsohavemaxedderivative,whichisgoodforourissuehereaswecaneasilypassourerror,whichwasnotthecasewithsigmoidfunctions.

Tousethetanhactivationfunction,youmustchangetheactivationattributeofyourdenselayer.

model=Sequential()model.add(Dense(512,activation=’tanh’,input_shape=(784,)))model.add(Dense(10,activation=’softmax’))Again,youcanseethatthestructureofyourneuralnetworkisthesame.Now,trainforfiveepochs.



Youcanseethatyouimprovedyourtestaccuracybymorethanonepercentjustbyusingadifferentactivationfunction.Now,youprobablywonder,canyoudobetter?Fortunately,youcanthankstothereluactivationfunction.

RELUACTIVATIONFUNCTION

Therangeofreluactivationfunctionsis0toinfinity.However,unliketanhandsigmoidfunctions,reluisbothdifferentiableatzeroeventhoughtherearesolutionstothis.

Thebestthingaboutreluactivationfunctionisitsgradient,whichisalwaysequaltoone,sothiswayyoucaneasilypassthemaximumamountoftheerrorduringbackpropagation.Now,trainyourmodelandseetheresults.



Now,yougothebestresultof98.2%.Thisisquiteamazing,andyoudidnotuseanyhiddenlayer.

Itisveryimportanttosaythereisnobestactivationfunctionyoucanuse.Onemaybebetterinsomecaseswhileanotherisbetterinotherinstances.Anotherimportantthingtosayisthatusingdifferentactivationfunctionsdoesnotinanywayaffectwhatyourneuralnetworkcanlearnandhowfast.

CHAPTER5:MNISTHANDWRITTENRECOGNITION

Inthissectionofthebook,wearegoingtobuildasimpleneuralnetworkinKerasandtrainitonaGPU-enabledserver.ThismodelwillbeabletorecognizehandwrittendigitsthankstotheMNISTdataset.Asyoualreadyknow,MNISTcontains70000images,10000fortestingand60000fortraining.Allimagesare28x28pixels,centeredtoreducepreprocessingtimes.

Tostart,youmustsetyourenvironmentwithKerasusingTheanoorTensorFlowasthebackend.Inthisexample,wearegoingtousetheTensorFlowandKeraspackagesasshownbelow.

condainstall-qy-canacondatensorflow-gpuh5py

pipinstallkeras

Theseimportsarequitestandard.Oncedone,youmustimportKerasimportsasfollows.Youwillimportplottingandarray-handling.Oncecomplete,youmustkeepyourKerasbackendTensorFlowquiet.

importnumpyasnp

importmatplotlib

matplotlib.use('agg')

importmatplotlib.pyplotasplt

importos

os.environ['TFCPPMINLOGLEVEL']='3'

fromkeras.datasetsimportmnist

fromkeras.modelsimportSequential,load_model

fromkeras.layers.coreimportDense,Dropout,Activation

fromkeras.utilsimportnp_utils

Afterthatisdone,Keraswillimportthedatasetandbuilditonyourneuralnetwork.Thefollowingstepistopreparethedatasetwearegoingtouse,MNIST.YoumustloadthedatasetusingthisveryhandyfunctionthatwillsplitMNISTintotestsetsandtrainsets.

(Xtrain,ytrain),(Xtest,ytest)=mnist.load_data()

Thenextstepistoinspectseveralexamples.TakeintoconsiderationthatMNISTcontainsonlygrayscaleimages,soformoreadvanceddatasets,wewill

useRGBorthree-colorchannels.

fig=plt.figure()

foriinrange(9):

plt.subplot(3,3,i+1)

plt.tight_layout()

plt.imshow(X_train[i],cmap='gray',interpolation='none')

plt.title("Class{}".format(y_train[i]))

plt.xticks([])

plt.yticks([])

fig

Next,youbegintotrainyourmodeltoclassifyimages.Todothis,youmustunrollthewidthheightpixelformatintoonehugevector,yourinputvectors.Therefore,graphthedistributionofyourpixelvaluesasfollows.

fig=plt.figure()

plt.subplot(2,1,1)

plt.imshow(X_train[0],cmap='gray',interpolation='none')

plt.title("Class{}".format(y_train[0]))

plt.xticks([])

plt.yticks([])

plt.subplot(2,1,2)

plt.hist(X_train[0].reshape(784))

plt.title("PixelValueDistribution")

fig

Justasexpected,yougetapixelvaluerangingfromzeroto255.Inthiscase,thebackgroundmajorityisclosertozerowhilethosepixelscloserto255representMNISTdigits.Tospeedupthemodeltraining,youshouldnormalizetheinputdata.Normalizingyourinputdata,youalsoreducethechanceofyourmodelgettingstuckinlocaloptimaasyouareusingstochasticgradientdescenttofindtheoptimalweightsforyourneuralnetwork.

Thenextstepisreshapingyourinputstoasinglevectorandnormalizingthepixelvaluetobebetweenzeroandone.Todoso,youmustprinttheshapebeforeyoucannormalizeandreshapeit.

Print("X_trainshape",X_train.shape)

Print("y_trainshape",y_train.shape)

Print("X_testshape",X_test.shape)

print("y_testshape",y_test.shape)

Afterthat,youmustbuildyourinputvectorfrom28x28pixelsasseenbelow.

X_train=X_train.reshape(60000,784)

X_test=X_test.reshape(10000,784)

X_train=X_train.astype('float32')

X_test=X_test.astype('float32')

Thenextstepistonormalizethedatatoboostyourmodeltraining.

X_train/=255

X_test/=255

Thefollowingstepistoprintyourfinalinputshape,whichisreadyfortraining.

print("Trainmatrixshape",X_train.shape)

print("Testmatrixshape",X_test.shape)

('X_trainshape',(60000,28,28))

('y_trainshape',(60000,))

('X_testshape',(10000,28,28))

('y_testshape',(10000,))

('Trainmatrixshape',(60000,784))

('Testmatrixshape',(10000,784))

Asyoucansee,Yinthistrainingmodelholdsintegervaluesfromzerotonine.Useitformodeltraining.

print(np.unique(y_train,return_counts=True))

(array([0,1,2,3,4,5,6,7,8,9],dtype=uint8),array([5923,6742,5958,6131,5842,5421,5918,6265,5851,5949]))

Thenextstepistoencodeyourcategories,digitsfromzerotonineusingone-hotencoding.Youwillgettheresultthatisavectorwithalengthequaltoyournumberofcategories.Inaddition,thevectoryougetisallzerosexceptinthemiddleposition.

N_classes=10

Print("",y_train.shape)

Y_train=nputils.tocategorical(ytrain,nclasses)

Y_test=nputils.tocategorical(y_test,n_classes)

print(":",Y_train.shape)

ThenextstepistoturntoKerastobuildyourneuralnetwork.Atthispoint,yourpixelvectorservesastheinput.Therearetwohidden512-nodelayersaswell.Therefore,thereismodelcomplexityyouwilluseforrecognizingdigits.Wemustaddanotherfully-connectedlayerforthetendifferentoutputclassesduetomulti-classclassification.Further,wearegoingtousethesequentialmodel.Thefirststepistostackmorelayersusingtheaddargument.

WhenyouaddyourfirstlayerintheKerassequentialmodel,youmustspecifyyourinputshapeforKerastocreatethepropermatriceswhiletheshapeforotherlayersisinferredbyKerasautomatically.Tointroducenonlinearitiesintoyournetworkandtoevaluateitbeyondcapabilitiesofabasicperceptron,youmustaddactivationfunctionstoyourhiddenlayers.

Thisdifferentiationforthemodeltrainingusingbackpropagationisoccurringbehindthescenes.Youmustaddadropoutmodelasthebestwayforpreventingmodelover-fitting.Youwillusethesoftmaxactivationasthestandardforeverymulti-classtargets.Whenbuildingyourmodel,thefirststepistobuildalinearstackoflayers.

model=Sequential()

model.add(Dense(512,input_shape=(784,)))

model.add(Activation('relu'))

model.add(Dropout(0.2))

model.add(Dense(512))

model.add(Activation('relu'))


model.add(Dense(10))

model.add(Activation('softmax'))

Followingthat,youmustcompileyourmodelusingthecompileargument.Inthisstep,youmustspecifyyourobjectiveorlossfunction.Inthisexample,wearegoingtoseecategoricalcrossentropy,butyoucanuseanyotherlossfunctions.

Whenitcomestotheoptimizers,inthisexamplewearegoingtousetheAdamoptimizerwithdefaultsettings.Youcaninstantiateyouroptimizerandsetparametersjustbeforeyouusethemodelcompileargument.Youmustchoosewhichmetricsyouwanttoevaluateduringmodeltrainingandtesting.Youcanhavemetricsdisplaysduringyourtestingandtrainingstageifyoulike.Tocompileyourmodel,doasfollows.

Model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer='adam')

Onceyoucompileyourmodel,youcanmovetomodeltraining.Youmustspecifyhowmanytimesyouwanttoiterateyourtrainingsetorepochs.Youmustspecifyhowmanysamplesyouwanttoupdatetothemodel’sbatchsizeorweights.

Keepinmindthatbiggerthebatch,themorestablestochasticgradientdescentupdatesbecome.However,beawareofGPUmemorylimitation.Inthisexample,wearegoingwithabatchsizeof8and128epochs.

Tohandleyourmodeltrainingprocesscorrectly,youshouldgraphthelearningcurveforyourmodellookingonlyatthemodelaccuracyandloss.Beforeyoucontinue,youshouldsaveyourmodel.Oncecompleted,yougettoworkwithyourtrainedmodelandfinallyevaluateitsperformance.Tosavemetricsrunasshownbelow.

history=model.fit(Xtrain,Ytrain,

batch_size=128,epochs=8,

verbose=2,

validation_data=(Xtest,Ytest))

Then,youmustsaveyourmodel.

Save_dir="/results/"

Model_name='keras_mnist.h5'

Modelpath=os.path.join(savedir,model_name)

Model.save(model_path)

print('Savedtrainedmodelat%s'%model_path)

Thenextstepistoplotthemetrics.

fig=plt.figure()

plt.subplot(2,1,1)

plt.plot(history.history['acc'])

plt.plot(history.history['val_acc'])

plt.title('modelaccuracy')

plt.ylabel('accuracy')

plt.xlabel('epoch')

plt.legend(['train','test'],loc='lowerright')

plt.subplot(2,1,2)

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('modelloss')

plt.ylabel('loss')

plt.xlabel('epoch')

plt.legend(['train','test'],loc='upperright')

plt.tight_layout()

fig

Youwillnoticethatthelossonyourtrainingsetisdecreasingrapidlywhenitcomestothefirsttwoepochs.Thismeansthatyourneuralmodelislearningtoclassifyhandwrittendigitsquitefast.Whenitcomestothetestset,thelosswillnotbedecreasingasfast,butitwillstaywithintherangeofthetrainingloss.Thismeansthatyourmodelisableofgeneralizingwelltodata,whichisunseen.

Thefollowingstepistoevaluateyourmodel’sperformance.Inthisstep,youaregoingtoseehowwellyourmodelperformsonthegiventestset.Toassessyourmodel,usemodelevaluatesfromtheargumentthatcomputeanymetricdefinedduringthemodelcompileprocess.Inthisexample,themodelaccuracyiscomputedonthe10000imagesoftestingexamplesusingthemodelweightsgivenbyoursavedmodel.

Mnistmodel=loadmodel

Loss_andmetrics=mnistmodel.evaluate(Xtest,Ytest,verbose=2)

print("TestLoss",lossandmetrics[0])

print("TestAccuracy",lossandmetrics[1])

('TestLoss',0.06264158328680787)

('TestAccuracy',0.98299999999999998)

Youwillgetthismodelaccuracythatlooksquitegood.However,youshouldlookatnineexampleseachsoyouevaluatebothincorrectlyandcorrectlyclassifiedexamples.Thefirststepistoloadthemodelandcreatepredictionsonyourtestset.

Mnistmodel=loadmodel

Predictedclasses=mnistmodel.predictclasses(Xtest)

Then,seewhatyoupredictedcorrectlyandincorrectly.

Correctindices=np.nonzero(predictedclasses==y_test)[0]

Incorrect_indices=np.nonzero(predictedclasses!=ytest)[0]

Print()

Print(len(correct_indices),"classifiedcorrectly")

Print(len(incorrect_indices),"classifiedincorrectly")

Thefollowingstepistoadaptthefiguresizetoaccommodateeighteensubplotsasfollows.

Plt.rcParams['figure.figsize']=(7,14)

Figure_evaluation=plt.figure()

Then,youmustplotninecorrectandnineincorrectpredictions.

(correct_indices[:9]):

Plt.subplot(6,3,i+1)

Plt.imshow(X_test[correct].reshape(28,28),cmap='gray',interpolation='none')

Plt.title("Predicted{},Class{}".format(predicted_classes[correct],y_test[correct]))

Plt.xticks([])

Plt.yticks([])

(incorrect_indices[:9]):

Plt.subplot(6,3,i+10)

Plt.imshow(X_test[incorrect].reshape(28,28),cmap='gray',interpolation='none')

Plt.title("Predicted{},Class{}".format(predicted_classes[incorrect],y_test[incorrect]))

Plt.xticks([])

Plt.yticks([])

Figure_evaluation

9696/10000[============================>.]-ETA:0s()

(9830,'classifiedcorrectly')

(170,'classifiedincorrectly')

Asyoucansee,theseincorrectpredictionsarequiteforgivableasinsomecasesitishardforthehumanreadertorecognize.Inthissectionofthebook,weusedKeraswithitsbackendTensorFlowonaGPUenabledservertotrainourneuralnetworktorecognizethehandwrittendigitsinjustunder20secondsofoveralltrainingtime.

CHAPTER6:NEURALNETWORKMODELSFORMULTI-CLASSCLASSIFICATIONPROBLEMS

Asyoualreadyknow,KerasisahighlypowerfulpartofthePythonlibraryfordeeplearning,whichwrapstheefficientnumericallibrariesTensorFlowandTheano.Inthissectionofthebook,youaregoingtolearnhowtouseKerastodevelopandevaluateyourneuralnetworkmodelsyoucanuseforassortedmulti-classclassificationproblems.

Afterthat,youwillknowhowtoloaddatafromCSVtoKeras,youwillhowtopreparemulti-classclassificationdataforfurthermodelingwithyourneuralnetworks,andyouwillknowhowtoevaluateyourKerasneuralnetworkmodelsusingscikit-learn.

Inthisspecificexample,wearegoingtoseeoneofthestandardmachinelearningproblemsnamedirisflowersdataset.Thisproblemiswell-studied,soitisthebestexampletousewhenyouwanttopracticemoreonneuralnetworksasallfourinputvariablesarenumericmeaningandtheyhavethesamescalerepresentedincentimeters.Inaddition,everyinstancedescribesthepropertiesoftheflowermeasurements,sotheoutputvariablesareveryspecificirisspecies.

Thisisastandardmulti-classclassificationproblem.Thismeanstherewillbemorethantwoclassesyoumustpredict,astherewillbethreeflowerspecies.

ThisisthebestillustrationtousewhenyouwanttopracticeonneuralnetworkmodelsinKerasbecausethesethreeclassvaluesrequireveryspecializedhandling.Sincethisirisflowerdatasetisaverycommonandwell-studiedproblem,expecttogetyourmodelaccuracysomewherebetween95%to97%.Tostart,youmustdownloadthisirisflowerdatasetfromtheMachineLearningrepository.Oncedownloaded,placeitinyourworkingdirectory.Thenextstepistoimportallclassesandfunction.Thisincludesboththedataloadingfrompandasaswellasdatapreparation.Youneedmodelevaluationfromscikit-learn.

importnumpy

importpandas



fromkeras.wrappers.scikit_learnimportKerasClassifier

fromkeras.utilsimportnp_utils

fromsklearn.modelselectionimportcrossval_score

fromsklearn.model_selectionimportKFold

fromsklearn.preprocessingimportLabelEncoder

fromsklearn.pipelineimportPipeline

Thenextstepistoinitializearandomnumbergeneratortoaconstantvalueofseven.Thisisveryimportant,asyouwanttoensurethattheresultsyougetfromthisneuralnetworkmodelcaninfactbeachievedagain.

Thisstepensuresthatthestochasticprocessofmodeltrainingcanbereproduced.Therefore,yournextstepistofixrandomseedforreproducibilityasseenbelow.

seed=7

numpy.random.seed(seed)

Oncefixed,youmustloadthedatasetdirectly.Youcandoitdirectlyastheoutputvariablecontainsstrings,sothebestwayistoloadthedatausingpandas.Inaddition,youcansplitattributedintoinputvariablesasXandoutputvariablesasY.

dataframe=pandas.read_csv("iris.csv",header=None)

dataset=dataframe.values

X=dataset[:,0:4].astype(float)

Y=dataset[:,4]

ONE-HOTENCODING

Asalreadymentioned,inthisexample,theoutputvariablecontainsthreestringvalues.Therefore,youmustencodetheoutputvariables.Whenyoumodelmulti-classclassificationproblemswithneuralnetworks,thebestwayistoreshapeyouroutputattributedfromavectorwhichcontainsvaluesforeveryclasstoamatrixthathasaBooleanforeveryclassvaluenomattergiveninstanceoftheclassvalue.Thisiscalledcreatingdummyvariablesorone-hotencodingofcategoricalvariables.

Forinstance,inthisproblemhere,therearethreeclassvaluesnamedIris-versicolor,Iris-setosaandIris-virginica.

Iris-setosa

Iris-versicolor

Iris-virginica

Inthiscase,youcanturnthisintoasingleone-hotencodedbinarymatrixforeverydatainstance,whichwouldlikeasshownbelow.

Iris-setosa,Iris-versicolor,Iris-virginica

1,0,0

0,1,00,0,1

Youcandothisbyjustencodingthestringstointegerswiththescikit-learnclass

LabelEncoder.Oncecompleted,youcanconverttheintegervectortoasingleone-hotencodingusingthefunctiontocategoricalinKerasasdemonstratedbelow.

encoder=LabelEncoder()

encoder.fit(Y)

encoded_Y=encoder.transform(Y)

dummy_y=nputils.tocategorical(encoded_Y)

DEFININGNEURALNETWORKMODELSWITHSCIKIT-LEARN

Oncedonewithone-hotencoding,youmustdefineyourneuralnetworkmodel.TheKeraslibrarycomeswithwrapperclasses,whichallowyoutouseyourneuralnetworkmodelsyoucreatedinKerasinscikit-learn.Plus,thereisaKerasClassifierclassthatcanbeusedasscikit-learnestimator.

ThisKerasClassifiertakesthenameofthefunctionsasanargument.Considerthatthisfunctionmustreturntoyourneuralnetworkmodelwhichisreadyfortraining.Further,wearegoingtocreateafunctionforthisirisclassificationproblem.

Onceyourunthecode,youwillcreateasimple,fully-connectedneuralnetworkcontainingonehiddenlayerwitheightneurons.Thishiddenlayerwillusearectifiedactivationfunctionthatisaverygoodpractice.Sincewealreadyusedone-hotencodingonthisirisdataset,youroutputlayersmustcreatethreeoutputvaluesforeachclass.Oncedone,theoutputvaluewiththebiggestvaluebecomestheclasspredictedbyyourneuralnetworkmodelasfollows.

4inputs->[8hiddennodes]->3outputs

Onethingshouldbenoted.Weusedasoftmaxactivationfunctioninouroutputlayertoensurethattheoutputvaluesofourmodelareintherangeofzeroandone,sotheymaybeusedasourpredictedprobabilities.

Finally,wemustusetheefficientAdamgradientdescentoptimizationmodelalongsidelogarithmiclossfunctionrepresentedasthecategoricalcrossentropyargumentinKeras.Therefore,thenextstepistodefineyourbaselinemodel.Oncecompleted,youmustcreateitandcompileitasbelow.

defbaseline_model()

model=Sequential()


model.add(Dense(3,activation='softmax'))

model.(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

returnmodel

Afterthat,youcanfinallycreateyourKerasClassifierinscikit-learn.YoucanpassargumentsaswellduringtheconstructionofyourKerasClassifier,whichwillbequicklypassedontoyourfitfunction,whichyouwillusefortrainingyourneuralnetworkmodel.Following,wearegoingtopassthenumberofepochsastwohundredandbatchsizeas5tousethemduringmodeltraining.Bearinmindthatdebuggingisalsoturnedoffaswesetverbosetozero.

estimator=KerasClassifier(buildfn=baselinemodel,epochs=200,batch_size=5,verbose=0)

EVALUATINGMODELSWITHK-FOLDCROSSVALIDATION

Oncefinishedwiththepreviousstep,youmustevaluateyourneuralnetworkmodelonyourtrainingdata.Thepowerfulscikit-learnhasexcellentcapabilityofevaluatingneuralnetworkmodelsusingseveraldifferenttechniques.Thebestwayforevaluatingyourneuralnetworkmodelsisusingk-foldcrossvalidation.

Usingk-foldcrossvalidation,youcanevaluateyourmodelonyourdatasetusingaten-foldcrossvalidationargumentork-fold.Theprocessofevaluatingyourmodelwilltakeabouttenseconds.

Whenfinished,yourmodelwillreturnasanobjectwhichdescribestheevaluationofthetenconstructedmodelsforeachofthesplitsinthedatasetasshownbelow.

results=cross_valscore(estimator,X,dummyy,cv=kfold)

print("Baseline:%.2f%%(%.2f%%)"%(results.mean()*100,results.std()*100))

Oncecompleted,youwillseethattheresultsaresummarizedasboththestandardandmeandeviationofyourneuralnetworkmodelaccuracyonthedatasetweused.

Thisisaveryreasonableestimationoftheoverallperformanceofyourneuralnetworkmodelonthisunseendata.Thisiswellwithintherealmofknown

resultsforthisspecificproblemasyougetaccuracyasseenbelow.

Accuracy:97.33%(4.42%)

CHAPTER7:RECURRENTNEURALNETWORKS

Inthislastsectionofthebook,youaregoingtolearnhowtocreaterecurrentneuralnetworksinKeras.Recurrentneuralnetworksareaclassofneuralnetworkmodels,whichexploitthesequentialnatureoftheirinput.Suchinputscanbespeech,text,timeseries,andeverythingelsewheretheoccurrenceofanelementinthesequenceisdependentontheelements,whichappearedbeforeit.

AnRNNmodelcanbethoughtofasagraphofrecurrentneuralnetworkcellswhereeverycellperformsthesameoperationoneachelementinthesequence.

Recurrentneuralnetworksareveryflexible,sotheyhavebeenusedtosolvediverseproblemslikelanguagemodeling,speechrecognition,sentimentanalysis,machinetranslationandimagecaptioning,tonameafew.

Recurrentneuralnetworkscanbereadilyadaptedtomanykindsofproblemsjustbyrearrangingthewaythecellsaresituatedinthegraphs.Inthissectionofthebook,youaregoingtolearnmoreaboutLSTMorlongshort-termmemoryandGRUorgatedrecurrentunitmodels,abouttheirpowersandtheirlimitations.

BothGRUandLSTMaredrop-inreplacementsforthebasicrecurrentneuralnetworkcell,sojustbyreplacingtherecurrentneuralnetworkcellwithoneofthesetwovariationsyoucangetamajorperformanceboostinyournetwork.

WhileGRUandLSTMarenottheonlyvariants,theyhaveproventobethemostefficientforsolvingmostsequenceproblems.

SEQUENCECLASSIFICATIONWITHLSTMRECURRENTNEURALNETWORKS

Sequenceclassificationisacommonpredictivemodelingprobleminwhichyouhaveasequenceofinputsplacedovertimeorspace,andyourtaskistopredictaspecificcategoryforthatsequence.ApowerfultypeofneuralnetworkmodelcreatedtohandleproblemslikethisisaLSTMrecurrentneuralnetwork.

Thelongshort-termmemoryisatypeofrecurrentneuralnetworkordinarilyusedindeeplearningproblemsduetoitslargearchitectureswhichcanbesuccessfullytrained.Inthissection,youaregoingtolearnsequenceclassificationinKerasusingLSTMrecurrentneuralnetworks.

Whatmakesthisproblemdifficultisthatthespecificsequencescanvaryinlength,theymaysometimescontaintheverylargevocabularyoftheirinputsymbolsandtheymayrequireyourneuralnetworkmodeltolearnthatlong-termcontextofdiversedependenciesexistingbetweendifferentsymbolsinyourinputsentence.

TheproblemwearegoingtosolveistheIMDBmoviesentimentclassificationproblem.EachmoviereviewontheIMDBisavariablesequenceofwordswhileeachsentimentofeverymoviereviewmustbeclassified.WewillusetheIMDBdatasetthatcontains25,000moviereviewsbothgoodorbadfortestingandtraining.Theproblemhereistodeterminewhetherthemoviehasanegativeorpositivesentiment.

KerascomeswithbuiltinaccesstotheIMDBdataset.Toloadit,useIMDBloaddatafunction.Onceloaded,youcanuseitforyourdeeplearningmodels.Thewordsherehavebeenreplacedbyintegers,whichindicatethespecificorderedfrequencyofeachwordintheIMDBdatasetwhilethesentencesineachmoviereviewarecomprisedofacertainsequenceofintegers.

WORDEMBEDDING

Ourfirstmoveistomapeachmoviereviewintoarealvectordomain.Thisisaverypopulartechniqueusedwithtextnamedwordembedding.Inthistechnique,wordsareencodedasreal-valuedvectorsinahighdimensionalspaceinwhichthesimilaritybetweenwords,intermsoftheirmeaning,translatestotheclosenessofthatvectorspace.

KerasisgoodforthisasitprovidesahighlyeffectivelyandconvenientwayforconvertingpositiveintegerwordrepresentationsintoawordembeddingusingEmbeddinglayers.Therefore,wearegoingtomapeachwordontoathirty-two-lengthreal-valuedvector.Inaddition,wearegoingtolimitthetotalnumberofwordsweareinterestedinneuralnetworkmodelingtothefivethousandmostfrequentwordsandzeroouttheremaining.

Moreover,wearegoingtoconstraineachmoviereviewtobefivehundredwordsaswetruncatelongmoviereviewsandpadthoseshorterreviewswithzerovalues.Thefirststepistoprepareandmodeldata.Oncedone,youarereadytocreateyourLSTMmodel,whichwillclassifythesentimentofmoviereviews.

YourfirststepistoquicklydevelopabasicLSTMforthisIMDBproblem.Startwithimportingfunctionsandclasses,whicharerequiredforthismodel.Then,initializetherandomnumbergeneratortoaconstantvaluetomakesureyoucaneffortlesslyreproducetheresultsyouget.

importnumpy

fromkeras.datasetsimportimdb



fromkeras.layersimportLSTM

fromkeras.layers.embeddingsimportEmbedding

fromkeras.preprocessingimportsequence


Oncedone,youmustloadtheIMDBdataset.Additionally,youwillconstrainthedatasettothetopfivethousandwords.Youmustsplitthedatasetintotestandtrainsets.

Top_words=5000

(Xtrain,ytrain),(Xtest,ytest)=imdb.load_data(numwords=topwords)

Thefollowingstepistruncatingandpaddingyourinputsequences,sotheyaresameinthelength.Yourmodelwilllearnthesezerovaluesthatcarrynoinformation,sosamelengthvectorsarerequiredforcomputationhere.

Maxreviewlength=500

Xtrain=sequence.padsequences(Xtrain,maxlen=maxreview_length)

X_test=sequence.pad_sequences(Xtest,maxlen=maxreview_length)

Oncecompleted,youmustdefine,compileandfinallyfityourLSTMmodel.

Thefirstlayerinyourembeddedlayerusesthirty-two-lengthvectors,whichrepresenteveryword.ThefollowinglayerisyourLSTMlayerthathashundredsmartormemoryunits.Furthermore,youmustuseadenseoutputlayercontainingasigmoidactivationfunctionandasingleneurontomakezeroandonepredictionsforyourtwoclasses,goodorbadreviews,intheproblem.

Sincethisisabinaryclassificationproblem,youmustuseloglossasyourlossfunctionsinadditiontotheefficientAdamoptimizer.Yourmodelswillbefitforonlytwoepochs,soitwillquicklyover-fitthisproblem.Tospaceoutweightupdates,youwilluseabigbatchsizeofsixty-fourmoviereviews.

Embeddingvecorlength=32

model=Sequential()

model.add(Embedding(top_words,embeddingvecorlength,inputlength=maxreview_length))

model.add(LSTM(100))



print(model.summary())

model.fit(Xtrain,ytrain,validation_data=(Xtest,ytest),epochs=3,batch_size=64)

Thenextstepistoestimatetheperformanceofyourmodelonafewunseenmoviereviewsasfollows.

scores=model.evaluate(Xtest,ytest,verbose=0)

print("Accuracy:%.2f%%"%(scores[1]*100))

Runningthiscode,youwillgetasindicatedbelow.

Epoch1/3

16750/16750[==============================]-107s-loss:0.5570-acc:0.7149

Epoch2/3

16750/16750[==============================]-107s-loss:0.3530-acc:0.8577

Epoch3/3

16750/16750[==============================]-107s-loss:0.2559-acc:0.9019

Accuracy:86.79%

APPLYINGDROPOUT

ThisverysimpleLSTMmodelwithlittletuningprovidesgreatresultsonthisIMDBdatasetproblem.UsethismodelasatemplatewhichyoucanapplytoavarietyofLSTMneuralnetworkstoyourownsequenceclassificationproblems.

RecurrentneuralnetworkslikeLSTMfrequentlycomewithover-fittingproblemsyoucansolvebyapplyingthedropoutKeraslayerbetweenlayers.YoujustaddnewlayersbetweenyourembeddingandLSTMlayersandbetweenyourLSTMandDenseoutputlayersasfollows.

model=Sequential()

model.add(Embedding(top_words,embeddingvecorlength,inputlength=maxreview_length))


model.add(LSTM(100))



Runningthisyouwillgetfollowingresult.

Epoch1/3

16750/16750[==============================]-112s-loss:0.6623-acc:0.5935

Epoch2/3

16750/16750[==============================]-113s-loss:0.5159-acc:0.7484

Epoch3/3

16750/16750[==============================]-113s-loss:0.4502-acc:0.7981

Accuracy:82.82%

Asyoucansee,thedropoutlayerhasanimpactontrainingwithalowerfinalaccuracyandslowertrendinconvergence.ThisLSTMmodelprobablycoulduseseveralmoreepochsoftrainingforbetterskill.DropoutcanalsobeappliedtotherecurrentconnectionsofthememoryunitswiththeLSTMseparatelyandprecisely.

KerascomeswithamazingcapabilityontheLSTMlayers.Youcanusethedropoutfunctionforconfiguringyourinputdropoutandyourrecurrentdropout.Youcanmodifythecodeandadddropouttoyourrecurrentconnectionsandtotheinputasillustratedbelow.

model=Sequential()

model.add(Embedding(top_words,embedding_vecor_length,inputlength=maxreview_length))

model.add(LSTM(100,dropout=0.2,recurrent_dropout=0.2))


YoucanseethatthisLSTMspecificdropouthasmoreeffectonthelayer-wisedropoutandontheconvergenceofyournetwork.Dropoutisaverypowerfultechniqueyoushoulduseforcombatingover-fittingissuesinyourLSTMmodels.Makesureyouusebothmethodseventhoughyoumaygetbetterresultswhenusingthisgate-specificdropoutmethod.

NATURALLANGUAGEPROCESSINGWITHRECURRENTNEURALNETWORKS

Inthissectionofthebook,wearegoingtosolveanaturallanguageprocessingproblemusingrecurrentneuralnetworksinKeras.Thisnaturallanguageprocessingproblemaimstoextractthemeaningofspeechutterances.Wearegoingtobreakthisproblemintosolvablepracticalissuesofunderstatingthespeakerinalimitedcontext.Here,wewanttoidentifytheintentofaspeakeraskingforinfoaboutflights.

WearegoingtouseAirlineTravelInformationSystemorATIS.ThisdatasetwasobtainedbyDARPAbackintheearly90s.Thedatasetconsistsofspokenqueriesonnumerousflights.ATIScontains4,978sentencesand56,590wordsbothinthetestandtrainset.Thenumberofclassescontainedin128.Ourapproachhereistouserecurrentneuralnetworksandwordembedding.

Asyoualreadyknow,wordembeddingmapswordstovectorsinahigh-dimensionalspace.Thewordembeddingwhenlearnedrightcanlearnsyntacticandsemanticinformationofthewordsinthisspace.Thisembeddingspacewillbelearnedbyyourmodelthatyoudefinelater.

Forthisproblem,aconvolutionallayercandogreatwhenitcomestopoolinglocalinformation,buttheyarenotcapableofcapturingtherealdatasequentially,sowearegoingtouserecurrentneuralnetworkswhichwillhelpustacklethisconsecutiveinformationasnaturallanguage.

Arecurrentneuralnetworkmodelhassuchamemorythatstoresthesummaryofcountlesssequencesthemodelhasseenbefore.

ThismeansyoucanuserecurrentneuralnetworkstosolvecomplexwordtaggingproblemslikePOSorpartofspeechtaggingorslotfillingasinthisproblem.

Forthisproblem,youmustpassthewordembeddingssequenceastheinputofyourrecurrentneuralnetwork.

AsyouaregoingtouseIOBrepresentationforyourlabels,itisnecessarytocalculatethescoresofyourmodel.Therefore,youwillruncodeasshownbelowforyourscorecalculation.Priortothathowever,youmustdownloadthecorrespondingATISfile.

gitclonehttps://github.com/chsasank/ATIS.keras.git

cdATIS.keras

IrecommendyoutouseJupyterNotebook.Afterthat,youmustloadyourdatausingdataloadatisfullargument.Keraswilldownloadthedatathefirsttimeyourunit.

Labelsandwordsareencodedasindexestoyourdatasetvocabularyandvocabularyisstoredinlabels2idx.

importnumpyasnp

importdata.load

train_set,valid_set,dicts=data.load.atisfull()w2idx,labels2idx=dicts['words2idx'],dicts['labels2idx']

train_x,,trainlabel=train_setval_x,,vallabel=valid_setThenextstepistocreateanindextolabelandworddictsasseenbelow.

idx2w={w2idx[k]:kforkinw2idx}

idx2la={labels2idx[k]:kforkinlabels2idx}

Then,createconllevalscriptasfollows.

Words_train=[list(map(lambdax:idx2w[x],w))forwintrain_x]

labels_train=[list(map(lambdax:idx2la[x],y))foryintrain_label]

words_val=[list(map(lambdax:idx2w[x],w))forwinval_x]

labels_val=[list(map(lambdax:idx2la[x],y))foryinval_label]

n_classes=len(idx2la)

n_vocab=len(idx2w)

Thenextstepistoprintanexamplelabelandsentence.

print("Examplesentence:{}".format(words_train[0]))

print("Encodedform:{}".format(train_x[0]))

print()

print("It'slabel:{}".Format(labels_train[0]))

print("Encodedform:{}".format(train_label[0]))

Thisiswhatyouget.

Examplesentence:[...]

Encodedform:[2325425021962087762103540582341376211234481321]

It'slabel:[...]

Encodedform:[126126126126126481263599126126126781261412612612]

ThenextstepistodefineyourKerasmodel.Kerascomeswithbuiltembeddinglayeryoucanuseforwordembeddings.Itwillexpectintegerindices.

YoualsomustuseTimeDistributedargumenttopasstheouputofyourrecurrentneuralnetworkateachtimesteptoafullyconnectedlayers.Ifyoudonotperformthisstep,youroutputatthetimeofthefinalstepwillbepassedonyournextlayer.


fromkeras.layers.embeddingsimportEmbedding

fromkeras.layers.recurrentimportSimpleRNN

fromkeras.layers.coreimportDense,Dropout

fromkeras.layers.wrappersimportTimeDistributed

fromkeras.layersimportConvolution1D

model=Sequential()

model.add(Embedding(n_vocab,100))model.add(Dropout(0.25))

model.add(SimpleRNN(100,return_sequences=True))model.add(TimeDistributed(Dense(n_classes,activation='softmax')))model.compile('rmsprop','categorical_crossentropy')Thenext

stepistotrainyourmodel.Youwillpasseverysentenceasabatchtoyourmodel.Youcannotusemodelfitargumentasitexpectsallincludedsentencestobethesamesize.Therefore,youaregoingtousemodeltrainonbatchargument.

importprogressbar

n_epochs=30

print("Trainingepoch{}".format(i))

bar=progressbar.ProgressBar(maxvalue=len(trainx))

label=trainlabel[nbatch]

Then,youmustmakelabelsone-hot.Whenthatstepisfinished,youmustmakeyourmodelvieweachsentenceasabatchasindicatedbelow.

label=np.eye(n_classes)[label][np.newaxis,:]

sent=sent[np.newaxis,:]

model.trainonbatch(sent,label)Tomeasuretheaccuracyofyourmodel,youaregoingtousethemodelpredictonbatchargumentalongsidemetricsaccuracyconllevalargument.

frommetrics.accuracyimportconlleval

labelspredval=[]

bar=progressbar.ProgressBar(max_value=len(val_x))

forn_batch,sentinbar(enumerate(val_x)):

label=vallabel[nbatch]

label=np.eye(n_classes)[label][np.newaxis,:]

sent=sent[np.newaxis,:]

pred=model.predictonbatch(sent)

pred=np.argmax(pred,-1)[0]

labelspredval.append(pred)

labelspredval=[list(map(lambdax:idx2la[x],y))\

foryinlabelspredval]

con_dict=conlleval(labelspredval,labels_val,

words_val,'measure.txt')

print('Precision={},Recall={},F1={}'.format(

condict['r'],condict['p'],con_dict['f1']))

Withthismodel,youshouldgetaroundninety-twoF1Score.Onedrawbackonthismodelisthatthereisnolookahead.Youcanhandilyimplementitbyaddingaconvolutionallayerbeforerecurrentneuralnetworklayersandjustafterwordembeddingsasfollows.

model=Sequential()

model.add(Embedding(n_vocab,100))

model.add(Convolution1D(128,5,border_mode='same',activation='relu'))


model.add(GRU(100,return_sequences=True))

model.add(TimeDistributed(Dense(n_classes,activation='softmax')))

model.compile('rmsprop','categorical_crossentropy')

Withthisgreatlyimprovedmodel,youshouldgetaroundninety-fourF1Score.Toimproveyourmodelevenfurther,youcanuseotherwordembeddingcorpuseslikeWikipedia.YoucantryotherrecurrentneuralnetworkvariantslikeGRUandLSTMthatallowmoreexperimentation.

LASTWORDS

Deeplearningisanewareaofbroadermachinelearningthathasbeenintroducedwiththemaingoalofmovingmachinelearningclosertoartificialintelligence,whichwasoneofitsoriginalgoals.Ifyouwanttobreakdeeperintoartificialintelligence,first,youneedtofocusondeeplearninganditspowers.Deeplearningarguablyisoneofthemosthighlysoughttechskills.

Thisbookwillhelpyoubecomegoodatdeeplearningbasics.Thebookwillhelpyoustartyourdeeplearningjourneyproperly.Sinceyouaredonewiththereadingpart,youknowalotaboutneuralnetworksmodels,howtobuildthemandhowtosolvedifferentdeeplearningproblemslikenaturallanguageprocessingandspeechrecognition.Therefore,youcanfocusonmoreadvanceddeeplearningproblemsinthefuture.

Inthebook,yousurveyedseveralneuralnetworkmodelsandtheirapplicationstothereal-worldproblems.YoucanusethisknowledgeinsolvingyourowndeeplearningtasksasyoubuildyourownneuralnetworkmodelsusingKeras.Onethingisforsure,youshouldtakeadvantageoftheknowledgeyougainedthroughthebookandfocusonmorecomplexdeeplearningproblems.

DeeplearningistheonlyfieldofAIthatwentviralanditsfuturelooksverybright.Therefore,youshouldnotstophere.Youshouldfocusonimprovingyourskillandgainingmoreknowledge.Machinelearningalreadyplaysamassivepartinyoureverydaylifeanddeeplearningisnotfarawayfrombecominga

largerpartofmodernsocietyaswell.

MachinelearningwasjustthebeginningasmoreandmoretechcompanieslikeMicrosoft,GoogleandFacebookspendmillionsondeeplearningandadvancedneuralnetworksresearchascomputersgetsmartereveryday.

However,deeplearningisnotaboutself-awaremachines.Itisabouthowingeniousneuralnetworkmodelsandcodearegivingmachinestheabilitytodothingswepreviouslythoughtimpossible.Therefore,deeplearningdoesconcernourfuture.Letthebookbeyourguideintothisworld,butdonotstophereandmakesureyoutakeastepfurtherbylearningsomethingneweveryday.

Deep Learning With Keras: Beginnerâ€™s Guide To Deep Learning With Keras

Documents

Transcript of Deep Learning With Keras: Beginnerâ€™s Guide To Deep Learning With Keras