Using Machine Learning for Network Capacity Management
Speaker: Taghrid SamakHost: Lori Pollock
CRA-W Undergraduate Town HallNovember 9th, 2017
Speaker & Moderator
Taghrid SamakTaghridSamakholdsadoctoratedegreeincomputersciencefromDePaulUniversity,aBScandMScincomputersciencefromAlexandriaUniversityinEgypt,andiscurrentlypursuingherJurisDoctoratedegreeattheUniversityofSanFrancisco. AtGoogle,Taghridappliesstatisticalmodelingfordiversenetworkapplicationsfromcapacityplanningtowirelessnetworks.Previously,sheworkedatLawrenceBerkeleyNationalLaboratorywhereherresearchfocusedonapplyingdataanalysisandmachinelearningtoenablecross-disciplinescientificdiscovery.
Taghridisco-founderandsteeringcommitteememberoftheArabWomeninComputingorganizationandvolunteersasamentorforvariouswomenincomputingorganizations.
Lori PollockDr.LoriPollockisaProfessorinComputerandInformationSciencesatUniversityofDelaware.Hercurrent researchfocusesonprogramanalysisforbuildingbettersoftwaremaintenancetools,softwaretesting,energy-efficientsoftwareandcomputerscienceeducation.Dr.PollockisanACMDistinguishedScientistandwasawardedtheUniversityofDelaware’sExcellenceinTeachingAwardandtheE.A.TrabantAwardforWomen’sEquity.
About MeBackground• OriginallyfromAlexandria,Egypt• BScandMScComputerScience,AlexandriaUniversity,Egypt• PhDinComputerScience,DePaulUniversity• JDStudent,UniversityofSanFranciscoSchoolofLaw
Career• Currently: Sr.DataAnalyst@Google,CorporateNetworking
• ResearchScientists@LawrenceBerkeleyNationalLab(dataanalysisforBiology,Physics,Systems,…)
• ResearchIntern@BellLabs• ResearchAssistant@DePaulUniversity• TeachingAssistant@AlexandriaUniversity
“A Day in the Life….of a Data Scientist”
Validation, Verification, Exploration, …
Modeling, Learning, Optimizations, …
Business Intelligence, Reporting, …
Coding, Analysis, Communication
Data
ActionKnowledge
Using Machine Learning for Network Capacity Management
Taghrid SamakSenior Data Analyst
Google Corporate Networking
Agenda
• Background– Capacityplanninginenterprisenetworks– Machinelearning
• UsageforecastforGoogle’senterprisenetwork– Data– Model- knowledge– Results- action
Networking Overview
Home Network Office Network
Images from:https://www.lucidchart.com/pages/examples/network-diagram
Network Capacity Management
• Ensuringsufficientresourcesonthenetworktosatisfyperformancerequirements
• Homenetwork– choosingsubscriptionfromtheInternetServiceProvider
• EnterpriseNetwork– interconnectedofficenetworks– officedesignoptimizations– managingbandwidthinsideandoutsideoftheenterprise
Capacity Management Points
Wide Area NetworkWAN
- Multiple Offices- Offices to Internet
Local Area NetworkLAN
- Within Offices
Enterprise Network Data Analysis
• Data– Trafficpassingthroughthenetworkateachlevelfrom
accesslayertoWAN
• Knowledge→Action– Whichusersorapplicationsareusingthenetwork?– Canweoptimizethenetworkdesign?– Whendoweneedcapacityupgradeordowngradefora
specificoffice?– Canwepredictperformanceproblems?
“Fieldofstudythatgivescomputerstheabilitytolearnwithoutbeingexplicitlyprogrammed”
- wikipediadefinition
“Howcanwebuildcomputersystemsthatautomaticallyimprovewithexperience,andwhatarethefundamentallawsthatgovernalllearningprocesses?”
- TomM.Mitchell,CMU
What is Machine Learning?
The Data
• Tableofobservations/samples/objects/…• Eachobservationshasdimensions/attributes/
features/fields/variables/measurements/…
Feature 1 Feature 2 Feature 3 … Feature k
Observation 1
Observation 2
…
Observation n
Typical Workflow
• DataCollection• Preprocessing(filtering,scaling,sampling,…)• FeatureExtraction• DimensionalityReduction• Learningthemodel(buildknowledge)
– Training– Testing– Validation
• Usethemodel– predictions– actions
The Learning Process
• Canwefindafunctionthataccuratelyfitsthedata?
• Supervisedlearning– thefunctionpredictsafeatureofinterestaccuratelyfor
thetrainingdataandnewsamples• Unsupervisedlearning
– thefunctioncreatesthecorrectpattern/groupsofdata
• What’sneeded– Data– Hypothesisspaceofpotentialfunctions– Optimizationfunctiontominimizetheerror
Supervised Learning
- Independentvariables(X):dimensions/features/…- Dependentvariables(Y):classifications/labels/…- LearningY’sfromvaluesofX’s- f:X→Y- Usuallyonlyone“y” Labels
x1 x2 x3 y1 y2
Observation 1
Observation 2
…
Observation n
Unsupervised Learning
- Nolabels- Learning“patterns”fromvaluesofX’s
x1 x2 x3 ... xk
Observation 1
Observation 2
…
Observation n
The Learning Process - Data Flow
Raw Data
Model 1
Model k
Cross Validation
Model*
Model EvaluationValidation Data
Training Data
Split 1
Split 2
Split k
Training1
Testing 1
Training k
Testing k
Google Offices WAN Capacity Forecast
• Forecastingnetworkusageforeachofficebasedonhistoricaldata– Whendoweneedcircuitcapacityupgradeordowngrade
foraspecificoffice?– Howtomodelusagechangesforchangingheadcounts?
Taghrid Samak, Mark Miklic, “WAN Capacity Forecasting for Large Enterprises.” IEEE/IFIP International Workshop on Analytics for Network and Service Management, AnNet 2016
Data
• Historicalinbound/outboundbandwidthutilizationforeachoffice- SNMP
• Historicalandforecastheadcount- HR• BW=f(hc,T)
• Approximately400Ksamplesperoffice
Modeling Process Per Office
Forecast Headcounts
Historical Utilization
Historical Headcount
Data Preparation
Model 1… n
Modeling
Model Parameters
Model 1… n
Model Selection Forecast
- Alignment- Interpolation- Smoothing
- Regression- Non-negative least square - Model order
- Cross validation
- Minimize error for the most recent data
Before we start
• Whatworksforyoumightnotworkforeveryone• Findyourownpaceandbalance
• Hereissomeanecdotaladvice:)
Taghrid’s Extracurriculars
• Asagraduatestudent– UPEHonorSociety,DePaulChapterPresident– ACM-w,DePaulChapterTreasurer– EgyptianStudentAssociationNationalExecutive
Committee• Asaprofessional
– Programcommitteememberfortechnicalconferences– ArabWomeninComputingCo-founder(arabwic.org)– MentorforUSDeptofStateTechwomenProgram
(techwomen.org)– InternaldiversityeffortswithinGoogle– USFLawSchoolDean’sMeritScholarship
Extracurricular Activities
• Activitiesoutsideofthe“normal”realmofstudy/work
• Asastudent– Normal→schoolwork,parttimejob– Extracurricular→studentorganizations,sports,non-
profit,…
• Asaprofessional– Normal→fulltimejob– Extracurricular→sports,non-profit,mentoring,…
Extracurricular Activities
• Whichactivityisrightforme?– Followyourpassion– Commitandfollowthrough
• Personalbenefits– Helpingcausesyoucareabout– Buildingfriendships
• Professionalbenefits– Buildingresumeandnetwork– Learningnewskills– Gettingexperience
Time Management Strategies - prep work
• Clearandfocusedgoals/tasks– Yourto-dolist– Dynamicandflexible– Learntosay“no”
• Prioritize– Byvalue– Bytimeneeded– Bydeadline
Top Related