Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
-
Upload
spark-summit -
Category
Data & Analytics
-
view
2.170 -
download
7
Transcript of Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
![Page 2: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/2.jpg)
Introduction
• HungarianAcademyofSciences, InstituteforComputerScienceandControl (MTASZTAKI)• Researchinstitutewithstrongindustryties• BigDataprojectsusingSpark,Flink,Cassandra,Hadoopetc.• Multipletelcousecaseslately,withchallengingdatavolumeanddistribution
![Page 3: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/3.jpg)
Agenda
• Ourdata-skewstory• Problemdefinitions&aims• DynamicRepartitioning
• Architecture• Component breakdown• Repartitioning mechanism• Benchmarkresults
• Visualization• Conclusion
![Page 4: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/4.jpg)
Motivation
• Wehavedevelopedanapplicationaggregatingtelcodatathattestedwellontoydata• Whendeployingitagainsttherealdatasettheapplicationseemedhealthy• Howeveritcouldbecomesurprisinglyslow orevencrash• Whatdidgowrong?
![Page 5: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/5.jpg)
Ourdata-skewstory
• Wehaveuse-caseswhenmap-sidecombineisnotanoption:groupBy, join• 80%ofthetrafficgenerated by 20%ofthecommunication towers• Mostofourdatais80–20rule
![Page 6: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/6.jpg)
Theproblem
• Usingdefaulthashingisnotgoingtodistributethedatauniformly• Someunknownpartition(s)tocontainalotofrecordsonthereducerside• Slowtaskswillappear• Datadistributionisnotknowninadvance• „Conceptdrifts”arecommon
stageboundary&shuffle
evenpartitioning skeweddata
slow task
slow task
![Page 7: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/7.jpg)
Aim
Generally,makeSparkadata-awaredistributeddata-processingframework
• Collectinformationaboutthedata-characteristicson-the-fly• Partitionthedataasuniformlyaspossibleon-the-fly• Handlearbitrarydatadistributions• Mitigatetheproblemofslowtasks• Belightweightandefficient• Shouldnotrequireuser-guidance
spark.repartitioning = true
![Page 8: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/8.jpg)
Architecture
Slave Slave Slave
Master
approximatelocal keydistribution &statistics
1collect to master 2
global keydistribution &statistics
3
4redistribute newhash
![Page 9: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/9.jpg)
Driverperspective
• RepartitioningTrackerMaster partofSparkEnv• Listenstojob&stagesubmissions• Holdsavarietyofrepartitioningstrategiesforeachjob&stage• Decides when &how to (re)partition
![Page 10: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/10.jpg)
Executorperspective
• RepartitioningTrackerWorker partofSparkEnv• Duties:
• StoresScannerStrategies (Scanner included) receivedfromtheRTM• Instantiates andbindsScanners toTaskMetrics (wheredata-characteristics iscollected)
• DefinesaninterfaceforScanners tosendDataCharacteristics backtotheRTM
![Page 11: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/11.jpg)
Scalable sampling
• Key-distributions are approximatedwith astrategy,that is• not sensitive to early or late concept drifts,• lightweight andefficient,• scalable by using abackoff strategy
keys
frequency
counters
keys
frequency
sampling rate of𝑠" sampling rate of𝑠" /𝑏
keysfre
quency
sampling rate of𝑠"%&/𝑏
truncateincrease by 𝑠"
𝑇𝐵
![Page 12: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/12.jpg)
Complexity-sensitive sampling
• Reducerrun-timecanhighlycorrelatewiththecomputationalcomplexity ofthevaluesforagivenkey• CalculatingobjectsizeiscostlyinJVMandinsomecasesshowslittlecorrelationtocomputationalcomplexity(functiononthenextstage)• Solution:
• Iftheuserobject isWeightable, usecomplexity-sensitive sampling• Whenincreasingthecounterforaspecificvalue,consider itscomplexity
![Page 13: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/13.jpg)
Scalable sampling in numbers
• InSpark,ithasbeenimplementedwithAccumulators• Notsoefficient,butwewantedtoimplementitwithminimalimpactontheexistingcode
• Optimizedwithmicro-benchmarking• Mainfactors:
• Samplingstrategy- aggressiveness (initialsamplingratio,back-offfactor,etc…)• Complexityofthecurrentstage(mapper)
• Currentstage’sruntime:• Whenusedthroughout theexecutionofthewholestageitadds5-15%toruntime• Afterrepartitioning,wecutoutthesampler’scode-path; inpractice,itadds0.2-1.5% toruntime
![Page 14: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/14.jpg)
Mainnew task-level metrics
• TaskMetrics• RepartitioningInfo – newhashfunction,repartitionstrategy• ShuffleReadMetrics
• (Weightable)DataCharacteristics – usedfortestingthecorrectnessoftherepartitioning&forexecutionvisualizationtools
• ShuffleWriteMetrics• (Weightable)DataCharacteristics – scannedperiodicallybyScannersbasedonScannerStrategy
• insertionTime• repartitioningTime
• inMemory• ofSpills
![Page 15: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/15.jpg)
Scanner
• Instantiatedforeachtaskbeforetheexecutorstartsthem• Differentimplementations:Throughput, Timed, Histogram• ScannerStrategy defines:
• whentosendtotheRTM,• histogram-compaction level.
![Page 16: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/16.jpg)
Decisiontorepartition
• RepartitioningTrackerMaster canusedifferentdecisionstrategies:• numberoflocalhistogramsneeded,• globalhistogram’sdistance fromuniformdistribution,• preferences intheconstruction ofthenewhashfunction.
![Page 17: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/17.jpg)
Construction ofthe new hash function
𝑐𝑢𝑡, - single-key cut
hash akey herewith the probabilityproportional to the remaining space to reach 𝑙
𝑐𝑢𝑡. - uniform distribution
topprominent keys hashed
𝑙
𝑐𝑢𝑡/ - probabilistic cut
![Page 18: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/18.jpg)
Newhash function in numbers
• MorecomplexthanahashCode• Weneedtoevaluateitforeveryrecord• Micro-benchmark(forexampleString):
• Numberofpartitions: 512• HashPartitioner: AVGtime to hash arecord is90.033 ns• KeyIsolatorPartitioner: AVGtime to hash arecord is121.933ns
• Inpracticeitaddsnegligibleoverhead,under1%
![Page 19: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/19.jpg)
Repartitioning
• Reorganizepreviousnaivehashing• Usuallyhappensin-memory• Inpractice,addsadditional1-8%overhead(usuallythelowerend),basedon:• complexity ofthemapper,• lengthofthescanningprocess.
![Page 20: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/20.jpg)
Morenumbers (groupBy)
MusicTimeseries – groupBy on tags from alistenings-stream
1710 200 400
83M
134M
92M
sizeofth
ebiggestpartition
number ofpartitions
over-partitioning overhead&blind scheduling64%reduction
observed 3%map-side overhead
naive
Dynamic Repartitioning
38%reduction
![Page 21: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/21.jpg)
Morenumbers (join)
• Joiningtrackswithtags• Tracksdatasetisskewed• Numberofpartitionsissetto33• Naive:
• sizeofthebiggest partition=4.14M• reducer’sstageruntime=251seconds
• DynamicRepartitioning• sizeofthebiggest partition=2.01M• reducer’sstageruntime=124 seconds• heavymap,only0.9%overhead
![Page 22: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/22.jpg)
Spark RESTAPI
• NewmetricsareavailablethroughtheRESTAPI• AddednewqueriestotheRESTAPI,forexample:„whathappenedinthelast3second?”• BlockFetches arecollectedtoShuffleReadMetrics
![Page 23: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/23.jpg)
Execution visualization ofSpark jobs
![Page 24: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/24.jpg)
Repartitioning in the visualization
heavy keys createheavy partitions (slow tasks)
heavy isalone,size ofthe biggest partition
isminimized
![Page 25: Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning](https://reader036.fdocuments.us/reader036/viewer/2022082206/58f9a945760da3da068b6cf2/html5/thumbnails/25.jpg)
Conclusion
• OurDynamicRepartitioning canhandledataskewdynamically,on-the-flyonanyworkloadandarbitrarykey-distributions• Withverylittleoverhead,dataskewcanbehandledinanatural &general way• Visualizationscanaiddeveloperstobetterunderstandissuesandbottlenecksofcertainworkloads• MakingSparkdata-aware paysoff