Condor-G: Condor and Grid Computing - University of California
Running Map-Reduce Under Condor
description
Transcript of Running Map-Reduce Under Condor
![Page 1: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/1.jpg)
Condor ProjectComputer Sciences DepartmentUniversity of Wisconsin-Madison
Running Map-ReduceUnder Condor
![Page 2: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/2.jpg)
www.cs.wisc.edu/Condor
Cast of thousands› Mihai Pop› Michael Schatz› Dan Sommer
h University of Maryland Center for Computational Biology
› Faisal Khan, Ken Hahn UW › David Schwartz, LMCG
![Page 3: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/3.jpg)
www.cs.wisc.edu/Condor
In 2003…
http://labs.google.com/papers/gfs.html
http://labs.google.com/papers/mapreduce.html
![Page 4: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/4.jpg)
www.cs.wisc.edu/Condor
![Page 5: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/5.jpg)
www.cs.wisc.edu/Condor
![Page 6: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/6.jpg)
www.cs.wisc.edu/Condor
Shortly thereafter…
![Page 7: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/7.jpg)
www.cs.wisc.edu/Condor
Two main Hadoop parts
![Page 8: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/8.jpg)
www.cs.wisc.edu/Condor
For more detailCondorWeek 2009 talk
Dhruba Borthakur
http://www.cs.wisc.edu/condor/CondorWeek2009/condor_presentations/borthakur-hadoop_univ_research.ppt
![Page 9: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/9.jpg)
www.cs.wisc.edu/Condor
![Page 10: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/10.jpg)
www.cs.wisc.edu/Condor
HDFS overview› Making POSIX distributed file
system go fast is easy…
![Page 11: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/11.jpg)
www.cs.wisc.edu/Condor
HDFS overview› …If you get rid of the POSIX part› Remove
h Random accessh Support for small filesh authenticationh In-kernel support
![Page 12: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/12.jpg)
www.cs.wisc.edu/Condor
HDFS Overview› Add in
h Data replication • (key for distributed systems)
h Command line utilities
![Page 13: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/13.jpg)
www.cs.wisc.edu/Condor
HDFS Architecture
![Page 14: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/14.jpg)
www.cs.wisc.edu/Condor
HDFS Condor Integration
› HDFS Daemons run under masterh Management/control
› Added HAD support for namenode
› Added host based security
![Page 15: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/15.jpg)
www.cs.wisc.edu/Condor
Condor HDFS: IIFile transfer support
transfer_input_files = hfds://…
Spool in hdfs
![Page 16: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/16.jpg)
www.cs.wisc.edu/Condor
Map Reduce
![Page 17: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/17.jpg)
www.cs.wisc.edu/Condor
Shell hackers map reduce
› grep tag input | sort | uniq –c | grep
![Page 18: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/18.jpg)
www.cs.wisc.edu/Condor
MapReduce lingo for the
native Condor speaker› Task tracker startd/starter
› Job tracker condor_schedd
![Page 19: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/19.jpg)
www.cs.wisc.edu/Condor
Map Reduce under Condor
› Zeroth law of software engineering
› Job tracker/task tracker must be managed!h Otherwise very bad things happen
![Page 20: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/20.jpg)
www.cs.wisc.edu/Condor
Hadoop on Demand w/Condor
![Page 21: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/21.jpg)
www.cs.wisc.edu/Condor
Map Reduce as overlay› Parallel Universe job› Starts job tracker on rank 0› Task trackers everywhere else› Open Question:
h Run more small jobs, or fewer bigger› One job tracker per user (i.e. per
job)
![Page 22: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/22.jpg)
www.cs.wisc.edu/Condor
On to real science…› David Schwartz, matchmaker
Mihai Pop
![Page 23: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/23.jpg)
www.cs.wisc.edu/Condor
Contrail – MR genome assembly
http://sourceforge.net/apps/mediawiki/contrail-bio/index.php
![Page 24: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/24.jpg)
www.cs.wisc.edu/Condor
Genome assembly
![Page 25: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/25.jpg)
www.cs.wisc.edu/Condor
DNA3 Billion base pairs
Sequencing machines only read small reads at a time
![Page 26: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/26.jpg)
www.cs.wisc.edu/Condor
Already done this?
![Page 27: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/27.jpg)
www.cs.wisc.edu/Condor
High throughput sequencers
![Page 28: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/28.jpg)
www.cs.wisc.edu/Condor
ContrailScalable Genome Assembly with MapReduce› Genome: African male NA18507 (Bentley et al., 2008)› Input: 3.5B 36bp reads, 210bp insert (SRA000271)› Preprocessor: Quality-Aware Error Correction
.
Cloud SurfingError CorrectionCompressedInitial
NMaxN50
>10B 2727
>1 B303 bp
< 100 bp
5.0 M14,007650 bp
4.2 M20,594923 bp
In Progress
Resolve Repeats
![Page 29: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/29.jpg)
www.cs.wisc.edu/Condor
Running it under Condor
› Used CHTC B-240 cluster
› ~100 machinesh 8 way nehalem cpuh 12 Gb totalh 1 disk partition dedicated to HDFSh HDFS running under condor master
![Page 30: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/30.jpg)
www.cs.wisc.edu/Condor
Running it on Condor› Used the MapReduce PU overlay› Started with Fruit Flies› …› And it crashed› Zeroth law of software engineering
h Version mismatch› Debugging…
![Page 31: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/31.jpg)
www.cs.wisc.edu/Condor
Debugging› After a couple of debugging rounds
› Fruit Fly sequenced!!
h On to humans!
![Page 32: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/32.jpg)
www.cs.wisc.edu/Condor
Cardinality› How many slots per task tracker?
h Task tracker, like schedd multi-slots› One machine
h 8 coresh 1 diskh 1 memory system
› How many mappers per slot
![Page 33: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/33.jpg)
www.cs.wisc.edu/Condor
More MR under Condor› More debugging, NPEs› Updated MR again› Some performance regressions› One power outage
› 12 weeks later…
![Page 34: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/34.jpg)
www.cs.wisc.edu/Condor
Success!
![Page 35: Running Map-Reduce Under Condor](https://reader035.fdocuments.us/reader035/viewer/2022062410/56816374550346895dd44fb8/html5/thumbnails/35.jpg)
www.cs.wisc.edu/Condor
Conclusions› Job trackers must be managed!
h Glide-in is more than Condor on batch
› Hadoop – more than just MapReduce
› HDFS – good partner for Condor› All this stuff is moving fast