7/29/2019 Pallet Big Data - JClouds Meetup 2013
1/26
JClouds Meetup Feb 2013
Toni Batchelli -- co-founder -- PalletOps.com
Pallet Big Data
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
2/26
Programmable
Infrastructure
The Cloud!
Flexible
Powerful
Dynamic
... and then what?
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
3/26
... and then what?
Configure all the things!
configure the servers
configure the local systems
configure the distributed systems
Configuration Managers:
build a configuration database
wait for nodes to pull config
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
4/26
Programmatic
Infrastructure
In a programatic infrastructure, the
systems are provisioned and configured by
running a program (*)
jclouds takes care of the provisioning part
Pallet takes care of the configuration part
(*) as opposed to configuring a server to coordinate the config, or
using templatesSaturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
5/26
Why programs?
With a program you can do many things:
Run it anywhere
Keep it in GitHub
Parametrize it
Have it run by another program
Make it a library
Extend it
etc
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
6/26
e.g. Hadoop Clusters
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
7/26
JobTracker
NameNode TaskTracker
DataNode
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
8/26
JobTracker
NameNode
Master
TaskTracker
DataNode
Slave
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
9/26
JobTracker
NameNode
Master
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
10/26
JobTracker
NameNode
Master
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
11/26
JobTracker
NameNode
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
NameNode
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
12/26
Java
Hadoop
Task
Tracker
Job
Tracker
Data
Node
Name
Node
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
13/26
Java
Hadoop
Job
Tracker
Task
Tracker
Data
Node
Name
Node
.jar
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
14/26
Java
Hadoop
Job
Tracker
Task
Tracker
Data
Node
Name
Node
.jar
Master
Node
Slave
Node
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
15/26
Java
Hadoop
Job
Tracker
Task
Tracker
Data
Node
Name
Node
.jar
Master
Node
Slave
Node
Hadoop
Cluster
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
16/26
JobTracker
NameNode
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
TaskTracker
DataNode
Slave
NameNode
SSH
SSH
SSH
SSH
SSH
SSH
Caution: Major oversimplification in progress!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
17/26
function:authorize-node(node,group)
(public-key,private-key)=gen-key(node)
fortarget-nodeinnodes(group)doauth-key(public-key,target-node)
done
function:auth-key(key,node)
when-not./sshdo
create-dir(./ssh)
done
when-not./ssh/authorized_keysdo
create-file(./ssh/authorized_keys)done
append-to-file(./ssh/authorized_keys,key)
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
18/26
function:build-cluster(infra,slave-count,RAM)
slave-spec=build-slave-spec(RAM)
master-spec=build-master-spec(RAM)
slaves=procure(infra,slave-spec,slave-count)master=procure(infra,master-spec,1)
master.configure()
forslaveinslavesdo
slave.configure()
done
authorize-node(master,slaves)
...
ec2c=build-cluster(ec2,100,8GB)
rsc=build-cluster(rackspace,100,16GB)
vbc=build-cluster(virtualbox,3,2GB)
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
19/26
e.g. Pallet Big Data
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
20/26
Pallet Big Data
We decided wed build something useful
with all this power: liberating Amazon EMR
users :)
Build Hadoop clusters anywhere and everywhere
Use your preferred Hadoop distro and version
Build your own workflows
I just saved a bunch of $$$ by switching to
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
21/26
{:cluster-prefix"hc1":groups{:master{:node-spec{:hardware{:hardware-id"m1.medium"}}:count1
:roles#{:namenode:jobtracker}}:slave{:node-spec{:hardware{:hardware-id"m1.medium"}}:count2
:roles#{:datanode:tasktracker}}}:node-spec{:image{:os-family:ubuntu:os-version-matches"12.04":os-64-bittrue}}
:hadoop-settings{:dist:cloudera}}
a Hadoop Cluster
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
22/26
{:steps[{:script-file"bootstrap/setup.sh"}{:script"/bin/start-daemon"}{:jar{:remote-file"//usr/.../image-parse.jar"}
:main"parse" :input"s3n://sources/satellite-data" :outputhdfs://parsed-sat-img} {:jar {:remote-file
//usr/.../outline-detection.jar} :maindetect :inputhdfs://parsed-sat-img :output"s3n://results/weather-data"}]
:on-completion:terminate-cluster}
a Hadoop workflow
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
23/26
$ bin/hadoop start
$ bin/hadoop job job_spec.clj
$ bin/hadoop destroy
run hadoop, run!
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
24/26Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
25/26
PalletOps
Saturday, March 16, 13
7/29/2019 Pallet Big Data - JClouds Meetup 2013
26/26
backlog
Feature parity with Amazon EMR
Server Rack support
Extended workflows
Central Management service
Interested in giving it a try?
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]Top Related