Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Disclaimer
This document may contain product features and technology directions that are under development or may be under development in the future.
Technical feasibility, market demand, user feedback, and the Apache Software Foundation community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not represent a contractual commitment from Hortonworks to deliver these features in any generally available product.
Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda
• Hadoop Security
• Kerberos
• Authorization and Auditing with Ranger
• Gateway Security with Knox
• Encryption
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
• Wire encryption in Hadoop
• Native and partner encryption
• Centralized audit reporting w/ Apache Ranger
• Fine grain access control with Apache Ranger
Security today in Hadoop with HDP/PHD
AuthorizationWhat can I do?
AuditWhat did I do?
Data ProtectionCan data be encrypted at rest and over the wire?
• Kerberos• API security with
Apache Knox
AuthenticationWho am I/prove it?
HD
P\P
HD
Centralized Security Administration
En
terp
rise
Se
rvic
es:
Se
curit
y
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Security needs are changing
AdministrationCentrally management & consistent security
AuthenticationAuthenticate users and systems
AuthorizationProvision access to data
AuditMaintain a record of data access
Data ProtectionProtect data at rest and in motion
Security needs are changing• YARN unlocks the data lake
• Multi-tenant: Multiple applications for data access
• Different kinds of data
• Changing and complex compliance environment
201465% of clusters host multiple workloads
Fall 2013Largely silo’d deployments with single workload clusters
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Hive Access through Beeline client
HiveServer 2A B C
Beeline Client
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Authenticate through Kerberos
HiveServer 2A B C
KDC
Use Hive Service T,icket submit query
Hive gets Namenode (NN) service ticket
Hive creates map reduce using NN Service Ticket
Client • Requests a TGT• Receives TGT• Client dcrypts it with the password
hash• Sends the TGT and receives a Service
Ticket
Beeline Client
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Add Authorization through Ranger(XA Secure)
HiveServer 2A B C
KDC
Use Hive ST, submit query
Hive gets Namenode (NN) service ticket
Hive creates map reduce using NN ST
Ranger
Client gets service ticket for Hive
Beeline Client
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Firewall, Route through Knox Gateway
HiveServer 2A B C
KDC
Use Hive ST, submit query
Hive gets Namenode (NN) service ticket
Hive creates map reduce using NN ST
Ranger
Knox gets service ticket for Hive
Knox runs as proxy user using Hive ST
Original request w/user id/password
Client gets query result
Beeline Client
Apache Knox
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Add Wire and File Encryption
HiveServer 2A B C
KDC
Use Hive ST, submit query
Hive gets Namenode (NN) service ticket
Hive creates map reduce using NN ST
Ranger
Knox gets service ticket for Hive
Knox runs as proxy user using Hive ST
Original request w/user id/password
Client gets query result
SSL
Beeline Client
SSL SASL
SSL SSL
Apache Knox
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Security Features
PHD/HDP Security
Authentication
Kerberos Support ✔
Perimeter Security – For services and rest API ✔
Authorizations
Fine grained access control HDFS, Hbase and Hive, Storm and Knox
Role base access control ✔
Column level ✔
Permission Support Create, Drop, Index, lock, user
Auditing
Resource access auditing Extensive Auditing
Policy auditing ✔
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP/PHD Security w/ Ranger
Data Protection
Wire Encryption ✔Volume Encryption TDE
File/Column Encryption HDFS TDE & Partners
Reporting
Global view of policies and audit data ✔
Manage
User/ Group mapping ✔
Global policy manager, Web UI ✔
Delegated administration ✔
Security Features
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Partner Integration
Security Integrations:● Ranger plugins: centralize authorization/audit of 3rd party s/w in Ranger
UI● Via Custom Log4J appender, can stream audit events to INFA infrastructure
● Knox: Route partner APIs through Knox after validating compatibility● Provide SSO capability to end users
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Kerberos in the field
Kerberos no longer “too complex”. Adoption growing.● Ambari helps automate and manage kerberos integration with cluster
Use: Active directory or a combine Kerberos/Active Directory● Active Directory is seen most commonly in the field● Many start with separate MIT KDC and then later grow into the AD KDC
Knox should be considered for API/Perimeter security● Removes need for Kerberos for end users● Enables integration with different authentication standards● Single location to manage security for REST APIs & HTTP based services● Tip: In DMZ
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Authorization and AuditingApache Ranger
Page 22
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Authorization and Audit
AuthorizationFine grain access control
• HDFS – Folder, File
• Hive – Database, Table, Column
• HBase – Table, Column Family, Column
• Storm, Knox and more
AuditExtensive user access auditing in HDFS, Hive and HBase
• IP Address
• Resource type/ resource
• Timestamp
• Access granted or denied
Control access into
system
Flexibility in defining
policies
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Central Security Administration
Apache Ranger• Delivers a ‘single pane of glass’ for
the security administrator
• Centralizes administration of security policy
• Ensures consistent coverage across the entire Hadoop stack
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Setup Authorization Policies
25
file level access control, flexible definition
Control permissions
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Authorization and Auditing w/ Ranger
HDFS
Ranger Administration Portal
HBase
Hive Server2
Ranger Policy Server
Ranger Audit Server
Ranger Plugin
Had
oop
Com
pone
nts
Ent
erpr
ise
Use
rs
Ranger Plugin
Ranger Plugin
Legacy Tools & Data
Governance
Integration APIHDFS
Knox
Storm
Ranger Plugin
Ranger Plugin
RDBMS
HDP 2.2 Additions Planned for 2015
TBD
En
terp
rise
Se
rvic
es:
Se
curit
y
Ranger Plugin*
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Installation Steps
• Install PHD 3.0
• Install Apache Ranger (https://tinyurl.com/mlgs3jy)– Install Policy Manager
– Install User Sync
– Install Ranger Plugins
• Start Policy Manager– service ranger-admin start
• Verify – http://<host>:6080/- admin/admin
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Ranger Plugins
• HDFS
• HIVE
• KNOX
• STORM
• HBASE
Steps to Enable plugins
1. Start the Policy Manager
2. Create the Plugin repository in the Policy Manager
3. Install the Plugin• Edit the install.properties
• Execue ./enable-<plugin>.sh
4. Restart the plugin service (e.g. HDFS, Hive etc)
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Ranger Console
31
• The Repository Manager Tab• The Policy Manager Tab• The User/Group Tab• The Analytics Tab• The Audit Tab
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Repository Manager
32
• Add New Repository• Edit Repository• Delete Repository
Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
REST API Security through KnoxSecurely share Hadoop Cluster
Page 34
Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Share Data Lake with everyone - Securely
• Simplifies access: Extends Hadoop’s REST/HTTP services by encapsulating Kerberos to within the Cluster.
• Enhances security: Exposes Hadoop’s REST/HTTP services without revealing network details, providing SSL out of the box.
• Centralized control: Enforces REST API security centrally, routing requests to multiple Hadoop clusters.
• Enterprise integration: Supports LDAP, Active Directory, SSO, SAML and other authentication systems.
Page 36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apache Knox
Knox can be used with both unsecured Hadoop clusters, and Kerberos secured clusters. In an enterprise solution that employs Kerberos secured clusters, the Apache Knox Gateway provides an enterprise security solution that:
• Integrates well with enterprise identity management solutions
• Protects the details of the Hadoop cluster deployment (hosts and ports are hidden from end users)
• Simplifies the number of services with which a client needs to interact
Page 37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Load Balancer
Extend Hadoop API reach with Knox
Hadoop Cluster
Application TierApp A App NApp B App C
Data Ingest
ETL
Admin/ Operators
Bastian Node
SSH
RPC Call
FalconOozieScoopFlume
Data Operator
Business User
Hadoop Admin
JDBC/ODBCREST/HTTP
Knox
Page 38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS
Typical Flow – Add Wire and File Encryption
HiveServer 2A B C
KDC
Use Hive ST, submit query
Hive gets Namenode (NN) service ticket
Hive creates map reduce using NN ST
Ranger
Knox gets service ticket for Hive
Knox runs as proxy user using Hive ST
Original request w/user id/password
Client gets query result
SSL
Beeline Client
SSL SASL
SSL SSL
Apache Knox
Page 39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Why Knox?
Simplified Access
•Kerberos encapsulation •Extends API reach•Single access point•Multi-cluster support•Single SSL certificate
Centralized Control
• Central REST API auditing• Service-level authorization• Alternative to SSH “edge node”
Enterprise Integration
•LDAP integration•Active Directory integration•SSO integration•Apache Shiro extensibility•Custom extensibility
Enhanced Security
• Protect network details• SSL for non-SSL services• WebApp vulnerability filter
Page 40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop REST API with Knox
Service Direct URL Knox URLWebHDFS http://namenode-host:50070/webhdfs https://knox-host:8443/webhdfs
WebHCat http://webhcat-host:50111/templeton https://knox-host:8443/templeton
Oozie http://ooziehost:11000/oozie https://knox-host:8443/oozie
HBase http://hbasehost:60080 https://knox-host:8443/hbase
Hive http://hivehost:10001/cliservice https://knox-host:8443/hive
YARN http://yarn-host:yarn-port/ws https://knox-host:8443/resourcemanager
Masters could be on many
different hosts
One hosts, one port
Consistent paths
SSL config at one host
Page 41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop REST API Security: Drill-Down
Page 41
RESTClient
EnterpriseIdentityProvider
LDAP/AD
Knox Gateway
GWGW
Firew
all
Firew
all
DMZ
LB
Edge Node/Hado
op CLIs RPC
HTTP
HTTP HTTP
LDAP
Hadoop Cluster 1Masters
Slaves
RM
NN
WebHCat
Oozie
DN NM
HS2
Hadoop Cluster 2Masters
Slaves
RM
NN
WebHCat
Oozie
DN NM
HS2
HBase
HBase
Page 42 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Knox –features in PHD
• Use Ambari for Install/start/stop/configuration• Knox support for HDFS HA• Support for YARN REST API• Support for SSL to Hadoop Cluster Services (WebHDFS, HBase,
Hive & Oozie)• Integration with Ranger for Knox Service Level Authorization
•Knox Management REST API
Page 43 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Installation
• Installed via Ambari–This can be done manually–Start the embeded ldap
• There is good examples in the Apache doc with groovy scripts
–https://knox.apache.org/books/knox-0-4-0/knox-0-4-0.html
Page 44 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data ProtectionWire and data at rest encryption
Page 44
Page 45 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Protection
HDP allows you to apply data protection policy at different layers across the Hadoop stack
Layer What? How ?
Storage and Access Encrypt data while it is at rest Partners, HDFS Tech Preview, Hbase
encryption, OS level encrypt,
Transmission Encrypt data as it moves Supported from HDP 2.1
Page 49 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) in 2.2
•Data encryption on a higher level than the OS one whilst remaining native and transparent to Hadoop•End-to-end: data can be both encrypted and decrypted by the clients•Encryption/decryption using the usual HDFS functions from the client
•No need to requiring to change user application code
•No need to store data encryption keys on HDFS itself
•No need to unencrypted data.
•Data is effectively encrypted at rest, but since it is decrypted on the client side, it means that it is also encrypted on the wire while being transmitted.•HDFS file encryption/decryption is transparent to its client
•users can read/write files to/from encryption zone as long they have the permission to access it
•Depends on installing a Key Management Server
Page 53 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) in 2.2
•Data encryption on a higher level than the OS one whilst remaining native and transparent to Hadoop•End-to-end: data can be both encrypted and decrypted by the clients•Encryption/decryption using the usual HDFS functions from the client
•No need to requiring to change user application code
•No need to store data encryption keys on HDFS itself
•No need to unencrypted data.
•Data is effectively encrypted at rest, but since it is decrypted on the client side, it means that it is also encrypted on the wire while being transmitted.
•HDFS file encryption/decryption is transparent to its client•users can read/write files to/from encryption zone as long they have the permission to access it
•Depends on installing a Key Management Server
Page 54 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDFS Transparent Data Encryption (TDE) - Steps
•Install and run KMS on top of HDP 2.2•Change HDFS params via Ambari•Create encryption key
•hadoop key create key1 -size 256
•hadoop key list –metadata
•Create an encryption zone using the key•hdfs dfs -mkdir /zone1
•hdfs crypto -createZone -keyName key1 /zone1
•hdfs –listZones
–http://hortonworks.com/kb/hdfs-transparent-data-encryption/
Top Related