© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Keith Blizard, Bob Tordella
October 2015
Self-service Cloud Services
How J&J Is Managing AWS at Scale
for Enterprise Workloads
ARC305
What to Expect from the Session
- Reviewing Enterprise Challenges & Incorporating Cloud Capabilities
- Provide approach for enabling Enterprise Controls
- Example Architecture & Implementations
- Example Patterns (HPC & Workspaces)
- Lessons Learned
J&J is a Global Health Care Leader
More than 270 Operating Companies in 60 Countries, with 126,000 employees
Selling Products in more than 175 Countries
The world’s sixth-largest consumer health, pharmaceuticals, and biologics company
The world’s largest medical devices and diagnostics business
Big Company, Big Challenges
Thousands of Systems
Complex IT Ops
Limited Financial Impact
Cloud Patterns & Acceleration
Automated IT Cost Transparency
Current State of Enterprise IT Cloud Strategy Offers Agility
Transformation to a Flexible Hybrid Cloud Strategy
N. America
DC
Provides complete infrastructure platform through
Amazon Web Services and integrated with J&J
processes and policies
On-Premise Cloud (OPCx)Virtual Private Cloud (VPCx)
Provides a highly flexible reference architecture (built
on VMware stack) to deliver ‘on-demand’ VMs inside
our Enterprise Data Centers or Co-location facilities
in each region
Europe
DCAP DC
Compliance Data Protection Operation Transparency Speed + Agility
N. America
Region
Europe
RegionAP Region
Virtual Private Cloud (VPCx) VisionEmpower the business by providing an integrated, scalable, secure self-service cloud IT platform that
enables agility, enforces policy, and accelerates best practices
Enable Agility
• Self Service
• Rapid Provisioning
• Capacity Mgmt.
• Full stack Availability
Ensure Policy
• AD Integration
• J&J AMIs
• Enterprise Logging
• Backup & Retention
• Firewall & Security Rules
Accelerate Best Practice
• Monitoring & Alerts
• VM Scheduling
• Encryption
• Software Config. Mgmt.
Enterprise Control without the Bottleneck
Preventative Controls
Detective Controls
Core principles for security,
compliance & management
Enforce Least Privilege Approach
Log Everything
J&J Identity & Group
Management
J&J Network Extension
Enforce our Images
Account Isolation
xbot
Big Data Account
Workspaces
Account
Xbot / Management Architecture
AWS Services
VPCx
Help
Assurance
Monitor
VPCx
DB
xbot
Admin
AD
Console
Billing
AWS
Console
Billing
Project Owners
VPCx Administrators
HPC Account
• Centralized Policy Enforcement - xbot
• Each Application Account is completely
isolated from each other
• Controls are executed through both
Assurance and Enforcement tests run
every 10 minutes
• Tickets are created for drift to
allowable values
Enterprise Control - Queue Management & Automation
Work
Queue
Work
Items
API Execution @
Each Account:
List, Info, Delete,
Update, Setup,
Admin, Login
Metadata:
Project Details,
Allowable Cloud Objects,
Chargeback,
Acceptable Values
Ex: HPC Account
Ticket
System
image = project.get_ec2_images(project_info['Id'], region, image_ids=image_id)
images = []
for img in image_objs:
unserialized_obj = binascii.a2b_qp(img['image'])
images.append(img)
instance_info[key][i.id]['Name']=i.tags.get('Name', '')
instance_info[key][i.id]['Env'] = i.tags.get('Environment', '')
instance_info[key][i.id]['Hostname'] = i.tags.get('Hostname', '')
instance_info[key][i.id][’ImageId'] = i.tags.get(’ami-id', ‘’)
If instance_info.img_id != allowable value
error.name = ‘instance-value-error’
error.value = instance_info
create_support_ticket(error.name=‘instance-value-error’)
Sample Control – Only Allowing Approved Images
Amazon DynamoDB – Project Metadata
Amazon DynamoDB – Project Level Exceptions
CLI – Automation – Member Info
User Level Information
And access list
CLI – Automation – Project Info
Project Lists including
account-code and
friendly name
CLI – Automation – Project Info
Project Metadata
Project Level Service
Listing
CLI – Automation – Adding Services
Adding New Service
for this Project
CLI – Automation – Project Info
New Service Added with
corresponding IAM
roles, policies
App AWS
Account
(002)
Core
Project
Services
Users
Alarms
HPC
App AWS
Account
(002)
Core
Project
Services
Users
Alarms
HPC
App AWS
Account
(002)
Core
Project
Services
Users
Alarms
HPC
AWS Account & Infrastructure Layer Control
Xbot Account
Payer
Account
(Consolidated
Billing)
Consolidated Billing
Xbot Administration
Scalable to 1000s of accounts
App AWS
Account
(001)
Core
Project
Services
Users
Alarms
HPC
App AWS
Account
(002)
Core
Project
Services
Users
Alarms
HPC
Core
Project
Services
Users
Alarms
HPC
Operating System & Database Layer Control
Xbot Account
App AWS Account (001)
RDS Amazon
RedshiftEC2
Operating System Database
Managing Amazon Redshift Controls
Encrypt
Sensitive Data
Work
Queue
Work
Items
Account Metadata:
Ex: HPC Account
Ticket
System
Checks 100s of
accounts every 10 min
for new instance;
enforces policy
AD Security
Group Sync
xbot
KMS
Sample Control ― Managing Redshift
audit policy requires:
# rotate_master_passwords=1hour
# apply_cw_metrics=95%CPUutil>60mins;85%DiskUsed>60mins;HealthStatus<1=10mins
# require_ssl=True
# enable_user_activity_logging=True; bucket_name=RegionalS3LogBucket
# backup_retention_period=35days
# modify_cluster(master_user_password=newpassword)
# publicly_accessible=False
# add_tags=‘Environment’;’Production’
# rotate_user_passwords=90days
# sync_users=(conn.rscluster)
## add users, set groups, revoke public schema
## drop users, move schema ownership
User Federates into Account
User creates Cluster
Cluster Created
Within 10 minutes,
xbot takes over
Master User
Master User
Password is reset by
xbot every hour
Master User takes over, abstracts
itself by syncing with AD Security
Groups tied to that AWS Account
Begins to build a Profile / Group
Grants various permissions to group
and associates DBAs
Revokes Access to Public Schema to
ensure least privilege
Xbot detects new Cluster;
applies CloudWatch Alarms
Alarms
Xbot enables logging & sets
the maximum backup retention
Xbot updates Parameter Group
for SSL & User Activity Logging
Xbot resets the
parameter group
within 10 minutes to
enforce policy
Xbot notifies users of
the changes to their
environment
Enterprise Log Management
Queries logs
out of DB
Rotates logs
every week
Temp Location
for Log Movement
Elastic Load
Balancing
S3
Amazon
Redshift
Data Pipeline
EMR
CloudFrontCloudTrail Config
EC2
RDS
Regional S3
Logging Bucket
No API Action to
send DB user
Activity Logs to S3
Regional S3
Logging Bucket
Copies to S3
Bucket
EC2 Elastic Load
BalancingS3EBS Amazon
Glacier
RDS Amazon
Redshift
Compute Storage & Content Delivery Database
AWS Components Orchestrated
DynamoDB
Amazon
Kinesis
Data Pipeline
EMR
VPC Direct Connect
Auto Scaling
CloudFront ElastiCache
CloudFormation CloudWatchCloudTrail
IAM SESSNSCloudSearch SQSSWF Python (boto)
WorkSpacesWorkDocs
Directory
Service
Trusted
Advisor Config
Networking Management Tools
Enterprise Applications
Common Architecture Pattern for Big Data or HPC
us-east-1 (10.X.X.X/25)
us-east-1a
10.X.X.0/27
us-east-1b
10.X.X.32/27
Connected VPC
VPC Peering
Amazon S3
Win/Lin
EC2
DynamoDB
us-east-1 (10.X.X.X/19)
Disconnected VPC for EMR
IGW
us-east-1a
10.X.0.X/21
us-east-1b
10.X.7.X/21
us-east-1c
10.X.15.X/20
Burst High Performance Computing (HPC) workloads
in Private Address Space in same Account
Take advantage of multiple
subnets / AZs for Spot
Instance Pricing
Common Use Cases
• Statistical Analysis on large data sets; e.g.
Genomic Sequencing
• Transformations of large complex data sets for
Advanced Analytics (Sales & Supply Chain)
• Machine Learning engines on unstructured or
non-relatable data
Large volumes of
Structured & Unstructured
DataDirect Connect
VGW
On-Premise Internal Data SourcesAdmins
OIA
J&JDCs
JJNET
MFA
SCCM Site & DP
J&J Resources J&J Facility
Zero Client
ELB
Workspaces Account
Infra Comp Account
Core Infra Account Zero Client Account
TeradiciConnection
Manager
Workspaces Architecture Patterns
Comments
• Global implementation across NA, EMEA and AP
• Infrastructure components living within AWS for scale,
performance and management
• J&J Network extended into AWS
Tradeoff / Lessons Learned
- DevOps is heavily recommended for approach to cloud. Focus on
velocity of new capabilities & operational improvements
- Security Engagement and Partnership is critical
- Identify, Design and remain Diligent with your Cloud Principles
- Early evaluation with CMP – focus has been too much on IaaS &
Provisioning only
- Partnership with 3rd Party is crucial (Log Management, Web
Application Firewall, Utilization & Spend)
- Training of Enterprise IT Users is critical
Key Takeaways
- Lean into PaaS services
- Enable agility of the cloud to your end users through self-service
- Automate your enterprise controls
- Unleash power of the cloud for small to large patterns
Remember to complete
your evaluations!
Top Related