Data Storage for the Long Haul: Compliance and Archive
-
Upload
amazon-web-services -
Category
Technology
-
view
1.158 -
download
2
Transcript of Data Storage for the Long Haul: Compliance and Archive
![Page 1: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/1.jpg)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Henry Zhang, Senior Product Manager, Amazon Glacier
August 11, 2016
Data Storage for the Long Haul:
Compliance and Archive
![Page 2: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/2.jpg)
AWS storage maturity
Amazon EFS
File
Amazon Elastic
Block Store
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct
Connect
AWS
Snowball
ISV
Connectors
Amazon
Kinesis
Firehose
Amazon S3
Transfer
Acceleration
AWS Storage
Gateway
![Page 3: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/3.jpg)
Audio archives–SoundCloud
• World’s leading social sound platform
• Audio files transcoded and stored in multiple formats
• Stores petabytes (PBs) of data
• Transcoded files served from S3
• Originals moved to Amazon Glacier for long-term retention
![Page 4: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/4.jpg)
• Media distribution backbone (Ve.nue platform)
• Over-The-Top (OTT) broadcast service
• PBs of media assets
• Assets to be archived and retained for decades
Video archives ̶
![Page 5: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/5.jpg)
Patient data–Philips Healthcare
• HealthSuite digital platform powered by AWS
• 15 petabytes of patient data
• Archived for decades (beyond the lifetime of patients)
• Uses AWS HIPAA-eligible services in the BAA
![Page 6: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/6.jpg)
Public sector–King County
• Most populous county in Washington state
• Replaced tape solution for backup from 17 agencies
• Meets compliance requirement
• Saved $1MM in first year; no more tape refresh or
management churn
![Page 7: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/7.jpg)
Archive:
Data retained for the long term,
for compliance or potential
future reference
Data archiving needs are growing everywhere
• Media assets, 4K, 8K
• Health care/life sciences
• Financial services
• Regulated industries
• Oil and gas/geospatial
• Digital preservation
• Long-term backups
• Logs
![Page 8: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/8.jpg)
Traditional archiving approaches
• Storage arrays/disk arrays
• Tape silos/tape libraries
• Tape drives (LTO-X/DLT/etc.)
• Virtual tape libraries (VTLs)
• Tape out/vaulting
• Specialized software and
personnel
![Page 9: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/9.jpg)
How can AWS help with your archival?
Metered usage:
Pay as you go
No capital investment
No commitment
No risky capacity planning
Avoid risks of physical
media handling
Control your
geographic locality for
performance and
compliance
![Page 10: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/10.jpg)
Archive Options–Storage Tiers and Data Lifecycle
![Page 11: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/11.jpg)
Object storage options
S3 Standard
Active data Archive dataInfrequently accessed data
S3 Standard - Infrequent
Access
Amazon Glacier
Milliseconds 3-5 hoursMilliseconds
$0.03/GB/mo. $0.007/GB/mo.$0.0125/GB/mo.
![Page 12: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/12.jpg)
A closer look: S3-IA and Amazon Glacier
S3-IA
• Same durability and throughput as S3 Standard
• Instant access
• $0.01/GB on each data retrieval
Amazon Glacier
• Same 11 9s durability as S3 Standard
• 3-5 hour data retrieval latency
• Suitable for cold archive such as offsite tapes
S3 Standard - Infrequent
Access
Amazon Glacier
![Page 13: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/13.jpg)
- Transition Standard to Standard-IA
- Transition Standard-IA to Amazon Glacier
- Expiration lifecycle policy
- Versioning support
Data lifecycle management
T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days
Data access frequency over time
![Page 14: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/14.jpg)
Set up lifecycle policy
![Page 15: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/15.jpg)
Transition older videos to Standard-IA
![Page 16: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/16.jpg)
Archive to S3-IA after 30 days
Lifecycle policy
Standard Storage->Standard-IA
<LifecycleConfiguration>
<Rule>
<ID>sample-rule</ID>
<Prefix>documents/</Prefix>
<Status>Enabled</Status>
<Transition>
<Days>30</Days>
<StorageClass>STANDARD-IA</StorageClass>
</Transition>
<Transition>
<Days>365</Days>
<StorageClass>GLACIER</StorageClass>
</Transition>
</Rule>
</LifecycleConfiguration>
![Page 17: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/17.jpg)
Archive to Amazon Glacier after 365 days
Lifecycle policy
Standard Storage->Standard-IA
<LifecycleConfiguration>
<Rule>
<ID>sample-rule</ID>
<Prefix>documents/</Prefix>
<Status>Enabled</Status>
<Transition>
<Days>30</Days>
<StorageClass>STANDARD-IA</StorageClass>
</Transition>
<Transition>
<Days>365</Days>
<StorageClass>GLACIER</StorageClass>
</Transition>
</Rule>
</LifecycleConfiguration>
Standard-IA Storage->Amazon Glacier
![Page 18: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/18.jpg)
Save money on storage
58% saving over S3 Standard
44% saving over S3 Standard-IA
* Assumes the highest public pricing tier
![Page 19: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/19.jpg)
Example backup software integration
• Commvault–Native integration with
S3 and Amazon Glacier
• Deduplication and encryption
• Single-console management
Amazon S3 Amazon Glacier
![Page 20: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/20.jpg)
Compliance Use Case 1–Regulatory Retention
![Page 21: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/21.jpg)
Amazon Glacier Vault Lock allows you to easily
set compliance controls on individual vaults and
enforce them via a lockable policy
Time-based retention
MFA authentication
Controls govern all
records in a vault
Immutable policy
Two-step locking
Compliance storage with Vault Lock
![Page 22: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/22.jpg)
Vault Lock for compliance storage
• Non-overwrite, non-erasable records
• Time-based retention with “ArchiveAgeInDays” control
• Policy lockdown (strong governance)
• Legal hold with vault-level tags
• Configure optional designated third-party access and grant
temporary access
![Page 23: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/23.jpg)
Amazon Glacier received a third-party assessment
from Cohasset Associates on how Amazon Glacier
with Vault Lock can be used to meet the requirements
of SEC Rule 17a-4(f) and CFTC 1.31(b)-(c).
![Page 24: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/24.jpg)
Example control: 1-year record retention
![Page 25: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/25.jpg)
Example control: 1-year record retention
![Page 26: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/26.jpg)
Vault Lock: Two-step locking
![Page 27: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/27.jpg)
Legal hold with vault-level tags
![Page 28: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/28.jpg)
Example control: Legal hold
![Page 29: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/29.jpg)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rich Sutton, VP of Engineering
Digital Risk, Social Media Security, and Compliance
Proofpoint SocialPatrol Archive
AWS Glacier and Vault Lock Use Case
![Page 30: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/30.jpg)
Proofpoint
• Cloud-based security and compliance for the enterprise:
threat research, email, mobile, social, digital risk
• Founded 2002, public in 2012
• $350M annual revenue, $3B market cap
• Huge AWS user
![Page 31: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/31.jpg)
Proofpoint SocialPatrol
Policy controls and enforcement for social
• Combats fraudulent brand impersonation
• Moderates content at scale
• Ensures compliance in publishing
• Integrates with social APIs
• 150+ classifiers using NLP and ML
• Text, links, images, meta data
• Ingesting >1M social posts per day
• Built in AWS
![Page 32: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/32.jpg)
Proofpoint SocialPatrol
How it works:
PFPT in AWS
Policy engine MySQL/C*/SolrEnterprise
Archive
“Awesome. Help me with retention by integrating with my existing email archive.”
Social
![Page 33: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/33.jpg)
Proofpoint SocialPatrol archiving integration
Imperfect …
Social != Email Every archive is
different
Requires internal
collaboration
![Page 34: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/34.jpg)
Proofpoint SocialPatrol Archive
SEC Rule 17a-4(f)-compliant archive, purpose-built for
social, enabled by Amazon Glacier and Vault Lock
PFPT in AWS
Policy engine MySQL/C*/SolrSocial
Amazon Glacier
& Vault Lock
![Page 35: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/35.jpg)
Proofpoint SocialPatrol Archive
The customer specifies the retention period in Proofpoint
Social:
![Page 36: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/36.jpg)
Proofpoint SocialPatrol Archive
Via AWS API we create a vault for that customer:
![Page 37: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/37.jpg)
Proofpoint SocialPatrol Archive
Via AWS API,
we lock the vault,
and specify policy
to observe a
legal hold via a tag.
![Page 38: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/38.jpg)
Proofpoint SocialPatrol Archive
As social content flows in, we record its purge date and
surface that to the user. Each piece of social content is an
archive in the vault.
![Page 39: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/39.jpg)
Proofpoint SocialPatrol Archive
Search UI uses
the copy of the data
we already had.
As archives expire,
we purge them.
![Page 40: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/40.jpg)
Proofpoint SocialPatrol Archive
• Legal hold can be put in place by Proofpoint Support
• Data can be exported from Amazon Glacier by
Proofpoint Support when necessary
• Amazon Glacier with Vault Lock allowed us to build a
product that complies with SEC Rule 17a-4(f) and CFTC
Rule 1.31(b)-(c)
What would it have cost for us to build a WORM data store,
get it certified, and scale it … ?
![Page 41: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/41.jpg)
Compliance Use Case 2–Auditing and Alerts
![Page 42: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/42.jpg)
Audit logging with AWS CloudTrail
• S3 and Amazon Glacier can log API
calls for audit via CloudTrail
• Enable CloudTrail in the AWS console
and designate your log bucket
• S3 logs bucket-level activities; object
activities supported via event notification
• Amazon Glacier logs all API calls for
vault and archives
![Page 43: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/43.jpg)
Access policy for a storage container
• Control access to a storage container in a single location
– S3 bucket or Amazon Glacier vault access policy
– Grant/revoke access to internal business units/teams
– “Marketing_Vault” has a distinct access policy from “DevOps_Vault”
• Easily manage cross-account access for your business partner
– Simply add a section for your business partner in the same policy
– Cross-account activities (API calls) also show up in CloudTrail logs
![Page 44: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/44.jpg)
S3 event notifications
Events
Amazon
SNS topic
Amazon
SQS
queue
AWS
Lambda
function
• Notification when objects are
created via PUT, POST, Copy, or
Multipart Upload, DELETE
• Filtering on prefixes and suffixes
for all types of notifications
![Page 45: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/45.jpg)
Request specific notifications
Request notifications on specific
PUT APIs
Request notifications on specific
DELETE APIs
s3:ObjectCreated:*
s3:ObjectCreated:Put
s3:ObjectCreated:Post
s3:ObjectCreated:Copy
s3:ObjectCreated:CompleteMultipartUpload
s3:ObjectRemoved:*
s3:ObjectRemoved:Delete
s3:ObjectRemoved:DeleteMarkerCreated
![Page 46: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/46.jpg)
Compliance Use Case 3–Geographic Redundancy
![Page 47: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/47.jpg)
Remote replicas managed
by separate AWS accounts
Secure
Distribute data to regional
customers
Lower Latency
Store hundreds of
miles apart
Compliance
S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions
![Page 48: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/48.jpg)
• Usual charges for
storage, requests, and
inter-region data transfer
for the replicated copy of
data
• Replicate into Standard-IA
or Amazon Glacier
Cost
HEAD operation on a source
object to determine replication
status
• Replicated objects will not be
re-replicated
• Use S3 COPY to replicate
existing objects
Replication status
DELETE without object
version ID• Marker replicated
DELETE specific object
version ID• Marker NOT replicated
Delete operation
Cross-region replication: Details
Object ACL updates are
replicated
• Objects with Amazon-
managed encryption key
replicated
• AWS KMS encryption not
replicated
Access control
![Page 49: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/49.jpg)
Versioning with cross-region replication
A
B
Vid1- v2
Vid1- v1
Key: A/vid1 Key: B/vid1
Vid1- v2
Vid1- v1
Vid1- v3Vid1- v3
Vid1- v4Vid1- v4
A
![Page 50: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/50.jpg)
Cross-region replication with lifecycle archiving
S3
Bucket A
Amazon Glacier
S3
Bucket B
![Page 51: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/51.jpg)
Snowball
• Accelerate PBs with AWS-
provided appliances
• NEW 80 TB model
Storage Gateway
• Instant hybrid cloud
• Up to 120 MB/s cloud upload rate
(4x improvement)
Data ingestion into AWS storage services
Firehose
• Ingest data streams directly into
AWS data stores
Direct Connect
• COLO to AWS
ISV Connectors
• Commvault
• Veritas
• etcetera
NEW S3 Transfer Acceleration
• Accelerate object transfer up to
300% using AWS’s private
network
![Page 52: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/52.jpg)
What is Snowball? Petabyte-scale data transport
E-ink shipping
label
Ruggedized
case
“8.5G Impact”
All data encrypted
end-to-end50 TB or 80 TB
10 G network
Rain & dust
resistant
Tamper-resistant
case & electronics
![Page 53: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/53.jpg)
Pricing
Dimension Price
Usage Charge per Job $250.00
Extra Day Charge (First 10 days* are free) $15.00
Data Transfer In $0.00/GB
Data Transfer Out $0.02/GB
Shipping** Varies
Amazon S3 Charges Standard storage and request
fees apply
* Starts one day after the appliance is delivered to you. The first day the appliance is received at your site and the last day the appliance is shipped out are also free
and not included in the 10-day free usage time.
** Shipping charges are based on your shipment destination and the shipping option (e.g., overnight, 2-day) you choose.
Transfer 1 PB with 13 devices
in parallel in 1 week!
![Page 54: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/54.jpg)
Remember to complete
your evaluations!
![Page 55: Data Storage for the Long Haul: Compliance and Archive](https://reader034.fdocuments.us/reader034/viewer/2022051709/587125a11a28abe4448b60c1/html5/thumbnails/55.jpg)
Thank you!