Data Day Health IT - Data Architecture

25
Healthcare Considerations for Modern Data Architectures Pitfalls, Challenges and Best Practices Data Day Health 2017 Presented by: Toby Owen, VP Product Development

Transcript of Data Day Health IT - Data Architecture

Page 1: Data Day Health IT - Data Architecture

Healthcare Considerations for Modern Data Architectures Pitfalls, Challenges and Best Practices Data Day Health 2017

Presented by:Toby Owen, VP Product Development

Page 2: Data Day Health IT - Data Architecture

OnRamp - Industry leading high security and hybrid hosting

provider- Operates multiple enterprise class data centers

located in Austin, Texas and Raleigh, North Carolina- SSAE 16 SOC II and SOC 3 Audited, PCI and HIPAA

compliant company- Specializes in helping organizations meet their

rigorous compliance requirement and keep their data safe

Toby Owen- Vice President, Product Development, OnRamp- 20 year IT veteran with operations and

engineering background- Security, IT ops at scale, hybrid cloud,

compliant workload hosting

Page 3: Data Day Health IT - Data Architecture
Page 4: Data Day Health IT - Data Architecture

AGENDAGOAL: Designing an app for Healthcare… that’s compliant!

Data StoresApp DesignWhere to Run ItDev LifecycleTakeawaysQ & A

Page 5: Data Day Health IT - Data Architecture

Refresher on (or intro to) databasesCAP theorem

C = ConsistencyA = AvailabilityP = Partition Tolerance

Page 6: Data Day Health IT - Data Architecture

Database Reference Guide – at a glance

*Adapted from http://blog.nahurst.com/visual-guide-to-nosql-systems

Page 7: Data Day Health IT - Data Architecture

Why do we care?• Scaling vertically versus horizontally

- Costs of scaling up can grow exponentially - Scaling horizontally is linear- Limits to scaling vertically, “indefinite”

horizontal scale limit• Data sources are increasingly distributed• Horizontal scaling provides better geo-

resiliency at the same time• Not all data needs strict ACID compliance More arguments favor distributed data stores

Page 8: Data Day Health IT - Data Architecture

RDBMS and ACID• Definition: Atomicity, Consistency, Isolation, Durability• Favors Consistency over Availability• Examples- MSSQL- MySQL- Postgres- Greenplum- VoltDB

Page 9: Data Day Health IT - Data Architecture

Is scalability and ACID a false tradeoff?• Scalability and ACID are difficult to satisfy at the same

time• Not all data requires strict ACID compliance• Relational can be a bottleneck- Simpler models might simplify operations – easier and more

efficient• New relational DBs can be very fast AND scalable• Many NoSQL DB’s adding features to look more like

RDBMS• Take-away: understand your data (shape and use case)

and pick the right solution

Page 10: Data Day Health IT - Data Architecture

NoSQL and BASE• NoSQL Definition- SOME of the following: non-relational, distributed, open-source,

horizontally scalable, schema free, easy replication support, simple API• BASE Definition: Basically Available, Soft state, Eventual

consistency- All data reads will eventually yield the same result

• Favors Availability over Consistency• Let’s focus some time here exploring NoSQL

databases/datastores- Considerations based on scalability, encryption and key management

Page 11: Data Day Health IT - Data Architecture

• Document oriented Database (JSON). Considered “semi-structured” data• Scalability - built in via automatic sharding (range, hash, zone)

- EA FIFA game (250+ servers), Yandex (10’s billion objects, TBs of data, growing at 10MM files uploads/day)• Security – encryption in-transit

- SSL/TLS client support (data in-transit)- MongoDB Enterprise Advanced supports FIPS 140-2- Atlas (Mongo-aaS on Amazon) does NOT support FIPS mode

• Security – encryption at-rest- App level, external filesystem, disk level, or natively (encrypted storage engine). Native suports FIOPS

140-2• Security – key management

- Each DB has a separate Key- Can be integrated with external KMS- Supports key rotation without downtime (via rolling restarts of replica set)- Native encryption is only available via Enterprise Advanced version!

Page 12: Data Day Health IT - Data Architecture

• Row-oriented• Scalability – peer-to-peer distributed system, data across all nodes

- Each node contains commit log, exchanges data across cluster every second- All writes are automatically partitioned and replicated throughout cluster- Apple (75,000 nodes, 10PB); Netflix (2,500 nodes, 420TB, 1 trillion requests/day)

• Security – encryption in-transit- Supports TLS/SSL, separate configs for client-server and server-server- FIPS compliance supported

• Security – encryption at-rest- Open-source Cassandra relies on filesystem encryption- Datastax (commercial version) supports at-rest encryption

• Security – key management- Open-source Cassandra relies on filesystem encryption’s key management tools (can be complex)- Datastax (commercial version) has native KMIP support

Page 13: Data Day Health IT - Data Architecture

• Not really a database – distributed filesystem (HDFS) plus application interface (MapReduce)• Scalability – designed for large file distribution across 100’s and 1000’s of servers, streaming

access and large data sets - (compute cheaper to move than data)- Facebook (21PB, 2000 machines), Spotify (1300 nodes, 42PB storage, 20TB a day ingested, 200TB a

day generated by Hadoop)• Security – encryption in-transit

- HDFS supports transparent encryption • Security – encryption at-rest

- Supported by HDFS, application, database, or disk-level- Lots of options for commercial support and tools to simplify management

• Security – key management- Natively supports it’s own KMS- Again, more commercial options exist to simplify

Page 14: Data Day Health IT - Data Architecture

LOTS of others• Key Value

- Redis - DynamoDB

• Document Oriented- CouchDB - DocumentDB

• Time Series• Graph• + 225 more! (nosql-database.org for basic info and

comparisons)

Page 15: Data Day Health IT - Data Architecture

So you’ve chosen your datastore(s)Now what?

Application architecture!

Page 16: Data Day Health IT - Data Architecture

Application design SOME Considerations for HIPAA and HITECH• HITECH – each app zone requires firewall isolation- Web, app, database

• Key Management- Key Management System (KMS)- Hardware Security Module (HSM)- Keys database- Key splitting – for transferring clear-text cipher keys

Page 17: Data Day Health IT - Data Architecture

Reference Architecture

Page 18: Data Day Health IT - Data Architecture

And more• Many other security considerations around compliant

application architecture- Shared storage resources and shared IaaS

Supporting encryption at-rest may not be enough to achieve HIPAA or HITRUST compliance.

- Verifiable (compliant) destruction of data in a shared environment - Encryption keys need to be managed in accordance with

shared secrets or ‘key splitting’ schemes (e.g. Shamir’s secret sharing)

Page 19: Data Day Health IT - Data Architecture

Next?We’ve chosen the right datastores…We’ve designed our application to support HITRUST or HIPAA…

Where will the app run?

Page 20: Data Day Health IT - Data Architecture

Hybrid is the likely reality• Consuming 3rd party data

sources• Capabilities of each data or

app component provider• BAA with each provider• Peril of failing to plan

Page 21: Data Day Health IT - Data Architecture

How to keep all this compliant?• Lots to consider to get it right• Start at the beginning – your

development lifecycle• Automate everything• Dev/Test/Staging/Production should all

account for secure design• Use Containers ?• Maybe get some help

Page 22: Data Day Health IT - Data Architecture

Key Takeaways• Distributed data is becoming the new norm• Data is different – data usage should dictate data technology

- (no one-size-fits-all)• Application Architecture is key to achieving compliance• Must consider all locations where app is running• Consider compliance in all phases of app development (starting

with design)• Automation in development pipeline is key to building-in and

maintaining compliance throughout app lifecycle• Final consideration – are you now a service provider?

Page 23: Data Day Health IT - Data Architecture
Page 24: Data Day Health IT - Data Architecture

Toby OwenVP, Product [email protected]@tobydowenlinkedin.com/in/tobyowen

Page 25: Data Day Health IT - Data Architecture

Resources• Databases and scaling:

- http://stackoverflow.com/questions/12215002/why-are-relational-databases-having-scalability-issues- http://blog.nahurst.com/visual-guide-to-nosql-systems- http://nosql-database.org/

• MongoDB- https://www.mongodb.com/mongodb-architecture- https://webassets.mongodb.com/_com_assets/collateral/MongoDB_Security_Architecture_WP.pdf

• Cassandra- http://cassandra.apache.org/doc/latest/operating/security.html?highlight=encryption- http://stackoverflow.com/questions/32584253/how-to-use-cassandra-with-tde-transparent-data-encryption- http://dba.stackexchange.com/questions/6909/cassandra-encryption-at-rest- http://www.datastax.com/products/datastax-enterprise

• Hadoop- https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html- Hadoop at Scale: Spotify http

://cdn.oreillystatic.com/en/assets/1/event/118/The%20Evolution%20of%20Hadoop%20at%20Spotify-%20Through%20Failures%20and%20Pain%20Presentation.pdf

• Key management- https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing