SplunkLive! London 2016 - HSCIC / NHS Digital / Spine 2
-
Upload
splunk -
Category
Technology
-
view
2.088 -
download
0
Transcript of SplunkLive! London 2016 - HSCIC / NHS Digital / Spine 2
Copyright © 2016 Splunk Inc.
Splunk for Spine 2Ramen Sen
Lead Systems EngineerHealth and Social Care Information Centre
2
About HSCIC
The national provider of information, data & IT systems for commissioners, analysts & clinicians in health and social care.
HSCIC is an executive non-departmental public body, sponsored by the Department of Health.
3
About NHS SPINE COREThe Spine supports the NHS in the exchange of information across national
and local NHS systems. It connects clinicians and patients to essential national services including:
Electronic Prescription Service
Summary Care Record
Child Protection - Information Sharing
4
NHS SPINE CORE in numbers…
Handles 6 billion messages every
year
Connects over 28,000 health care
IT systems in 21,000 organisations
Holds over 500 million records and
documents.
Peak daily volume is ~42 million
transactions
Indexes over a billion events a day.
5
How We Got Started
2003/2004Contracts awarded
for Spine 1
Jul 2011Began developing Spine 2 in-house, text-log index and analysis tool chosen
(Splunk)
July 2013Spine2 went into external testing
with health software suppliers
August 2014Successful transition
to Spine 2
TodayIndexing over a
billion events a day
6
Requirements for event indexing and search:Index bespoke application logs as well as product logs (NGINX/Riak etc).
Time based reporting with both matrix and charts output.
Support for inexpert users • Form based user driven generation of reports, dashboards
• Google-style query language for power users
7
Requirements for event indexing and search:
Support for expert users • Includes transaction linking & API support for app building.
Scalability - O(400) GB a day with O(1000) reports every hour.
Horizontal scalability on commodity hardware.
Security (authentication and authorization control).
8
Platform Performance Monitoring Architecture
A B A BLive Reference
200 Servers
Search headsnon-sensitive
non-patient data
IndexersAccess to security & audit logs
Role-based authentication2-factor authentication
Platform Performance Monitoring Platform Performance Monitoring
9
We Have a Lot of Use Cases…
Reporting for business programs
SLA/performance reporting
24/7 operational monitoring
Performance/scalability monitoring
Non functional test monitoring
Deployment monitoring Incident investigation Trend analysis
H/W, OS and process monitoring Security monitoring Audit logging
10
So I’m Going to Focus on Three…
Reporting for business programs
SLA/performance reporting
24/7 operational monitoring
Performance/scalability monitoring
Non functional test monitoring
Deployment monitoring Incident investigation Trend analysis
H/W, OS and process monitoring Security monitoring Audit logging
11
24/7 Operational Monitoring
12
24/7 Operational Monitoring
13
Performance monitoring
14
Incident Investigation – BusinessIdentify and resolve external incidents where a message doesn’t go through (e.g. GP record transfer or electronic prescription)
15
Incident Investigation - Business Identify and resolve external incidents where a message has a processing problem.
Top Tips
17
Find the ‘transaction in a haystack’ with unique transaction IDs
For all Spine application log points we log:
Log LevelLog ReferenceProcessInternal ID
This allows us to trace a single message journey through the entire system, across all hosts
Thank you!
18