The Ultimate Logging ArchitectureYou know you WANT it!
Michele Leroux [email protected]
@michelebusta
http://solliance.nethttp://michelebusta.com
The Hello WorldOf Logging
1992
HelloWorld!
HelloWorld!
Logging Today2014
WebBrowsers
MobileApps
ClientApps
Why do we log?
• Troubleshooting visibility
• Security audits, review, early detection
• Post incident forensics
• Track change history
• Insights into user activity
• Reporting and analysis
What to log?
EXAMPLE:
Application EventsWindows Logs
IIS LogsTrace Output
EXAMPLE:
Login AttemptsUnauthorized/
Authorized AccessPassword Resets
EXAMPLE:
Session TracePurchase Flow
Report GenerationFeature Access
EXAMPLE:
Change history for any critical system
records
Event Logs Audit Logs Activity Logs History Logs
Live Streaming / Analytics
Make LoggingEASY
Implement a Log Helper
ILogger
Logger
TraceDebug()
TraceInformation()
TraceWarning()
TraceError()
Throw()
Logger.Current.TraceInformation();Logger.Current.Throw(ex);
Failure is NOT an option.
Event Logging
Just Do It
• Whatever is built in
• Whatever you know best
• Just do it
Encapsulate the Mechanism
ILogger
Logger
ELMAH / SLAB Azure Diagnostics log4j / log4net ElasticSearch
Audit Logging
Logs and Compliance
• Contain no user credentials
• No PII, PHI or identifiable user data
• Retention period (1 year is good baseline)
• A structured archival process
• Alert if log reaches capacity
• Authorized access
• Protections from modifications (write-only)
Implement an Audit Helper
ILogger
Logger
Tracexxx()
Throw()
AuditLogger.Current.Write();AuditLogger.Current.Throw(ex);
IAuditLogger
AuditLogger
Write() Throw()
Event Logs Audit Logs
Logger.Current.TraceInformation();Logger.Current.Throw(ex);
AzureBlobs
DocumentDB
Benefits of noSQL
• Log details tend to evolve
– Schema-less storage is best
– Re-indexing may be necessary
• Co-location with mainline databases
– Adds complexity and overhead (potentially)
– Does not allow a separate “evolution” team around telemetry and analysis
Audit Log Use Cases
• Every login attempt (success or failure)
• Excessive login attempts and lockouts
• Blocking/blacklisting users, IP addresses, access ports
• Every logout
• Every modification to user table, including permissions
• All configuration changes
• Attempts to access restricted resources, APIs from unexpected paths
• All access to PII / PHI in an individually identifiable way
Audit Log Fields
• Date/time of event• Machine name/instance• Process ID• User ID (possibly encrypted) / Session ID• Type of event• Success or failure of the event (if applicable)• Seriousness of the event violation (if applicable)• Message (free form)• Stack Trace (if applicable)
History and ActivityLogging
History Logs
• Changes made to key tables
• Describes
– Who changed the record?
– From which application?
– Which fields changed?
• Need the ability to surface this to applications
– Sometimes to users
– Always to operations to solve problems
Implement a History Log Helper
IHistoryLogger
HistoryLogger
HistoryLogger.Current.Write();
History Logs
DocumentDB
Users
Orders
ClaimsClaims
…
Claims
Wrap History in the DAL
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Users
Orders
Claims
Content
Wrap History in the DAL
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Users
Orders
Claims
Content
What happened with my order?
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Users
Orders
Claims
Content
Activity Logs
• Not specific to code execution and troubleshooting, diagnostics
• Specific to the application, user activity
• COULD be informative to users as well– History of recent activity in the site
– Reports they requested, downloads, other…
• Provides insights to the business regarding user activity, trends and patterns– Non-critical analysis
Implement an Activity Log Helper
IActivityLogger
ActivityLogger
ActivityLogger.Current.UserDownload();ActivityLogger.Current.ReportRequest();ActivityLogger.Current.PurchaseOrder();
Activity Logs
DocumentDB
What happened with my order?
History Logs
OrdersDal
Relational DB
Orders
Activity Logs
Automate Logging Where Possible
• View controllers
• API controllers
• Authorization hooks
• Outbound calls
• Data Access layers
To QueueOr NOT To Queue
Event Logs Audit Logs Activity Logs History Logs
Loggers
Client and Server Logging
WebBrowsers
MobileApps
ClientApps
Mobile API Client API Log API Client API Log API
What can I queue?
Event Logs Audit Logs Activity Logs History Logs
Loggers
ETWDocDB
ETW Goal
Event Logs Audit Logs Activity Logs History Logs
Loggers
ETW
HistoryPublisher
ActivityPublisher
Audit Publisher
Events Publisher
Stream Analytics
ALERTS
Queued Logging
• Considerations– Timestamps matter
– Correlation across nodes matters (to a point)
– Guaranteed exactly one in order doesn’t exist
– Async is good (mostly)
• That said– Priority matters (hot, warm, default)
– Simplicity matters
– Throughput matters
TroubleshootingIs Important!
Problem Statement
• We need immediate access to what the HECK is going on when there is a problem
• Sometimes I use (in order):
– Google Analytics
– Event Logs (Azure Website)
– Table Storage queries (STRIKE THAT, USELESS)
– Blob storage CSVs (good enough, not realtime)
Elasticsearch Architecture
Elasticsearch
Logger AuditLogger HistoryLogger ActivityLogger
Kibana Visualization
LogStash
LogStash
Elasticsearch
Identity Server Web Server / IIS /
Event LogsCPU / Memory
Perf Counters
Blob CSVs …
Archives, Aggregation and Analytics
ARCHIVE
Elastic Search
Audit Logs
Activity Logs
History Logs
HDInsight
PoweShellSpin up, analyze, spin down
Ingest
Blob
Storage
Event Logs
OR, just…
What you’re looking for is…
• Manageable implementation
• Ability to “evolve” log content
• Reduce IO / socket overhead (monitor this)
• Prioritization
• Real-time analytics, troubleshooting
• Accessibility for UI lookups (history, activity)
• Archival and mass analysis
References
• Conference resources:
– http://michelebusta.com
• Contact me:
– @michelebusta
• Founder, CIO of Solliance
– http://solliance.net
Top Related