Distributed systems-radiology
-
Upload
boundary -
Category
Technology
-
view
148 -
download
0
description
Transcript of Distributed systems-radiology
Modern Radiology forDistributed Systems
Dietrich Featherston@d2fn
Thursday, October 11, 12
This is a talk about monitoring
Thursday, October 11, 12
But not just any kind of monitoring
Non-invasive monitoring
Thursday, October 11, 12
non-invasive monitoring
measures taken to describe the state of a system with minimal changes to the system being monitored
Thursday, October 11, 12
Insight
Invasiveness
Radiographic Imagery
Thursday, October 11, 12
preventative care
measures taken to prevent diseases or injuries rather than curing them or treating their symptoms
Thursday, October 11, 12
Non-invasive monitoring techniques focus primarily on host-based metrics
Why is this a problem?
Thursday, October 11, 12
Because applications are distributed
Thursday, October 11, 12
Information emittedabout nodes in the network
n Information emittedabout edges
in the network
n²Network size
Thursday, October 11, 12
We analyze cell-structure because we can’t envision
the whole organism
We react to disease and injury because we lack
preventative care
Thursday, October 11, 12
We lack preventative care for applications because our non-invasive monitoring techniques are growing less and less meaningful
Thursday, October 11, 12
Radiology is useful in illuminating non-invasive monitoring of distributed systems
Thursday, October 11, 12
Thursday, October 11, 12
Thursday, October 11, 12
Thursday, October 11, 12
Context iseverything
Thursday, October 11, 12
How do we use context?
Thursday, October 11, 12
Context
Your BigDumb Data
!!!
Thursday, October 11, 12
Human brain
+med school
Radiographic Imagery
Diagnoses
Thursday, October 11, 12
Signal Processing
VLA Output
E.T.
Thursday, October 11, 12
NetworkData
ApplicationBehavior
Application TopologySignal ProcessingExpert Brain
Thursday, October 11, 12
dimensions (11)epoch secondsepoch minutesepoch hoursnode idsource ipsource portdest ipdest portinterfacecountrynetwork/asn
measurements (8)egress packetsegress octetsingress packetsingress octetsretransmitserrorsapp-rtthandshake-rtt
Thursday, October 11, 12
Case Study #1
GC-Death of a distributed JVM application
Thursday, October 11, 12
Thursday, October 11, 12
Case Study #2
Symptoms:- Latent Riak handoff- Cluster throughput bottoming out
Thursday, October 11, 12
Thursday, October 11, 12
busy_dist_port
Thursday, October 11, 12
+zdbbl 8192
Thursday, October 11, 12
Thursday, October 11, 12
Case Study #3
Bringing a dead riak node back online
Thursday, October 11, 12
Thursday, October 11, 12
Thursday, October 11, 12
Thursday, October 11, 12
Case Study #4
Retransmits 10% of total network throughput
Thursday, October 11, 12
Thursday, October 11, 12
var put: HttpPut = nulltry { // ... put data}catch { case e: Exception => // ... handle exception}finally { if(put != null) { put.abort() }}
Thursday, October 11, 12
var put: HttpPut = nulltry { // ... put data}catch { case e: Exception => // ... handle exception}finally { if(put != null) { put.abort() }}
Thursday, October 11, 12
abortpublic void abort()Description copied from interface: HttpUriRequestAborts execution of the request.
Source: http://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/methods/HttpRequestBase.html#abort()
THANKS
Thursday, October 11, 12
129 public void abort() {130 ClientConnectionRequest localRequest;131 ConnectionReleaseTrigger localTrigger;132 133 this.abortLock.lock();134 try {135 if (this.aborted) {136 return;137 } 138 this.aborted = true;139 140 localRequest = connRequest;141 localTrigger = releaseTrigger;142 } finally {143 this.abortLock.unlock();144 } 145146 // Trigger the callbacks outside of the lock, to prevent147 // deadlocks in the scenario where the callbacks have148 // their own locks that may be used while calling149 // setReleaseTrigger or setConnectionRequest.150 if (localRequest != null) {151 localRequest.abortRequest();152 }153 if (localTrigger != null) {154 try {155 localTrigger.abortConnection();156 } catch (IOException ex) {157 // ignore158 }159 }160 }
Thursday, October 11, 12
Thursday, October 11, 12
augmented intelligence precedesartificial intelligence
Thursday, October 11, 12
1895
Wilhelm Röntgen discovers X-RaysFirst medical use of x-rays in human imaging takes place one month later
Thursday, October 11, 12
1895
Wilhelm Röntgen discovers X-RaysFirst medical use of x-rays in human imaging takes place one month later
1905
First English text on chest radiography
Thursday, October 11, 12
1895
Wilhelm Röntgen discovers X-RaysFirst medical use of x-rays in human imaging takes place one month later
1920
1905
First English text on chest radiography
Society of Radiographers formed
Thursday, October 11, 12
Recognition of radiology as a formal medical discipline was a cultural problem, not
a technology problem
http://www.bshr.org.uk/page13.htmlThursday, October 11, 12
If you want to talk to me about the query language used to ask questions of the network data we collect at Boundary talk to me after or hit me up on twitter.
@d2fngithub.com/dietrichf
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12
Find 45 minutes of total traffic
seen on meters 1, 2, 226, & 301
starting 18 hours ago broken
down by peer ip retain top 10 by
the ratio of retransmits to
packets
get volume_1s_meter_ip [ meter in {1, 2, 226, 301}; epochMillis from -18h for 45m;]categorize sum(ingress) as ingress, sum(egress) as egress, sum(ingressPackets + egressPackets) as packets, sum(retransmits) as retransmits, mean(appRttUsec/1000) as appRttMsby epochMillis, ipretain top 10 per epochMillis on retransmits/packets
Thursday, October 11, 12