NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses...
Transcript of NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses...
![Page 1: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/1.jpg)
NetCheck: Network Diagnoses from Blackbox Traces
Yanyan Zhuang*^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@, Monzur Muhammad^,
Ivan Beschastnikh^, Justin Cappos*
!(*)New York University, (^)University of British
Columbia, (@)University of Washington
![Page 2: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/2.jpg)
• Find bugs in networked applications • Large complex unknown applications !!!
• Large complex unknown networks !!!
• Understandable output / fix
Goal
2
![Page 3: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/3.jpg)
Motivation Apache Server
Chrome Client
3
![Page 4: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/4.jpg)
Motivation Apache Server
Chrome Client probing ping
4
![Page 5: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/5.jpg)
Motivation Apache Server
Chrome Client Different traffic (ICMP) Often different result
probing ping
5
![Page 6: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/6.jpg)
Motivation Apache Server
Chrome Client
6
![Page 7: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/7.jpg)
Motivation Apache Server
Chrome Clientpacket capture
7
![Page 8: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/8.jpg)
Motivation Apache Server
Chrome Clientpacket capture
Requires detailed protocol / app knowledge
8
![Page 9: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/9.jpg)
Motivation Apache Server
Chrome Client
9
![Page 10: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/10.jpg)
Motivation Apache Server
Chrome Client
ModelModel apps Magpie, Xtrace,
Pip...Model
10
![Page 11: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/11.jpg)
Motivation Apache Server
Chrome Client
ModelModel
Need a model per application
11
Model apps Magpie, Xtrace,
Pip...
![Page 12: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/12.jpg)
Motivation Apache Server
Chrome Client
12
![Page 13: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/13.jpg)
MotivationChrome Client
Network Config Analysis
Model & Config
Model & Config
Model & Config
Model & Config
13
Header Space Analysis, etc.
Apache Server
![Page 14: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/14.jpg)
Motivation Apache Server
Chrome Client
Network Config Analysis
Model & Config
Model & Config
Model & Config
Model & Config
Need detailed network knowledge HW + config
14
![Page 15: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/15.jpg)
Motivation Apache Server
Chrome Client ?
15
![Page 16: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/16.jpg)
NetCheck Apache Server
Chrome Client
programmer
programmer
16
![Page 17: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/17.jpg)
NetCheck Apache Server
Chrome Client
programmer
programmer
17
![Page 18: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/18.jpg)
NetCheck Apache Server
Chrome Client
Model Programmer’s Understanding
Deutsch’s Fallacies
programmer
programmer
18
![Page 19: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/19.jpg)
• Motivation • NetCheck Overview • Trace Ordering • Network Model • Fault Classification • Results / Conclusion
Outline
19
![Page 20: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/20.jpg)
NetCheck overview
ApplicationFail
Traces
NetCheck
Likely Faults
20
![Page 21: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/21.jpg)
NetCheck overview
Application
Traces
NetCheck
Likely Faults
ktrace strace
21
Fail
![Page 22: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/22.jpg)
NetCheck overview
Application
Traces
NetCheck
Likely Faults
Ordering Algorithm
Network Model
Diagnoses EngineInput
DiagnosisOutput
Host Traces
NetCheck
syscall simulationresult
simulation stateerrors
22
![Page 23: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/23.jpg)
NetCheck overview
Application
Traces
NetCheck
Likely Faults
Network Configuration Issues
Traffic Statistics
Problem Detected
23
![Page 24: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/24.jpg)
• Motivation • NetCheck Overview • Trace Ordering • Network Model • Fault Classification • Results / Conclusion
Outline
24
Traces (a) Trace Ordering
![Page 25: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/25.jpg)
Series of locally ordered system calls Don’t want to modify apps or use a global clock Gathered by strace, ktrace, systrace, truss, etc. Call arguments and “return values” !socket() = 3 bind(3, …) = 0 listen(3, 1) = 0 accept(3, …) = 4 recv(4, "HTTP", …) = 4 close(4) = 0
Traces
25
Call arguments
Return values
Return buffer
![Page 26: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/26.jpg)
!Node A Node B 1. socket() = 3 1. socket() = 3 2. bind(3, ...) = 0 2. connect(3,...) = 0 3. listen(3, 1) = 0 3. send(3, "Hello",.) = 5 4. accept(3, ...) = 4 4. close(3) = 0 5. recv(4,"Hello", ..) = 5 6. close(4) = 0
What we see is this:
- one trace per host - local order but no global order Q: how do we reconstruct what really happened?
26
![Page 27: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/27.jpg)
A1. socket() = 3 B1. socket() = 3 A2. bind(3, .. .) = 0 A3. listen(3, 1) = 0 B2. connect(3,...) = 0 A4. accept(3, ...) = 4 B3. send(3, "Hello", ...) = 5 A5. recv(4, "Hello", ...) = 5 B4. close(3) = 0 A6. close(4) = 0
What we want is this
The ground truth
A B
27
![Page 28: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/28.jpg)
A1. socket() = 3 B1. socket() = 3 A2. bind(3, .. .) = 0 A3. listen(3, 1) = 0 B2. connect(3,...) = 0 A4. accept(3, ...) = 4 B3. send(3, "Hello", ...) = 5 A5. recv(4, "Hello", ...) = 5 B4. close(3) = 0 A6. close(4) = 0
What we want is this
The ground truth !!!!!!!Goal: find an equivalent interleaving
A B
28
![Page 29: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/29.jpg)
!Node A Node B 1. socket() = 3 1. socket() = 3 2. bind(3, ...) = 0 2. connect(3,...) = 0 3. listen(3, 1) = 0 3. send(3, "Hello",.) = 5 4. accept(3, ...) = 4 4. close(3) = 0 5. recv(4,"Hello", ..) = 5 6. close(4) = 0
Observation 1: Order Equivalence
- one trace per host - local order but no global order Q: how do we reconstruct what really happened? The socket() calls are not visible to the other side Some orders are equivalent! 29
![Page 30: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/30.jpg)
!Node A Node B 1. socket() = 3 1. socket() = 3 2. bind(3, ...) = 0 2. connect(3,...) = 0 3. listen(3, 1) = 0 3. send(3, "Hello",.) = 5 4. accept(3, ...) = 4 4. close(3) = 0 5. recv(4,"Hello", ..) = 5 6. close(4) = 0
- one trace per host - local order but no global order Q: how do we reconstruct what really happened?
30
Observation 2: Return Values Guide Ordering
![Page 31: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/31.jpg)
Return values guide ordering
A2. bind(3, ...) = 0 A3. listen(3, 1) = 0 B2. connect(3, ...) = 0 !!A2. bind(3, ...) = 0 B2. connect(3, ...) = -1, ECONNREFUSED A3. listen(3, 1) = 0 !!A call’s return value may-depend-on a remote call’s action Result indicates order of calls 31
!!!!
!!!!
One valid ordering: all syscalls returned successfully.
A second valid ordering: connect failed with ECONNREFUSED.
![Page 32: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/32.jpg)
Deciding call order
full set of may-depend-on relations
socketbind getsockopt,setsockoptgetsockname
accept getpeername
poll, select
connect recv, recvfrom, recvmsg, read
send, sendto, sendmsg, write, writev, sendfileclose, shutdownlisten
32
![Page 33: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/33.jpg)
Ordering Algorithm
33
Input traces
Output Ordering
Algorithm processsocket socket
connect
send
recv
accept
listen
bind
A B
![Page 34: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/34.jpg)
Ordering Algorithm
34
Input traces
Output Ordering
Try socket on host A: accepted
Algorithm processsocket socket
connect
send
recv
accept
listen
bind
A B
socket
A
![Page 35: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/35.jpg)
connect
Ordering Algorithm
35
Input traces
Output Ordering
Try connect on host B:
Algorithm process
send
recv
accept
listen
A B
socket
Asocket
Bbind
A
connect rejected
![Page 36: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/36.jpg)
listen
Ordering Algorithm
36
Input traces
Output Ordering
Try listen on host A: accepted
Algorithm processconnect
send
recv
accept
A B
socket
Asocket
Bbind
Alisten
A
![Page 37: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/37.jpg)
recvrecv rejected
Ordering Algorithm
37
Input traces
Output Ordering
Try recv on host A:
Algorithm process
send
A B
socket
Asocket
Bbind
Alisten
Aconnect
Baccept
A
TCP BUFFER: “”
“Hola!”
![Page 38: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/38.jpg)
None
Ordering Algorithm
38
Input traces
Output Ordering
Try send on host B: accepted
Algorithm process
sendrecv
A B
socket
Asocket
Bbind
Alisten
Aconnect
Baccept
A
sendB
TCP BUFFER: “”
“Hola!”
![Page 39: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/39.jpg)
Ordering Algorithm
39
Input traces
Output Ordering
Try send on host B: accepted
Algorithm process
recv
A B
socket
Asocket
Bbind
Alisten
Aconnect
Baccept
A
sendB
TCP BUFFER: “Hello”
None
“Hola!”
![Page 40: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/40.jpg)
recvrecv
Fatal Error
Ordering Algorithm
40
Input traces
Output Ordering
Try recv on host A:
Algorithm processA B
socket
Asocket
Bbind
Alisten
Aconnect
Baccept
A
None
sendB
TCP BUFFER: “Hello”
“Hola!”
![Page 41: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/41.jpg)
• Motivation • NetCheck Overview • Trace Ordering • Network Model • Fault Classification • Results / Conclusion
Outline
41
Model
Accept
Reject
Fatal Error
![Page 42: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/42.jpg)
● Simulates invocation of a syscall ○ datagrams sent/lost ○ reordering / duplication is notable
○ track pending connections ○ buffer lengths and contents ○ send -> put data into buffer ○ recv -> pop data from buffer !
● Simulation outcome ○ Accept → can process (correct buffer) ○ Reject → wrong order (incomplete buffer) ○ Permanent reject → abnormal behavior (incorrect buffer)
Network Model
Model
Accept
Reject
Fatal Error
42
![Page 43: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/43.jpg)
● Simulates invocation of a syscall ● Capture programmer assumptions
● Assumes a simplified network view • Assume transitive connectivity • Little, random loss • No middle boxes
• Assume uniform platform • Flag OS differences
Network Model
43
![Page 44: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/44.jpg)
● Blackbox Tracing mechanism
How Model Return Values Impact Trace Ordering
Trace Ordering: linear running time (total trace length) * number of traces
44
Ordering Algorithm
Network Model
Diagnoses EngineInput
DiagnosisOutput
Host Traces
NetCheck
syscall simulationresult
simulation stateerrors
![Page 45: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/45.jpg)
• Motivation • NetCheck Overview • Trace Ordering • Network Model • Fault Classification • Results / Conclusion
Outline
45
(c) Fault Classifier
Output45
![Page 46: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/46.jpg)
● Goal: Decide what to output ● Problem: Show relevant information ● Fault classifier: global (rather than local) view
○ uncovers high-level patterns by extracting low-level features ○ Examples: middleboxes, non-transitive
connectivity, MTU, mobility, network disconnection
○ All look like loss, but have different patterns in the context of other flows
Fault Classifier
46
![Page 47: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/47.jpg)
● Options to show different levels of detail ● Network admins / developers
● detailed info ● End users
● Classification ● Recommendations
Fault Classifier
Network Configuration Issues
Traffic Statistics
Problem Detected
47
![Page 48: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/48.jpg)
• Motivation • NetCheck Overview • Trace Ordering • Network Model • Fault Classification • Results / Conclusion
Outline
48
![Page 49: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/49.jpg)
● Reproduce reported bugs from bug trackers (Python, Apache, Ruby, Firefox, etc.) ○ A total of 71 bugs ○ Grouped into 23 categories
■ Virtualization incurred/portability bugs ■ SO_REUSEADDR behaves differently across OSes ■ accept inherit O_NONBLOCK ■ …
○ Correct analysis of >95% bugs
Evaluation: Production Application Bugs
49
![Page 50: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/50.jpg)
● Twenty faults observed in practice on a live network ○ MTU bug
■ Intermediary device ○ Port forward
■ Traffic sent to non-relevant addresses ○ Provide supplemental info
■ packet loss ■ buffers being closed with data in
○ 90% of cases correctly detected
Evaluation: Observed Network Faults
50
![Page 51: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/51.jpg)
● Middle boxes ○ Multiple unaccepted connections ■ client behind NAT in FTP
• TCP/UDP ▪ non-transitive connectivity in VLC
• Complex failures oVirtualBox send data larger than buffer size oPidgin returned IP different from bind oSkype NAT + close socket from a different thread
• Used on Seattle Testbed seattle.poly.edu
General Findings in Practice
51
![Page 52: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/52.jpg)
NetCheck Performance Overhead
52
Firefox
Skype
Telnet
SSH
VLC
![Page 53: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/53.jpg)
Built and evaluated NetCheck, a tool to diagnose network failures in complex apps
!● Key insights:
○ model the programmer’s misconceptions ○ relation between calls → reconstruct order
● NetCheck is effective
○ Everyday applications & networks ○ Real network / application bugs ○ No per-network knowledge ○ No per-application knowledge
Try it here: https://netcheck.poly.edu/ 53
Conclusion
![Page 54: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/54.jpg)
Backup slides.
54
![Page 55: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/55.jpg)
○ No app- or network-specific knowledge ○ No modification to apps/infrastructure ○ No synchronized global clock !
● Blackbox Tracing mechanism (eg, strace) ○ Reconstruct a plausible total ordering of
syscall traces from multiple hosts ○ Uses simulation and captured state to identify
network related issues ○ Map low-level issues to higher-level
characterizations of failure
What is NetCheck?
55
![Page 56: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/56.jpg)
● Blackbox Tracing mechanism
Diagnosis Model
Trace Ordering
Application-Agnostic Model
Collating Fault
Classifier
Call depen- dency
Traces
56
![Page 57: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/57.jpg)
● Blackbox Tracing mechanism
Diagnosis Model
Trace Ordering
Application-Agnostic Model
Collating Fault
Classifier
Call depen- dency
accept/reject/FE
Traces
57
![Page 58: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/58.jpg)
● Blackbox Tracing mechanism
Diagnosis Model
Trace Ordering
Application-Agnostic Model
Collating Fault
Classifier
Call depen- dency
accept/reject/FE
reject → reorder
Traces
Trace Ordering: linear running time
58
![Page 59: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/59.jpg)
1. push trace t0 in stack s0, …, trace tn-1 in stack sn-1
2. while (s0, … , sn-1) not empty: 3. q = peek_stack(s0, … , sn-1); q.sort(priority) 4. while True: 5. if q empty: raise FatalError 6. ij = q.dequeue(); 7. outcome = model_simulate(ij) 8. if outcome == ACCEPT: 9. ordered_trace.push(sj.pop()); break 10. elif outcome == REJECT: pass 11. elif outcome == FatalError: raise FatalError
Pseudocode and Analysis
O(L)
Best case: O(1) Worst case: O(n)
Overall: Best case O(L)
Worst Case O(n*L)
59
![Page 60: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/60.jpg)
1. push trace t0 in list s0, …, trace tn-1 in list sn-1
2. while (s0, … , sn-1) not empty: 3. q = peek_stack(s0, … , sn-1); q.sort(priority) 4. while True: 5. if q empty: raise FatalError 6. ij = q.dequeue(); 7. outcome = model_simulate(ij) 8. if outcome == ACCEPT: 9. ordered_trace.push(sj.pop()); break 10. elif outcome == REJECT: continue 11. elif outcome == FatalError: raise FatalError
Pseudocode and Analysis
Accept → Traverse
Reject → Backtrack60
![Page 61: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/61.jpg)
!Node A Node B 1. socket() = 3 1. socket() = 3 2. bind(3, ...) = 0 2. connect(3,...) = 0 3. listen(3, 1) = 0 3. send(3,"Hello",..) =5 4. accept(3, ...) = 4 4. close(3) = 0 5. recv(4, "Hello", ..) = 5 • 6. close(4) = 0
NetCheck input
Syscall
61
![Page 62: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/62.jpg)
!Node A Node B 1. socket() = 3 1. socket() = 3 2. bind(3, ...) = 0 2. connect(3,...) = 0 3. listen(3, 1) = 0 3. send(3, "Hello",.) =5 4. accept(3, ...) = 4 4. close(3) = 0 5. recv(4, "Hello", ..) = 5 • 6. close(4) = 0
NetCheck input
Syscall
62
![Page 63: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/63.jpg)
Order 1 A1 bind(3, ...) = 0 A2 listen(3, 5) = 0 B1 connect(3, ...) = 0 !
Order 2 A1 bind(3, ...) = 0 B1 connect(3, ...) = -1 ECONNREFUSED A2 listen(3, 5) = 0 !
Order 3 B1 connect(3, ...) = -1 ECONNREFUSED A1 bind(3, ...) = 0 A2 listen(3, 5) = 0
connect depends on listen
63
![Page 64: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/64.jpg)
● Middle boxes ○ Multiple unaccepted connections
⇒ client behind NAT in FTP
○ Missing connect on accepted connections → server behind NAT or port forwarding
○ Multiple connect non-standard failure → firewall filtering connections
○ Multiple connect to listening address get refused ○ Multiple non-blocking connect failure ○ Traffic sent to non-relevant addresses → NAT or 3rd
party proxy/traffic forwarding
Example Rules
64
![Page 65: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/65.jpg)
● Middle boxes ○ Multiple unaccepted connections
⇒ client behind NAT in FTP
○ Missing connect on accepted connections → server behind NAT or port forwarding
○ Traffic sent to non-relevant addresses → NAT or 3rd party proxy/traffic forwarding
● TCP ○ select/poll timeout ○ send data after connection closed
Example fault classifier rules
65
![Page 66: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/66.jpg)
• UDP o datagram sent/lost per connection o high datagram loss rate
⇒ non-transitive connectivity in VLC
• Misc o apps send data larger than default OS buffer size
⇒ bug report from VirtualBox bug tracker
o returned IP different from bind ⇒ simultaneous net disconnect/reconnect in Pidgin
○ Skype attempted to close socket from a different thread
Example rules (cont.)
66
![Page 67: NetCheck: Network Diagnoses from Blackbox Traces · 2019. 12. 30. · NetCheck: Network Diagnoses from Blackbox Traces Yanyan Zhuang *^, Eleni Gessiou*, Fraida Fund*, Steven Portzer@,](https://reader035.fdocuments.us/reader035/viewer/2022062610/610ef52121eff43cab4739f2/html5/thumbnails/67.jpg)
● FTP ○ All reverse connections from server lost
■ Client behind NAT ● Pidgin
○ getsockname returns different IP ■ Client poor connection results in IP changes
● Skype ○ Poor call quality, msg drop
■ Network delay, NAT ■ Skype closes socket from different thread
● VLC ○ Packet loss
■ Non-transitive connectivity issue
Evaluation: Everyday Applications
67