New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... ·...
Transcript of New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... ·...
![Page 1: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/1.jpg)
Paolo Abeni, Davide Caratti, Eelco Chaudron, Marcelo Ricardo Leitner - Red Hat
LPC, Vancouver 2018
TC SW datapath: a performance analysis
![Page 2: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/2.jpg)
INSERT DESIGNATOR, IF NEEDED
Outline
2
● An history of 2 datapaths● The testing scenario ● Performance analysis, recent and current status● Will eBPF save the world?
![Page 3: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/3.jpg)
INSERT DESIGNATOR, IF NEEDED3
Why we need 2 in kernel OVS datapath?
● “Old” kernel OVS datapath○ first “fast” OVS datapath implementation○ Feature-complete
● TC S/W:○ Created to allow for H/W offload○ Considered slower○ Lacks some features - conntrack
![Page 4: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/4.jpg)
INSERT DESIGNATOR, IF NEEDED4
The PVP test scenarioDUT
NIC
OVS
Loopback VM
Virtio NIC
testpmd
Traffic Generator
![Page 5: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/5.jpg)
INSERT DESIGNATOR, IF NEEDED
Let’s see the numbers
![Page 6: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/6.jpg)
INSERT DESIGNATOR, IF NEEDED
How about scaling?
![Page 7: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/7.jpg)
INSERT DESIGNATOR, IF NEEDED
Topmost perf offenders for vhost (1 queue)
6.68% _raw_spin_lock 5.51% tun_get_user 5.20% vhost_get_vq_desc 5.17% masked_flow_lookup 4.82% ixgbe_xmit_frame_ring 3.80% translate_desc 3.41% __skb_flow_dissect 3.34% tun_do_read 3.33% iov_iter_advance 2.83% __slab_free 2.55% _copy_to_iter 2.46% page_frag_free
7
Why are we so slow? Can perf tell us?
Not entirely obvious...
Topmost perf offenders for vhost (16 queues)
13.76% tun_do_read 8.08% __slab_free 7.79% vhost_get_vq_desc 7.26% _copy_to_iter 7.23% page_frag_free 5.95% __check_object_size 5.86% handle_rx 4.26% vhost_net_buf_peek 3.69% translate_desc 3.42% iov_iter_advance 2.84% skb_release_head_state 2.65% tun_recvmsg
![Page 8: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/8.jpg)
INSERT DESIGNATOR, IF NEEDED8
More help from perf
vhost forwarding is asymmetric! And fixing it is simple: apply the same limits to both handle_rx() and handle_tx(). Implemented in the 4.18 release cycle
call-graph accounting for vhost (16 queues)
100.00% vhost-<pid> | -- |--87.85%--ret_from_fork | kthread | vhost_worker | | |--79.25%--handle_rx
|--55.45%--tun_recvmsg[...] | --8.60%--handle_tx
||--7.42%--tun_sendmsg
[...]
![Page 9: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/9.jpg)
INSERT DESIGNATOR, IF NEEDED
Did we improve?
![Page 10: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/10.jpg)
INSERT DESIGNATOR, IF NEEDED
Can we do any better?
4.18 with OVS backend
7.50% masked_flow_lookup 5.02% ixgbe_xmit_frame_ring 4.20% vhost_get_vq_desc 3.89% iov_iter_advance 3.18% translate_desc 2.92% pfifo_fast_dequeue
2.83% tun_build_skb.isra.572.81% tun_get_user2.09% __dev_queue_xmit1.99% handle_tx1.93% _copy_to_iter
10
Again, not entirely obvious… let’s look towards the bottom...
[...] 0.71% skb_clone
With TC backend
5.08% ixgbe_xmit_frame_ring 4.63% vhost_get_vq_desc 4.37% skb_release_data 3.86% translate_desc 3.00% iov_iter_advance 2.94% tun_get_user 2.89% __skb_flow_dissect 2.76% memcmp 2.54% pfifo_fast_dequeue 2.17% rhashtable_jhash2
Topmost perf offender for vhost on Linux 4.18
![Page 11: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/11.jpg)
INSERT DESIGNATOR, IF NEEDED
Killing bad clones
● In the TC S/W datapath packets are forwarded via the TC act_mirred action
- It clones the skb and return a control action. The caller acts on the original skb accordingly
- The TC S/W datapath uses DROP as the control action. We can avoid the clone and forward directly the original skb
- Implemented in the 4.19 release cycle
![Page 12: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/12.jpg)
INSERT DESIGNATOR, IF NEEDED
The current status
12
![Page 13: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/13.jpg)
INSERT DESIGNATOR, IF NEEDED
Things intentionally omitted - so far
13
● With more complex ruleset TC will hit a greater retpoline overhead○ “listification” is hard to apply here, will not help with many flows
● Some specific TC actions do not scale well (per action spin_lock)○ removal is a WIP - thanks to Davide Caratti
● We could almost double the tput using 2 vhost threads per virtio net queue (rx and tx)
○ That is almost alike using multiple virtio_net queues● Still far away from line rate and carrier grade reqs (15x), less far
from bypass solutions (3x)
![Page 14: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/14.jpg)
INSERT DESIGNATOR, IF NEEDED
Will eBPF save us?
14
● OVS support for XDP is under development. Is that a game changer? Let’s perf it - almost.
○ Use a simple XDP program parsing ingress packet up to L3 and forwarding it using an user configured map
○ Nowhere near a complete solution, hopefully an upper bound of what we should expect with ovs-XDP
![Page 15: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/15.jpg)
INSERT DESIGNATOR, IF NEEDED
Will eBPF save us? [II]
15
![Page 16: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/16.jpg)
INSERT DESIGNATOR, IF NEEDED
A glance at the future
16
● XDP eBPF backend for OVS is not there yet○ And next-to-come AF_XDP is possibly more interesting from
performance PoV● UDP GRO for forwarded packet can someday land into the kernel
datapath.○ Will help only with scenarios using a limited number of flows.
![Page 17: New TC SW datapath: a performance analysisvger.kernel.org/lpc_net2018_talks/TC_SW_datapath_a... · 2018. 11. 5. · INSERT DESIGNATOR, IF NEEDED Killing bad clones In the TC S/W datapath](https://reader034.fdocuments.us/reader034/viewer/2022051906/5ff97fb45737ce678c41c744/html5/thumbnails/17.jpg)
THANK YOU