P ENNSYLVANIA S TATE O FFICE 359 E AST P ARK D RIVE S UITE ...
Ed Warnicke, Cisco Tomasz Zawadzki, Intel · S E R S P A C E K E R N E L S P A C E U S E R S P A C...
Transcript of Ed Warnicke, Cisco Tomasz Zawadzki, Intel · S E R S P A C E K E R N E L S P A C E U S E R S P A C...
Ed Warnicke, Cisco
Tomasz Zawadzki, Intel
Agenda
SPDK iSCSI target overview
FD.io and VPP
SPDK iSCSI VPP integration
Q&A
2
Notices & DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.
No computer system can be absolutely secure.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks .
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks .
Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system.
Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.
© 2018 Intel Corporation. Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as property of others.
4
Intel® Builders 5
Moving to UserspaceAlternate solutions (RDMA) are moving in strides
TCP/IP transport has been present for much longer
There are still use cases for TCP/IP
Even NVMe-oF transport will be using it in the future
Intel® Builders 6
SPDK iSCSI target overviewUsing POSIX sockets for data path negates benefits of userspace storage services by:
• Having syscalls go to kernel and back
• Adding back interrupts
SP
DK
U
S
E
R
S
P
A
C
E
K
E
R
N
E
L
S
P
A
C
E
NVMe Driver
Block Device Abstraction
iSCSItarget
Ke
rne
lL4 TCP
L2/L3 MAC/IP
NIC Driver
POSIX sockets
7
FD.io: The Universal Dataplane
• Project at Linux Foundation• Multi-party
• Multi-project
• Software Dataplane• High throughput
• Low Latency
• Feature Rich
• Resource Efficient
• Bare Metal/VM/Container
• Multiplatform
FD.io Foundation 8
Bare Metal/VM/Container
• Fd.io Scope:
• Network IO - NIC/vNIC <-> cores/threads
• Packet Processing –Classify/Transform/Prioritize/Forward/Terminate
• Dataplane Management Agents - ControlPlane
Dataplane Management Agent
Packet Processing
Network IO
VPP – Vector Packet ProcessingCompute Optimized SW Network Platform
Packet Processing Software Platform
• High performance
• Linux user space
• Runs on compute CPUs: - And “knows” how to run them well !
Shipping at volume in server & embedded products
9
Packet Processing
Dataplane Management Agent
Network IO
Bare-metal / VM / Container
VPP – How does it work?Compute Optimized SW Network Platform
1Packet processing is decomposedinto a directed graph of nodes …
Packet 0
Packet 1
Packet 2
Packet 3
Packet 4
Packet 5
Packet 6
Packet 7
Packet 8
Packet 9
Packet 10
… packets move through graph nodes in vector …
2
Microprocessor
… graph nodes are optimized to fit inside the instruction cache …
… packets are pre-fetched into the data cache.
Instruction Cache3
Data Cache4
3
4
Makes use of modern Intel® Xeon® Processor micro-architectures.
Instruction cache & data cache always hot Minimized memory latency and usage.
vhost-user-
input
af-packet-
inputdpdk-input
ip4-lookup-
mulitcastip4-lookup*
ethernet-
input
mpls-inputlldp-input
arp-inputcdp-input
...-no-
checksum
ip6-inputl2-input ip4-input
ip4-load-
balance
mpls-policy-
encap
ip4-rewrite-
transit
ip4-
midchain
interface-
output
* Each graph node implements a “micro-NF”, a “micro-NetworkFunction” processing packets.
VPP Architecture: Packet Processing
Packet
Vector of n packets
ethernet-input
dpdk-input
vhost-user-input af-packet-input
ip4-inputip6-input arp-input
ip6-lookup
ip4-lookup
ip6-localip6-rewriteip4-
rewriteip4-local
mpls-input
…
…
Packet Processing Graph
Graph Node
Input Graph Node
0 1 32 … n
VPP Architecture: Splitting the Vector
Packet
Vector of n packets
ethernet-input
dpdk-input
vhost-user-input af-packet-input
ip4-inputip6-input arp-input
ip6-lookup
ip4-lookup
ip6-localip6-rewriteip4-
rewriteip4-local
mpls-input
…
…
Packet Processing Graph
Graph Node
Input Graph Node
20 1 3 … n
VPP Architecture: Plugins0 1 32 … n
Packet
Vector of n packets
ethernet-input
dpdk-input
vhost-user-input af-packet-input
ip4-inputip6-input arp-input
ip6-lookup
ip4-lookup
ip6-localip6-rewriteip4-
rewriteip4-local
mpls-input
…
…
custom-1
custom-2 custom-3
Packet Processing Graph
Graph Node
Input Graph Node
/usr/lib/vpp_plugins/foo.soPlugin Plugins are:
First class citizensThat can:
Add graph nodesAdd APIRearrange the graph
Hardware Plugin
hw-accel-input
Skip sftw nodeswhere work is done byhardware already
Can be built independently of VPP source tree
FD.io Foundation 14
K8s Networking Microservice: Contiv-VPP
Node
Kubelet
CNIvet
h
Contiv-VPP
VPP Agent
… PodPodPod
PodPodPod
VPP
Node
Kubelet
CNI vet
h
Contiv-VPP
VPP Agent
…PodPodPod
PodPodPod
VPP
…
K8s Master
vswitch CNF Podvswitch CNF Pod
IPv4/IPv6/SRv6 Network
tapv2 tapv2
Motivation: Container networking
FD.io Mini-Summit at KubeCon Europe 2018
FIFO
TCP
IP (routing)
device
send()
FIFO
TCP
IP (routing)
device
recv()
kernel
glibc
PID 1234 PID 4321
Why not this?
PID 1234 PID 4321
recv()
FIFOFIFO
TCP
IP
DPDK
send()
Session
FD.io Mini-Summit at KubeCon Europe 2018
VPP
VPP Host Stack
FD.io Mini-Summit at KubeCon Europe 2018
Session
App
Binary API
TCP
IP, DPDK
VPP
shmsegmentrx tx
VPP Host Stack: SVM FIFOs
FD.io Mini-Summit at KubeCon Europe 2018
Session
App
Binary API
TCP
IP, DPDK
VPP
Allocated within shared memory segments with or without file backing (ssvm/memfd)
Fixed position and size Lock free enqueue/dequeue but atomic size
increment Option to dequeue/peek data Support for out-of-order data enqueues
shmsegmentrx tx
VPP Host Stack: TCP
FD.io Mini-Summit at KubeCon Europe 2018
Session
App
Binary API
TCP
IP, DPDK
VPP
shmsegmentrx tx
Clean-slate implementation “Complete” state machine implementation Connection management and flow control
(window management) Timers and retransmission, fast retransmit,
SACK NewReno congestion control, SACK based fast
recovery Checksum offloading Linux compatibility tested with IWL TCP
protocol tester
SPDK w/ VPP Host Stack: More network option
FD.io Mini-Summit at KubeCon Europe 2018
Session
App
Binary API
TCP
IP, DPDK
VPP
shmsegmentrx tx
SCTP UDP TLS
IPv4, IPv6 Bridging/Routing MPLSoX, SRv6 VXLAN{-GPE}, Geneve, GRE Much much more
iSCSI/SPDK
Future: StorageUnified Storage/Networking Graph
FD.io Foundation 22
- Unified Storage Networking Graph allows hyper efficient processing of blocks to packets and packets to blocks
- Avoid copies- Avoid cache misses- Utilize other VPP performance tricks
- Most Storage IO is connected to Network IO- Can extend with additional protocols like
ROCEv2
vhost-user-
input
af-packet-
inputdpdk-input
ip4-lookup-
mulitcastip4-lookup*
ethernet-
input
mpls-inputlldp-input
arp-inputcdp-input
...-no-
checksum
ip6-inputl2-input ip4-input
ip4-load-
balance
mpls-policy-
encap
ip4-rewrite-
transit
ip4-
midchain
interface-
output
spdk-input
block processing
iSCSI ROCEv2
tcp-output
23
Intel® Builders 24
iSCSI target architecture Extension
SP
DK
U
S
E
R
S
P
A
C
E
K
E
R
N
E
L
S
P
A
C
E
U
S
E
R
S
P
A
C
E
NVMe Driver
Block Device Abstraction
iSCSItarget
POSIX sockets
Ke
rne
lL4 TCP
L2/L3 MAC/IP
NIC Driver
SP
DK
NVMe Driver
Block Device Abstraction
iSCSItarget
VPP API
VP
PTCP host stack
VPP Graph nodes
DPDK NIC Driver
Network Services
APIStorage Services
Intel® Builders 25
iSCSI target architecture with VPPSPDK iSCSI target is using VPP Communications Library (VCL):
• No kernel syscalls from top to bottom
• Better CPU utilization
• Extensive VPP networking capabilities available
USERSPACE PROCESS
SP
DK
NVMe Driver
Block Device Abstraction
iSCSItarget
VCL API
VP
PTCP host stack
VPP Graph nodes
DPDK NIC Driver
Network Services
APIStorage Services
Shared memory
USERSPACE PROCESS
Intel® Builders 26
net framework abstractioniSCSI target is not aware of socket types used
All net framework types can be used at the same
time
POSIX sockets are still available
VPP support - optional at compile time
Enables usage in other libraries in the future
(such as NVMe-oF target)
SP
DKiSCSI
target
Net framework
Kernel VPP
NVMe-oFtarget
POSIX sockets
VPP API
Planned
APIStorage Services
Intel® Builders 27
VPP integrationKey steps for running SPDK iSCSI target with VPP:
1. Build SPDK with VPP support
2. Run VPP process
3. Configure interfaces using VPPCTL utility
4. Start SPDK iSCSI target, which can now utilize VPP interfaces
All configuration steps can be found on spdk.io iSCSI target documentation
http://www.spdk.io/doc/iscsi.html
Intel® Builders 28
What about performance DATA?
30
Backup