Ed Warnicke, Cisco Tomasz Zawadzki, Intel · S E R S P A C E K E R N E L S P A C E U S E R S P A C...

Post on 25-Mar-2020

0 views 0 download

Transcript of Ed Warnicke, Cisco Tomasz Zawadzki, Intel · S E R S P A C E K E R N E L S P A C E U S E R S P A C...

Ed Warnicke, Cisco

Tomasz Zawadzki, Intel

Agenda

SPDK iSCSI target overview

FD.io and VPP

SPDK iSCSI VPP integration

Q&A

2

Notices & DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration.

No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks .

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks .

Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system.

Intel® Advanced Vector Extensions (Intel® AVX)* provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo.

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

© 2018 Intel Corporation. Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as property of others.

4

Intel® Builders 5

Moving to UserspaceAlternate solutions (RDMA) are moving in strides

TCP/IP transport has been present for much longer

There are still use cases for TCP/IP

Even NVMe-oF transport will be using it in the future

Intel® Builders 6

SPDK iSCSI target overviewUsing POSIX sockets for data path negates benefits of userspace storage services by:

• Having syscalls go to kernel and back

• Adding back interrupts

SP

DK

U

S

E

R

S

P

A

C

E

K

E

R

N

E

L

S

P

A

C

E

NVMe Driver

Block Device Abstraction

iSCSItarget

Ke

rne

lL4 TCP

L2/L3 MAC/IP

NIC Driver

POSIX sockets

7

FD.io: The Universal Dataplane

• Project at Linux Foundation• Multi-party

• Multi-project

• Software Dataplane• High throughput

• Low Latency

• Feature Rich

• Resource Efficient

• Bare Metal/VM/Container

• Multiplatform

FD.io Foundation 8

Bare Metal/VM/Container

• Fd.io Scope:

• Network IO - NIC/vNIC <-> cores/threads

• Packet Processing –Classify/Transform/Prioritize/Forward/Terminate

• Dataplane Management Agents - ControlPlane

Dataplane Management Agent

Packet Processing

Network IO

VPP – Vector Packet ProcessingCompute Optimized SW Network Platform

Packet Processing Software Platform

• High performance

• Linux user space

• Runs on compute CPUs: - And “knows” how to run them well !

Shipping at volume in server & embedded products

9

Packet Processing

Dataplane Management Agent

Network IO

Bare-metal / VM / Container

VPP – How does it work?Compute Optimized SW Network Platform

1Packet processing is decomposedinto a directed graph of nodes …

Packet 0

Packet 1

Packet 2

Packet 3

Packet 4

Packet 5

Packet 6

Packet 7

Packet 8

Packet 9

Packet 10

… packets move through graph nodes in vector …

2

Microprocessor

… graph nodes are optimized to fit inside the instruction cache …

… packets are pre-fetched into the data cache.

Instruction Cache3

Data Cache4

3

4

Makes use of modern Intel® Xeon® Processor micro-architectures.

Instruction cache & data cache always hot Minimized memory latency and usage.

vhost-user-

input

af-packet-

inputdpdk-input

ip4-lookup-

mulitcastip4-lookup*

ethernet-

input

mpls-inputlldp-input

arp-inputcdp-input

...-no-

checksum

ip6-inputl2-input ip4-input

ip4-load-

balance

mpls-policy-

encap

ip4-rewrite-

transit

ip4-

midchain

interface-

output

* Each graph node implements a “micro-NF”, a “micro-NetworkFunction” processing packets.

VPP Architecture: Packet Processing

Packet

Vector of n packets

ethernet-input

dpdk-input

vhost-user-input af-packet-input

ip4-inputip6-input arp-input

ip6-lookup

ip4-lookup

ip6-localip6-rewriteip4-

rewriteip4-local

mpls-input

Packet Processing Graph

Graph Node

Input Graph Node

0 1 32 … n

VPP Architecture: Splitting the Vector

Packet

Vector of n packets

ethernet-input

dpdk-input

vhost-user-input af-packet-input

ip4-inputip6-input arp-input

ip6-lookup

ip4-lookup

ip6-localip6-rewriteip4-

rewriteip4-local

mpls-input

Packet Processing Graph

Graph Node

Input Graph Node

20 1 3 … n

VPP Architecture: Plugins0 1 32 … n

Packet

Vector of n packets

ethernet-input

dpdk-input

vhost-user-input af-packet-input

ip4-inputip6-input arp-input

ip6-lookup

ip4-lookup

ip6-localip6-rewriteip4-

rewriteip4-local

mpls-input

custom-1

custom-2 custom-3

Packet Processing Graph

Graph Node

Input Graph Node

/usr/lib/vpp_plugins/foo.soPlugin Plugins are:

First class citizensThat can:

Add graph nodesAdd APIRearrange the graph

Hardware Plugin

hw-accel-input

Skip sftw nodeswhere work is done byhardware already

Can be built independently of VPP source tree

FD.io Foundation 14

K8s Networking Microservice: Contiv-VPP

Node

Kubelet

CNIvet

h

Contiv-VPP

VPP Agent

… PodPodPod

PodPodPod

VPP

Node

Kubelet

CNI vet

h

Contiv-VPP

VPP Agent

…PodPodPod

PodPodPod

VPP

K8s Master

vswitch CNF Podvswitch CNF Pod

IPv4/IPv6/SRv6 Network

tapv2 tapv2

Motivation: Container networking

FD.io Mini-Summit at KubeCon Europe 2018

FIFO

TCP

IP (routing)

device

send()

FIFO

TCP

IP (routing)

device

recv()

kernel

glibc

PID 1234 PID 4321

Why not this?

PID 1234 PID 4321

recv()

FIFOFIFO

TCP

IP

DPDK

send()

Session

FD.io Mini-Summit at KubeCon Europe 2018

VPP

VPP Host Stack

FD.io Mini-Summit at KubeCon Europe 2018

Session

App

Binary API

TCP

IP, DPDK

VPP

shmsegmentrx tx

VPP Host Stack: SVM FIFOs

FD.io Mini-Summit at KubeCon Europe 2018

Session

App

Binary API

TCP

IP, DPDK

VPP

Allocated within shared memory segments with or without file backing (ssvm/memfd)

Fixed position and size Lock free enqueue/dequeue but atomic size

increment Option to dequeue/peek data Support for out-of-order data enqueues

shmsegmentrx tx

VPP Host Stack: TCP

FD.io Mini-Summit at KubeCon Europe 2018

Session

App

Binary API

TCP

IP, DPDK

VPP

shmsegmentrx tx

Clean-slate implementation “Complete” state machine implementation Connection management and flow control

(window management) Timers and retransmission, fast retransmit,

SACK NewReno congestion control, SACK based fast

recovery Checksum offloading Linux compatibility tested with IWL TCP

protocol tester

SPDK w/ VPP Host Stack: More network option

FD.io Mini-Summit at KubeCon Europe 2018

Session

App

Binary API

TCP

IP, DPDK

VPP

shmsegmentrx tx

SCTP UDP TLS

IPv4, IPv6 Bridging/Routing MPLSoX, SRv6 VXLAN{-GPE}, Geneve, GRE Much much more

iSCSI/SPDK

Future: StorageUnified Storage/Networking Graph

FD.io Foundation 22

- Unified Storage Networking Graph allows hyper efficient processing of blocks to packets and packets to blocks

- Avoid copies- Avoid cache misses- Utilize other VPP performance tricks

- Most Storage IO is connected to Network IO- Can extend with additional protocols like

ROCEv2

vhost-user-

input

af-packet-

inputdpdk-input

ip4-lookup-

mulitcastip4-lookup*

ethernet-

input

mpls-inputlldp-input

arp-inputcdp-input

...-no-

checksum

ip6-inputl2-input ip4-input

ip4-load-

balance

mpls-policy-

encap

ip4-rewrite-

transit

ip4-

midchain

interface-

output

spdk-input

block processing

iSCSI ROCEv2

tcp-output

23

Intel® Builders 24

iSCSI target architecture Extension

SP

DK

U

S

E

R

S

P

A

C

E

K

E

R

N

E

L

S

P

A

C

E

U

S

E

R

S

P

A

C

E

NVMe Driver

Block Device Abstraction

iSCSItarget

POSIX sockets

Ke

rne

lL4 TCP

L2/L3 MAC/IP

NIC Driver

SP

DK

NVMe Driver

Block Device Abstraction

iSCSItarget

VPP API

VP

PTCP host stack

VPP Graph nodes

DPDK NIC Driver

Network Services

APIStorage Services

Intel® Builders 25

iSCSI target architecture with VPPSPDK iSCSI target is using VPP Communications Library (VCL):

• No kernel syscalls from top to bottom

• Better CPU utilization

• Extensive VPP networking capabilities available

USERSPACE PROCESS

SP

DK

NVMe Driver

Block Device Abstraction

iSCSItarget

VCL API

VP

PTCP host stack

VPP Graph nodes

DPDK NIC Driver

Network Services

APIStorage Services

Shared memory

USERSPACE PROCESS

Intel® Builders 26

net framework abstractioniSCSI target is not aware of socket types used

All net framework types can be used at the same

time

POSIX sockets are still available

VPP support - optional at compile time

Enables usage in other libraries in the future

(such as NVMe-oF target)

SP

DKiSCSI

target

Net framework

Kernel VPP

NVMe-oFtarget

POSIX sockets

VPP API

Planned

APIStorage Services

Intel® Builders 27

VPP integrationKey steps for running SPDK iSCSI target with VPP:

1. Build SPDK with VPP support

2. Run VPP process

3. Configure interfaces using VPPCTL utility

4. Start SPDK iSCSI target, which can now utilize VPP interfaces

All configuration steps can be found on spdk.io iSCSI target documentation

http://www.spdk.io/doc/iscsi.html

Intel® Builders 28

What about performance DATA?

30

Backup