目录
文章目录
- 目录
- 配置
- 80-vpp.conf
- startup.conf
- 可以配置 VPP 的 Threading Modes
- 运行示例
- non-DPDK 模式运行 VPP
- DPDK 模式运行 VPP
- Router and Switch for namespaces
配置
- VPP 配置手册:https://fd.io/docs/vpp/master/gettingstarted/users/configuring/index.html
VPP 有两个重要的配置文件:
- VPP Sysctl 配置文件:80-vpp.conf
- VPP 配置文件:startup.conf
80-vpp.conf
80-vpp.conf 在安装 VPP 的过程中,需要手动或自动的被复制到 /etc/sysctl.d/80-vpp.conf,这样我们就可以使用 systemctl 指令来管理 VPP 守护进程的生命周期了。
$ systemctl start vpp && systemctl enable vpp && systemctl status vpp
80-vpp.conf 的初始内容如下,主要设定了 VPP 守护进程所需要占用的 HugePage(大页内存)数量。
# Number of 2MB hugepages desired
vm.nr_hugepages=1024
# Must be greater than or equal to (2 * vm.nr_hugepages).
vm.max_map_count=3096
# All groups allowed to access hugepages
vm.hugetlb_shm_group=0
# Shared Memory Max must be greater or equal to the total size of hugepages.
# For 2MB pages, TotalHugepageSize = vm.nr_hugepages * 2 * 1024 * 1024
# If the existing kernel.shmmax setting (cat /proc/sys/kernel/shmmax)
# is greater than the calculated TotalHugepageSize then set this parameter
# to current shmmax value.
kernel.shmmax=2147483648
如上所示,默认的,VPP 会将操作系统的 HugePages 设置为 10242M。
$ cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 1024
HugePages_Free: 982
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
注意,启动 VPP 时,会将操作系统原有的 HugePage 设置覆盖掉。所以,当使用 OpenStack VM 来启动 VPP 时,注意 VM 需要配置为开启 HugePage,且 VPP 配置的大页数最好与 VM 配置的大页数保持一致。
startup.conf
startup.conf 作为 VPP 的主要配置文件,包含了以下几个部分。
- unix:Unix-Like 操作系统层面的配置。
- api-trace
- api-segment
- statseg
- socksvr
- cpu:CPU 相关配置。
- memory:内存相关配置。
- buffers:缓冲相关配置。
- node:图节点相关配置。
- dpdk:DPDK 相关配置。
- plugins:插件相关配置。
- l2fib:L2 FIB 相关配置。
- ipsec:IPSec 相关配置。
- logging:日志配置。
unix {
# nodaemon
log /var/log/vpp/vpp.log
full-coredump
cli-listen /run/vpp/cli.sock
gid vpp
## run vpp in the interactive mode
# interactive
## do not use colors in terminal output
# nocolor
## do not display banner
# nobanner
}
api-trace {
## This stanza controls binary API tracing. Unless there is a very strong reason,
## please leave this feature enabled.
on
## Additional parameters:
##
## To set the number of binary API trace records in the circular buffer, configure nitems
##
## nitems <nnn>
##
## To save the api message table decode tables, configure a filename. Results in /tmp/<filename>
## Very handy for understanding api message changes between versions, identifying missing
## plugins, and so forth.
##
## save-api-table <filename>
}
api-segment {
gid vpp
}
socksvr {
default
}
# memory {
## Set the main heap size, default is 1G
# main-heap-size 2G
## Set the main heap page size. Default page size is OS default page
## which is in most cases 4K. if different page size is specified VPP
## will try to allocate main heap by using specified page size.
## special keyword 'default-hugepage' will use system default hugepage
## size
# main-heap-page-size 1G
#}
cpu {
## In the VPP there is one main thread and optionally the user can create worker(s)
## The main thread and worker thread(s) can be pinned to CPU core(s) manually or automatically
## Manual pinning of thread(s) to CPU core(s)
## Set logical CPU core where main thread runs, if main core is not set
## VPP will use core 1 if available
# main-core 1
## Set logical CPU core(s) where worker threads are running
# corelist-workers 2-3,18-19
## Automatic pinning of thread(s) to CPU core(s)
## Sets number of CPU core(s) to be skipped (1 ... N-1)
## Skipped CPU core(s) are not used for pinning main thread and working thread(s).
## The main thread is automatically pinned to the first available CPU core and worker(s)
## are pinned to next free CPU core(s) after core assigned to main thread
# skip-cores 4
## Specify a number of workers to be created
## Workers are pinned to N consecutive CPU cores while skipping "skip-cores" CPU core(s)
## and main thread's CPU core
# workers 2
## Set scheduling policy and priority of main and worker threads
## Scheduling policy options are: other (SCHED_OTHER), batch (SCHED_BATCH)
## idle (SCHED_IDLE), fifo (SCHED_FIFO), rr (SCHED_RR)
# scheduler-policy fifo
## Scheduling priority is used only for "real-time policies (fifo and rr),
## and has to be in the range of priorities supported for a particular policy
# scheduler-priority 50
}
# buffers {
## Increase number of buffers allocated, needed only in scenarios with
## large number of interfaces and worker threads. Value is per numa node.
## Default is 16384 (8192 if running unpriviledged)
# buffers-per-numa 128000
## Size of buffer data area
## Default is 2048
# default data-size 2048
## Size of the memory pages allocated for buffer data
## Default will try 'default-hugepage' then 'default'
## you can also pass a size in K/M/G e.g. '8M'
# page-size default-hugepage
# }
# dpdk {
## Change default settings for all interfaces
# dev default {
## Number of receive queues, enables RSS
## Default is 1
# num-rx-queues 3
## Number of transmit queues, Default is equal
## to number of worker threads or 1 if no workers treads
# num-tx-queues 3
## Number of descriptors in transmit and receive rings
## increasing or reducing number can impact performance
## Default is 1024 for both rx and tx
# num-rx-desc 512
# num-tx-desc 512
## VLAN strip offload mode for interface
## Default is off
# vlan-strip-offload on
## TCP Segment Offload
## Default is off
## To enable TSO, 'enable-tcp-udp-checksum' must be set
# tso on
## Devargs
## device specific init args
## Default is NULL
# devargs safe-mode-support=1,pipeline-mode-support=1
## rss-queues
## set valid rss steering queues
# rss-queues 0,2,5-7
# }
## Whitelist specific interface by specifying PCI address
# dev 0000:02:00.0
## Blacklist specific device type by specifying PCI vendor:device
## Whitelist entries take precedence
# blacklist 8086:10fb
## Set interface name
# dev 0000:02:00.1 {
# name eth0
# }
## Whitelist specific interface by specifying PCI address and in
## addition specify custom parameters for this interface
# dev 0000:02:00.1 {
# num-rx-queues 2
# }
## Change UIO driver used by VPP, Options are: igb_uio, vfio-pci,
## uio_pci_generic or auto (default)
# uio-driver vfio-pci
## Disable multi-segment buffers, improves performance but
## disables Jumbo MTU support
# no-multi-seg
## Change hugepages allocation per-socket, needed only if there is need for
## larger number of mbufs. Default is 256M on each detected CPU socket
# socket-mem 2048,2048
## Disables UDP / TCP TX checksum offload. Typically needed for use
## faster vector PMDs (together with no-multi-seg)
# no-tx-checksum-offload
## Enable UDP / TCP TX checksum offload
## This is the reversed option of 'no-tx-checksum-offload'
# enable-tcp-udp-checksum
# }
## node variant defaults
#node {
## specify the preferred default variant
# default { variant avx512 }
## specify the preferred variant, for a given node
# ip4-rewrite { variant avx2 }
#}
# plugins {
## Adjusting the plugin path depending on where the VPP plugins are
# path /ws/vpp/build-root/install-vpp-native/vpp/lib/vpp_plugins
## Disable all plugins by default and then selectively enable specific plugins
# plugin default { disable }
# plugin dpdk_plugin.so { enable }
# plugin acl_plugin.so { enable }
## Enable all plugins by default and then selectively disable specific plugins
# plugin dpdk_plugin.so { disable }
# plugin acl_plugin.so { disable }
# }
## Statistics Segment
# statseg {
# socket-name <filename>, name of the stats segment socket
# defaults to /run/vpp/stats.sock
# size <nnn>[KMG], size of the stats segment, defaults to 32mb
# page-size <nnn>, page size, ie. 2m, defaults to 4k
# per-node-counters on | off, defaults to none
# update-interval <f64-seconds>, sets the segment scrape / update interval
# }
## L2 FIB
# l2fib {
## l2fib hash table size.
# table-size 512M
## l2fib hash table number of buckets. Must be power of 2.
# num-buckets 524288
# }
## ipsec
# {
# ip4 {
## ipsec for ipv4 tunnel lookup hash number of buckets.
# num-buckets 524288
# }
# ip6 {
## ipsec for ipv6 tunnel lookup hash number of buckets.
# num-buckets 524288
# }
# }
# logging {
## set default logging level for logging buffer
## logging levels: emerg, alert,crit, error, warn, notice, info, debug, disabled
# default-log-level debug
## set default logging level for syslog or stderr output
# default-syslog-log-level info
## Set per-class configuration
# class dpdk/cryptodev { rate-limit 100 level debug syslog-level error }
# }
可以配置 VPP 的 Threading Modes
Single-Threaded:
- Control 和 Forwarding Engine 都运行在一个 Thread 中。
Multi-thread with workers only:
- Control 运行在 Main thread(API,CLI)中。
- Forwarding Engine 分布运行在一个或多个 Thread 中。
Multi-thread with IO and Workers:
- Control 运行在 Main thread(API,CLI)
- IO thread handling input and dispatching to worker threads
- Worker threads doing actual work including interface TX
- RSS is in use
Multi-thread with Main and IO on a single thread
- Main 和 IO 运行在单个 Thread 中。
- 而 Workers 则分布在不通过的 Core 上运行。
运行示例
non-DPDK 模式运行 VPP
non-DPDK 模式运行 VPP,使用 Linux 的虚拟网络设备(e.g. veth pair)作为 VPP Host-interface。
- 编辑 VPP1 的配置。
$ vi /etc/vpp/startup1.conf
unix {
nodaemon
cli-listen /run/vpp/cli-vpp1.sock
}
plugins {
plugin dpdk_plugin.so { disable }
}
- 启动进程。
$ vpp -c /etc/vpp/startup1.conf
vpp[10475]: clib_elf_parse_file: open `/usr/bin/vp': No such file or director
vpp[10475]: vat-plug/load: vat_plugin_register: oddbuf plugin not loaded...
- 进入 CLI。
$ vppctl -s /run/vpp/cli-vpp1.sock
_______ _ _ _____ ___
__/ __/ _ \ (_)__ | | / / _ \/ _ \
_/ _// // / / / _ \ | |/ / ___/ ___/
/_/ /____(_)_/\___/ |___/_/ /_/
vpp# show version
vpp v21.01.0-5~g6bd1c77fd built by root on vpp-host at 2021-05-12T16:17:20
- 创建 Linux veth pair 设备,vpp1out 作为 VPP 的 Host-Interface,vpp1host 作为 Host(宿主机)的 Network Interface。
$ sudo ip link add name vpp1out type veth peer name vpp1host
$ ip l
...
4: vpp1host@vpp1out: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 22:5b:e3:6a:2c:85 brd ff:ff:ff:ff:ff:ff
5: vpp1out@vpp1host: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 22:40:3a:b0:48:a2 brd ff:ff:ff:ff:ff:ff
- Turn up both ends:
$ sudo ip link set dev vpp1out up
$ sudo ip link set dev vpp1host up
- Assign an IP address
$ sudo ip addr add 10.10.1.1/24 dev vpp1host
$ ip addr show vpp1host
4: vpp1host@vpp1out: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 22:5b:e3:6a:2c:85 brd ff:ff:ff:ff:ff:ff
inet 10.10.1.1/24 scope global vpp1host
valid_lft forever preferred_lft forever
inet6 fe80::205b:e3ff:fe6a:2c85/64 scope link
valid_lft forever preferred_lft forever
- Create a VPP host-interface attached to vpp1out.
vpp# create host-interface name vpp1out
host-vpp1out
vpp# show hardware
Name Idx Link Hardware
host-vpp1out 1 up host-vpp1out
Link speed: unknown
Ethernet address 02:fe:d6:f5:de:03
Linux PACKET socket interface
local0 0 down local0
Link speed: unknown
local
- Turn up the VPP host-interfac:
vpp# set int state host-vpp1out up
vpp# show int
Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count
host-vpp1out 1 up 9000/0/0/0
local0
- Assign ip address 10.10.1.2/24
vpp# set int ip address host-vpp1out 10.10.1.2/24
vpp# show int addr
host-vpp1out (up):
L3 10.10.1.2/24
local0 (dn):
- 测试连通性
vpp# ping 10.10.1.1
116 bytes from 10.10.1.1: icmp_seq=1 ttl=64 time=9.1423 ms
DPDK 模式运行 VPP
DPDK 模式运行 VPP,使用真实的 Host 网卡设备作为 VPP Host-interface。笔者的环境为 OpenStack 虚拟机,vNIC 为 OvS vTap,驱动为 virtio。
$ lspci
...
00:07.0 Ethernet controller: Red Hat, Inc. Virtio network device
修改 startup.conf 配置:
dpdk {
dev 0000:00:07.0 {
num-rx-queues 1
}
}
重启 VPP 后查看 interface 状态:
vpp# show interface addr
GigabitEthernet0/7/0 (dn):
local0 (dn):