Skip to content

Build For Azure Ubuntu (mlx5)

Hanoh Haim edited this page Jun 3, 2020 · 9 revisions

TRex package is built with DPDK mlx5/tap driver bind to CentOs kernel headers and it is not supported anymore. It won’t work with a different kernel (e.g. Ubuntu that has newer kernel) To support different kernel API version the following steps should be followed.

NOTE:TRex does not work from V2.80 up to v2.81 due to DPDK 20.02 issue. v2.82 should work (see below how with DPDK 20.02) and v2.79 should work with DPDK 19.05.

Note
Azure change the distro and IO model over time, so the following procedure (compile from source) has more chances to work

Build Ubuntu VM in Azure

az vm create --resource-group rgXYZ --name TrexUbuntuAN --image UbuntuLTS --size Standard_F16s_v2 --admin-username azureuser --admin-password trexTesting --nics ANAutoLinux2_Gi1_NIC ANAutoLinux1_Gi2_NIC ANAutoLinux2_Gi2_NIC
Note
Do not add AN on eth0/management to cut down on mapping confusion with MLX But we may want to add it in the future to cut down interrupts from eth0 that may end up on the TREX cores running at 100%

Follow Azure DPDK setup steps

  • The Ubuntu Azure kernel provides the best network performance on Azure

sudo add-apt-repository ppa:canonical-server/dpdk-azure -y
sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade
sudo apt-get install -y librdmacm-dev librdmacm1 build-essential libnuma-dev libmnl-dev
lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic

setup huge pages on reboot

sudo vi /etc/default/grub
# default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never
# GRUB_CMDLINE_LINUX=" default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never "


cat /etc/default/grub

    # If you change this file, run 'update-grub' afterwards to update
    # /boot/grub/grub.cfg.
    # For full documentation of the options in this file, see:
    #   info -f grub -n 'Simple configuration'



    GRUB_DEFAULT=0
    GRUB_TIMEOUT_STYLE=hidden
    GRUB_TIMEOUT=0
    GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
    GRUB_CMDLINE_LINUX=" default_hugepagesz=1GB hugepagesz=1G hugepages=8 transparent_hugepage=never"

    # Uncomment to enable BadRAM filtering, modify to suit your needs
    # This works with Linux (no patch required) and with any kernel that obtains
    # the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)

    #GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"
    # Uncomment to disable graphical terminal (grub-pc only)

    #GRUB_TERMINAL=console

    # The resolution used on graphical terminal
    # note that you can use only modes which your graphic card supports via VBE
    # you can see them in real GRUB with the command `vbeinfo'
    #GRUB_GFXMODE=640x480

    # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
    #GRUB_DISABLE_LINUX_UUID=true


    # Uncomment to disable generation of recovery mode menu entries
    #GRUB_DISABLE_RECOVERY="true"


    # Uncomment to get a beep at grub start
    #GRUB_INIT_TUNE="480 440 1"


sudo update-grub
sudo vi /etc/fstab
# nodev /mnt/huge hugetlbfs defaults 0 0


    cat /etc/fstab
    # CLOUD_IMG: This file was created/modified by the Cloud Image build process
    UUID=3214843a-f90f-4908-bb0a-4b80480810c1       /        ext4   defaults,discard        0 0
    UUID=22B8-1E20  /boot/efi       vfat    defaults,discard        0 0
    nodev /mnt/huge hugetlbfs defaults 0 0
    /dev/disk/cloud/azure_resource-part1    /mnt    auto    defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig       0       2

Prevent renaming of dtapX

sudo vi /etc/systemd/network/10-dtap.link

# [Match]
# OriginalName=dtap*
# [Link]
# NamePolicy=kernel

cat /etc/systemd/network/10-dtap.link

    [Match]
    OriginalName=dtap*
    [Link]
    NamePolicy=kernel

Load Azure drivers on reboot

sudo vi /etc/modules-load.d/modules.conf
# ib_uverbs
# mlx4_ib
# mlx5_ib



    cat /etc/modules-load.d/modules.conf
    # /etc/modules: kernel modules to load at boot time.
    #
    # This file contains the names of kernel modules that should be loaded
    # at boot time, one per line. Lines beginning with "#" are ignored.
    ib_uverbs
    mlx4_ib
    mlx5_ib

Reboot

>sudo reboot

After reboot

Validate huge pages and drivers loaded

cat /proc/meminfo | grep Huge
lsmod | grep ib_uverbs

Follow Azure DPDK build v2.79 including and down

Build steps to get autoconf files

tar xJf dpdk-19.05.tar.xz
cd ~/dpdk-19.05
make config T=x86_64-native-linuxapp-gcc
sed -ri 's,(MLX._PMD=)n,\1y,' build/.config
make
Note
if IGB_UIO fails add “-Wno-implicit-fallthrough” to dpdk-19.05/build/build/kernel/linux/igb_uio/Makefile

Validate DPDK with testpmd

cd ~/dpdk-19.05/build/app
sudo ./testpmd -w 0002:00:02.0 -w 0003:00:02.0 --vdev="net_vdev_netvsc0,iface=eth1" --vdev="net_vdev_netvsc1,iface=eth2" -- -i
# validate dtapX exist (no renaming)


#From another shell with test_pmd running verify hv_netvsc ethX and dtapX
ifconfig -a | grep flags
dtap0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
dtap1: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
eth1: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
eth2: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
rename5: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
rename6: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500

Copy autoconf files

From DPDK to use with TREX build

cd ~/dpdk-19.05/
find . -name "*autoconf.h"
./build/build/drivers/net/tap/tap_autoconf.h
./build/build/drivers/net/mlx4/mlx4_autoconf.h
./build/build/drivers/net/mlx5/mlx5_autoconf.h
cp ./build/build/drivers/net/tap/tap_autoconf.h ~/.
cp ./build/build/drivers/net/mlx4/mlx4_autoconf.h ~/.
cp ./build/build/drivers/net/mlx5/mlx5_autoconf.h ~/.

Setup for build with TREX

cd ~
sudo apt-get install -y python3-distutils
sudo apt install zlib1g-dev
git clone https://github.com/cisco-system-traffic-generator/trex-core.git # better to take a latest

# reset head to v2.79 (TBD Debug issues with DPDK 20.02)
git reset --hard fe76d22b

# To compile DPDK (TAP, MLX4, MLX5) on the machine natively
# Copy in the autoconf files from DPDK build.
# In TAP dir (src/dpdk/drivers/net/tap) switch all #include <linux_tap/> to <linux/>
# Manually edit linux_dpdk/ws_main.py to set check_ofed to true and remove MLX4 define
# (see diffs below)
Copy the files
cd ~/trex-core
cp ~/tap_autoconf.h src/dpdk/drivers/net/tap/tap_autoconf.h
cp ~/mlx4_autoconf.h src/dpdk/drivers/net/mlx4/mlx4_autoconf.h
cp ~/mlx5_autoconf.h src/dpdk/drivers/net/mlx5/mlx5_autoconf.h
cd src/dpdk/drivers/net/tap
sed -i 's/linux_tap/linux/g' *
cd ~/trex-core
status
git status
modified:   linux_dpdk/ws_main.py
modified:   src/dpdk/drivers/net/mlx4/mlx4_autoconf.h
modified:   src/dpdk/drivers/net/mlx5/mlx5_autoconf.h
modified:   src/dpdk/drivers/net/tap/rte_eth_tap.c
modified:   src/dpdk/drivers/net/tap/rte_eth_tap.h
modified:   src/dpdk/drivers/net/tap/tap_autoconf.h
modified:   src/dpdk/drivers/net/tap/tap_netlink.h
modified:   src/dpdk/drivers/net/tap/tap_tcmsgs.h

Follow Azure DPDK build - v2.82 including and up

Build steps to get autoconf files

tar xJf dpdk-20.02.tar.xz
cd ~/dpdk-20.02
make config T=x86_64-native-linuxapp-gcc
sed -ri 's,(MLX._PMD=)n,\1y,' build/.config
make
Note
if IGB_UIO fails add “-Wno-implicit-fallthrough” to dpdk-19.05/build/build/kernel/linux/igb_uio/Makefile

Validate DPDK with testpmd

cd ~/dpdk-20.02/build/app
sudo ./testpmd -w 0002:00:02.0 -w 0003:00:02.0 --vdev="net_vdev_netvsc0,iface=eth1" --vdev="net_vdev_netvsc1,iface=eth2" -- -i
# validate dtapX exist (no renaming)


#From another shell with test_pmd running verify hv_netvsc ethX and dtapX
ifconfig -a | grep flags
dtap0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
dtap1: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
eth1: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
eth2: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
rename5: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
rename6: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500

Copy autoconf files

From DPDK to use with TREX build

cd ~/dpdk-20.02/
find . -name "*autoconf.h"
./build/build/drivers/net/mlx4/mlx4_autoconf.h
./build/build/drivers/net/tap/tap_autoconf.h
./build/build/drivers/common/mlx5/mlx5_autoconf.h

cp ./build/build/drivers/net/mlx4/mlx4_autoconf.h ~/.
cp ./build/build/drivers/net/tap/tap_autoconf.h ~/.
cp ./build/build/drivers/common/mlx5/mlx5_autoconf.h ~/.

Setup for build with TREX

cd ~
sudo apt-get install -y python3-distutils
sudo apt install zlib1g-dev
git clone https://github.com/cisco-system-traffic-generator/trex-core.git # better to take a latest

# To compile DPDK (TAP, MLX4, MLX5) on the machine natively
# Copy in the autoconf files from DPDK build.
# In TAP dir (src/dpdk/drivers/net/tap) switch all #include <linux_tap/> to <linux/>
# Manually edit linux_dpdk/ws_main.py to set check_ofed to true and remove MLX4 define
# (see diffs below)
Copy the files
cd ~/trex-core
cp ~/tap_autoconf.h src/dpdk/drivers/net/tap/tap_autoconf.h
cp ~/mlx4_autoconf.h src/dpdk/drivers/net/mlx4/mlx4_autoconf.h
cp ~/mlx5_autoconf.h src/dpdk/drivers/common/mlx5/mlx5_autoconf.h
cd src/dpdk/drivers/net/tap
sed -i 's/linux_tap/linux/g' *
cd ~/trex-core
status
git status
modified:   linux_dpdk/ws_main.py
modified:   src/dpdk/drivers/common/mlx5/mlx5_autoconf.h
modified:   src/dpdk/drivers/net/mlx4/mlx4_autoconf.h
modified:   src/dpdk/drivers/net/tap/rte_eth_tap.c
modified:   src/dpdk/drivers/net/tap/rte_eth_tap.h
modified:   src/dpdk/drivers/net/tap/tap_autoconf.h
modified:   src/dpdk/drivers/net/tap/tap_netlink.h
modified:   src/dpdk/drivers/net/tap/tap_tcmsgs.h

Follow Azure DPDK build - common

Diff
  diff --git a/linux_dpdk/ws_main.py b/linux_dpdk/ws_main.py
  index 7d685f61..82656fc4 100755
  --- a/linux_dpdk/ws_main.py
  +++ b/linux_dpdk/ws_main.py
  @@ -209,6 +209,7 @@ def check_ofed(ctx):
       ofed_ver= 42
       ofed_ver_show= '4.2'
  +    return True
       if not os.path.isfile(ofed_info):
           ctx.end_msg('not found', 'YELLOW')
           return False
  @@ -1478,8 +1479,6 @@ class build_option:
              flags += ['-DNDEBUG'];
          else:
              flags += ['-UNDEBUG'];
  -        if bld.env.OFED_OK:
  -            flags += ['-DHAVE_IBV_MLX4_WQE_LSO_SEG=1']
           return (flags)
       def get_common_flags (self):

Build Trex after modifications for native build on Ubuntu

cd linux_dpdk
./b configure
./b build

Create trex_cfg.yaml for the system

Typical setup routes

sudo route add -net 16.0.0.0 netmask 255.0.0.0 gw 10.90.130.202
sudo route add -net 48.0.0.0 netmask 255.0.0.0 gw 10.90.23.202

Turn off TSO etc….

sudo ethtool -K eth1 tso off gro off gso off
sudo ethtool -K eth2 tso off gro off gso off
sudo ethtool -K rename5 tso off gro off gso off
sudo ethtool -K rename6 tso off gro off gso off

Run TREX

cd ~/trex-core/scripts
sudo ./t-rex-64 -i -c 1 -v 7 --no-ofed-check

Run tui/bench

stty cols 111 rows 45
cd ~/trex-core/scripts
./trex-console
trex> tui
tui> start -f stl/bench_azure.py -t vm=cached,size=1518 -m 230kpps --port 0 1 --force
tui> start -f stl/bench_azure.py -t vm=cached,size=imix -m 1mpps --port 0 1 --force
Note
The azure profiles (imix_azure,bench_azure) are not standard imix as we wanted to send 72 bytes UDP packets instead of 64 (+FCS) in the standard. the 64B packets are not optimized in Azure.
TUI output
  port    |         0         |         1         |       total
  -----------+-------------------+-------------------+------------------
owner      |         azureuser |         azureuser |
link       |                UP |                UP |
state      |      TRANSMITTING |      TRANSMITTING |
speed      |           10 Gb/s |           10 Gb/s |
CPU util.  |            15.23% |            15.23% |
  --         |                   |                   |
Tx bps L2  |         3.04 Gbps |         3.04 Gbps |         6.08 Gbps
Tx bps L1  |          3.2 Gbps |          3.2 Gbps |          6.4 Gbps
Tx pps     |            1 Mpps |            1 Mpps |            2 Mpps
Line Util. |               8 % |               8 % |
  ---        |                   |                   |
Rx bps     |         3.04 Gbps |         3.04 Gbps |         6.08 Gbps
Rx pps     |            1 Mpps |            1 Mpps |            2 Mpps
  ----       |                   |                   |
opackets   |          14529463 |          14529417 |          29058880
ipackets   |          14528604 |          14528705 |          29057309
obytes     |        5521195940 |        5521179078 |       11042375018
ibytes     |        5520871800 |        5520908626 |       11041780426
tx-pkts    |       14.53 Mpkts |       14.53 Mpkts |       29.06 Mpkts
rx-pkts    |       14.53 Mpkts |       14.53 Mpkts |       29.06 Mpkts
tx-bytes   |           5.52 GB |           5.52 GB |          11.04 GB
rx-bytes   |           5.52 GB |           5.52 GB |          11.04 GB
  -----      |                   |                   |
oerrors    |                 0 |                 0 |                 0
ierrors    |                 0 |                 0 |                 0

Run ndr scripts (new from v2.70)

NDR script
  ./ndr --stl --port 0 1 -v --profile stl/imix_azure.py  --force-map --pdr 0.1 --bi-dir