Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted

Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2023-02-21 18:24:12 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2023-02-21 18:24:12 -0800
commit: 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch)
tree: cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
download: linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz
linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip
1 files changed, 578 insertions, 0 deletions
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
new file mode 100644
index 000000000..5e3a26b15
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
@@ -0,0 +1,578 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/**
+ * Test XDP bonding support
+ *
+ * Sets up two bonded veth pairs between two fresh namespaces
+ * and verifies that XDP_TX program loaded on a bond device
+ * are correctly loaded onto the slave devices and XDP_TX'd
+ * packets are balanced using bonding.
+ */
+
+#define _GNU_SOURCE
+#include <sched.h>
+#include <net/if.h>
+#include <linux/if_link.h>
+#include "test_progs.h"
+#include "network_helpers.h"
+#include <linux/if_bonding.h>
+#include <linux/limits.h>
+#include <linux/udp.h>
+
+#include "xdp_dummy.skel.h"
+#include "xdp_redirect_multi_kern.skel.h"
+#include "xdp_tx.skel.h"
+
+#define BOND1_MAC {0x00, 0x11, 0x22, 0x33, 0x44, 0x55}
+#define BOND1_MAC_STR "00:11:22:33:44:55"
+#define BOND2_MAC {0x00, 0x22, 0x33, 0x44, 0x55, 0x66}
+#define BOND2_MAC_STR "00:22:33:44:55:66"
+#define NPACKETS 100
+
+static int root_netns_fd = -1;
+
+static void restore_root_netns(void)
+{
+	ASSERT_OK(setns(root_netns_fd, CLONE_NEWNET), "restore_root_netns");
+}
+
+static int setns_by_name(char *name)
+{
+	int nsfd, err;
+	char nspath[PATH_MAX];
+
+	snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
+	nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
+	if (nsfd < 0)
+		return -1;
+
+	err = setns(nsfd, CLONE_NEWNET);
+	close(nsfd);
+	return err;
+}
+
+static int get_rx_packets(const char *iface)
+{
+	FILE *f;
+	char line[512];
+	int iface_len = strlen(iface);
+
+	f = fopen("/proc/net/dev", "r");
+	if (!f)
+		return -1;
+
+	while (fgets(line, sizeof(line), f)) {
+		char *p = line;
+
+		while (*p == ' ')
+			p++; /* skip whitespace */
+		if (!strncmp(p, iface, iface_len)) {
+			p += iface_len;
+			if (*p++ != ':')
+				continue;
+			while (*p == ' ')
+				p++; /* skip whitespace */
+			while (*p && *p != ' ')
+				p++; /* skip rx bytes */
+			while (*p == ' ')
+				p++; /* skip whitespace */
+			fclose(f);
+			return atoi(p);
+		}
+	}
+	fclose(f);
+	return -1;
+}
+
+#define MAX_BPF_LINKS 8
+
+struct skeletons {
+	struct xdp_dummy *xdp_dummy;
+	struct xdp_tx *xdp_tx;
+	struct xdp_redirect_multi_kern *xdp_redirect_multi_kern;
+
+	int nlinks;
+	struct bpf_link *links[MAX_BPF_LINKS];
+};
+
+static int xdp_attach(struct skeletons *skeletons, struct bpf_program *prog, char *iface)
+{
+	struct bpf_link *link;
+	int ifindex;
+
+	ifindex = if_nametoindex(iface);
+	if (!ASSERT_GT(ifindex, 0, "get ifindex"))
+		return -1;
+
+	if (!ASSERT_LE(skeletons->nlinks+1, MAX_BPF_LINKS, "too many XDP programs attached"))
+		return -1;
+
+	link = bpf_program__attach_xdp(prog, ifindex);
+	if (!ASSERT_OK_PTR(link, "attach xdp program"))
+		return -1;
+
+	skeletons->links[skeletons->nlinks++] = link;
+	return 0;
+}
+
+enum {
+	BOND_ONE_NO_ATTACH = 0,
+	BOND_BOTH_AND_ATTACH,
+};
+
+static const char * const mode_names[] = {
+	[BOND_MODE_ROUNDROBIN]   = "balance-rr",
+	[BOND_MODE_ACTIVEBACKUP] = "active-backup",
+	[BOND_MODE_XOR]          = "balance-xor",
+	[BOND_MODE_BROADCAST]    = "broadcast",
+	[BOND_MODE_8023AD]       = "802.3ad",
+	[BOND_MODE_TLB]          = "balance-tlb",
+	[BOND_MODE_ALB]          = "balance-alb",
+};
+
+static const char * const xmit_policy_names[] = {
+	[BOND_XMIT_POLICY_LAYER2]       = "layer2",
+	[BOND_XMIT_POLICY_LAYER34]      = "layer3+4",
+	[BOND_XMIT_POLICY_LAYER23]      = "layer2+3",
+	[BOND_XMIT_POLICY_ENCAP23]      = "encap2+3",
+	[BOND_XMIT_POLICY_ENCAP34]      = "encap3+4",
+};
+
+static int bonding_setup(struct skeletons *skeletons, int mode, int xmit_policy,
+			 int bond_both_attach)
+{
+#define SYS(fmt, ...)						\
+	({							\
+		char cmd[1024];					\
+		snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__);	\
+		if (!ASSERT_OK(system(cmd), cmd))		\
+			return -1;				\
+	})
+
+	SYS("ip netns add ns_dst");
+	SYS("ip link add veth1_1 type veth peer name veth2_1 netns ns_dst");
+	SYS("ip link add veth1_2 type veth peer name veth2_2 netns ns_dst");
+
+	SYS("ip link add bond1 type bond mode %s xmit_hash_policy %s",
+	    mode_names[mode], xmit_policy_names[xmit_policy]);
+	SYS("ip link set bond1 up address " BOND1_MAC_STR " addrgenmode none");
+	SYS("ip -netns ns_dst link add bond2 type bond mode %s xmit_hash_policy %s",
+	    mode_names[mode], xmit_policy_names[xmit_policy]);
+	SYS("ip -netns ns_dst link set bond2 up address " BOND2_MAC_STR " addrgenmode none");
+
+	SYS("ip link set veth1_1 master bond1");
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH) {
+		SYS("ip link set veth1_2 master bond1");
+	} else {
+		SYS("ip link set veth1_2 up addrgenmode none");
+
+		if (xdp_attach(skeletons, skeletons->xdp_dummy->progs.xdp_dummy_prog, "veth1_2"))
+			return -1;
+	}
+
+	SYS("ip -netns ns_dst link set veth2_1 master bond2");
+
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH)
+		SYS("ip -netns ns_dst link set veth2_2 master bond2");
+	else
+		SYS("ip -netns ns_dst link set veth2_2 up addrgenmode none");
+
+	/* Load a dummy program on sending side as with veth peer needs to have a
+	 * XDP program loaded as well.
+	 */
+	if (xdp_attach(skeletons, skeletons->xdp_dummy->progs.xdp_dummy_prog, "bond1"))
+		return -1;
+
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH) {
+		if (!ASSERT_OK(setns_by_name("ns_dst"), "set netns to ns_dst"))
+			return -1;
+
+		if (xdp_attach(skeletons, skeletons->xdp_tx->progs.xdp_tx, "bond2"))
+			return -1;
+
+		restore_root_netns();
+	}
+
+	return 0;
+
+#undef SYS
+}
+
+static void bonding_cleanup(struct skeletons *skeletons)
+{
+	restore_root_netns();
+	while (skeletons->nlinks) {
+		skeletons->nlinks--;
+		bpf_link__destroy(skeletons->links[skeletons->nlinks]);
+	}
+	ASSERT_OK(system("ip link delete bond1"), "delete bond1");
+	ASSERT_OK(system("ip link delete veth1_1"), "delete veth1_1");
+	ASSERT_OK(system("ip link delete veth1_2"), "delete veth1_2");
+	ASSERT_OK(system("ip netns delete ns_dst"), "delete ns_dst");
+}
+
+static int send_udp_packets(int vary_dst_ip)
+{
+	struct ethhdr eh = {
+		.h_source = BOND1_MAC,
+		.h_dest = BOND2_MAC,
+		.h_proto = htons(ETH_P_IP),
+	};
+	struct iphdr iph = {};
+	struct udphdr uh = {};
+	uint8_t buf[128];
+	int i, s = -1;
+	int ifindex;
+
+	s = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
+	if (!ASSERT_GE(s, 0, "socket"))
+		goto err;
+
+	ifindex = if_nametoindex("bond1");
+	if (!ASSERT_GT(ifindex, 0, "get bond1 ifindex"))
+		goto err;
+
+	iph.ihl = 5;
+	iph.version = 4;
+	iph.tos = 16;
+	iph.id = 1;
+	iph.ttl = 64;
+	iph.protocol = IPPROTO_UDP;
+	iph.saddr = 1;
+	iph.daddr = 2;
+	iph.tot_len = htons(sizeof(buf) - ETH_HLEN);
+	iph.check = 0;
+
+	for (i = 1; i <= NPACKETS; i++) {
+		int n;
+		struct sockaddr_ll saddr_ll = {
+			.sll_ifindex = ifindex,
+			.sll_halen = ETH_ALEN,
+			.sll_addr = BOND2_MAC,
+		};
+
+		/* vary the UDP destination port for even distribution with roundrobin/xor modes */
+		uh.dest++;
+
+		if (vary_dst_ip)
+			iph.daddr++;
+
+		/* construct a packet */
+		memcpy(buf, &eh, sizeof(eh));
+		memcpy(buf + sizeof(eh), &iph, sizeof(iph));
+		memcpy(buf + sizeof(eh) + sizeof(iph), &uh, sizeof(uh));
+
+		n = sendto(s, buf, sizeof(buf), 0, (struct sockaddr *)&saddr_ll, sizeof(saddr_ll));
+		if (!ASSERT_EQ(n, sizeof(buf), "sendto"))
+			goto err;
+	}
+
+	return 0;
+
+err:
+	if (s >= 0)
+		close(s);
+	return -1;
+}
+
+static void test_xdp_bonding_with_mode(struct skeletons *skeletons, int mode, int xmit_policy)
+{
+	int bond1_rx;
+
+	if (bonding_setup(skeletons, mode, xmit_policy, BOND_BOTH_AND_ATTACH))
+		goto out;
+
+	if (send_udp_packets(xmit_policy != BOND_XMIT_POLICY_LAYER34))
+		goto out;
+
+	bond1_rx = get_rx_packets("bond1");
+	ASSERT_EQ(bond1_rx, NPACKETS, "expected more received packets");
+
+	switch (mode) {
+	case BOND_MODE_ROUNDROBIN:
+	case BOND_MODE_XOR: {
+		int veth1_rx = get_rx_packets("veth1_1");
+		int veth2_rx = get_rx_packets("veth1_2");
+		int diff = abs(veth1_rx - veth2_rx);
+
+		ASSERT_GE(veth1_rx + veth2_rx, NPACKETS, "expected more packets");
+
+		switch (xmit_policy) {
+		case BOND_XMIT_POLICY_LAYER2:
+			ASSERT_GE(diff, NPACKETS,
+				  "expected packets on only one of the interfaces");
+			break;
+		case BOND_XMIT_POLICY_LAYER23:
+		case BOND_XMIT_POLICY_LAYER34:
+			ASSERT_LT(diff, NPACKETS/2,
+				  "expected even distribution of packets");
+			break;
+		default:
+			PRINT_FAIL("Unimplemented xmit_policy=%d\n", xmit_policy);
+			break;
+		}
+		break;
+	}
+	case BOND_MODE_ACTIVEBACKUP: {
+		int veth1_rx = get_rx_packets("veth1_1");
+		int veth2_rx = get_rx_packets("veth1_2");
+		int diff = abs(veth1_rx - veth2_rx);
+
+		ASSERT_GE(diff, NPACKETS,
+			  "expected packets on only one of the interfaces");
+		break;
+	}
+	default:
+		PRINT_FAIL("Unimplemented xmit_policy=%d\n", xmit_policy);
+		break;
+	}
+
+out:
+	bonding_cleanup(skeletons);
+}
+
+/* Test the broadcast redirection using xdp_redirect_map_multi_prog and adding
+ * all the interfaces to it and checking that broadcasting won't send the packet
+ * to neither the ingress bond device (bond2) or its slave (veth2_1).
+ */
+static void test_xdp_bonding_redirect_multi(struct skeletons *skeletons)
+{
+	static const char * const ifaces[] = {"bond2", "veth2_1", "veth2_2"};
+	int veth1_1_rx, veth1_2_rx;
+	int err;
+
+	if (bonding_setup(skeletons, BOND_MODE_ROUNDROBIN, BOND_XMIT_POLICY_LAYER23,
+			  BOND_ONE_NO_ATTACH))
+		goto out;
+
+
+	if (!ASSERT_OK(setns_by_name("ns_dst"), "could not set netns to ns_dst"))
+		goto out;
+
+	/* populate the devmap with the relevant interfaces */
+	for (int i = 0; i < ARRAY_SIZE(ifaces); i++) {
+		int ifindex = if_nametoindex(ifaces[i]);
+		int map_fd = bpf_map__fd(skeletons->xdp_redirect_multi_kern->maps.map_all);
+
+		if (!ASSERT_GT(ifindex, 0, "could not get interface index"))
+			goto out;
+
+		err = bpf_map_update_elem(map_fd, &ifindex, &ifindex, 0);
+		if (!ASSERT_OK(err, "add interface to map_all"))
+			goto out;
+	}
+
+	if (xdp_attach(skeletons,
+		       skeletons->xdp_redirect_multi_kern->progs.xdp_redirect_map_multi_prog,
+		       "bond2"))
+		goto out;
+
+	restore_root_netns();
+
+	if (send_udp_packets(BOND_MODE_ROUNDROBIN))
+		goto out;
+
+	veth1_1_rx = get_rx_packets("veth1_1");
+	veth1_2_rx = get_rx_packets("veth1_2");
+
+	ASSERT_EQ(veth1_1_rx, 0, "expected no packets on veth1_1");
+	ASSERT_GE(veth1_2_rx, NPACKETS, "expected packets on veth1_2");
+
+out:
+	restore_root_netns();
+	bonding_cleanup(skeletons);
+}
+
+/* Test that XDP programs cannot be attached to both the bond master and slaves simultaneously */
+static void test_xdp_bonding_attach(struct skeletons *skeletons)
+{
+	struct bpf_link *link = NULL;
+	struct bpf_link *link2 = NULL;
+	int veth, bond, err;
+
+	if (!ASSERT_OK(system("ip link add veth type veth"), "add veth"))
+		goto out;
+	if (!ASSERT_OK(system("ip link add bond type bond"), "add bond"))
+		goto out;
+
+	veth = if_nametoindex("veth");
+	if (!ASSERT_GE(veth, 0, "if_nametoindex veth"))
+		goto out;
+	bond = if_nametoindex("bond");
+	if (!ASSERT_GE(bond, 0, "if_nametoindex bond"))
+		goto out;
+
+	/* enslaving with a XDP program loaded is allowed */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, veth);
+	if (!ASSERT_OK_PTR(link, "attach program to veth"))
+		goto out;
+
+	err = system("ip link set veth master bond");
+	if (!ASSERT_OK(err, "set veth master"))
+		goto out;
+
+	bpf_link__destroy(link);
+	link = NULL;
+
+	/* attaching to slave when master has no program is allowed */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, veth);
+	if (!ASSERT_OK_PTR(link, "attach program to slave when enslaved"))
+		goto out;
+
+	/* attaching to master not allowed when slave has program loaded */
+	link2 = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	if (!ASSERT_ERR_PTR(link2, "attach program to master when slave has program"))
+		goto out;
+
+	bpf_link__destroy(link);
+	link = NULL;
+
+	/* attaching XDP program to master allowed when slave has no program */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	if (!ASSERT_OK_PTR(link, "attach program to master"))
+		goto out;
+
+	/* attaching to slave not allowed when master has program loaded */
+	link2 = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, veth);
+	if (!ASSERT_ERR_PTR(link2, "attach program to slave when master has program"))
+		goto out;
+
+	bpf_link__destroy(link);
+	link = NULL;
+
+	/* test program unwinding with a non-XDP slave */
+	if (!ASSERT_OK(system("ip link add vxlan type vxlan id 1 remote 1.2.3.4 dstport 0 dev lo"),
+		       "add vxlan"))
+		goto out;
+
+	err = system("ip link set vxlan master bond");
+	if (!ASSERT_OK(err, "set vxlan master"))
+		goto out;
+
+	/* attaching not allowed when one slave does not support XDP */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	if (!ASSERT_ERR_PTR(link, "attach program to master when slave does not support XDP"))
+		goto out;
+
+out:
+	bpf_link__destroy(link);
+	bpf_link__destroy(link2);
+
+	system("ip link del veth");
+	system("ip link del bond");
+	system("ip link del vxlan");
+}
+
+/* Test with nested bonding devices to catch issue with negative jump label count */
+static void test_xdp_bonding_nested(struct skeletons *skeletons)
+{
+	struct bpf_link *link = NULL;
+	int bond, err;
+
+	if (!ASSERT_OK(system("ip link add bond type bond"), "add bond"))
+		goto out;
+
+	bond = if_nametoindex("bond");
+	if (!ASSERT_GE(bond, 0, "if_nametoindex bond"))
+		goto out;
+
+	if (!ASSERT_OK(system("ip link add bond_nest1 type bond"), "add bond_nest1"))
+		goto out;
+
+	err = system("ip link set bond_nest1 master bond");
+	if (!ASSERT_OK(err, "set bond_nest1 master"))
+		goto out;
+
+	if (!ASSERT_OK(system("ip link add bond_nest2 type bond"), "add bond_nest1"))
+		goto out;
+
+	err = system("ip link set bond_nest2 master bond_nest1");
+	if (!ASSERT_OK(err, "set bond_nest2 master"))
+		goto out;
+
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	ASSERT_OK_PTR(link, "attach program to master");
+
+out:
+	bpf_link__destroy(link);
+	system("ip link del bond");
+	system("ip link del bond_nest1");
+	system("ip link del bond_nest2");
+}
+
+static int libbpf_debug_print(enum libbpf_print_level level,
+			      const char *format, va_list args)
+{
+	if (level != LIBBPF_WARN)
+		vprintf(format, args);
+	return 0;
+}
+
+struct bond_test_case {
+	char *name;
+	int mode;
+	int xmit_policy;
+};
+
+static struct bond_test_case bond_test_cases[] = {
+	{ "xdp_bonding_roundrobin", BOND_MODE_ROUNDROBIN, BOND_XMIT_POLICY_LAYER23, },
+	{ "xdp_bonding_activebackup", BOND_MODE_ACTIVEBACKUP, BOND_XMIT_POLICY_LAYER23 },
+
+	{ "xdp_bonding_xor_layer2", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER2, },
+	{ "xdp_bonding_xor_layer23", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER23, },
+	{ "xdp_bonding_xor_layer34", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER34, },
+};
+
+void serial_test_xdp_bonding(void)
+{
+	libbpf_print_fn_t old_print_fn;
+	struct skeletons skeletons = {};
+	int i;
+
+	old_print_fn = libbpf_set_print(libbpf_debug_print);
+
+	root_netns_fd = open("/proc/self/ns/net", O_RDONLY);
+	if (!ASSERT_GE(root_netns_fd, 0, "open /proc/self/ns/net"))
+		goto out;
+
+	skeletons.xdp_dummy = xdp_dummy__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_dummy, "xdp_dummy__open_and_load"))
+		goto out;
+
+	skeletons.xdp_tx = xdp_tx__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_tx, "xdp_tx__open_and_load"))
+		goto out;
+
+	skeletons.xdp_redirect_multi_kern = xdp_redirect_multi_kern__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_redirect_multi_kern,
+			   "xdp_redirect_multi_kern__open_and_load"))
+		goto out;
+
+	if (test__start_subtest("xdp_bonding_attach"))
+		test_xdp_bonding_attach(&skeletons);
+
+	if (test__start_subtest("xdp_bonding_nested"))
+		test_xdp_bonding_nested(&skeletons);
+
+	for (i = 0; i < ARRAY_SIZE(bond_test_cases); i++) {
+		struct bond_test_case *test_case = &bond_test_cases[i];
+
+		if (test__start_subtest(test_case->name))
+			test_xdp_bonding_with_mode(
+				&skeletons,
+				test_case->mode,
+				test_case->xmit_policy);
+	}
+
+	if (test__start_subtest("xdp_bonding_redirect_multi"))
+		test_xdp_bonding_redirect_multi(&skeletons);
+
+out:
+	xdp_dummy__destroy(skeletons.xdp_dummy);
+	xdp_tx__destroy(skeletons.xdp_tx);
+	xdp_redirect_multi_kern__destroy(skeletons.xdp_redirect_multi_kern);
+
+	libbpf_set_print(old_print_fn);
+	if (root_netns_fd >= 0)
+		close(root_netns_fd);
+}
author	Linus Torvalds <torvalds@linux-foundation.org>	2023-02-21 18:24:12 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2023-02-21 18:24:12 -0800
commit	5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch)
tree	cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
download	linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip