From 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Tue, 21 Feb 2023 18:24:12 -0800 Subject: Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ... --- arch/s390/include/asm/processor.h | 342 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 342 insertions(+) create mode 100644 arch/s390/include/asm/processor.h (limited to 'arch/s390/include/asm/processor.h') diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h new file mode 100644 index 000000000..e98d96507 --- /dev/null +++ b/arch/s390/include/asm/processor.h @@ -0,0 +1,342 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * S390 version + * Copyright IBM Corp. 1999 + * Author(s): Hartmut Penner (hp@de.ibm.com), + * Martin Schwidefsky (schwidefsky@de.ibm.com) + * + * Derived from "include/asm-i386/processor.h" + * Copyright (C) 1994, Linus Torvalds + */ + +#ifndef __ASM_S390_PROCESSOR_H +#define __ASM_S390_PROCESSOR_H + +#include + +#define CIF_NOHZ_DELAY 2 /* delay HZ disable for a tick */ +#define CIF_FPU 3 /* restore FPU registers */ +#define CIF_ENABLED_WAIT 5 /* in enabled wait state */ +#define CIF_MCCK_GUEST 6 /* machine check happening in guest */ +#define CIF_DEDICATED_CPU 7 /* this CPU is dedicated */ + +#define _CIF_NOHZ_DELAY BIT(CIF_NOHZ_DELAY) +#define _CIF_FPU BIT(CIF_FPU) +#define _CIF_ENABLED_WAIT BIT(CIF_ENABLED_WAIT) +#define _CIF_MCCK_GUEST BIT(CIF_MCCK_GUEST) +#define _CIF_DEDICATED_CPU BIT(CIF_DEDICATED_CPU) + +#define RESTART_FLAG_CTLREGS _AC(1 << 0, U) + +#ifndef __ASSEMBLY__ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +typedef long (*sys_call_ptr_t)(struct pt_regs *regs); + +static __always_inline void set_cpu_flag(int flag) +{ + S390_lowcore.cpu_flags |= (1UL << flag); +} + +static __always_inline void clear_cpu_flag(int flag) +{ + S390_lowcore.cpu_flags &= ~(1UL << flag); +} + +static __always_inline bool test_cpu_flag(int flag) +{ + return S390_lowcore.cpu_flags & (1UL << flag); +} + +static __always_inline bool test_and_set_cpu_flag(int flag) +{ + if (test_cpu_flag(flag)) + return true; + set_cpu_flag(flag); + return false; +} + +static __always_inline bool test_and_clear_cpu_flag(int flag) +{ + if (!test_cpu_flag(flag)) + return false; + clear_cpu_flag(flag); + return true; +} + +/* + * Test CIF flag of another CPU. The caller needs to ensure that + * CPU hotplug can not happen, e.g. by disabling preemption. + */ +static __always_inline bool test_cpu_flag_of(int flag, int cpu) +{ + struct lowcore *lc = lowcore_ptr[cpu]; + + return lc->cpu_flags & (1UL << flag); +} + +#define arch_needs_cpu() test_cpu_flag(CIF_NOHZ_DELAY) + +static inline void get_cpu_id(struct cpuid *ptr) +{ + asm volatile("stidp %0" : "=Q" (*ptr)); +} + +void s390_adjust_jiffies(void); +void s390_update_cpu_mhz(void); +void cpu_detect_mhz_feature(void); + +extern const struct seq_operations cpuinfo_op; +extern void execve_tail(void); +extern void __bpon(void); +unsigned long vdso_size(void); + +/* + * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit. + */ + +#define TASK_SIZE (test_thread_flag(TIF_31BIT) ? \ + _REGION3_SIZE : TASK_SIZE_MAX) +#define TASK_UNMAPPED_BASE (test_thread_flag(TIF_31BIT) ? \ + (_REGION3_SIZE >> 1) : (_REGION2_SIZE >> 1)) +#define TASK_SIZE_MAX (-PAGE_SIZE) + +#define VDSO_BASE (STACK_TOP + PAGE_SIZE) +#define VDSO_LIMIT (test_thread_flag(TIF_31BIT) ? _REGION3_SIZE : _REGION2_SIZE) +#define STACK_TOP (VDSO_LIMIT - vdso_size() - PAGE_SIZE) +#define STACK_TOP_MAX (_REGION2_SIZE - vdso_size() - PAGE_SIZE) + +#define HAVE_ARCH_PICK_MMAP_LAYOUT + +/* + * Thread structure + */ +struct thread_struct { + unsigned int acrs[NUM_ACRS]; + unsigned long ksp; /* kernel stack pointer */ + unsigned long user_timer; /* task cputime in user space */ + unsigned long guest_timer; /* task cputime in kvm guest */ + unsigned long system_timer; /* task cputime in kernel space */ + unsigned long hardirq_timer; /* task cputime in hardirq context */ + unsigned long softirq_timer; /* task cputime in softirq context */ + const sys_call_ptr_t *sys_call_table; /* system call table address */ + unsigned long gmap_addr; /* address of last gmap fault. */ + unsigned int gmap_write_flag; /* gmap fault write indication */ + unsigned int gmap_int_code; /* int code of last gmap fault */ + unsigned int gmap_pfault; /* signal of a pending guest pfault */ + + /* Per-thread information related to debugging */ + struct per_regs per_user; /* User specified PER registers */ + struct per_event per_event; /* Cause of the last PER trap */ + unsigned long per_flags; /* Flags to control debug behavior */ + unsigned int system_call; /* system call number in signal */ + unsigned long last_break; /* last breaking-event-address. */ + /* pfault_wait is used to block the process on a pfault event */ + unsigned long pfault_wait; + struct list_head list; + /* cpu runtime instrumentation */ + struct runtime_instr_cb *ri_cb; + struct gs_cb *gs_cb; /* Current guarded storage cb */ + struct gs_cb *gs_bc_cb; /* Broadcast guarded storage cb */ + struct pgm_tdb trap_tdb; /* Transaction abort diagnose block */ + /* + * Warning: 'fpu' is dynamically-sized. It *MUST* be at + * the end. + */ + struct fpu fpu; /* FP and VX register save area */ +}; + +/* Flag to disable transactions. */ +#define PER_FLAG_NO_TE 1UL +/* Flag to enable random transaction aborts. */ +#define PER_FLAG_TE_ABORT_RAND 2UL +/* Flag to specify random transaction abort mode: + * - abort each transaction at a random instruction before TEND if set. + * - abort random transactions at a random instruction if cleared. + */ +#define PER_FLAG_TE_ABORT_RAND_TEND 4UL + +typedef struct thread_struct thread_struct; + +#define ARCH_MIN_TASKALIGN 8 + +#define INIT_THREAD { \ + .ksp = sizeof(init_stack) + (unsigned long) &init_stack, \ + .fpu.regs = (void *) init_task.thread.fpu.fprs, \ + .last_break = 1, \ +} + +/* + * Do necessary setup to start up a new thread. + */ +#define start_thread(regs, new_psw, new_stackp) do { \ + regs->psw.mask = PSW_USER_BITS | PSW_MASK_EA | PSW_MASK_BA; \ + regs->psw.addr = new_psw; \ + regs->gprs[15] = new_stackp; \ + execve_tail(); \ +} while (0) + +#define start_thread31(regs, new_psw, new_stackp) do { \ + regs->psw.mask = PSW_USER_BITS | PSW_MASK_BA; \ + regs->psw.addr = new_psw; \ + regs->gprs[15] = new_stackp; \ + execve_tail(); \ +} while (0) + +/* Forward declaration, a strange C thing */ +struct task_struct; +struct mm_struct; +struct seq_file; +struct pt_regs; + +void show_registers(struct pt_regs *regs); +void show_cacheinfo(struct seq_file *m); + +/* Free guarded storage control block */ +void guarded_storage_release(struct task_struct *tsk); +void gs_load_bc_cb(struct pt_regs *regs); + +unsigned long __get_wchan(struct task_struct *p); +#define task_pt_regs(tsk) ((struct pt_regs *) \ + (task_stack_page(tsk) + THREAD_SIZE) - 1) +#define KSTK_EIP(tsk) (task_pt_regs(tsk)->psw.addr) +#define KSTK_ESP(tsk) (task_pt_regs(tsk)->gprs[15]) + +/* Has task runtime instrumentation enabled ? */ +#define is_ri_task(tsk) (!!(tsk)->thread.ri_cb) + +/* avoid using global register due to gcc bug in versions < 8.4 */ +#define current_stack_pointer (__current_stack_pointer()) + +static __always_inline unsigned long __current_stack_pointer(void) +{ + unsigned long sp; + + asm volatile("lgr %0,15" : "=d" (sp)); + return sp; +} + +static __always_inline unsigned short stap(void) +{ + unsigned short cpu_address; + + asm volatile("stap %0" : "=Q" (cpu_address)); + return cpu_address; +} + +#define cpu_relax() barrier() + +#define ECAG_CACHE_ATTRIBUTE 0 +#define ECAG_CPU_ATTRIBUTE 1 + +static inline unsigned long __ecag(unsigned int asi, unsigned char parm) +{ + unsigned long val; + + asm volatile("ecag %0,0,0(%1)" : "=d" (val) : "a" (asi << 8 | parm)); + return val; +} + +static inline void psw_set_key(unsigned int key) +{ + asm volatile("spka 0(%0)" : : "d" (key)); +} + +/* + * Set PSW to specified value. + */ +static inline void __load_psw(psw_t psw) +{ + asm volatile("lpswe %0" : : "Q" (psw) : "cc"); +} + +/* + * Set PSW mask to specified value, while leaving the + * PSW addr pointing to the next instruction. + */ +static __always_inline void __load_psw_mask(unsigned long mask) +{ + unsigned long addr; + psw_t psw; + + psw.mask = mask; + + asm volatile( + " larl %0,1f\n" + " stg %0,%1\n" + " lpswe %2\n" + "1:" + : "=&d" (addr), "=Q" (psw.addr) : "Q" (psw) : "memory", "cc"); +} + +/* + * Extract current PSW mask + */ +static inline unsigned long __extract_psw(void) +{ + unsigned int reg1, reg2; + + asm volatile("epsw %0,%1" : "=d" (reg1), "=a" (reg2)); + return (((unsigned long) reg1) << 32) | ((unsigned long) reg2); +} + +static inline void local_mcck_enable(void) +{ + __load_psw_mask(__extract_psw() | PSW_MASK_MCHECK); +} + +static inline void local_mcck_disable(void) +{ + __load_psw_mask(__extract_psw() & ~PSW_MASK_MCHECK); +} + +/* + * Rewind PSW instruction address by specified number of bytes. + */ +static inline unsigned long __rewind_psw(psw_t psw, unsigned long ilc) +{ + unsigned long mask; + + mask = (psw.mask & PSW_MASK_EA) ? -1UL : + (psw.mask & PSW_MASK_BA) ? (1UL << 31) - 1 : + (1UL << 24) - 1; + return (psw.addr - ilc) & mask; +} + +/* + * Function to drop a processor into disabled wait state + */ +static __always_inline void __noreturn disabled_wait(void) +{ + psw_t psw; + + psw.mask = PSW_MASK_BASE | PSW_MASK_WAIT | PSW_MASK_BA | PSW_MASK_EA; + psw.addr = _THIS_IP_; + __load_psw(psw); + while (1); +} + +#define ARCH_LOW_ADDRESS_LIMIT 0x7fffffffUL + +extern int s390_isolate_bp(void); +extern int s390_isolate_bp_guest(void); + +static __always_inline bool regs_irqs_disabled(struct pt_regs *regs) +{ + return arch_irqs_disabled_flags(regs->psw.mask); +} + +#endif /* __ASSEMBLY__ */ + +#endif /* __ASM_S390_PROCESSOR_H */ -- cgit v1.2.3