From 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Tue, 21 Feb 2023 18:24:12 -0800 Subject: Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ... --- drivers/net/ipa/ipa_power.c | 437 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 437 insertions(+) create mode 100644 drivers/net/ipa/ipa_power.c (limited to 'drivers/net/ipa/ipa_power.c') diff --git a/drivers/net/ipa/ipa_power.c b/drivers/net/ipa/ipa_power.c new file mode 100644 index 000000000..921eecf3e --- /dev/null +++ b/drivers/net/ipa/ipa_power.c @@ -0,0 +1,437 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved. + * Copyright (C) 2018-2022 Linaro Ltd. + */ + +#include +#include +#include +#include +#include +#include + +#include "linux/soc/qcom/qcom_aoss.h" + +#include "ipa.h" +#include "ipa_power.h" +#include "ipa_endpoint.h" +#include "ipa_modem.h" +#include "ipa_data.h" + +/** + * DOC: IPA Power Management + * + * The IPA hardware is enabled when the IPA core clock and all the + * interconnects (buses) it depends on are enabled. Runtime power + * management is used to determine whether the core clock and + * interconnects are enabled, and if not in use to be suspended + * automatically. + * + * The core clock currently runs at a fixed clock rate when enabled, + * an all interconnects use a fixed average and peak bandwidth. + */ + +#define IPA_AUTOSUSPEND_DELAY 500 /* milliseconds */ + +/** + * enum ipa_power_flag - IPA power flags + * @IPA_POWER_FLAG_RESUMED: Whether resume from suspend has been signaled + * @IPA_POWER_FLAG_SYSTEM: Hardware is system (not runtime) suspended + * @IPA_POWER_FLAG_STOPPED: Modem TX is disabled by ipa_start_xmit() + * @IPA_POWER_FLAG_STARTED: Modem TX was enabled by ipa_runtime_resume() + * @IPA_POWER_FLAG_COUNT: Number of defined power flags + */ +enum ipa_power_flag { + IPA_POWER_FLAG_RESUMED, + IPA_POWER_FLAG_SYSTEM, + IPA_POWER_FLAG_STOPPED, + IPA_POWER_FLAG_STARTED, + IPA_POWER_FLAG_COUNT, /* Last; not a flag */ +}; + +/** + * struct ipa_power - IPA power management information + * @dev: IPA device pointer + * @core: IPA core clock + * @qmp: QMP handle for AOSS communication + * @spinlock: Protects modem TX queue enable/disable + * @flags: Boolean state flags + * @interconnect_count: Number of elements in interconnect[] + * @interconnect: Interconnect array + */ +struct ipa_power { + struct device *dev; + struct clk *core; + struct qmp *qmp; + spinlock_t spinlock; /* used with STOPPED/STARTED power flags */ + DECLARE_BITMAP(flags, IPA_POWER_FLAG_COUNT); + u32 interconnect_count; + struct icc_bulk_data interconnect[]; +}; + +/* Initialize interconnects required for IPA operation */ +static int ipa_interconnect_init(struct ipa_power *power, + const struct ipa_interconnect_data *data) +{ + struct icc_bulk_data *interconnect; + int ret; + u32 i; + + /* Initialize our interconnect data array for bulk operations */ + interconnect = &power->interconnect[0]; + for (i = 0; i < power->interconnect_count; i++) { + /* interconnect->path is filled in by of_icc_bulk_get() */ + interconnect->name = data->name; + interconnect->avg_bw = data->average_bandwidth; + interconnect->peak_bw = data->peak_bandwidth; + data++; + interconnect++; + } + + ret = of_icc_bulk_get(power->dev, power->interconnect_count, + power->interconnect); + if (ret) + return ret; + + /* All interconnects are initially disabled */ + icc_bulk_disable(power->interconnect_count, power->interconnect); + + /* Set the bandwidth values to be used when enabled */ + ret = icc_bulk_set_bw(power->interconnect_count, power->interconnect); + if (ret) + icc_bulk_put(power->interconnect_count, power->interconnect); + + return ret; +} + +/* Inverse of ipa_interconnect_init() */ +static void ipa_interconnect_exit(struct ipa_power *power) +{ + icc_bulk_put(power->interconnect_count, power->interconnect); +} + +/* Enable IPA power, enabling interconnects and the core clock */ +static int ipa_power_enable(struct ipa *ipa) +{ + struct ipa_power *power = ipa->power; + int ret; + + ret = icc_bulk_enable(power->interconnect_count, power->interconnect); + if (ret) + return ret; + + ret = clk_prepare_enable(power->core); + if (ret) { + dev_err(power->dev, "error %d enabling core clock\n", ret); + icc_bulk_disable(power->interconnect_count, + power->interconnect); + } + + return ret; +} + +/* Inverse of ipa_power_enable() */ +static void ipa_power_disable(struct ipa *ipa) +{ + struct ipa_power *power = ipa->power; + + clk_disable_unprepare(power->core); + + icc_bulk_disable(power->interconnect_count, power->interconnect); +} + +static int ipa_runtime_suspend(struct device *dev) +{ + struct ipa *ipa = dev_get_drvdata(dev); + + /* Endpoints aren't usable until setup is complete */ + if (ipa->setup_complete) { + __clear_bit(IPA_POWER_FLAG_RESUMED, ipa->power->flags); + ipa_endpoint_suspend(ipa); + gsi_suspend(&ipa->gsi); + } + + ipa_power_disable(ipa); + + return 0; +} + +static int ipa_runtime_resume(struct device *dev) +{ + struct ipa *ipa = dev_get_drvdata(dev); + int ret; + + ret = ipa_power_enable(ipa); + if (WARN_ON(ret < 0)) + return ret; + + /* Endpoints aren't usable until setup is complete */ + if (ipa->setup_complete) { + gsi_resume(&ipa->gsi); + ipa_endpoint_resume(ipa); + } + + return 0; +} + +static int ipa_suspend(struct device *dev) +{ + struct ipa *ipa = dev_get_drvdata(dev); + + __set_bit(IPA_POWER_FLAG_SYSTEM, ipa->power->flags); + + /* Increment the disable depth to ensure that the IRQ won't + * be re-enabled until the matching _enable call in + * ipa_resume(). We do this to ensure that the interrupt + * handler won't run whilst PM runtime is disabled. + * + * Note that disabling the IRQ is NOT the same as disabling + * irq wake. If wakeup is enabled for the IPA then the IRQ + * will still cause the system to wake up, see irq_set_irq_wake(). + */ + ipa_interrupt_irq_disable(ipa); + + return pm_runtime_force_suspend(dev); +} + +static int ipa_resume(struct device *dev) +{ + struct ipa *ipa = dev_get_drvdata(dev); + int ret; + + ret = pm_runtime_force_resume(dev); + + __clear_bit(IPA_POWER_FLAG_SYSTEM, ipa->power->flags); + + /* Now that PM runtime is enabled again it's safe + * to turn the IRQ back on and process any data + * that was received during suspend. + */ + ipa_interrupt_irq_enable(ipa); + + return ret; +} + +/* Return the current IPA core clock rate */ +u32 ipa_core_clock_rate(struct ipa *ipa) +{ + return ipa->power ? (u32)clk_get_rate(ipa->power->core) : 0; +} + +void ipa_power_suspend_handler(struct ipa *ipa, enum ipa_irq_id irq_id) +{ + /* To handle an IPA interrupt we will have resumed the hardware + * just to handle the interrupt, so we're done. If we are in a + * system suspend, trigger a system resume. + */ + if (!__test_and_set_bit(IPA_POWER_FLAG_RESUMED, ipa->power->flags)) + if (test_bit(IPA_POWER_FLAG_SYSTEM, ipa->power->flags)) + pm_wakeup_dev_event(&ipa->pdev->dev, 0, true); + + /* Acknowledge/clear the suspend interrupt on all endpoints */ + ipa_interrupt_suspend_clear_all(ipa->interrupt); +} + +/* The next few functions coordinate stopping and starting the modem + * network device transmit queue. + * + * Transmit can be running concurrent with power resume, and there's a + * chance the resume completes before the transmit path stops the queue, + * leaving the queue in a stopped state. The next two functions are used + * to avoid this: ipa_power_modem_queue_stop() is used by ipa_start_xmit() + * to conditionally stop the TX queue; and ipa_power_modem_queue_start() + * is used by ipa_runtime_resume() to conditionally restart it. + * + * Two flags and a spinlock are used. If the queue is stopped, the STOPPED + * power flag is set. And if the queue is started, the STARTED flag is set. + * The queue is only started on resume if the STOPPED flag is set. And the + * queue is only started in ipa_start_xmit() if the STARTED flag is *not* + * set. As a result, the queue remains operational if the two activites + * happen concurrently regardless of the order they complete. The spinlock + * ensures the flag and TX queue operations are done atomically. + * + * The first function stops the modem netdev transmit queue, but only if + * the STARTED flag is *not* set. That flag is cleared if it was set. + * If the queue is stopped, the STOPPED flag is set. This is called only + * from the power ->runtime_resume operation. + */ +void ipa_power_modem_queue_stop(struct ipa *ipa) +{ + struct ipa_power *power = ipa->power; + unsigned long flags; + + spin_lock_irqsave(&power->spinlock, flags); + + if (!__test_and_clear_bit(IPA_POWER_FLAG_STARTED, power->flags)) { + netif_stop_queue(ipa->modem_netdev); + __set_bit(IPA_POWER_FLAG_STOPPED, power->flags); + } + + spin_unlock_irqrestore(&power->spinlock, flags); +} + +/* This function starts the modem netdev transmit queue, but only if the + * STOPPED flag is set. That flag is cleared if it was set. If the queue + * was restarted, the STARTED flag is set; this allows ipa_start_xmit() + * to skip stopping the queue in the event of a race. + */ +void ipa_power_modem_queue_wake(struct ipa *ipa) +{ + struct ipa_power *power = ipa->power; + unsigned long flags; + + spin_lock_irqsave(&power->spinlock, flags); + + if (__test_and_clear_bit(IPA_POWER_FLAG_STOPPED, power->flags)) { + __set_bit(IPA_POWER_FLAG_STARTED, power->flags); + netif_wake_queue(ipa->modem_netdev); + } + + spin_unlock_irqrestore(&power->spinlock, flags); +} + +/* This function clears the STARTED flag once the TX queue is operating */ +void ipa_power_modem_queue_active(struct ipa *ipa) +{ + clear_bit(IPA_POWER_FLAG_STARTED, ipa->power->flags); +} + +static int ipa_power_retention_init(struct ipa_power *power) +{ + struct qmp *qmp = qmp_get(power->dev); + + if (IS_ERR(qmp)) { + if (PTR_ERR(qmp) == -EPROBE_DEFER) + return -EPROBE_DEFER; + + /* We assume any other error means it's not defined/needed */ + qmp = NULL; + } + power->qmp = qmp; + + return 0; +} + +static void ipa_power_retention_exit(struct ipa_power *power) +{ + qmp_put(power->qmp); + power->qmp = NULL; +} + +/* Control register retention on power collapse */ +void ipa_power_retention(struct ipa *ipa, bool enable) +{ + static const char fmt[] = "{ class: bcm, res: ipa_pc, val: %c }"; + struct ipa_power *power = ipa->power; + char buf[36]; /* Exactly enough for fmt[]; size a multiple of 4 */ + int ret; + + if (!power->qmp) + return; /* Not needed on this platform */ + + (void)snprintf(buf, sizeof(buf), fmt, enable ? '1' : '0'); + + ret = qmp_send(power->qmp, buf, sizeof(buf)); + if (ret) + dev_err(power->dev, "error %d sending QMP %sable request\n", + ret, enable ? "en" : "dis"); +} + +int ipa_power_setup(struct ipa *ipa) +{ + int ret; + + ipa_interrupt_enable(ipa, IPA_IRQ_TX_SUSPEND); + + ret = device_init_wakeup(&ipa->pdev->dev, true); + if (ret) + ipa_interrupt_disable(ipa, IPA_IRQ_TX_SUSPEND); + + return ret; +} + +void ipa_power_teardown(struct ipa *ipa) +{ + (void)device_init_wakeup(&ipa->pdev->dev, false); + ipa_interrupt_disable(ipa, IPA_IRQ_TX_SUSPEND); +} + +/* Initialize IPA power management */ +struct ipa_power * +ipa_power_init(struct device *dev, const struct ipa_power_data *data) +{ + struct ipa_power *power; + struct clk *clk; + size_t size; + int ret; + + clk = clk_get(dev, "core"); + if (IS_ERR(clk)) { + dev_err_probe(dev, PTR_ERR(clk), "error getting core clock\n"); + + return ERR_CAST(clk); + } + + ret = clk_set_rate(clk, data->core_clock_rate); + if (ret) { + dev_err(dev, "error %d setting core clock rate to %u\n", + ret, data->core_clock_rate); + goto err_clk_put; + } + + size = struct_size(power, interconnect, data->interconnect_count); + power = kzalloc(size, GFP_KERNEL); + if (!power) { + ret = -ENOMEM; + goto err_clk_put; + } + power->dev = dev; + power->core = clk; + spin_lock_init(&power->spinlock); + power->interconnect_count = data->interconnect_count; + + ret = ipa_interconnect_init(power, data->interconnect_data); + if (ret) + goto err_kfree; + + ret = ipa_power_retention_init(power); + if (ret) + goto err_interconnect_exit; + + pm_runtime_set_autosuspend_delay(dev, IPA_AUTOSUSPEND_DELAY); + pm_runtime_use_autosuspend(dev); + pm_runtime_enable(dev); + + return power; + +err_interconnect_exit: + ipa_interconnect_exit(power); +err_kfree: + kfree(power); +err_clk_put: + clk_put(clk); + + return ERR_PTR(ret); +} + +/* Inverse of ipa_power_init() */ +void ipa_power_exit(struct ipa_power *power) +{ + struct device *dev = power->dev; + struct clk *clk = power->core; + + pm_runtime_disable(dev); + pm_runtime_dont_use_autosuspend(dev); + ipa_power_retention_exit(power); + ipa_interconnect_exit(power); + kfree(power); + clk_put(clk); +} + +const struct dev_pm_ops ipa_pm_ops = { + .suspend = ipa_suspend, + .resume = ipa_resume, + .runtime_suspend = ipa_runtime_suspend, + .runtime_resume = ipa_runtime_resume, +}; -- cgit v1.2.3