aboutsummaryrefslogtreecommitdiff
path: root/Documentation/driver-api/firmware/fallback-mechanisms.rst
diff options
context:
space:
mode:
authorLibravatar Linus Torvalds <torvalds@linux-foundation.org>2023-02-21 18:24:12 -0800
committerLibravatar Linus Torvalds <torvalds@linux-foundation.org>2023-02-21 18:24:12 -0800
commit5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch)
treecc5c2d0a898769fd59549594fedb3ee6f84e59a0 /Documentation/driver-api/firmware/fallback-mechanisms.rst
downloadlinux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz
linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ...
Diffstat (limited to 'Documentation/driver-api/firmware/fallback-mechanisms.rst')
-rw-r--r--Documentation/driver-api/firmware/fallback-mechanisms.rst308
1 files changed, 308 insertions, 0 deletions
diff --git a/Documentation/driver-api/firmware/fallback-mechanisms.rst b/Documentation/driver-api/firmware/fallback-mechanisms.rst
new file mode 100644
index 000000000..5f04c3bcd
--- /dev/null
+++ b/Documentation/driver-api/firmware/fallback-mechanisms.rst
@@ -0,0 +1,308 @@
+===================
+Fallback mechanisms
+===================
+
+A fallback mechanism is supported to allow to overcome failures to do a direct
+filesystem lookup on the root filesystem or when the firmware simply cannot be
+installed for practical reasons on the root filesystem. The kernel
+configuration options related to supporting the firmware fallback mechanism are:
+
+ * CONFIG_FW_LOADER_USER_HELPER: enables building the firmware fallback
+ mechanism. Most distributions enable this option today. If enabled but
+ CONFIG_FW_LOADER_USER_HELPER_FALLBACK is disabled, only the custom fallback
+ mechanism is available and for the request_firmware_nowait() call.
+ * CONFIG_FW_LOADER_USER_HELPER_FALLBACK: force enables each request to
+ enable the kobject uevent fallback mechanism on all firmware API calls
+ except request_firmware_direct(). Most distributions disable this option
+ today. The call request_firmware_nowait() allows for one alternative
+ fallback mechanism: if this kconfig option is enabled and your second
+ argument to request_firmware_nowait(), uevent, is set to false you are
+ informing the kernel that you have a custom fallback mechanism and it will
+ manually load the firmware. Read below for more details.
+
+Note that this means when having this configuration:
+
+CONFIG_FW_LOADER_USER_HELPER=y
+CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
+
+the kobject uevent fallback mechanism will never take effect even
+for request_firmware_nowait() when uevent is set to true.
+
+Justifying the firmware fallback mechanism
+==========================================
+
+Direct filesystem lookups may fail for a variety of reasons. Known reasons for
+this are worth itemizing and documenting as it justifies the need for the
+fallback mechanism:
+
+* Race against access with the root filesystem upon bootup.
+
+* Races upon resume from suspend. This is resolved by the firmware cache, but
+ the firmware cache is only supported if you use uevents, and its not
+ supported for request_firmware_into_buf().
+
+* Firmware is not accessible through typical means:
+
+ * It cannot be installed into the root filesystem
+ * The firmware provides very unique device specific data tailored for
+ the unit gathered with local information. An example is calibration
+ data for WiFi chipsets for mobile devices. This calibration data is
+ not common to all units, but tailored per unit. Such information may
+ be installed on a separate flash partition other than where the root
+ filesystem is provided.
+
+Types of fallback mechanisms
+============================
+
+There are really two fallback mechanisms available using one shared sysfs
+interface as a loading facility:
+
+* Kobject uevent fallback mechanism
+* Custom fallback mechanism
+
+First lets document the shared sysfs loading facility.
+
+Firmware sysfs loading facility
+===============================
+
+In order to help device drivers upload firmware using a fallback mechanism
+the firmware infrastructure creates a sysfs interface to enable userspace
+to load and indicate when firmware is ready. The sysfs directory is created
+via fw_create_instance(). This call creates a new struct device named after
+the firmware requested, and establishes it in the device hierarchy by
+associating the device used to make the request as the device's parent.
+The sysfs directory's file attributes are defined and controlled through
+the new device's class (firmware_class) and group (fw_dev_attr_groups).
+This is actually where the original firmware_class module name came from,
+given that originally the only firmware loading mechanism available was the
+mechanism we now use as a fallback mechanism, which registers a struct class
+firmware_class. Because the attributes exposed are part of the module name, the
+module name firmware_class cannot be renamed in the future, to ensure backward
+compatibility with old userspace.
+
+To load firmware using the sysfs interface we expose a loading indicator,
+and a file upload firmware into:
+
+ * /sys/$DEVPATH/loading
+ * /sys/$DEVPATH/data
+
+To upload firmware you will echo 1 onto the loading file to indicate
+you are loading firmware. You then write the firmware into the data file,
+and you notify the kernel the firmware is ready by echo'ing 0 onto
+the loading file.
+
+The firmware device used to help load firmware using sysfs is only created if
+direct firmware loading fails and if the fallback mechanism is enabled for your
+firmware request, this is set up with :c:func:`firmware_fallback_sysfs`. It is
+important to re-iterate that no device is created if a direct filesystem lookup
+succeeded.
+
+Using::
+
+ echo 1 > /sys/$DEVPATH/loading
+
+Will clean any previous partial load at once and make the firmware API
+return an error. When loading firmware the firmware_class grows a buffer
+for the firmware in PAGE_SIZE increments to hold the image as it comes in.
+
+firmware_data_read() and firmware_loading_show() are just provided for the
+test_firmware driver for testing, they are not called in normal use or
+expected to be used regularly by userspace.
+
+firmware_fallback_sysfs
+-----------------------
+.. kernel-doc:: drivers/base/firmware_loader/fallback.c
+ :functions: firmware_fallback_sysfs
+
+Firmware kobject uevent fallback mechanism
+==========================================
+
+Since a device is created for the sysfs interface to help load firmware as a
+fallback mechanism userspace can be informed of the addition of the device by
+relying on kobject uevents. The addition of the device into the device
+hierarchy means the fallback mechanism for firmware loading has been initiated.
+For details of implementation refer to fw_load_sysfs_fallback(), in particular
+on the use of dev_set_uevent_suppress() and kobject_uevent().
+
+The kernel's kobject uevent mechanism is implemented in lib/kobject_uevent.c,
+it issues uevents to userspace. As a supplement to kobject uevents Linux
+distributions could also enable CONFIG_UEVENT_HELPER_PATH, which makes use of
+core kernel's usermode helper (UMH) functionality to call out to a userspace
+helper for kobject uevents. In practice though no standard distribution has
+ever used the CONFIG_UEVENT_HELPER_PATH. If CONFIG_UEVENT_HELPER_PATH is
+enabled this binary would be called each time kobject_uevent_env() gets called
+in the kernel for each kobject uevent triggered.
+
+Different implementations have been supported in userspace to take advantage of
+this fallback mechanism. When firmware loading was only possible using the
+sysfs mechanism the userspace component "hotplug" provided the functionality of
+monitoring for kobject events. Historically this was superseded be systemd's
+udev, however firmware loading support was removed from udev as of systemd
+commit be2ea723b1d0 ("udev: remove userspace firmware loading support")
+as of v217 on August, 2014. This means most Linux distributions today are
+not using or taking advantage of the firmware fallback mechanism provided
+by kobject uevents. This is specially exacerbated due to the fact that most
+distributions today disable CONFIG_FW_LOADER_USER_HELPER_FALLBACK.
+
+Refer to do_firmware_uevent() for details of the kobject event variables
+setup. The variables currently passed to userspace with a "kobject add"
+event are:
+
+* FIRMWARE=firmware name
+* TIMEOUT=timeout value
+* ASYNC=whether or not the API request was asynchronous
+
+By default DEVPATH is set by the internal kernel kobject infrastructure.
+Below is an example simple kobject uevent script::
+
+ # Both $DEVPATH and $FIRMWARE are already provided in the environment.
+ MY_FW_DIR=/lib/firmware/
+ echo 1 > /sys/$DEVPATH/loading
+ cat $MY_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data
+ echo 0 > /sys/$DEVPATH/loading
+
+Firmware custom fallback mechanism
+==================================
+
+Users of the request_firmware_nowait() call have yet another option available
+at their disposal: rely on the sysfs fallback mechanism but request that no
+kobject uevents be issued to userspace. The original logic behind this
+was that utilities other than udev might be required to lookup firmware
+in non-traditional paths -- paths outside of the listing documented in the
+section 'Direct filesystem lookup'. This option is not available to any of
+the other API calls as uevents are always forced for them.
+
+Since uevents are only meaningful if the fallback mechanism is enabled
+in your kernel it would seem odd to enable uevents with kernels that do not
+have the fallback mechanism enabled in their kernels. Unfortunately we also
+rely on the uevent flag which can be disabled by request_firmware_nowait() to
+also setup the firmware cache for firmware requests. As documented above,
+the firmware cache is only set up if uevent is enabled for an API call.
+Although this can disable the firmware cache for request_firmware_nowait()
+calls, users of this API should not use it for the purposes of disabling
+the cache as that was not the original purpose of the flag. Not setting
+the uevent flag means you want to opt-in for the firmware fallback mechanism
+but you want to suppress kobject uevents, as you have a custom solution which
+will monitor for your device addition into the device hierarchy somehow and
+load firmware for you through a custom path.
+
+Firmware fallback timeout
+=========================
+
+The firmware fallback mechanism has a timeout. If firmware is not loaded
+onto the sysfs interface by the timeout value an error is sent to the
+driver. By default the timeout is set to 60 seconds if uevents are
+desirable, otherwise MAX_JIFFY_OFFSET is used (max timeout possible).
+The logic behind using MAX_JIFFY_OFFSET for non-uevents is that a custom
+solution will have as much time as it needs to load firmware.
+
+You can customize the firmware timeout by echo'ing your desired timeout into
+the following file:
+
+* /sys/class/firmware/timeout
+
+If you echo 0 into it means MAX_JIFFY_OFFSET will be used. The data type
+for the timeout is an int.
+
+EFI embedded firmware fallback mechanism
+========================================
+
+On some devices the system's EFI code / ROM may contain an embedded copy
+of firmware for some of the system's integrated peripheral devices and
+the peripheral's Linux device-driver needs to access this firmware.
+
+Device drivers which need such firmware can use the
+firmware_request_platform() function for this, note that this is a
+separate fallback mechanism from the other fallback mechanisms and
+this does not use the sysfs interface.
+
+A device driver which needs this can describe the firmware it needs
+using an efi_embedded_fw_desc struct:
+
+.. kernel-doc:: include/linux/efi_embedded_fw.h
+ :functions: efi_embedded_fw_desc
+
+The EFI embedded-fw code works by scanning all EFI_BOOT_SERVICES_CODE memory
+segments for an eight byte sequence matching prefix; if the prefix is found it
+then does a sha256 over length bytes and if that matches makes a copy of length
+bytes and adds that to its list with found firmwares.
+
+To avoid doing this somewhat expensive scan on all systems, dmi matching is
+used. Drivers are expected to export a dmi_system_id array, with each entries'
+driver_data pointing to an efi_embedded_fw_desc.
+
+To register this array with the efi-embedded-fw code, a driver needs to:
+
+1. Always be builtin to the kernel or store the dmi_system_id array in a
+ separate object file which always gets builtin.
+
+2. Add an extern declaration for the dmi_system_id array to
+ include/linux/efi_embedded_fw.h.
+
+3. Add the dmi_system_id array to the embedded_fw_table in
+ drivers/firmware/efi/embedded-firmware.c wrapped in a #ifdef testing that
+ the driver is being builtin.
+
+4. Add "select EFI_EMBEDDED_FIRMWARE if EFI_STUB" to its Kconfig entry.
+
+The firmware_request_platform() function will always first try to load firmware
+with the specified name directly from the disk, so the EFI embedded-fw can
+always be overridden by placing a file under /lib/firmware.
+
+Note that:
+
+1. The code scanning for EFI embedded-firmware runs near the end
+ of start_kernel(), just before calling rest_init(). For normal drivers and
+ subsystems using subsys_initcall() to register themselves this does not
+ matter. This means that code running earlier cannot use EFI
+ embedded-firmware.
+
+2. At the moment the EFI embedded-fw code assumes that firmwares always start at
+ an offset which is a multiple of 8 bytes, if this is not true for your case
+ send in a patch to fix this.
+
+3. At the moment the EFI embedded-fw code only works on x86 because other archs
+ free EFI_BOOT_SERVICES_CODE before the EFI embedded-fw code gets a chance to
+ scan it.
+
+4. The current brute-force scanning of EFI_BOOT_SERVICES_CODE is an ad-hoc
+ brute-force solution. There has been discussion to use the UEFI Platform
+ Initialization (PI) spec's Firmware Volume protocol. This has been rejected
+ because the FV Protocol relies on *internal* interfaces of the PI spec, and:
+ 1. The PI spec does not define peripheral firmware at all
+ 2. The internal interfaces of the PI spec do not guarantee any backward
+ compatibility. Any implementation details in FV may be subject to change,
+ and may vary system to system. Supporting the FV Protocol would be
+ difficult as it is purposely ambiguous.
+
+Example how to check for and extract embedded firmware
+------------------------------------------------------
+
+To check for, for example Silead touchscreen controller embedded firmware,
+do the following:
+
+1. Boot the system with efi=debug on the kernel commandline
+
+2. cp /sys/kernel/debug/efi/boot_services_code? to your home dir
+
+3. Open the boot_services_code? files in a hex-editor, search for the
+ magic prefix for Silead firmware: F0 00 00 00 02 00 00 00, this gives you
+ the beginning address of the firmware inside the boot_services_code? file.
+
+4. The firmware has a specific pattern, it starts with a 8 byte page-address,
+ typically F0 00 00 00 02 00 00 00 for the first page followed by 32-bit
+ word-address + 32-bit value pairs. With the word-address incrementing 4
+ bytes (1 word) for each pair until a page is complete. A complete page is
+ followed by a new page-address, followed by more word + value pairs. This
+ leads to a very distinct pattern. Scroll down until this pattern stops,
+ this gives you the end of the firmware inside the boot_services_code? file.
+
+5. "dd if=boot_services_code? of=firmware bs=1 skip=<begin-addr> count=<len>"
+ will extract the firmware for you. Inspect the firmware file in a
+ hexeditor to make sure you got the dd parameters correct.
+
+6. Copy it to /lib/firmware under the expected name to test it.
+
+7. If the extracted firmware works, you can use the found info to fill an
+ efi_embedded_fw_desc struct to describe it, run "sha256sum firmware"
+ to get the sha256sum to put in the sha256 field.