From 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Tue, 21 Feb 2023 18:24:12 -0800 Subject: Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Add dedicated kmem_cache for typical/small skb->head, avoid having to access struct page at kfree time, and improve memory use. - Introduce sysctl to set default RPS configuration for new netdevs. - Define Netlink protocol specification format which can be used to describe messages used by each family and auto-generate parsers. Add tools for generating kernel data structures and uAPI headers. - Expose all net/core sysctls inside netns. - Remove 4s sleep in netpoll if carrier is instantly detected on boot. - Add configurable limit of MDB entries per port, and port-vlan. - Continue populating drop reasons throughout the stack. - Retire a handful of legacy Qdiscs and classifiers. Protocols: - Support IPv4 big TCP (TSO frames larger than 64kB). - Add IP_LOCAL_PORT_RANGE socket option, to control local port range on socket by socket basis. - Track and report in procfs number of MPTCP sockets used. - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path manager. - IPv6: don't check net.ipv6.route.max_size and rely on garbage collection to free memory (similarly to IPv4). - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986). - ICMP: add per-rate limit counters. - Add support for user scanning requests in ieee802154. - Remove static WEP support. - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting. - WiFi 7 EHT channel puncturing support (client & AP). BPF: - Add a rbtree data structure following the "next-gen data structure" precedent set by recently added linked list, that is, by using kfunc + kptr instead of adding a new BPF map type. - Expose XDP hints via kfuncs with initial support for RX hash and timestamp metadata. - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata. - Improve x86 JIT's codegen for PROBE_MEM runtime error checks. - Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers. - Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case. - Significantly reduce the search time for module symbols by livepatch and BPF. - Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals. - Add support for BPF trampoline on s390x and riscv64. - Add capability to export the XDP features supported by the NIC. - Add __bpf_kfunc tag for marking kernel functions as kfuncs. - Add cgroup.memory=nobpf kernel parameter option to disable BPF memory accounting for container environments. Netfilter: - Remove the CLUSTERIP target. It has been marked as obsolete for years, and we still have WARN splats wrt races of the out-of-band /proc interface installed by this target. - Add 'destroy' commands to nf_tables. They are identical to the existing 'delete' commands, but do not return an error if the referenced object (set, chain, rule...) did not exist. Driver API: - Improve cpumask_local_spread() locality to help NICs set the right IRQ affinity on AMD platforms. - Separate C22 and C45 MDIO bus transactions more clearly. - Introduce new DCB table to control DSCP rewrite on egress. - Support configuration of Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of shared medium Ethernet. - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing preemption of low priority frames by high priority frames. - Add support for controlling MACSec offload using netlink SET. - Rework devlink instance refcounts to allow registration and de-registration under the instance lock. Split the code into multiple files, drop some of the unnecessarily granular locks and factor out common parts of netlink operation handling. - Add TX frame aggregation parameters (for USB drivers). - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning messages with notifications for debug. - Allow offloading of UDP NEW connections via act_ct. - Add support for per action HW stats in TC. - Support hardware miss to TC action (continue processing in SW from a specific point in the action chain). - Warn if old Wireless Extension user space interface is used with modern cfg80211/mac80211 drivers. Do not support Wireless Extensions for Wi-Fi 7 devices at all. Everyone should switch to using nl80211 interface instead. - Improve the CAN bit timing configuration. Use extack to return error messages directly to user space, update the SJW handling, including the definition of a new default value that will benefit CAN-FD controllers, by increasing their oscillator tolerance. New hardware / drivers: - Ethernet: - nVidia BlueField-3 support (control traffic driver) - Ethernet support for imx93 SoCs - Motorcomm yt8531 gigabit Ethernet PHY - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA) - Microchip LAN8841 PHY (incl. cable diagnostics and PTP) - Amlogic gxl MDIO mux - WiFi: - RealTek RTL8188EU (rtl8xxxu) - Qualcomm Wi-Fi 7 devices (ath12k) - CAN: - Renesas R-Car V4H Drivers: - Bluetooth: - Set Per Platform Antenna Gain (PPAG) for Intel controllers. - Ethernet NICs: - Intel (1G, igc): - support TSN / Qbv / packet scheduling features of i226 model - Intel (100G, ice): - use GNSS subsystem instead of TTY - multi-buffer XDP support - extend support for GPIO pins to E823 devices - nVidia/Mellanox: - update the shared buffer configuration on PFC commands - implement PTP adjphase function for HW offset control - TC support for Geneve and GRE with VF tunnel offload - more efficient crypto key management method - multi-port eswitch support - Netronome/Corigine: - add DCB IEEE support - support IPsec offloading for NFP3800 - Freescale/NXP (enetc): - support XDP_REDIRECT for XDP non-linear buffers - improve reconfig, avoid link flap and waiting for idle - support MAC Merge layer - Other NICs: - sfc/ef100: add basic devlink support for ef100 - ionic: rx_push mode operation (writing descriptors via MMIO) - bnxt: use the auxiliary bus abstraction for RDMA - r8169: disable ASPM and reset bus in case of tx timeout - cpsw: support QSGMII mode for J721e CPSW9G - cpts: support pulse-per-second output - ngbe: add an mdio bus driver - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing - r8152: handle devices with FW with NCM support - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation - virtio-net: support multi buffer XDP - virtio/vsock: replace virtio_vsock_pkt with sk_buff - tsnep: XDP support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add support for latency TLV (in FW control messages) - Microchip (sparx5): - separate explicit and implicit traffic forwarding rules, make the implicit rules always active - add support for egress DSCP rewrite - IS0 VCAP support (Ingress Classification) - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.) - ES2 VCAP support (Egress Access Control) - support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1) - Ethernet embedded switches: - Marvell (mv88e6xxx): - add MAB (port auth) offload support - enable PTP receive for mv88e6390 - NXP (ocelot): - support MAC Merge layer - support for the the vsc7512 internal copper phys - Microchip: - lan9303: convert to PHYLINK - lan966x: support TC flower filter statistics - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x - lan937x: support Credit Based Shaper configuration - ksz9477: support Energy Efficient Ethernet - other: - qca8k: convert to regmap read/write API, use bulk operations - rswitch: Improve TX timestamp accuracy - Intel WiFi (iwlwifi): - EHT (Wi-Fi 7) rate reporting - STEP equalizer support: transfer some STEP (connection to radio on platforms with integrated wifi) related parameters from the BIOS to the firmware. - Qualcomm 802.11ax WiFi (ath11k): - IPQ5018 support - Fine Timing Measurement (FTM) responder role support - channel 177 support - MediaTek WiFi (mt76): - per-PHY LED support - mt7996: EHT (Wi-Fi 7) support - Wireless Ethernet Dispatch (WED) reset support - switch to using page pool allocator - RealTek WiFi (rtw89): - support new version of Bluetooth co-existance - Mobile: - rmnet: support TX aggregation" * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits) page_pool: add a comment explaining the fragment counter usage net: ethtool: fix __ethtool_dev_mm_supported() implementation ethtool: pse-pd: Fix double word in comments xsk: add linux/vmalloc.h to xsk.c sefltests: netdevsim: wait for devlink instance after netns removal selftest: fib_tests: Always cleanup before exit net/mlx5e: Align IPsec ASO result memory to be as required by hardware net/mlx5e: TC, Set CT miss to the specific ct action instance net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG net/mlx5: Refactor tc miss handling to a single function net/mlx5: Kconfig: Make tc offload depend on tc skb extension net/sched: flower: Support hardware miss to tc action net/sched: flower: Move filter handle initialization earlier net/sched: cls_api: Support hardware miss to tc action net/sched: Rename user cookie and act cookie sfc: fix builds without CONFIG_RTC_LIB sfc: clean up some inconsistent indentings net/mlx4_en: Introduce flexible array to silence overflow warning net: lan966x: Fix possible deadlock inside PTP net/ulp: Remove redundant ->clone() test in inet_clone_ulp(). ... --- lib/scatterlist.c | 1097 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1097 insertions(+) create mode 100644 lib/scatterlist.c (limited to 'lib/scatterlist.c') diff --git a/lib/scatterlist.c b/lib/scatterlist.c new file mode 100644 index 000000000..8d7519a8f --- /dev/null +++ b/lib/scatterlist.c @@ -0,0 +1,1097 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2007 Jens Axboe + * + * Scatterlist handling helpers. + */ +#include +#include +#include +#include +#include + +/** + * sg_next - return the next scatterlist entry in a list + * @sg: The current sg entry + * + * Description: + * Usually the next entry will be @sg@ + 1, but if this sg element is part + * of a chained scatterlist, it could jump to the start of a new + * scatterlist array. + * + **/ +struct scatterlist *sg_next(struct scatterlist *sg) +{ + if (sg_is_last(sg)) + return NULL; + + sg++; + if (unlikely(sg_is_chain(sg))) + sg = sg_chain_ptr(sg); + + return sg; +} +EXPORT_SYMBOL(sg_next); + +/** + * sg_nents - return total count of entries in scatterlist + * @sg: The scatterlist + * + * Description: + * Allows to know how many entries are in sg, taking into account + * chaining as well + * + **/ +int sg_nents(struct scatterlist *sg) +{ + int nents; + for (nents = 0; sg; sg = sg_next(sg)) + nents++; + return nents; +} +EXPORT_SYMBOL(sg_nents); + +/** + * sg_nents_for_len - return total count of entries in scatterlist + * needed to satisfy the supplied length + * @sg: The scatterlist + * @len: The total required length + * + * Description: + * Determines the number of entries in sg that are required to meet + * the supplied length, taking into account chaining as well + * + * Returns: + * the number of sg entries needed, negative error on failure + * + **/ +int sg_nents_for_len(struct scatterlist *sg, u64 len) +{ + int nents; + u64 total; + + if (!len) + return 0; + + for (nents = 0, total = 0; sg; sg = sg_next(sg)) { + nents++; + total += sg->length; + if (total >= len) + return nents; + } + + return -EINVAL; +} +EXPORT_SYMBOL(sg_nents_for_len); + +/** + * sg_last - return the last scatterlist entry in a list + * @sgl: First entry in the scatterlist + * @nents: Number of entries in the scatterlist + * + * Description: + * Should only be used casually, it (currently) scans the entire list + * to get the last entry. + * + * Note that the @sgl@ pointer passed in need not be the first one, + * the important bit is that @nents@ denotes the number of entries that + * exist from @sgl@. + * + **/ +struct scatterlist *sg_last(struct scatterlist *sgl, unsigned int nents) +{ + struct scatterlist *sg, *ret = NULL; + unsigned int i; + + for_each_sg(sgl, sg, nents, i) + ret = sg; + + BUG_ON(!sg_is_last(ret)); + return ret; +} +EXPORT_SYMBOL(sg_last); + +/** + * sg_init_table - Initialize SG table + * @sgl: The SG table + * @nents: Number of entries in table + * + * Notes: + * If this is part of a chained sg table, sg_mark_end() should be + * used only on the last table part. + * + **/ +void sg_init_table(struct scatterlist *sgl, unsigned int nents) +{ + memset(sgl, 0, sizeof(*sgl) * nents); + sg_init_marker(sgl, nents); +} +EXPORT_SYMBOL(sg_init_table); + +/** + * sg_init_one - Initialize a single entry sg list + * @sg: SG entry + * @buf: Virtual address for IO + * @buflen: IO length + * + **/ +void sg_init_one(struct scatterlist *sg, const void *buf, unsigned int buflen) +{ + sg_init_table(sg, 1); + sg_set_buf(sg, buf, buflen); +} +EXPORT_SYMBOL(sg_init_one); + +/* + * The default behaviour of sg_alloc_table() is to use these kmalloc/kfree + * helpers. + */ +static struct scatterlist *sg_kmalloc(unsigned int nents, gfp_t gfp_mask) +{ + if (nents == SG_MAX_SINGLE_ALLOC) { + /* + * Kmemleak doesn't track page allocations as they are not + * commonly used (in a raw form) for kernel data structures. + * As we chain together a list of pages and then a normal + * kmalloc (tracked by kmemleak), in order to for that last + * allocation not to become decoupled (and thus a + * false-positive) we need to inform kmemleak of all the + * intermediate allocations. + */ + void *ptr = (void *) __get_free_page(gfp_mask); + kmemleak_alloc(ptr, PAGE_SIZE, 1, gfp_mask); + return ptr; + } else + return kmalloc_array(nents, sizeof(struct scatterlist), + gfp_mask); +} + +static void sg_kfree(struct scatterlist *sg, unsigned int nents) +{ + if (nents == SG_MAX_SINGLE_ALLOC) { + kmemleak_free(sg); + free_page((unsigned long) sg); + } else + kfree(sg); +} + +/** + * __sg_free_table - Free a previously mapped sg table + * @table: The sg table header to use + * @max_ents: The maximum number of entries per single scatterlist + * @nents_first_chunk: Number of entries int the (preallocated) first + * scatterlist chunk, 0 means no such preallocated first chunk + * @free_fn: Free function + * @num_ents: Number of entries in the table + * + * Description: + * Free an sg table previously allocated and setup with + * __sg_alloc_table(). The @max_ents value must be identical to + * that previously used with __sg_alloc_table(). + * + **/ +void __sg_free_table(struct sg_table *table, unsigned int max_ents, + unsigned int nents_first_chunk, sg_free_fn *free_fn, + unsigned int num_ents) +{ + struct scatterlist *sgl, *next; + unsigned curr_max_ents = nents_first_chunk ?: max_ents; + + if (unlikely(!table->sgl)) + return; + + sgl = table->sgl; + while (num_ents) { + unsigned int alloc_size = num_ents; + unsigned int sg_size; + + /* + * If we have more than max_ents segments left, + * then assign 'next' to the sg table after the current one. + * sg_size is then one less than alloc size, since the last + * element is the chain pointer. + */ + if (alloc_size > curr_max_ents) { + next = sg_chain_ptr(&sgl[curr_max_ents - 1]); + alloc_size = curr_max_ents; + sg_size = alloc_size - 1; + } else { + sg_size = alloc_size; + next = NULL; + } + + num_ents -= sg_size; + if (nents_first_chunk) + nents_first_chunk = 0; + else + free_fn(sgl, alloc_size); + sgl = next; + curr_max_ents = max_ents; + } + + table->sgl = NULL; +} +EXPORT_SYMBOL(__sg_free_table); + +/** + * sg_free_append_table - Free a previously allocated append sg table. + * @table: The mapped sg append table header + * + **/ +void sg_free_append_table(struct sg_append_table *table) +{ + __sg_free_table(&table->sgt, SG_MAX_SINGLE_ALLOC, 0, sg_kfree, + table->total_nents); +} +EXPORT_SYMBOL(sg_free_append_table); + + +/** + * sg_free_table - Free a previously allocated sg table + * @table: The mapped sg table header + * + **/ +void sg_free_table(struct sg_table *table) +{ + __sg_free_table(table, SG_MAX_SINGLE_ALLOC, 0, sg_kfree, + table->orig_nents); +} +EXPORT_SYMBOL(sg_free_table); + +/** + * __sg_alloc_table - Allocate and initialize an sg table with given allocator + * @table: The sg table header to use + * @nents: Number of entries in sg list + * @max_ents: The maximum number of entries the allocator returns per call + * @nents_first_chunk: Number of entries int the (preallocated) first + * scatterlist chunk, 0 means no such preallocated chunk provided by user + * @gfp_mask: GFP allocation mask + * @alloc_fn: Allocator to use + * + * Description: + * This function returns a @table @nents long. The allocator is + * defined to return scatterlist chunks of maximum size @max_ents. + * Thus if @nents is bigger than @max_ents, the scatterlists will be + * chained in units of @max_ents. + * + * Notes: + * If this function returns non-0 (eg failure), the caller must call + * __sg_free_table() to cleanup any leftover allocations. + * + **/ +int __sg_alloc_table(struct sg_table *table, unsigned int nents, + unsigned int max_ents, struct scatterlist *first_chunk, + unsigned int nents_first_chunk, gfp_t gfp_mask, + sg_alloc_fn *alloc_fn) +{ + struct scatterlist *sg, *prv; + unsigned int left; + unsigned curr_max_ents = nents_first_chunk ?: max_ents; + unsigned prv_max_ents; + + memset(table, 0, sizeof(*table)); + + if (nents == 0) + return -EINVAL; +#ifdef CONFIG_ARCH_NO_SG_CHAIN + if (WARN_ON_ONCE(nents > max_ents)) + return -EINVAL; +#endif + + left = nents; + prv = NULL; + do { + unsigned int sg_size, alloc_size = left; + + if (alloc_size > curr_max_ents) { + alloc_size = curr_max_ents; + sg_size = alloc_size - 1; + } else + sg_size = alloc_size; + + left -= sg_size; + + if (first_chunk) { + sg = first_chunk; + first_chunk = NULL; + } else { + sg = alloc_fn(alloc_size, gfp_mask); + } + if (unlikely(!sg)) { + /* + * Adjust entry count to reflect that the last + * entry of the previous table won't be used for + * linkage. Without this, sg_kfree() may get + * confused. + */ + if (prv) + table->nents = ++table->orig_nents; + + return -ENOMEM; + } + + sg_init_table(sg, alloc_size); + table->nents = table->orig_nents += sg_size; + + /* + * If this is the first mapping, assign the sg table header. + * If this is not the first mapping, chain previous part. + */ + if (prv) + sg_chain(prv, prv_max_ents, sg); + else + table->sgl = sg; + + /* + * If no more entries after this one, mark the end + */ + if (!left) + sg_mark_end(&sg[sg_size - 1]); + + prv = sg; + prv_max_ents = curr_max_ents; + curr_max_ents = max_ents; + } while (left); + + return 0; +} +EXPORT_SYMBOL(__sg_alloc_table); + +/** + * sg_alloc_table - Allocate and initialize an sg table + * @table: The sg table header to use + * @nents: Number of entries in sg list + * @gfp_mask: GFP allocation mask + * + * Description: + * Allocate and initialize an sg table. If @nents@ is larger than + * SG_MAX_SINGLE_ALLOC a chained sg table will be setup. + * + **/ +int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) +{ + int ret; + + ret = __sg_alloc_table(table, nents, SG_MAX_SINGLE_ALLOC, + NULL, 0, gfp_mask, sg_kmalloc); + if (unlikely(ret)) + sg_free_table(table); + return ret; +} +EXPORT_SYMBOL(sg_alloc_table); + +static struct scatterlist *get_next_sg(struct sg_append_table *table, + struct scatterlist *cur, + unsigned long needed_sges, + gfp_t gfp_mask) +{ + struct scatterlist *new_sg, *next_sg; + unsigned int alloc_size; + + if (cur) { + next_sg = sg_next(cur); + /* Check if last entry should be keeped for chainning */ + if (!sg_is_last(next_sg) || needed_sges == 1) + return next_sg; + } + + alloc_size = min_t(unsigned long, needed_sges, SG_MAX_SINGLE_ALLOC); + new_sg = sg_kmalloc(alloc_size, gfp_mask); + if (!new_sg) + return ERR_PTR(-ENOMEM); + sg_init_table(new_sg, alloc_size); + if (cur) { + table->total_nents += alloc_size - 1; + __sg_chain(next_sg, new_sg); + } else { + table->sgt.sgl = new_sg; + table->total_nents = alloc_size; + } + return new_sg; +} + +static bool pages_are_mergeable(struct page *a, struct page *b) +{ + if (page_to_pfn(a) != page_to_pfn(b) + 1) + return false; + if (!zone_device_pages_have_same_pgmap(a, b)) + return false; + return true; +} + +/** + * sg_alloc_append_table_from_pages - Allocate and initialize an append sg + * table from an array of pages + * @sgt_append: The sg append table to use + * @pages: Pointer to an array of page pointers + * @n_pages: Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size: Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist element in bytes + * @left_pages: Left pages caller have to set after this call + * @gfp_mask: GFP allocation mask + * + * Description: + * In the first call it allocate and initialize an sg table from a list of + * pages, else reuse the scatterlist from sgt_append. Contiguous ranges of + * the pages are squashed into a single scatterlist entry up to the maximum + * size specified in @max_segment. A user may provide an offset at a start + * and a size of valid data in a buffer specified by the page array. The + * returned sg table is released by sg_free_append_table + * + * Returns: + * 0 on success, negative error on failure + * + * Notes: + * If this function returns non-0 (eg failure), the caller must call + * sg_free_append_table() to cleanup any leftover allocations. + * + * In the fist call, sgt_append must by initialized. + */ +int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, + struct page **pages, unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + unsigned int left_pages, gfp_t gfp_mask) +{ + unsigned int chunks, cur_page, seg_len, i, prv_len = 0; + unsigned int added_nents = 0; + struct scatterlist *s = sgt_append->prv; + struct page *last_pg; + + /* + * The algorithm below requires max_segment to be aligned to PAGE_SIZE + * otherwise it can overshoot. + */ + max_segment = ALIGN_DOWN(max_segment, PAGE_SIZE); + if (WARN_ON(max_segment < PAGE_SIZE)) + return -EINVAL; + + if (IS_ENABLED(CONFIG_ARCH_NO_SG_CHAIN) && sgt_append->prv) + return -EOPNOTSUPP; + + if (sgt_append->prv) { + unsigned long next_pfn = (page_to_phys(sg_page(sgt_append->prv)) + + sgt_append->prv->offset + sgt_append->prv->length) / PAGE_SIZE; + + if (WARN_ON(offset)) + return -EINVAL; + + /* Merge contiguous pages into the last SG */ + prv_len = sgt_append->prv->length; + if (page_to_pfn(pages[0]) == next_pfn) { + last_pg = pfn_to_page(next_pfn - 1); + while (n_pages && pages_are_mergeable(pages[0], last_pg)) { + if (sgt_append->prv->length + PAGE_SIZE > max_segment) + break; + sgt_append->prv->length += PAGE_SIZE; + last_pg = pages[0]; + pages++; + n_pages--; + } + if (!n_pages) + goto out; + } + } + + /* compute number of contiguous chunks */ + chunks = 1; + seg_len = 0; + for (i = 1; i < n_pages; i++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + !pages_are_mergeable(pages[i], pages[i - 1])) { + chunks++; + seg_len = 0; + } + } + + /* merging chunks and putting them into the scatterlist */ + cur_page = 0; + for (i = 0; i < chunks; i++) { + unsigned int j, chunk_size; + + /* look for the end of the current chunk */ + seg_len = 0; + for (j = cur_page + 1; j < n_pages; j++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + !pages_are_mergeable(pages[j], pages[j - 1])) + break; + } + + /* Pass how many chunks might be left */ + s = get_next_sg(sgt_append, s, chunks - i + left_pages, + gfp_mask); + if (IS_ERR(s)) { + /* + * Adjust entry length to be as before function was + * called. + */ + if (sgt_append->prv) + sgt_append->prv->length = prv_len; + return PTR_ERR(s); + } + chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); + added_nents++; + size -= chunk_size; + offset = 0; + cur_page = j; + } + sgt_append->sgt.nents += added_nents; + sgt_append->sgt.orig_nents = sgt_append->sgt.nents; + sgt_append->prv = s; +out: + if (!left_pages) + sg_mark_end(s); + return 0; +} +EXPORT_SYMBOL(sg_alloc_append_table_from_pages); + +/** + * sg_alloc_table_from_pages_segment - Allocate and initialize an sg table from + * an array of pages and given maximum + * segment. + * @sgt: The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages: Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size: Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist element in bytes + * @gfp_mask: GFP allocation mask + * + * Description: + * Allocate and initialize an sg table from a list of pages. Contiguous + * ranges of the pages are squashed into a single scatterlist node up to the + * maximum size specified in @max_segment. A user may provide an offset at a + * start and a size of valid data in a buffer specified by the page array. + * + * The returned sg table is released by sg_free_table. + * + * Returns: + * 0 on success, negative error on failure + */ +int sg_alloc_table_from_pages_segment(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) +{ + struct sg_append_table append = {}; + int err; + + err = sg_alloc_append_table_from_pages(&append, pages, n_pages, offset, + size, max_segment, 0, gfp_mask); + if (err) { + sg_free_append_table(&append); + return err; + } + memcpy(sgt, &append.sgt, sizeof(*sgt)); + WARN_ON(append.total_nents != sgt->orig_nents); + return 0; +} +EXPORT_SYMBOL(sg_alloc_table_from_pages_segment); + +#ifdef CONFIG_SGL_ALLOC + +/** + * sgl_alloc_order - allocate a scatterlist and its pages + * @length: Length in bytes of the scatterlist. Must be at least one + * @order: Second argument for alloc_pages() + * @chainable: Whether or not to allocate an extra element in the scatterlist + * for scatterlist chaining purposes + * @gfp: Memory allocation flags + * @nent_p: [out] Number of entries in the scatterlist that have pages + * + * Returns: A pointer to an initialized scatterlist or %NULL upon failure. + */ +struct scatterlist *sgl_alloc_order(unsigned long long length, + unsigned int order, bool chainable, + gfp_t gfp, unsigned int *nent_p) +{ + struct scatterlist *sgl, *sg; + struct page *page; + unsigned int nent, nalloc; + u32 elem_len; + + nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order); + /* Check for integer overflow */ + if (length > (nent << (PAGE_SHIFT + order))) + return NULL; + nalloc = nent; + if (chainable) { + /* Check for integer overflow */ + if (nalloc + 1 < nalloc) + return NULL; + nalloc++; + } + sgl = kmalloc_array(nalloc, sizeof(struct scatterlist), + gfp & ~GFP_DMA); + if (!sgl) + return NULL; + + sg_init_table(sgl, nalloc); + sg = sgl; + while (length) { + elem_len = min_t(u64, length, PAGE_SIZE << order); + page = alloc_pages(gfp, order); + if (!page) { + sgl_free_order(sgl, order); + return NULL; + } + + sg_set_page(sg, page, elem_len, 0); + length -= elem_len; + sg = sg_next(sg); + } + WARN_ONCE(length, "length = %lld\n", length); + if (nent_p) + *nent_p = nent; + return sgl; +} +EXPORT_SYMBOL(sgl_alloc_order); + +/** + * sgl_alloc - allocate a scatterlist and its pages + * @length: Length in bytes of the scatterlist + * @gfp: Memory allocation flags + * @nent_p: [out] Number of entries in the scatterlist + * + * Returns: A pointer to an initialized scatterlist or %NULL upon failure. + */ +struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp, + unsigned int *nent_p) +{ + return sgl_alloc_order(length, 0, false, gfp, nent_p); +} +EXPORT_SYMBOL(sgl_alloc); + +/** + * sgl_free_n_order - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + * @nents: Maximum number of elements to free + * @order: Second argument for __free_pages() + * + * Notes: + * - If several scatterlists have been chained and each chain element is + * freed separately then it's essential to set nents correctly to avoid that a + * page would get freed twice. + * - All pages in a chained scatterlist can be freed at once by setting @nents + * to a high number. + */ +void sgl_free_n_order(struct scatterlist *sgl, int nents, int order) +{ + struct scatterlist *sg; + struct page *page; + int i; + + for_each_sg(sgl, sg, nents, i) { + if (!sg) + break; + page = sg_page(sg); + if (page) + __free_pages(page, order); + } + kfree(sgl); +} +EXPORT_SYMBOL(sgl_free_n_order); + +/** + * sgl_free_order - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + * @order: Second argument for __free_pages() + */ +void sgl_free_order(struct scatterlist *sgl, int order) +{ + sgl_free_n_order(sgl, INT_MAX, order); +} +EXPORT_SYMBOL(sgl_free_order); + +/** + * sgl_free - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + */ +void sgl_free(struct scatterlist *sgl) +{ + sgl_free_order(sgl, 0); +} +EXPORT_SYMBOL(sgl_free); + +#endif /* CONFIG_SGL_ALLOC */ + +void __sg_page_iter_start(struct sg_page_iter *piter, + struct scatterlist *sglist, unsigned int nents, + unsigned long pgoffset) +{ + piter->__pg_advance = 0; + piter->__nents = nents; + + piter->sg = sglist; + piter->sg_pgoffset = pgoffset; +} +EXPORT_SYMBOL(__sg_page_iter_start); + +static int sg_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; +} + +bool __sg_page_iter_next(struct sg_page_iter *piter) +{ + if (!piter->__nents || !piter->sg) + return false; + + piter->sg_pgoffset += piter->__pg_advance; + piter->__pg_advance = 1; + + while (piter->sg_pgoffset >= sg_page_count(piter->sg)) { + piter->sg_pgoffset -= sg_page_count(piter->sg); + piter->sg = sg_next(piter->sg); + if (!--piter->__nents || !piter->sg) + return false; + } + + return true; +} +EXPORT_SYMBOL(__sg_page_iter_next); + +static int sg_dma_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg_dma_len(sg)) >> PAGE_SHIFT; +} + +bool __sg_page_iter_dma_next(struct sg_dma_page_iter *dma_iter) +{ + struct sg_page_iter *piter = &dma_iter->base; + + if (!piter->__nents || !piter->sg) + return false; + + piter->sg_pgoffset += piter->__pg_advance; + piter->__pg_advance = 1; + + while (piter->sg_pgoffset >= sg_dma_page_count(piter->sg)) { + piter->sg_pgoffset -= sg_dma_page_count(piter->sg); + piter->sg = sg_next(piter->sg); + if (!--piter->__nents || !piter->sg) + return false; + } + + return true; +} +EXPORT_SYMBOL(__sg_page_iter_dma_next); + +/** + * sg_miter_start - start mapping iteration over a sg list + * @miter: sg mapping iter to be started + * @sgl: sg list to iterate over + * @nents: number of sg entries + * + * Description: + * Starts mapping iterator @miter. + * + * Context: + * Don't care. + */ +void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl, + unsigned int nents, unsigned int flags) +{ + memset(miter, 0, sizeof(struct sg_mapping_iter)); + + __sg_page_iter_start(&miter->piter, sgl, nents, 0); + WARN_ON(!(flags & (SG_MITER_TO_SG | SG_MITER_FROM_SG))); + miter->__flags = flags; +} +EXPORT_SYMBOL(sg_miter_start); + +static bool sg_miter_get_next_page(struct sg_mapping_iter *miter) +{ + if (!miter->__remaining) { + struct scatterlist *sg; + + if (!__sg_page_iter_next(&miter->piter)) + return false; + + sg = miter->piter.sg; + + miter->__offset = miter->piter.sg_pgoffset ? 0 : sg->offset; + miter->piter.sg_pgoffset += miter->__offset >> PAGE_SHIFT; + miter->__offset &= PAGE_SIZE - 1; + miter->__remaining = sg->offset + sg->length - + (miter->piter.sg_pgoffset << PAGE_SHIFT) - + miter->__offset; + miter->__remaining = min_t(unsigned long, miter->__remaining, + PAGE_SIZE - miter->__offset); + } + + return true; +} + +/** + * sg_miter_skip - reposition mapping iterator + * @miter: sg mapping iter to be skipped + * @offset: number of bytes to plus the current location + * + * Description: + * Sets the offset of @miter to its current location plus @offset bytes. + * If mapping iterator @miter has been proceeded by sg_miter_next(), this + * stops @miter. + * + * Context: + * Don't care. + * + * Returns: + * true if @miter contains the valid mapping. false if end of sg + * list is reached. + */ +bool sg_miter_skip(struct sg_mapping_iter *miter, off_t offset) +{ + sg_miter_stop(miter); + + while (offset) { + off_t consumed; + + if (!sg_miter_get_next_page(miter)) + return false; + + consumed = min_t(off_t, offset, miter->__remaining); + miter->__offset += consumed; + miter->__remaining -= consumed; + offset -= consumed; + } + + return true; +} +EXPORT_SYMBOL(sg_miter_skip); + +/** + * sg_miter_next - proceed mapping iterator to the next mapping + * @miter: sg mapping iter to proceed + * + * Description: + * Proceeds @miter to the next mapping. @miter should have been started + * using sg_miter_start(). On successful return, @miter->page, + * @miter->addr and @miter->length point to the current mapping. + * + * Context: + * May sleep if !SG_MITER_ATOMIC. + * + * Returns: + * true if @miter contains the next mapping. false if end of sg + * list is reached. + */ +bool sg_miter_next(struct sg_mapping_iter *miter) +{ + sg_miter_stop(miter); + + /* + * Get to the next page if necessary. + * __remaining, __offset is adjusted by sg_miter_stop + */ + if (!sg_miter_get_next_page(miter)) + return false; + + miter->page = sg_page_iter_page(&miter->piter); + miter->consumed = miter->length = miter->__remaining; + + if (miter->__flags & SG_MITER_ATOMIC) + miter->addr = kmap_atomic(miter->page) + miter->__offset; + else + miter->addr = kmap(miter->page) + miter->__offset; + + return true; +} +EXPORT_SYMBOL(sg_miter_next); + +/** + * sg_miter_stop - stop mapping iteration + * @miter: sg mapping iter to be stopped + * + * Description: + * Stops mapping iterator @miter. @miter should have been started + * using sg_miter_start(). A stopped iteration can be resumed by + * calling sg_miter_next() on it. This is useful when resources (kmap) + * need to be released during iteration. + * + * Context: + * Don't care otherwise. + */ +void sg_miter_stop(struct sg_mapping_iter *miter) +{ + WARN_ON(miter->consumed > miter->length); + + /* drop resources from the last iteration */ + if (miter->addr) { + miter->__offset += miter->consumed; + miter->__remaining -= miter->consumed; + + if (miter->__flags & SG_MITER_TO_SG) + flush_dcache_page(miter->page); + + if (miter->__flags & SG_MITER_ATOMIC) { + WARN_ON_ONCE(!pagefault_disabled()); + kunmap_atomic(miter->addr); + } else + kunmap(miter->page); + + miter->page = NULL; + miter->addr = NULL; + miter->length = 0; + miter->consumed = 0; + } +} +EXPORT_SYMBOL(sg_miter_stop); + +/** + * sg_copy_buffer - Copy data between a linear buffer and an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * @to_buffer: transfer direction (true == from an sg list to a + * buffer, false == from a buffer to an sg list) + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, + size_t buflen, off_t skip, bool to_buffer) +{ + unsigned int offset = 0; + struct sg_mapping_iter miter; + unsigned int sg_flags = SG_MITER_ATOMIC; + + if (to_buffer) + sg_flags |= SG_MITER_FROM_SG; + else + sg_flags |= SG_MITER_TO_SG; + + sg_miter_start(&miter, sgl, nents, sg_flags); + + if (!sg_miter_skip(&miter, skip)) + return 0; + + while ((offset < buflen) && sg_miter_next(&miter)) { + unsigned int len; + + len = min(miter.length, buflen - offset); + + if (to_buffer) + memcpy(buf + offset, miter.addr, len); + else + memcpy(miter.addr, buf + offset, len); + + offset += len; + } + + sg_miter_stop(&miter); + + return offset; +} +EXPORT_SYMBOL(sg_copy_buffer); + +/** + * sg_copy_from_buffer - Copy from a linear buffer to an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_from_buffer(struct scatterlist *sgl, unsigned int nents, + const void *buf, size_t buflen) +{ + return sg_copy_buffer(sgl, nents, (void *)buf, buflen, 0, false); +} +EXPORT_SYMBOL(sg_copy_from_buffer); + +/** + * sg_copy_to_buffer - Copy from an SG list to a linear buffer + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy to + * @buflen: The number of bytes to copy + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, + void *buf, size_t buflen) +{ + return sg_copy_buffer(sgl, nents, buf, buflen, 0, true); +} +EXPORT_SYMBOL(sg_copy_to_buffer); + +/** + * sg_pcopy_from_buffer - Copy from a linear buffer to an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * + * Returns the number of copied bytes. + * + **/ +size_t sg_pcopy_from_buffer(struct scatterlist *sgl, unsigned int nents, + const void *buf, size_t buflen, off_t skip) +{ + return sg_copy_buffer(sgl, nents, (void *)buf, buflen, skip, false); +} +EXPORT_SYMBOL(sg_pcopy_from_buffer); + +/** + * sg_pcopy_to_buffer - Copy from an SG list to a linear buffer + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy to + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * + * Returns the number of copied bytes. + * + **/ +size_t sg_pcopy_to_buffer(struct scatterlist *sgl, unsigned int nents, + void *buf, size_t buflen, off_t skip) +{ + return sg_copy_buffer(sgl, nents, buf, buflen, skip, true); +} +EXPORT_SYMBOL(sg_pcopy_to_buffer); + +/** + * sg_zero_buffer - Zero-out a part of a SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buflen: The number of bytes to zero out + * @skip: Number of bytes to skip before zeroing + * + * Returns the number of bytes zeroed. + **/ +size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, + size_t buflen, off_t skip) +{ + unsigned int offset = 0; + struct sg_mapping_iter miter; + unsigned int sg_flags = SG_MITER_ATOMIC | SG_MITER_TO_SG; + + sg_miter_start(&miter, sgl, nents, sg_flags); + + if (!sg_miter_skip(&miter, skip)) + return false; + + while (offset < buflen && sg_miter_next(&miter)) { + unsigned int len; + + len = min(miter.length, buflen - offset); + memset(miter.addr, 0, len); + + offset += len; + } + + sg_miter_stop(&miter); + return offset; +} +EXPORT_SYMBOL(sg_zero_buffer); -- cgit v1.2.3