diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /lib/scatterlist.c | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'lib/scatterlist.c')
-rw-r--r-- | lib/scatterlist.c | 1097 |
1 files changed, 1097 insertions, 0 deletions
diff --git a/lib/scatterlist.c b/lib/scatterlist.c new file mode 100644 index 000000000..8d7519a8f --- /dev/null +++ b/lib/scatterlist.c @@ -0,0 +1,1097 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2007 Jens Axboe <jens.axboe@oracle.com> + * + * Scatterlist handling helpers. + */ +#include <linux/export.h> +#include <linux/slab.h> +#include <linux/scatterlist.h> +#include <linux/highmem.h> +#include <linux/kmemleak.h> + +/** + * sg_next - return the next scatterlist entry in a list + * @sg: The current sg entry + * + * Description: + * Usually the next entry will be @sg@ + 1, but if this sg element is part + * of a chained scatterlist, it could jump to the start of a new + * scatterlist array. + * + **/ +struct scatterlist *sg_next(struct scatterlist *sg) +{ + if (sg_is_last(sg)) + return NULL; + + sg++; + if (unlikely(sg_is_chain(sg))) + sg = sg_chain_ptr(sg); + + return sg; +} +EXPORT_SYMBOL(sg_next); + +/** + * sg_nents - return total count of entries in scatterlist + * @sg: The scatterlist + * + * Description: + * Allows to know how many entries are in sg, taking into account + * chaining as well + * + **/ +int sg_nents(struct scatterlist *sg) +{ + int nents; + for (nents = 0; sg; sg = sg_next(sg)) + nents++; + return nents; +} +EXPORT_SYMBOL(sg_nents); + +/** + * sg_nents_for_len - return total count of entries in scatterlist + * needed to satisfy the supplied length + * @sg: The scatterlist + * @len: The total required length + * + * Description: + * Determines the number of entries in sg that are required to meet + * the supplied length, taking into account chaining as well + * + * Returns: + * the number of sg entries needed, negative error on failure + * + **/ +int sg_nents_for_len(struct scatterlist *sg, u64 len) +{ + int nents; + u64 total; + + if (!len) + return 0; + + for (nents = 0, total = 0; sg; sg = sg_next(sg)) { + nents++; + total += sg->length; + if (total >= len) + return nents; + } + + return -EINVAL; +} +EXPORT_SYMBOL(sg_nents_for_len); + +/** + * sg_last - return the last scatterlist entry in a list + * @sgl: First entry in the scatterlist + * @nents: Number of entries in the scatterlist + * + * Description: + * Should only be used casually, it (currently) scans the entire list + * to get the last entry. + * + * Note that the @sgl@ pointer passed in need not be the first one, + * the important bit is that @nents@ denotes the number of entries that + * exist from @sgl@. + * + **/ +struct scatterlist *sg_last(struct scatterlist *sgl, unsigned int nents) +{ + struct scatterlist *sg, *ret = NULL; + unsigned int i; + + for_each_sg(sgl, sg, nents, i) + ret = sg; + + BUG_ON(!sg_is_last(ret)); + return ret; +} +EXPORT_SYMBOL(sg_last); + +/** + * sg_init_table - Initialize SG table + * @sgl: The SG table + * @nents: Number of entries in table + * + * Notes: + * If this is part of a chained sg table, sg_mark_end() should be + * used only on the last table part. + * + **/ +void sg_init_table(struct scatterlist *sgl, unsigned int nents) +{ + memset(sgl, 0, sizeof(*sgl) * nents); + sg_init_marker(sgl, nents); +} +EXPORT_SYMBOL(sg_init_table); + +/** + * sg_init_one - Initialize a single entry sg list + * @sg: SG entry + * @buf: Virtual address for IO + * @buflen: IO length + * + **/ +void sg_init_one(struct scatterlist *sg, const void *buf, unsigned int buflen) +{ + sg_init_table(sg, 1); + sg_set_buf(sg, buf, buflen); +} +EXPORT_SYMBOL(sg_init_one); + +/* + * The default behaviour of sg_alloc_table() is to use these kmalloc/kfree + * helpers. + */ +static struct scatterlist *sg_kmalloc(unsigned int nents, gfp_t gfp_mask) +{ + if (nents == SG_MAX_SINGLE_ALLOC) { + /* + * Kmemleak doesn't track page allocations as they are not + * commonly used (in a raw form) for kernel data structures. + * As we chain together a list of pages and then a normal + * kmalloc (tracked by kmemleak), in order to for that last + * allocation not to become decoupled (and thus a + * false-positive) we need to inform kmemleak of all the + * intermediate allocations. + */ + void *ptr = (void *) __get_free_page(gfp_mask); + kmemleak_alloc(ptr, PAGE_SIZE, 1, gfp_mask); + return ptr; + } else + return kmalloc_array(nents, sizeof(struct scatterlist), + gfp_mask); +} + +static void sg_kfree(struct scatterlist *sg, unsigned int nents) +{ + if (nents == SG_MAX_SINGLE_ALLOC) { + kmemleak_free(sg); + free_page((unsigned long) sg); + } else + kfree(sg); +} + +/** + * __sg_free_table - Free a previously mapped sg table + * @table: The sg table header to use + * @max_ents: The maximum number of entries per single scatterlist + * @nents_first_chunk: Number of entries int the (preallocated) first + * scatterlist chunk, 0 means no such preallocated first chunk + * @free_fn: Free function + * @num_ents: Number of entries in the table + * + * Description: + * Free an sg table previously allocated and setup with + * __sg_alloc_table(). The @max_ents value must be identical to + * that previously used with __sg_alloc_table(). + * + **/ +void __sg_free_table(struct sg_table *table, unsigned int max_ents, + unsigned int nents_first_chunk, sg_free_fn *free_fn, + unsigned int num_ents) +{ + struct scatterlist *sgl, *next; + unsigned curr_max_ents = nents_first_chunk ?: max_ents; + + if (unlikely(!table->sgl)) + return; + + sgl = table->sgl; + while (num_ents) { + unsigned int alloc_size = num_ents; + unsigned int sg_size; + + /* + * If we have more than max_ents segments left, + * then assign 'next' to the sg table after the current one. + * sg_size is then one less than alloc size, since the last + * element is the chain pointer. + */ + if (alloc_size > curr_max_ents) { + next = sg_chain_ptr(&sgl[curr_max_ents - 1]); + alloc_size = curr_max_ents; + sg_size = alloc_size - 1; + } else { + sg_size = alloc_size; + next = NULL; + } + + num_ents -= sg_size; + if (nents_first_chunk) + nents_first_chunk = 0; + else + free_fn(sgl, alloc_size); + sgl = next; + curr_max_ents = max_ents; + } + + table->sgl = NULL; +} +EXPORT_SYMBOL(__sg_free_table); + +/** + * sg_free_append_table - Free a previously allocated append sg table. + * @table: The mapped sg append table header + * + **/ +void sg_free_append_table(struct sg_append_table *table) +{ + __sg_free_table(&table->sgt, SG_MAX_SINGLE_ALLOC, 0, sg_kfree, + table->total_nents); +} +EXPORT_SYMBOL(sg_free_append_table); + + +/** + * sg_free_table - Free a previously allocated sg table + * @table: The mapped sg table header + * + **/ +void sg_free_table(struct sg_table *table) +{ + __sg_free_table(table, SG_MAX_SINGLE_ALLOC, 0, sg_kfree, + table->orig_nents); +} +EXPORT_SYMBOL(sg_free_table); + +/** + * __sg_alloc_table - Allocate and initialize an sg table with given allocator + * @table: The sg table header to use + * @nents: Number of entries in sg list + * @max_ents: The maximum number of entries the allocator returns per call + * @nents_first_chunk: Number of entries int the (preallocated) first + * scatterlist chunk, 0 means no such preallocated chunk provided by user + * @gfp_mask: GFP allocation mask + * @alloc_fn: Allocator to use + * + * Description: + * This function returns a @table @nents long. The allocator is + * defined to return scatterlist chunks of maximum size @max_ents. + * Thus if @nents is bigger than @max_ents, the scatterlists will be + * chained in units of @max_ents. + * + * Notes: + * If this function returns non-0 (eg failure), the caller must call + * __sg_free_table() to cleanup any leftover allocations. + * + **/ +int __sg_alloc_table(struct sg_table *table, unsigned int nents, + unsigned int max_ents, struct scatterlist *first_chunk, + unsigned int nents_first_chunk, gfp_t gfp_mask, + sg_alloc_fn *alloc_fn) +{ + struct scatterlist *sg, *prv; + unsigned int left; + unsigned curr_max_ents = nents_first_chunk ?: max_ents; + unsigned prv_max_ents; + + memset(table, 0, sizeof(*table)); + + if (nents == 0) + return -EINVAL; +#ifdef CONFIG_ARCH_NO_SG_CHAIN + if (WARN_ON_ONCE(nents > max_ents)) + return -EINVAL; +#endif + + left = nents; + prv = NULL; + do { + unsigned int sg_size, alloc_size = left; + + if (alloc_size > curr_max_ents) { + alloc_size = curr_max_ents; + sg_size = alloc_size - 1; + } else + sg_size = alloc_size; + + left -= sg_size; + + if (first_chunk) { + sg = first_chunk; + first_chunk = NULL; + } else { + sg = alloc_fn(alloc_size, gfp_mask); + } + if (unlikely(!sg)) { + /* + * Adjust entry count to reflect that the last + * entry of the previous table won't be used for + * linkage. Without this, sg_kfree() may get + * confused. + */ + if (prv) + table->nents = ++table->orig_nents; + + return -ENOMEM; + } + + sg_init_table(sg, alloc_size); + table->nents = table->orig_nents += sg_size; + + /* + * If this is the first mapping, assign the sg table header. + * If this is not the first mapping, chain previous part. + */ + if (prv) + sg_chain(prv, prv_max_ents, sg); + else + table->sgl = sg; + + /* + * If no more entries after this one, mark the end + */ + if (!left) + sg_mark_end(&sg[sg_size - 1]); + + prv = sg; + prv_max_ents = curr_max_ents; + curr_max_ents = max_ents; + } while (left); + + return 0; +} +EXPORT_SYMBOL(__sg_alloc_table); + +/** + * sg_alloc_table - Allocate and initialize an sg table + * @table: The sg table header to use + * @nents: Number of entries in sg list + * @gfp_mask: GFP allocation mask + * + * Description: + * Allocate and initialize an sg table. If @nents@ is larger than + * SG_MAX_SINGLE_ALLOC a chained sg table will be setup. + * + **/ +int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) +{ + int ret; + + ret = __sg_alloc_table(table, nents, SG_MAX_SINGLE_ALLOC, + NULL, 0, gfp_mask, sg_kmalloc); + if (unlikely(ret)) + sg_free_table(table); + return ret; +} +EXPORT_SYMBOL(sg_alloc_table); + +static struct scatterlist *get_next_sg(struct sg_append_table *table, + struct scatterlist *cur, + unsigned long needed_sges, + gfp_t gfp_mask) +{ + struct scatterlist *new_sg, *next_sg; + unsigned int alloc_size; + + if (cur) { + next_sg = sg_next(cur); + /* Check if last entry should be keeped for chainning */ + if (!sg_is_last(next_sg) || needed_sges == 1) + return next_sg; + } + + alloc_size = min_t(unsigned long, needed_sges, SG_MAX_SINGLE_ALLOC); + new_sg = sg_kmalloc(alloc_size, gfp_mask); + if (!new_sg) + return ERR_PTR(-ENOMEM); + sg_init_table(new_sg, alloc_size); + if (cur) { + table->total_nents += alloc_size - 1; + __sg_chain(next_sg, new_sg); + } else { + table->sgt.sgl = new_sg; + table->total_nents = alloc_size; + } + return new_sg; +} + +static bool pages_are_mergeable(struct page *a, struct page *b) +{ + if (page_to_pfn(a) != page_to_pfn(b) + 1) + return false; + if (!zone_device_pages_have_same_pgmap(a, b)) + return false; + return true; +} + +/** + * sg_alloc_append_table_from_pages - Allocate and initialize an append sg + * table from an array of pages + * @sgt_append: The sg append table to use + * @pages: Pointer to an array of page pointers + * @n_pages: Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size: Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist element in bytes + * @left_pages: Left pages caller have to set after this call + * @gfp_mask: GFP allocation mask + * + * Description: + * In the first call it allocate and initialize an sg table from a list of + * pages, else reuse the scatterlist from sgt_append. Contiguous ranges of + * the pages are squashed into a single scatterlist entry up to the maximum + * size specified in @max_segment. A user may provide an offset at a start + * and a size of valid data in a buffer specified by the page array. The + * returned sg table is released by sg_free_append_table + * + * Returns: + * 0 on success, negative error on failure + * + * Notes: + * If this function returns non-0 (eg failure), the caller must call + * sg_free_append_table() to cleanup any leftover allocations. + * + * In the fist call, sgt_append must by initialized. + */ +int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, + struct page **pages, unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + unsigned int left_pages, gfp_t gfp_mask) +{ + unsigned int chunks, cur_page, seg_len, i, prv_len = 0; + unsigned int added_nents = 0; + struct scatterlist *s = sgt_append->prv; + struct page *last_pg; + + /* + * The algorithm below requires max_segment to be aligned to PAGE_SIZE + * otherwise it can overshoot. + */ + max_segment = ALIGN_DOWN(max_segment, PAGE_SIZE); + if (WARN_ON(max_segment < PAGE_SIZE)) + return -EINVAL; + + if (IS_ENABLED(CONFIG_ARCH_NO_SG_CHAIN) && sgt_append->prv) + return -EOPNOTSUPP; + + if (sgt_append->prv) { + unsigned long next_pfn = (page_to_phys(sg_page(sgt_append->prv)) + + sgt_append->prv->offset + sgt_append->prv->length) / PAGE_SIZE; + + if (WARN_ON(offset)) + return -EINVAL; + + /* Merge contiguous pages into the last SG */ + prv_len = sgt_append->prv->length; + if (page_to_pfn(pages[0]) == next_pfn) { + last_pg = pfn_to_page(next_pfn - 1); + while (n_pages && pages_are_mergeable(pages[0], last_pg)) { + if (sgt_append->prv->length + PAGE_SIZE > max_segment) + break; + sgt_append->prv->length += PAGE_SIZE; + last_pg = pages[0]; + pages++; + n_pages--; + } + if (!n_pages) + goto out; + } + } + + /* compute number of contiguous chunks */ + chunks = 1; + seg_len = 0; + for (i = 1; i < n_pages; i++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + !pages_are_mergeable(pages[i], pages[i - 1])) { + chunks++; + seg_len = 0; + } + } + + /* merging chunks and putting them into the scatterlist */ + cur_page = 0; + for (i = 0; i < chunks; i++) { + unsigned int j, chunk_size; + + /* look for the end of the current chunk */ + seg_len = 0; + for (j = cur_page + 1; j < n_pages; j++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + !pages_are_mergeable(pages[j], pages[j - 1])) + break; + } + + /* Pass how many chunks might be left */ + s = get_next_sg(sgt_append, s, chunks - i + left_pages, + gfp_mask); + if (IS_ERR(s)) { + /* + * Adjust entry length to be as before function was + * called. + */ + if (sgt_append->prv) + sgt_append->prv->length = prv_len; + return PTR_ERR(s); + } + chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); + added_nents++; + size -= chunk_size; + offset = 0; + cur_page = j; + } + sgt_append->sgt.nents += added_nents; + sgt_append->sgt.orig_nents = sgt_append->sgt.nents; + sgt_append->prv = s; +out: + if (!left_pages) + sg_mark_end(s); + return 0; +} +EXPORT_SYMBOL(sg_alloc_append_table_from_pages); + +/** + * sg_alloc_table_from_pages_segment - Allocate and initialize an sg table from + * an array of pages and given maximum + * segment. + * @sgt: The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages: Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size: Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist element in bytes + * @gfp_mask: GFP allocation mask + * + * Description: + * Allocate and initialize an sg table from a list of pages. Contiguous + * ranges of the pages are squashed into a single scatterlist node up to the + * maximum size specified in @max_segment. A user may provide an offset at a + * start and a size of valid data in a buffer specified by the page array. + * + * The returned sg table is released by sg_free_table. + * + * Returns: + * 0 on success, negative error on failure + */ +int sg_alloc_table_from_pages_segment(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) +{ + struct sg_append_table append = {}; + int err; + + err = sg_alloc_append_table_from_pages(&append, pages, n_pages, offset, + size, max_segment, 0, gfp_mask); + if (err) { + sg_free_append_table(&append); + return err; + } + memcpy(sgt, &append.sgt, sizeof(*sgt)); + WARN_ON(append.total_nents != sgt->orig_nents); + return 0; +} +EXPORT_SYMBOL(sg_alloc_table_from_pages_segment); + +#ifdef CONFIG_SGL_ALLOC + +/** + * sgl_alloc_order - allocate a scatterlist and its pages + * @length: Length in bytes of the scatterlist. Must be at least one + * @order: Second argument for alloc_pages() + * @chainable: Whether or not to allocate an extra element in the scatterlist + * for scatterlist chaining purposes + * @gfp: Memory allocation flags + * @nent_p: [out] Number of entries in the scatterlist that have pages + * + * Returns: A pointer to an initialized scatterlist or %NULL upon failure. + */ +struct scatterlist *sgl_alloc_order(unsigned long long length, + unsigned int order, bool chainable, + gfp_t gfp, unsigned int *nent_p) +{ + struct scatterlist *sgl, *sg; + struct page *page; + unsigned int nent, nalloc; + u32 elem_len; + + nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order); + /* Check for integer overflow */ + if (length > (nent << (PAGE_SHIFT + order))) + return NULL; + nalloc = nent; + if (chainable) { + /* Check for integer overflow */ + if (nalloc + 1 < nalloc) + return NULL; + nalloc++; + } + sgl = kmalloc_array(nalloc, sizeof(struct scatterlist), + gfp & ~GFP_DMA); + if (!sgl) + return NULL; + + sg_init_table(sgl, nalloc); + sg = sgl; + while (length) { + elem_len = min_t(u64, length, PAGE_SIZE << order); + page = alloc_pages(gfp, order); + if (!page) { + sgl_free_order(sgl, order); + return NULL; + } + + sg_set_page(sg, page, elem_len, 0); + length -= elem_len; + sg = sg_next(sg); + } + WARN_ONCE(length, "length = %lld\n", length); + if (nent_p) + *nent_p = nent; + return sgl; +} +EXPORT_SYMBOL(sgl_alloc_order); + +/** + * sgl_alloc - allocate a scatterlist and its pages + * @length: Length in bytes of the scatterlist + * @gfp: Memory allocation flags + * @nent_p: [out] Number of entries in the scatterlist + * + * Returns: A pointer to an initialized scatterlist or %NULL upon failure. + */ +struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp, + unsigned int *nent_p) +{ + return sgl_alloc_order(length, 0, false, gfp, nent_p); +} +EXPORT_SYMBOL(sgl_alloc); + +/** + * sgl_free_n_order - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + * @nents: Maximum number of elements to free + * @order: Second argument for __free_pages() + * + * Notes: + * - If several scatterlists have been chained and each chain element is + * freed separately then it's essential to set nents correctly to avoid that a + * page would get freed twice. + * - All pages in a chained scatterlist can be freed at once by setting @nents + * to a high number. + */ +void sgl_free_n_order(struct scatterlist *sgl, int nents, int order) +{ + struct scatterlist *sg; + struct page *page; + int i; + + for_each_sg(sgl, sg, nents, i) { + if (!sg) + break; + page = sg_page(sg); + if (page) + __free_pages(page, order); + } + kfree(sgl); +} +EXPORT_SYMBOL(sgl_free_n_order); + +/** + * sgl_free_order - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + * @order: Second argument for __free_pages() + */ +void sgl_free_order(struct scatterlist *sgl, int order) +{ + sgl_free_n_order(sgl, INT_MAX, order); +} +EXPORT_SYMBOL(sgl_free_order); + +/** + * sgl_free - free a scatterlist and its pages + * @sgl: Scatterlist with one or more elements + */ +void sgl_free(struct scatterlist *sgl) +{ + sgl_free_order(sgl, 0); +} +EXPORT_SYMBOL(sgl_free); + +#endif /* CONFIG_SGL_ALLOC */ + +void __sg_page_iter_start(struct sg_page_iter *piter, + struct scatterlist *sglist, unsigned int nents, + unsigned long pgoffset) +{ + piter->__pg_advance = 0; + piter->__nents = nents; + + piter->sg = sglist; + piter->sg_pgoffset = pgoffset; +} +EXPORT_SYMBOL(__sg_page_iter_start); + +static int sg_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; +} + +bool __sg_page_iter_next(struct sg_page_iter *piter) +{ + if (!piter->__nents || !piter->sg) + return false; + + piter->sg_pgoffset += piter->__pg_advance; + piter->__pg_advance = 1; + + while (piter->sg_pgoffset >= sg_page_count(piter->sg)) { + piter->sg_pgoffset -= sg_page_count(piter->sg); + piter->sg = sg_next(piter->sg); + if (!--piter->__nents || !piter->sg) + return false; + } + + return true; +} +EXPORT_SYMBOL(__sg_page_iter_next); + +static int sg_dma_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg_dma_len(sg)) >> PAGE_SHIFT; +} + +bool __sg_page_iter_dma_next(struct sg_dma_page_iter *dma_iter) +{ + struct sg_page_iter *piter = &dma_iter->base; + + if (!piter->__nents || !piter->sg) + return false; + + piter->sg_pgoffset += piter->__pg_advance; + piter->__pg_advance = 1; + + while (piter->sg_pgoffset >= sg_dma_page_count(piter->sg)) { + piter->sg_pgoffset -= sg_dma_page_count(piter->sg); + piter->sg = sg_next(piter->sg); + if (!--piter->__nents || !piter->sg) + return false; + } + + return true; +} +EXPORT_SYMBOL(__sg_page_iter_dma_next); + +/** + * sg_miter_start - start mapping iteration over a sg list + * @miter: sg mapping iter to be started + * @sgl: sg list to iterate over + * @nents: number of sg entries + * + * Description: + * Starts mapping iterator @miter. + * + * Context: + * Don't care. + */ +void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl, + unsigned int nents, unsigned int flags) +{ + memset(miter, 0, sizeof(struct sg_mapping_iter)); + + __sg_page_iter_start(&miter->piter, sgl, nents, 0); + WARN_ON(!(flags & (SG_MITER_TO_SG | SG_MITER_FROM_SG))); + miter->__flags = flags; +} +EXPORT_SYMBOL(sg_miter_start); + +static bool sg_miter_get_next_page(struct sg_mapping_iter *miter) +{ + if (!miter->__remaining) { + struct scatterlist *sg; + + if (!__sg_page_iter_next(&miter->piter)) + return false; + + sg = miter->piter.sg; + + miter->__offset = miter->piter.sg_pgoffset ? 0 : sg->offset; + miter->piter.sg_pgoffset += miter->__offset >> PAGE_SHIFT; + miter->__offset &= PAGE_SIZE - 1; + miter->__remaining = sg->offset + sg->length - + (miter->piter.sg_pgoffset << PAGE_SHIFT) - + miter->__offset; + miter->__remaining = min_t(unsigned long, miter->__remaining, + PAGE_SIZE - miter->__offset); + } + + return true; +} + +/** + * sg_miter_skip - reposition mapping iterator + * @miter: sg mapping iter to be skipped + * @offset: number of bytes to plus the current location + * + * Description: + * Sets the offset of @miter to its current location plus @offset bytes. + * If mapping iterator @miter has been proceeded by sg_miter_next(), this + * stops @miter. + * + * Context: + * Don't care. + * + * Returns: + * true if @miter contains the valid mapping. false if end of sg + * list is reached. + */ +bool sg_miter_skip(struct sg_mapping_iter *miter, off_t offset) +{ + sg_miter_stop(miter); + + while (offset) { + off_t consumed; + + if (!sg_miter_get_next_page(miter)) + return false; + + consumed = min_t(off_t, offset, miter->__remaining); + miter->__offset += consumed; + miter->__remaining -= consumed; + offset -= consumed; + } + + return true; +} +EXPORT_SYMBOL(sg_miter_skip); + +/** + * sg_miter_next - proceed mapping iterator to the next mapping + * @miter: sg mapping iter to proceed + * + * Description: + * Proceeds @miter to the next mapping. @miter should have been started + * using sg_miter_start(). On successful return, @miter->page, + * @miter->addr and @miter->length point to the current mapping. + * + * Context: + * May sleep if !SG_MITER_ATOMIC. + * + * Returns: + * true if @miter contains the next mapping. false if end of sg + * list is reached. + */ +bool sg_miter_next(struct sg_mapping_iter *miter) +{ + sg_miter_stop(miter); + + /* + * Get to the next page if necessary. + * __remaining, __offset is adjusted by sg_miter_stop + */ + if (!sg_miter_get_next_page(miter)) + return false; + + miter->page = sg_page_iter_page(&miter->piter); + miter->consumed = miter->length = miter->__remaining; + + if (miter->__flags & SG_MITER_ATOMIC) + miter->addr = kmap_atomic(miter->page) + miter->__offset; + else + miter->addr = kmap(miter->page) + miter->__offset; + + return true; +} +EXPORT_SYMBOL(sg_miter_next); + +/** + * sg_miter_stop - stop mapping iteration + * @miter: sg mapping iter to be stopped + * + * Description: + * Stops mapping iterator @miter. @miter should have been started + * using sg_miter_start(). A stopped iteration can be resumed by + * calling sg_miter_next() on it. This is useful when resources (kmap) + * need to be released during iteration. + * + * Context: + * Don't care otherwise. + */ +void sg_miter_stop(struct sg_mapping_iter *miter) +{ + WARN_ON(miter->consumed > miter->length); + + /* drop resources from the last iteration */ + if (miter->addr) { + miter->__offset += miter->consumed; + miter->__remaining -= miter->consumed; + + if (miter->__flags & SG_MITER_TO_SG) + flush_dcache_page(miter->page); + + if (miter->__flags & SG_MITER_ATOMIC) { + WARN_ON_ONCE(!pagefault_disabled()); + kunmap_atomic(miter->addr); + } else + kunmap(miter->page); + + miter->page = NULL; + miter->addr = NULL; + miter->length = 0; + miter->consumed = 0; + } +} +EXPORT_SYMBOL(sg_miter_stop); + +/** + * sg_copy_buffer - Copy data between a linear buffer and an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * @to_buffer: transfer direction (true == from an sg list to a + * buffer, false == from a buffer to an sg list) + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, + size_t buflen, off_t skip, bool to_buffer) +{ + unsigned int offset = 0; + struct sg_mapping_iter miter; + unsigned int sg_flags = SG_MITER_ATOMIC; + + if (to_buffer) + sg_flags |= SG_MITER_FROM_SG; + else + sg_flags |= SG_MITER_TO_SG; + + sg_miter_start(&miter, sgl, nents, sg_flags); + + if (!sg_miter_skip(&miter, skip)) + return 0; + + while ((offset < buflen) && sg_miter_next(&miter)) { + unsigned int len; + + len = min(miter.length, buflen - offset); + + if (to_buffer) + memcpy(buf + offset, miter.addr, len); + else + memcpy(miter.addr, buf + offset, len); + + offset += len; + } + + sg_miter_stop(&miter); + + return offset; +} +EXPORT_SYMBOL(sg_copy_buffer); + +/** + * sg_copy_from_buffer - Copy from a linear buffer to an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_from_buffer(struct scatterlist *sgl, unsigned int nents, + const void *buf, size_t buflen) +{ + return sg_copy_buffer(sgl, nents, (void *)buf, buflen, 0, false); +} +EXPORT_SYMBOL(sg_copy_from_buffer); + +/** + * sg_copy_to_buffer - Copy from an SG list to a linear buffer + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy to + * @buflen: The number of bytes to copy + * + * Returns the number of copied bytes. + * + **/ +size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, + void *buf, size_t buflen) +{ + return sg_copy_buffer(sgl, nents, buf, buflen, 0, true); +} +EXPORT_SYMBOL(sg_copy_to_buffer); + +/** + * sg_pcopy_from_buffer - Copy from a linear buffer to an SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy from + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * + * Returns the number of copied bytes. + * + **/ +size_t sg_pcopy_from_buffer(struct scatterlist *sgl, unsigned int nents, + const void *buf, size_t buflen, off_t skip) +{ + return sg_copy_buffer(sgl, nents, (void *)buf, buflen, skip, false); +} +EXPORT_SYMBOL(sg_pcopy_from_buffer); + +/** + * sg_pcopy_to_buffer - Copy from an SG list to a linear buffer + * @sgl: The SG list + * @nents: Number of SG entries + * @buf: Where to copy to + * @buflen: The number of bytes to copy + * @skip: Number of bytes to skip before copying + * + * Returns the number of copied bytes. + * + **/ +size_t sg_pcopy_to_buffer(struct scatterlist *sgl, unsigned int nents, + void *buf, size_t buflen, off_t skip) +{ + return sg_copy_buffer(sgl, nents, buf, buflen, skip, true); +} +EXPORT_SYMBOL(sg_pcopy_to_buffer); + +/** + * sg_zero_buffer - Zero-out a part of a SG list + * @sgl: The SG list + * @nents: Number of SG entries + * @buflen: The number of bytes to zero out + * @skip: Number of bytes to skip before zeroing + * + * Returns the number of bytes zeroed. + **/ +size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents, + size_t buflen, off_t skip) +{ + unsigned int offset = 0; + struct sg_mapping_iter miter; + unsigned int sg_flags = SG_MITER_ATOMIC | SG_MITER_TO_SG; + + sg_miter_start(&miter, sgl, nents, sg_flags); + + if (!sg_miter_skip(&miter, skip)) + return false; + + while (offset < buflen && sg_miter_next(&miter)) { + unsigned int len; + + len = min(miter.length, buflen - offset); + memset(miter.addr, 0, len); + + offset += len; + } + + sg_miter_stop(&miter); + return offset; +} +EXPORT_SYMBOL(sg_zero_buffer); |