diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /drivers/nvdimm/label.c | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'drivers/nvdimm/label.c')
-rw-r--r-- | drivers/nvdimm/label.c | 1120 |
1 files changed, 1120 insertions, 0 deletions
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c new file mode 100644 index 000000000..082253a3a --- /dev/null +++ b/drivers/nvdimm/label.c @@ -0,0 +1,1120 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright(c) 2013-2015 Intel Corporation. All rights reserved. + */ +#include <linux/device.h> +#include <linux/ndctl.h> +#include <linux/uuid.h> +#include <linux/slab.h> +#include <linux/io.h> +#include <linux/nd.h> +#include "nd-core.h" +#include "label.h" +#include "nd.h" + +static guid_t nvdimm_btt_guid; +static guid_t nvdimm_btt2_guid; +static guid_t nvdimm_pfn_guid; +static guid_t nvdimm_dax_guid; + +static uuid_t nvdimm_btt_uuid; +static uuid_t nvdimm_btt2_uuid; +static uuid_t nvdimm_pfn_uuid; +static uuid_t nvdimm_dax_uuid; + +static uuid_t cxl_region_uuid; +static uuid_t cxl_namespace_uuid; + +static const char NSINDEX_SIGNATURE[] = "NAMESPACE_INDEX\0"; + +static u32 best_seq(u32 a, u32 b) +{ + a &= NSINDEX_SEQ_MASK; + b &= NSINDEX_SEQ_MASK; + + if (a == 0 || a == b) + return b; + else if (b == 0) + return a; + else if (nd_inc_seq(a) == b) + return b; + else + return a; +} + +unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd) +{ + return ndd->nslabel_size; +} + +static size_t __sizeof_namespace_index(u32 nslot) +{ + return ALIGN(sizeof(struct nd_namespace_index) + DIV_ROUND_UP(nslot, 8), + NSINDEX_ALIGN); +} + +static int __nvdimm_num_label_slots(struct nvdimm_drvdata *ndd, + size_t index_size) +{ + return (ndd->nsarea.config_size - index_size * 2) / + sizeof_namespace_label(ndd); +} + +int nvdimm_num_label_slots(struct nvdimm_drvdata *ndd) +{ + u32 tmp_nslot, n; + + tmp_nslot = ndd->nsarea.config_size / sizeof_namespace_label(ndd); + n = __sizeof_namespace_index(tmp_nslot) / NSINDEX_ALIGN; + + return __nvdimm_num_label_slots(ndd, NSINDEX_ALIGN * n); +} + +size_t sizeof_namespace_index(struct nvdimm_drvdata *ndd) +{ + u32 nslot, space, size; + + /* + * Per UEFI 2.7, the minimum size of the Label Storage Area is large + * enough to hold 2 index blocks and 2 labels. The minimum index + * block size is 256 bytes. The label size is 128 for namespaces + * prior to version 1.2 and at minimum 256 for version 1.2 and later. + */ + nslot = nvdimm_num_label_slots(ndd); + space = ndd->nsarea.config_size - nslot * sizeof_namespace_label(ndd); + size = __sizeof_namespace_index(nslot) * 2; + if (size <= space && nslot >= 2) + return size / 2; + + dev_err(ndd->dev, "label area (%d) too small to host (%d byte) labels\n", + ndd->nsarea.config_size, sizeof_namespace_label(ndd)); + return 0; +} + +static int __nd_label_validate(struct nvdimm_drvdata *ndd) +{ + /* + * On media label format consists of two index blocks followed + * by an array of labels. None of these structures are ever + * updated in place. A sequence number tracks the current + * active index and the next one to write, while labels are + * written to free slots. + * + * +------------+ + * | | + * | nsindex0 | + * | | + * +------------+ + * | | + * | nsindex1 | + * | | + * +------------+ + * | label0 | + * +------------+ + * | label1 | + * +------------+ + * | | + * ....nslot... + * | | + * +------------+ + * | labelN | + * +------------+ + */ + struct nd_namespace_index *nsindex[] = { + to_namespace_index(ndd, 0), + to_namespace_index(ndd, 1), + }; + const int num_index = ARRAY_SIZE(nsindex); + struct device *dev = ndd->dev; + bool valid[2] = { 0 }; + int i, num_valid = 0; + u32 seq; + + for (i = 0; i < num_index; i++) { + u32 nslot; + u8 sig[NSINDEX_SIG_LEN]; + u64 sum_save, sum, size; + unsigned int version, labelsize; + + memcpy(sig, nsindex[i]->sig, NSINDEX_SIG_LEN); + if (memcmp(sig, NSINDEX_SIGNATURE, NSINDEX_SIG_LEN) != 0) { + dev_dbg(dev, "nsindex%d signature invalid\n", i); + continue; + } + + /* label sizes larger than 128 arrived with v1.2 */ + version = __le16_to_cpu(nsindex[i]->major) * 100 + + __le16_to_cpu(nsindex[i]->minor); + if (version >= 102) + labelsize = 1 << (7 + nsindex[i]->labelsize); + else + labelsize = 128; + + if (labelsize != sizeof_namespace_label(ndd)) { + dev_dbg(dev, "nsindex%d labelsize %d invalid\n", + i, nsindex[i]->labelsize); + continue; + } + + sum_save = __le64_to_cpu(nsindex[i]->checksum); + nsindex[i]->checksum = __cpu_to_le64(0); + sum = nd_fletcher64(nsindex[i], sizeof_namespace_index(ndd), 1); + nsindex[i]->checksum = __cpu_to_le64(sum_save); + if (sum != sum_save) { + dev_dbg(dev, "nsindex%d checksum invalid\n", i); + continue; + } + + seq = __le32_to_cpu(nsindex[i]->seq); + if ((seq & NSINDEX_SEQ_MASK) == 0) { + dev_dbg(dev, "nsindex%d sequence: %#x invalid\n", i, seq); + continue; + } + + /* sanity check the index against expected values */ + if (__le64_to_cpu(nsindex[i]->myoff) + != i * sizeof_namespace_index(ndd)) { + dev_dbg(dev, "nsindex%d myoff: %#llx invalid\n", + i, (unsigned long long) + __le64_to_cpu(nsindex[i]->myoff)); + continue; + } + if (__le64_to_cpu(nsindex[i]->otheroff) + != (!i) * sizeof_namespace_index(ndd)) { + dev_dbg(dev, "nsindex%d otheroff: %#llx invalid\n", + i, (unsigned long long) + __le64_to_cpu(nsindex[i]->otheroff)); + continue; + } + if (__le64_to_cpu(nsindex[i]->labeloff) + != 2 * sizeof_namespace_index(ndd)) { + dev_dbg(dev, "nsindex%d labeloff: %#llx invalid\n", + i, (unsigned long long) + __le64_to_cpu(nsindex[i]->labeloff)); + continue; + } + + size = __le64_to_cpu(nsindex[i]->mysize); + if (size > sizeof_namespace_index(ndd) + || size < sizeof(struct nd_namespace_index)) { + dev_dbg(dev, "nsindex%d mysize: %#llx invalid\n", i, size); + continue; + } + + nslot = __le32_to_cpu(nsindex[i]->nslot); + if (nslot * sizeof_namespace_label(ndd) + + 2 * sizeof_namespace_index(ndd) + > ndd->nsarea.config_size) { + dev_dbg(dev, "nsindex%d nslot: %u invalid, config_size: %#x\n", + i, nslot, ndd->nsarea.config_size); + continue; + } + valid[i] = true; + num_valid++; + } + + switch (num_valid) { + case 0: + break; + case 1: + for (i = 0; i < num_index; i++) + if (valid[i]) + return i; + /* can't have num_valid > 0 but valid[] = { false, false } */ + WARN_ON(1); + break; + default: + /* pick the best index... */ + seq = best_seq(__le32_to_cpu(nsindex[0]->seq), + __le32_to_cpu(nsindex[1]->seq)); + if (seq == (__le32_to_cpu(nsindex[1]->seq) & NSINDEX_SEQ_MASK)) + return 1; + else + return 0; + break; + } + + return -1; +} + +static int nd_label_validate(struct nvdimm_drvdata *ndd) +{ + /* + * In order to probe for and validate namespace index blocks we + * need to know the size of the labels, and we can't trust the + * size of the labels until we validate the index blocks. + * Resolve this dependency loop by probing for known label + * sizes, but default to v1.2 256-byte namespace labels if + * discovery fails. + */ + int label_size[] = { 128, 256 }; + int i, rc; + + for (i = 0; i < ARRAY_SIZE(label_size); i++) { + ndd->nslabel_size = label_size[i]; + rc = __nd_label_validate(ndd); + if (rc >= 0) + return rc; + } + + return -1; +} + +static void nd_label_copy(struct nvdimm_drvdata *ndd, + struct nd_namespace_index *dst, + struct nd_namespace_index *src) +{ + /* just exit if either destination or source is NULL */ + if (!dst || !src) + return; + + memcpy(dst, src, sizeof_namespace_index(ndd)); +} + +static struct nd_namespace_label *nd_label_base(struct nvdimm_drvdata *ndd) +{ + void *base = to_namespace_index(ndd, 0); + + return base + 2 * sizeof_namespace_index(ndd); +} + +static int to_slot(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label) +{ + unsigned long label, base; + + label = (unsigned long) nd_label; + base = (unsigned long) nd_label_base(ndd); + + return (label - base) / sizeof_namespace_label(ndd); +} + +static struct nd_namespace_label *to_label(struct nvdimm_drvdata *ndd, int slot) +{ + unsigned long label, base; + + base = (unsigned long) nd_label_base(ndd); + label = base + sizeof_namespace_label(ndd) * slot; + + return (struct nd_namespace_label *) label; +} + +#define for_each_clear_bit_le(bit, addr, size) \ + for ((bit) = find_next_zero_bit_le((addr), (size), 0); \ + (bit) < (size); \ + (bit) = find_next_zero_bit_le((addr), (size), (bit) + 1)) + +/** + * preamble_index - common variable initialization for nd_label_* routines + * @ndd: dimm container for the relevant label set + * @idx: namespace_index index + * @nsindex_out: on return set to the currently active namespace index + * @free: on return set to the free label bitmap in the index + * @nslot: on return set to the number of slots in the label space + */ +static bool preamble_index(struct nvdimm_drvdata *ndd, int idx, + struct nd_namespace_index **nsindex_out, + unsigned long **free, u32 *nslot) +{ + struct nd_namespace_index *nsindex; + + nsindex = to_namespace_index(ndd, idx); + if (nsindex == NULL) + return false; + + *free = (unsigned long *) nsindex->free; + *nslot = __le32_to_cpu(nsindex->nslot); + *nsindex_out = nsindex; + + return true; +} + +char *nd_label_gen_id(struct nd_label_id *label_id, const uuid_t *uuid, + u32 flags) +{ + if (!label_id || !uuid) + return NULL; + snprintf(label_id->id, ND_LABEL_ID_SIZE, "pmem-%pUb", uuid); + return label_id->id; +} + +static bool preamble_current(struct nvdimm_drvdata *ndd, + struct nd_namespace_index **nsindex, + unsigned long **free, u32 *nslot) +{ + return preamble_index(ndd, ndd->ns_current, nsindex, + free, nslot); +} + +static bool preamble_next(struct nvdimm_drvdata *ndd, + struct nd_namespace_index **nsindex, + unsigned long **free, u32 *nslot) +{ + return preamble_index(ndd, ndd->ns_next, nsindex, + free, nslot); +} + +static bool nsl_validate_checksum(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label) +{ + u64 sum, sum_save; + + if (!ndd->cxl && !efi_namespace_label_has(ndd, checksum)) + return true; + + sum_save = nsl_get_checksum(ndd, nd_label); + nsl_set_checksum(ndd, nd_label, 0); + sum = nd_fletcher64(nd_label, sizeof_namespace_label(ndd), 1); + nsl_set_checksum(ndd, nd_label, sum_save); + return sum == sum_save; +} + +static void nsl_calculate_checksum(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label) +{ + u64 sum; + + if (!ndd->cxl && !efi_namespace_label_has(ndd, checksum)) + return; + nsl_set_checksum(ndd, nd_label, 0); + sum = nd_fletcher64(nd_label, sizeof_namespace_label(ndd), 1); + nsl_set_checksum(ndd, nd_label, sum); +} + +static bool slot_valid(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label, u32 slot) +{ + bool valid; + + /* check that we are written where we expect to be written */ + if (slot != nsl_get_slot(ndd, nd_label)) + return false; + valid = nsl_validate_checksum(ndd, nd_label); + if (!valid) + dev_dbg(ndd->dev, "fail checksum. slot: %d\n", slot); + return valid; +} + +int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot, slot; + + if (!preamble_current(ndd, &nsindex, &free, &nslot)) + return 0; /* no label, nothing to reserve */ + + for_each_clear_bit_le(slot, free, nslot) { + struct nd_namespace_label *nd_label; + struct nd_region *nd_region = NULL; + struct nd_label_id label_id; + struct resource *res; + uuid_t label_uuid; + u32 flags; + + nd_label = to_label(ndd, slot); + + if (!slot_valid(ndd, nd_label, slot)) + continue; + + nsl_get_uuid(ndd, nd_label, &label_uuid); + flags = nsl_get_flags(ndd, nd_label); + nd_label_gen_id(&label_id, &label_uuid, flags); + res = nvdimm_allocate_dpa(ndd, &label_id, + nsl_get_dpa(ndd, nd_label), + nsl_get_rawsize(ndd, nd_label)); + nd_dbg_dpa(nd_region, ndd, res, "reserve\n"); + if (!res) + return -EBUSY; + } + + return 0; +} + +int nd_label_data_init(struct nvdimm_drvdata *ndd) +{ + size_t config_size, read_size, max_xfer, offset; + struct nd_namespace_index *nsindex; + unsigned int i; + int rc = 0; + u32 nslot; + + if (ndd->data) + return 0; + + if (ndd->nsarea.status || ndd->nsarea.max_xfer == 0) { + dev_dbg(ndd->dev, "failed to init config data area: (%u:%u)\n", + ndd->nsarea.max_xfer, ndd->nsarea.config_size); + return -ENXIO; + } + + /* + * We need to determine the maximum index area as this is the section + * we must read and validate before we can start processing labels. + * + * If the area is too small to contain the two indexes and 2 labels + * then we abort. + * + * Start at a label size of 128 as this should result in the largest + * possible namespace index size. + */ + ndd->nslabel_size = 128; + read_size = sizeof_namespace_index(ndd) * 2; + if (!read_size) + return -ENXIO; + + /* Allocate config data */ + config_size = ndd->nsarea.config_size; + ndd->data = kvzalloc(config_size, GFP_KERNEL); + if (!ndd->data) + return -ENOMEM; + + /* + * We want to guarantee as few reads as possible while conserving + * memory. To do that we figure out how much unused space will be left + * in the last read, divide that by the total number of reads it is + * going to take given our maximum transfer size, and then reduce our + * maximum transfer size based on that result. + */ + max_xfer = min_t(size_t, ndd->nsarea.max_xfer, config_size); + if (read_size < max_xfer) { + /* trim waste */ + max_xfer -= ((max_xfer - 1) - (config_size - 1) % max_xfer) / + DIV_ROUND_UP(config_size, max_xfer); + /* make certain we read indexes in exactly 1 read */ + if (max_xfer < read_size) + max_xfer = read_size; + } + + /* Make our initial read size a multiple of max_xfer size */ + read_size = min(DIV_ROUND_UP(read_size, max_xfer) * max_xfer, + config_size); + + /* Read the index data */ + rc = nvdimm_get_config_data(ndd, ndd->data, 0, read_size); + if (rc) + goto out_err; + + /* Validate index data, if not valid assume all labels are invalid */ + ndd->ns_current = nd_label_validate(ndd); + if (ndd->ns_current < 0) + return 0; + + /* Record our index values */ + ndd->ns_next = nd_label_next_nsindex(ndd->ns_current); + + /* Copy "current" index on top of the "next" index */ + nsindex = to_current_namespace_index(ndd); + nd_label_copy(ndd, to_next_namespace_index(ndd), nsindex); + + /* Determine starting offset for label data */ + offset = __le64_to_cpu(nsindex->labeloff); + nslot = __le32_to_cpu(nsindex->nslot); + + /* Loop through the free list pulling in any active labels */ + for (i = 0; i < nslot; i++, offset += ndd->nslabel_size) { + size_t label_read_size; + + /* zero out the unused labels */ + if (test_bit_le(i, nsindex->free)) { + memset(ndd->data + offset, 0, ndd->nslabel_size); + continue; + } + + /* if we already read past here then just continue */ + if (offset + ndd->nslabel_size <= read_size) + continue; + + /* if we haven't read in a while reset our read_size offset */ + if (read_size < offset) + read_size = offset; + + /* determine how much more will be read after this next call. */ + label_read_size = offset + ndd->nslabel_size - read_size; + label_read_size = DIV_ROUND_UP(label_read_size, max_xfer) * + max_xfer; + + /* truncate last read if needed */ + if (read_size + label_read_size > config_size) + label_read_size = config_size - read_size; + + /* Read the label data */ + rc = nvdimm_get_config_data(ndd, ndd->data + read_size, + read_size, label_read_size); + if (rc) + goto out_err; + + /* push read_size to next read offset */ + read_size += label_read_size; + } + + dev_dbg(ndd->dev, "len: %zu rc: %d\n", offset, rc); +out_err: + return rc; +} + +int nd_label_active_count(struct nvdimm_drvdata *ndd) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot, slot; + int count = 0; + + if (!preamble_current(ndd, &nsindex, &free, &nslot)) + return 0; + + for_each_clear_bit_le(slot, free, nslot) { + struct nd_namespace_label *nd_label; + + nd_label = to_label(ndd, slot); + + if (!slot_valid(ndd, nd_label, slot)) { + u32 label_slot = nsl_get_slot(ndd, nd_label); + u64 size = nsl_get_rawsize(ndd, nd_label); + u64 dpa = nsl_get_dpa(ndd, nd_label); + + dev_dbg(ndd->dev, + "slot%d invalid slot: %d dpa: %llx size: %llx\n", + slot, label_slot, dpa, size); + continue; + } + count++; + } + return count; +} + +struct nd_namespace_label *nd_label_active(struct nvdimm_drvdata *ndd, int n) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot, slot; + + if (!preamble_current(ndd, &nsindex, &free, &nslot)) + return NULL; + + for_each_clear_bit_le(slot, free, nslot) { + struct nd_namespace_label *nd_label; + + nd_label = to_label(ndd, slot); + if (!slot_valid(ndd, nd_label, slot)) + continue; + + if (n-- == 0) + return to_label(ndd, slot); + } + + return NULL; +} + +u32 nd_label_alloc_slot(struct nvdimm_drvdata *ndd) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot, slot; + + if (!preamble_next(ndd, &nsindex, &free, &nslot)) + return UINT_MAX; + + WARN_ON(!is_nvdimm_bus_locked(ndd->dev)); + + slot = find_next_bit_le(free, nslot, 0); + if (slot == nslot) + return UINT_MAX; + + clear_bit_le(slot, free); + + return slot; +} + +bool nd_label_free_slot(struct nvdimm_drvdata *ndd, u32 slot) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot; + + if (!preamble_next(ndd, &nsindex, &free, &nslot)) + return false; + + WARN_ON(!is_nvdimm_bus_locked(ndd->dev)); + + if (slot < nslot) + return !test_and_set_bit_le(slot, free); + return false; +} + +u32 nd_label_nfree(struct nvdimm_drvdata *ndd) +{ + struct nd_namespace_index *nsindex; + unsigned long *free; + u32 nslot; + + WARN_ON(!is_nvdimm_bus_locked(ndd->dev)); + + if (!preamble_next(ndd, &nsindex, &free, &nslot)) + return nvdimm_num_label_slots(ndd); + + return bitmap_weight(free, nslot); +} + +static int nd_label_write_index(struct nvdimm_drvdata *ndd, int index, u32 seq, + unsigned long flags) +{ + struct nd_namespace_index *nsindex; + unsigned long offset; + u64 checksum; + u32 nslot; + int rc; + + nsindex = to_namespace_index(ndd, index); + if (flags & ND_NSINDEX_INIT) + nslot = nvdimm_num_label_slots(ndd); + else + nslot = __le32_to_cpu(nsindex->nslot); + + memcpy(nsindex->sig, NSINDEX_SIGNATURE, NSINDEX_SIG_LEN); + memset(&nsindex->flags, 0, 3); + nsindex->labelsize = sizeof_namespace_label(ndd) >> 8; + nsindex->seq = __cpu_to_le32(seq); + offset = (unsigned long) nsindex + - (unsigned long) to_namespace_index(ndd, 0); + nsindex->myoff = __cpu_to_le64(offset); + nsindex->mysize = __cpu_to_le64(sizeof_namespace_index(ndd)); + offset = (unsigned long) to_namespace_index(ndd, + nd_label_next_nsindex(index)) + - (unsigned long) to_namespace_index(ndd, 0); + nsindex->otheroff = __cpu_to_le64(offset); + offset = (unsigned long) nd_label_base(ndd) + - (unsigned long) to_namespace_index(ndd, 0); + nsindex->labeloff = __cpu_to_le64(offset); + nsindex->nslot = __cpu_to_le32(nslot); + nsindex->major = __cpu_to_le16(1); + if (sizeof_namespace_label(ndd) < 256) + nsindex->minor = __cpu_to_le16(1); + else + nsindex->minor = __cpu_to_le16(2); + nsindex->checksum = __cpu_to_le64(0); + if (flags & ND_NSINDEX_INIT) { + unsigned long *free = (unsigned long *) nsindex->free; + u32 nfree = ALIGN(nslot, BITS_PER_LONG); + int last_bits, i; + + memset(nsindex->free, 0xff, nfree / 8); + for (i = 0, last_bits = nfree - nslot; i < last_bits; i++) + clear_bit_le(nslot + i, free); + } + checksum = nd_fletcher64(nsindex, sizeof_namespace_index(ndd), 1); + nsindex->checksum = __cpu_to_le64(checksum); + rc = nvdimm_set_config_data(ndd, __le64_to_cpu(nsindex->myoff), + nsindex, sizeof_namespace_index(ndd)); + if (rc < 0) + return rc; + + if (flags & ND_NSINDEX_INIT) + return 0; + + /* copy the index we just wrote to the new 'next' */ + WARN_ON(index != ndd->ns_next); + nd_label_copy(ndd, to_current_namespace_index(ndd), nsindex); + ndd->ns_current = nd_label_next_nsindex(ndd->ns_current); + ndd->ns_next = nd_label_next_nsindex(ndd->ns_next); + WARN_ON(ndd->ns_current == ndd->ns_next); + + return 0; +} + +static unsigned long nd_label_offset(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label) +{ + return (unsigned long) nd_label + - (unsigned long) to_namespace_index(ndd, 0); +} + +static enum nvdimm_claim_class guid_to_nvdimm_cclass(guid_t *guid) +{ + if (guid_equal(guid, &nvdimm_btt_guid)) + return NVDIMM_CCLASS_BTT; + else if (guid_equal(guid, &nvdimm_btt2_guid)) + return NVDIMM_CCLASS_BTT2; + else if (guid_equal(guid, &nvdimm_pfn_guid)) + return NVDIMM_CCLASS_PFN; + else if (guid_equal(guid, &nvdimm_dax_guid)) + return NVDIMM_CCLASS_DAX; + else if (guid_equal(guid, &guid_null)) + return NVDIMM_CCLASS_NONE; + + return NVDIMM_CCLASS_UNKNOWN; +} + +/* CXL labels store UUIDs instead of GUIDs for the same data */ +static enum nvdimm_claim_class uuid_to_nvdimm_cclass(uuid_t *uuid) +{ + if (uuid_equal(uuid, &nvdimm_btt_uuid)) + return NVDIMM_CCLASS_BTT; + else if (uuid_equal(uuid, &nvdimm_btt2_uuid)) + return NVDIMM_CCLASS_BTT2; + else if (uuid_equal(uuid, &nvdimm_pfn_uuid)) + return NVDIMM_CCLASS_PFN; + else if (uuid_equal(uuid, &nvdimm_dax_uuid)) + return NVDIMM_CCLASS_DAX; + else if (uuid_equal(uuid, &uuid_null)) + return NVDIMM_CCLASS_NONE; + + return NVDIMM_CCLASS_UNKNOWN; +} + +static const guid_t *to_abstraction_guid(enum nvdimm_claim_class claim_class, + guid_t *target) +{ + if (claim_class == NVDIMM_CCLASS_BTT) + return &nvdimm_btt_guid; + else if (claim_class == NVDIMM_CCLASS_BTT2) + return &nvdimm_btt2_guid; + else if (claim_class == NVDIMM_CCLASS_PFN) + return &nvdimm_pfn_guid; + else if (claim_class == NVDIMM_CCLASS_DAX) + return &nvdimm_dax_guid; + else if (claim_class == NVDIMM_CCLASS_UNKNOWN) { + /* + * If we're modifying a namespace for which we don't + * know the claim_class, don't touch the existing guid. + */ + return target; + } else + return &guid_null; +} + +/* CXL labels store UUIDs instead of GUIDs for the same data */ +static const uuid_t *to_abstraction_uuid(enum nvdimm_claim_class claim_class, + uuid_t *target) +{ + if (claim_class == NVDIMM_CCLASS_BTT) + return &nvdimm_btt_uuid; + else if (claim_class == NVDIMM_CCLASS_BTT2) + return &nvdimm_btt2_uuid; + else if (claim_class == NVDIMM_CCLASS_PFN) + return &nvdimm_pfn_uuid; + else if (claim_class == NVDIMM_CCLASS_DAX) + return &nvdimm_dax_uuid; + else if (claim_class == NVDIMM_CCLASS_UNKNOWN) { + /* + * If we're modifying a namespace for which we don't + * know the claim_class, don't touch the existing uuid. + */ + return target; + } else + return &uuid_null; +} + +static void reap_victim(struct nd_mapping *nd_mapping, + struct nd_label_ent *victim) +{ + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + u32 slot = to_slot(ndd, victim->label); + + dev_dbg(ndd->dev, "free: %d\n", slot); + nd_label_free_slot(ndd, slot); + victim->label = NULL; +} + +static void nsl_set_type_guid(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label, guid_t *guid) +{ + if (efi_namespace_label_has(ndd, type_guid)) + guid_copy(&nd_label->efi.type_guid, guid); +} + +bool nsl_validate_type_guid(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label, guid_t *guid) +{ + if (ndd->cxl || !efi_namespace_label_has(ndd, type_guid)) + return true; + if (!guid_equal(&nd_label->efi.type_guid, guid)) { + dev_dbg(ndd->dev, "expect type_guid %pUb got %pUb\n", guid, + &nd_label->efi.type_guid); + return false; + } + return true; +} + +static void nsl_set_claim_class(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label, + enum nvdimm_claim_class claim_class) +{ + if (ndd->cxl) { + uuid_t uuid; + + import_uuid(&uuid, nd_label->cxl.abstraction_uuid); + export_uuid(nd_label->cxl.abstraction_uuid, + to_abstraction_uuid(claim_class, &uuid)); + return; + } + + if (!efi_namespace_label_has(ndd, abstraction_guid)) + return; + guid_copy(&nd_label->efi.abstraction_guid, + to_abstraction_guid(claim_class, + &nd_label->efi.abstraction_guid)); +} + +enum nvdimm_claim_class nsl_get_claim_class(struct nvdimm_drvdata *ndd, + struct nd_namespace_label *nd_label) +{ + if (ndd->cxl) { + uuid_t uuid; + + import_uuid(&uuid, nd_label->cxl.abstraction_uuid); + return uuid_to_nvdimm_cclass(&uuid); + } + if (!efi_namespace_label_has(ndd, abstraction_guid)) + return NVDIMM_CCLASS_NONE; + return guid_to_nvdimm_cclass(&nd_label->efi.abstraction_guid); +} + +static int __pmem_label_update(struct nd_region *nd_region, + struct nd_mapping *nd_mapping, struct nd_namespace_pmem *nspm, + int pos, unsigned long flags) +{ + struct nd_namespace_common *ndns = &nspm->nsio.common; + struct nd_interleave_set *nd_set = nd_region->nd_set; + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + struct nd_namespace_label *nd_label; + struct nd_namespace_index *nsindex; + struct nd_label_ent *label_ent; + struct nd_label_id label_id; + struct resource *res; + unsigned long *free; + u32 nslot, slot; + size_t offset; + u64 cookie; + int rc; + + if (!preamble_next(ndd, &nsindex, &free, &nslot)) + return -ENXIO; + + cookie = nd_region_interleave_set_cookie(nd_region, nsindex); + nd_label_gen_id(&label_id, nspm->uuid, 0); + for_each_dpa_resource(ndd, res) + if (strcmp(res->name, label_id.id) == 0) + break; + + if (!res) { + WARN_ON_ONCE(1); + return -ENXIO; + } + + /* allocate and write the label to the staging (next) index */ + slot = nd_label_alloc_slot(ndd); + if (slot == UINT_MAX) + return -ENXIO; + dev_dbg(ndd->dev, "allocated: %d\n", slot); + + nd_label = to_label(ndd, slot); + memset(nd_label, 0, sizeof_namespace_label(ndd)); + nsl_set_uuid(ndd, nd_label, nspm->uuid); + nsl_set_name(ndd, nd_label, nspm->alt_name); + nsl_set_flags(ndd, nd_label, flags); + nsl_set_nlabel(ndd, nd_label, nd_region->ndr_mappings); + nsl_set_nrange(ndd, nd_label, 1); + nsl_set_position(ndd, nd_label, pos); + nsl_set_isetcookie(ndd, nd_label, cookie); + nsl_set_rawsize(ndd, nd_label, resource_size(res)); + nsl_set_lbasize(ndd, nd_label, nspm->lbasize); + nsl_set_dpa(ndd, nd_label, res->start); + nsl_set_slot(ndd, nd_label, slot); + nsl_set_type_guid(ndd, nd_label, &nd_set->type_guid); + nsl_set_claim_class(ndd, nd_label, ndns->claim_class); + nsl_calculate_checksum(ndd, nd_label); + nd_dbg_dpa(nd_region, ndd, res, "\n"); + + /* update label */ + offset = nd_label_offset(ndd, nd_label); + rc = nvdimm_set_config_data(ndd, offset, nd_label, + sizeof_namespace_label(ndd)); + if (rc < 0) + return rc; + + /* Garbage collect the previous label */ + mutex_lock(&nd_mapping->lock); + list_for_each_entry(label_ent, &nd_mapping->labels, list) { + if (!label_ent->label) + continue; + if (test_and_clear_bit(ND_LABEL_REAP, &label_ent->flags) || + nsl_uuid_equal(ndd, label_ent->label, nspm->uuid)) + reap_victim(nd_mapping, label_ent); + } + + /* update index */ + rc = nd_label_write_index(ndd, ndd->ns_next, + nd_inc_seq(__le32_to_cpu(nsindex->seq)), 0); + if (rc == 0) { + list_for_each_entry(label_ent, &nd_mapping->labels, list) + if (!label_ent->label) { + label_ent->label = nd_label; + nd_label = NULL; + break; + } + dev_WARN_ONCE(&nspm->nsio.common.dev, nd_label, + "failed to track label: %d\n", + to_slot(ndd, nd_label)); + if (nd_label) + rc = -ENXIO; + } + mutex_unlock(&nd_mapping->lock); + + return rc; +} + +static int init_labels(struct nd_mapping *nd_mapping, int num_labels) +{ + int i, old_num_labels = 0; + struct nd_label_ent *label_ent; + struct nd_namespace_index *nsindex; + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + + mutex_lock(&nd_mapping->lock); + list_for_each_entry(label_ent, &nd_mapping->labels, list) + old_num_labels++; + mutex_unlock(&nd_mapping->lock); + + /* + * We need to preserve all the old labels for the mapping so + * they can be garbage collected after writing the new labels. + */ + for (i = old_num_labels; i < num_labels; i++) { + label_ent = kzalloc(sizeof(*label_ent), GFP_KERNEL); + if (!label_ent) + return -ENOMEM; + mutex_lock(&nd_mapping->lock); + list_add_tail(&label_ent->list, &nd_mapping->labels); + mutex_unlock(&nd_mapping->lock); + } + + if (ndd->ns_current == -1 || ndd->ns_next == -1) + /* pass */; + else + return max(num_labels, old_num_labels); + + nsindex = to_namespace_index(ndd, 0); + memset(nsindex, 0, ndd->nsarea.config_size); + for (i = 0; i < 2; i++) { + int rc = nd_label_write_index(ndd, i, 3 - i, ND_NSINDEX_INIT); + + if (rc) + return rc; + } + ndd->ns_next = 1; + ndd->ns_current = 0; + + return max(num_labels, old_num_labels); +} + +static int del_labels(struct nd_mapping *nd_mapping, uuid_t *uuid) +{ + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + struct nd_label_ent *label_ent, *e; + struct nd_namespace_index *nsindex; + unsigned long *free; + LIST_HEAD(list); + u32 nslot, slot; + int active = 0; + + if (!uuid) + return 0; + + /* no index || no labels == nothing to delete */ + if (!preamble_next(ndd, &nsindex, &free, &nslot)) + return 0; + + mutex_lock(&nd_mapping->lock); + list_for_each_entry_safe(label_ent, e, &nd_mapping->labels, list) { + struct nd_namespace_label *nd_label = label_ent->label; + + if (!nd_label) + continue; + active++; + if (!nsl_uuid_equal(ndd, nd_label, uuid)) + continue; + active--; + slot = to_slot(ndd, nd_label); + nd_label_free_slot(ndd, slot); + dev_dbg(ndd->dev, "free: %d\n", slot); + list_move_tail(&label_ent->list, &list); + label_ent->label = NULL; + } + list_splice_tail_init(&list, &nd_mapping->labels); + + if (active == 0) { + nd_mapping_free_labels(nd_mapping); + dev_dbg(ndd->dev, "no more active labels\n"); + } + mutex_unlock(&nd_mapping->lock); + + return nd_label_write_index(ndd, ndd->ns_next, + nd_inc_seq(__le32_to_cpu(nsindex->seq)), 0); +} + +int nd_pmem_namespace_label_update(struct nd_region *nd_region, + struct nd_namespace_pmem *nspm, resource_size_t size) +{ + int i, rc; + + for (i = 0; i < nd_region->ndr_mappings; i++) { + struct nd_mapping *nd_mapping = &nd_region->mapping[i]; + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); + struct resource *res; + int count = 0; + + if (size == 0) { + rc = del_labels(nd_mapping, nspm->uuid); + if (rc) + return rc; + continue; + } + + for_each_dpa_resource(ndd, res) + if (strncmp(res->name, "pmem", 4) == 0) + count++; + WARN_ON_ONCE(!count); + + rc = init_labels(nd_mapping, count); + if (rc < 0) + return rc; + + rc = __pmem_label_update(nd_region, nd_mapping, nspm, i, + NSLABEL_FLAG_UPDATING); + if (rc) + return rc; + } + + if (size == 0) + return 0; + + /* Clear the UPDATING flag per UEFI 2.7 expectations */ + for (i = 0; i < nd_region->ndr_mappings; i++) { + struct nd_mapping *nd_mapping = &nd_region->mapping[i]; + + rc = __pmem_label_update(nd_region, nd_mapping, nspm, i, 0); + if (rc) + return rc; + } + + return 0; +} + +int __init nd_label_init(void) +{ + WARN_ON(guid_parse(NVDIMM_BTT_GUID, &nvdimm_btt_guid)); + WARN_ON(guid_parse(NVDIMM_BTT2_GUID, &nvdimm_btt2_guid)); + WARN_ON(guid_parse(NVDIMM_PFN_GUID, &nvdimm_pfn_guid)); + WARN_ON(guid_parse(NVDIMM_DAX_GUID, &nvdimm_dax_guid)); + + WARN_ON(uuid_parse(NVDIMM_BTT_GUID, &nvdimm_btt_uuid)); + WARN_ON(uuid_parse(NVDIMM_BTT2_GUID, &nvdimm_btt2_uuid)); + WARN_ON(uuid_parse(NVDIMM_PFN_GUID, &nvdimm_pfn_uuid)); + WARN_ON(uuid_parse(NVDIMM_DAX_GUID, &nvdimm_dax_uuid)); + + WARN_ON(uuid_parse(CXL_REGION_UUID, &cxl_region_uuid)); + WARN_ON(uuid_parse(CXL_NAMESPACE_UUID, &cxl_namespace_uuid)); + + return 0; +} |