diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools/perf/util/probe-file.c | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'tools/perf/util/probe-file.c')
-rw-r--r-- | tools/perf/util/probe-file.c | 1200 |
1 files changed, 1200 insertions, 0 deletions
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c new file mode 100644 index 000000000..3d50de321 --- /dev/null +++ b/tools/perf/util/probe-file.c @@ -0,0 +1,1200 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * probe-file.c : operate ftrace k/uprobe events files + * + * Written by Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> + */ +#include <errno.h> +#include <fcntl.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <sys/uio.h> +#include <unistd.h> +#include <linux/zalloc.h> +#include "namespaces.h" +#include "event.h" +#include "strlist.h" +#include "strfilter.h" +#include "debug.h" +#include "build-id.h" +#include "dso.h" +#include "color.h" +#include "symbol.h" +#include "strbuf.h" +#include <api/fs/tracing_path.h> +#include <api/fs/fs.h> +#include "probe-event.h" +#include "probe-file.h" +#include "session.h" +#include "perf_regs.h" +#include "string2.h" + +/* 4096 - 2 ('\n' + '\0') */ +#define MAX_CMDLEN 4094 + +static bool print_common_warning(int err, bool readwrite) +{ + if (err == -EACCES) + pr_warning("No permission to %s tracefs.\nPlease %s\n", + readwrite ? "write" : "read", + readwrite ? "run this command again with sudo." : + "try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'"); + else + return false; + + return true; +} + +static bool print_configure_probe_event(int kerr, int uerr) +{ + const char *config, *file; + + if (kerr == -ENOENT && uerr == -ENOENT) { + file = "{k,u}probe_events"; + config = "CONFIG_KPROBE_EVENTS=y and CONFIG_UPROBE_EVENTS=y"; + } else if (kerr == -ENOENT) { + file = "kprobe_events"; + config = "CONFIG_KPROBE_EVENTS=y"; + } else if (uerr == -ENOENT) { + file = "uprobe_events"; + config = "CONFIG_UPROBE_EVENTS=y"; + } else + return false; + + if (!debugfs__configured() && !tracefs__configured()) + pr_warning("Debugfs or tracefs is not mounted\n" + "Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'\n"); + else + pr_warning("%s/%s does not exist.\nPlease rebuild kernel with %s.\n", + tracing_path_mount(), file, config); + + return true; +} + +static void print_open_warning(int err, bool uprobe, bool readwrite) +{ + char sbuf[STRERR_BUFSIZE]; + + if (print_common_warning(err, readwrite)) + return; + + if (print_configure_probe_event(uprobe ? 0 : err, uprobe ? err : 0)) + return; + + pr_warning("Failed to open %s/%cprobe_events: %s\n", + tracing_path_mount(), uprobe ? 'u' : 'k', + str_error_r(-err, sbuf, sizeof(sbuf))); +} + +static void print_both_open_warning(int kerr, int uerr, bool readwrite) +{ + char sbuf[STRERR_BUFSIZE]; + + if (kerr == uerr && print_common_warning(kerr, readwrite)) + return; + + if (print_configure_probe_event(kerr, uerr)) + return; + + if (kerr < 0) + pr_warning("Failed to open %s/kprobe_events: %s.\n", + tracing_path_mount(), + str_error_r(-kerr, sbuf, sizeof(sbuf))); + if (uerr < 0) + pr_warning("Failed to open %s/uprobe_events: %s.\n", + tracing_path_mount(), + str_error_r(-uerr, sbuf, sizeof(sbuf))); +} + +int open_trace_file(const char *trace_file, bool readwrite) +{ + char buf[PATH_MAX]; + int ret; + + ret = e_snprintf(buf, PATH_MAX, "%s/%s", tracing_path_mount(), trace_file); + if (ret >= 0) { + pr_debug("Opening %s write=%d\n", buf, readwrite); + if (readwrite && !probe_event_dry_run) + ret = open(buf, O_RDWR | O_APPEND, 0); + else + ret = open(buf, O_RDONLY, 0); + + if (ret < 0) + ret = -errno; + } + return ret; +} + +static int open_kprobe_events(bool readwrite) +{ + return open_trace_file("kprobe_events", readwrite); +} + +static int open_uprobe_events(bool readwrite) +{ + return open_trace_file("uprobe_events", readwrite); +} + +int probe_file__open(int flag) +{ + int fd; + + if (flag & PF_FL_UPROBE) + fd = open_uprobe_events(flag & PF_FL_RW); + else + fd = open_kprobe_events(flag & PF_FL_RW); + if (fd < 0) + print_open_warning(fd, flag & PF_FL_UPROBE, flag & PF_FL_RW); + + return fd; +} + +int probe_file__open_both(int *kfd, int *ufd, int flag) +{ + if (!kfd || !ufd) + return -EINVAL; + + *kfd = open_kprobe_events(flag & PF_FL_RW); + *ufd = open_uprobe_events(flag & PF_FL_RW); + if (*kfd < 0 && *ufd < 0) { + print_both_open_warning(*kfd, *ufd, flag & PF_FL_RW); + return *kfd; + } + + return 0; +} + +/* Get raw string list of current kprobe_events or uprobe_events */ +struct strlist *probe_file__get_rawlist(int fd) +{ + int ret, idx, fddup; + FILE *fp; + char buf[MAX_CMDLEN]; + char *p; + struct strlist *sl; + + if (fd < 0) + return NULL; + + sl = strlist__new(NULL, NULL); + if (sl == NULL) + return NULL; + + fddup = dup(fd); + if (fddup < 0) + goto out_free_sl; + + fp = fdopen(fddup, "r"); + if (!fp) + goto out_close_fddup; + + while (!feof(fp)) { + p = fgets(buf, MAX_CMDLEN, fp); + if (!p) + break; + + idx = strlen(p) - 1; + if (p[idx] == '\n') + p[idx] = '\0'; + ret = strlist__add(sl, buf); + if (ret < 0) { + pr_debug("strlist__add failed (%d)\n", ret); + goto out_close_fp; + } + } + fclose(fp); + + return sl; + +out_close_fp: + fclose(fp); + goto out_free_sl; +out_close_fddup: + close(fddup); +out_free_sl: + strlist__delete(sl); + return NULL; +} + +static struct strlist *__probe_file__get_namelist(int fd, bool include_group) +{ + char buf[128]; + struct strlist *sl, *rawlist; + struct str_node *ent; + struct probe_trace_event tev; + int ret = 0; + + memset(&tev, 0, sizeof(tev)); + rawlist = probe_file__get_rawlist(fd); + if (!rawlist) + return NULL; + sl = strlist__new(NULL, NULL); + strlist__for_each_entry(ent, rawlist) { + ret = parse_probe_trace_command(ent->s, &tev); + if (ret < 0) + break; + if (include_group) { + ret = e_snprintf(buf, 128, "%s:%s", tev.group, + tev.event); + if (ret >= 0) + ret = strlist__add(sl, buf); + } else + ret = strlist__add(sl, tev.event); + clear_probe_trace_event(&tev); + /* Skip if there is same name multi-probe event in the list */ + if (ret == -EEXIST) + ret = 0; + if (ret < 0) + break; + } + strlist__delete(rawlist); + + if (ret < 0) { + strlist__delete(sl); + return NULL; + } + return sl; +} + +/* Get current perf-probe event names */ +struct strlist *probe_file__get_namelist(int fd) +{ + return __probe_file__get_namelist(fd, false); +} + +int probe_file__add_event(int fd, struct probe_trace_event *tev) +{ + int ret = 0; + char *buf = synthesize_probe_trace_command(tev); + char sbuf[STRERR_BUFSIZE]; + + if (!buf) { + pr_debug("Failed to synthesize probe trace event.\n"); + return -EINVAL; + } + + pr_debug("Writing event: %s\n", buf); + if (!probe_event_dry_run) { + if (write(fd, buf, strlen(buf)) < (int)strlen(buf)) { + ret = -errno; + pr_warning("Failed to write event: %s\n", + str_error_r(errno, sbuf, sizeof(sbuf))); + } + } + free(buf); + + return ret; +} + +static int __del_trace_probe_event(int fd, struct str_node *ent) +{ + char *p; + char buf[128]; + int ret; + + /* Convert from perf-probe event to trace-probe event */ + ret = e_snprintf(buf, 128, "-:%s", ent->s); + if (ret < 0) + goto error; + + p = strchr(buf + 2, ':'); + if (!p) { + pr_debug("Internal error: %s should have ':' but not.\n", + ent->s); + ret = -ENOTSUP; + goto error; + } + *p = '/'; + + pr_debug("Writing event: %s\n", buf); + ret = write(fd, buf, strlen(buf)); + if (ret < 0) { + ret = -errno; + goto error; + } + + return 0; +error: + pr_warning("Failed to delete event: %s\n", + str_error_r(-ret, buf, sizeof(buf))); + return ret; +} + +int probe_file__get_events(int fd, struct strfilter *filter, + struct strlist *plist) +{ + struct strlist *namelist; + struct str_node *ent; + const char *p; + int ret = -ENOENT; + + if (!plist) + return -EINVAL; + + namelist = __probe_file__get_namelist(fd, true); + if (!namelist) + return -ENOENT; + + strlist__for_each_entry(ent, namelist) { + p = strchr(ent->s, ':'); + if ((p && strfilter__compare(filter, p + 1)) || + strfilter__compare(filter, ent->s)) { + ret = strlist__add(plist, ent->s); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out; + } + ret = 0; + } + } +out: + strlist__delete(namelist); + + return ret; +} + +int probe_file__del_strlist(int fd, struct strlist *namelist) +{ + int ret = 0; + struct str_node *ent; + + strlist__for_each_entry(ent, namelist) { + ret = __del_trace_probe_event(fd, ent); + if (ret < 0) + break; + } + return ret; +} + +int probe_file__del_events(int fd, struct strfilter *filter) +{ + struct strlist *namelist; + int ret; + + namelist = strlist__new(NULL, NULL); + if (!namelist) + return -ENOMEM; + + ret = probe_file__get_events(fd, filter, namelist); + if (ret < 0) + goto out; + + ret = probe_file__del_strlist(fd, namelist); +out: + strlist__delete(namelist); + return ret; +} + +/* Caller must ensure to remove this entry from list */ +static void probe_cache_entry__delete(struct probe_cache_entry *entry) +{ + if (entry) { + BUG_ON(!list_empty(&entry->node)); + + strlist__delete(entry->tevlist); + clear_perf_probe_event(&entry->pev); + zfree(&entry->spev); + free(entry); + } +} + +static struct probe_cache_entry * +probe_cache_entry__new(struct perf_probe_event *pev) +{ + struct probe_cache_entry *entry = zalloc(sizeof(*entry)); + + if (entry) { + INIT_LIST_HEAD(&entry->node); + entry->tevlist = strlist__new(NULL, NULL); + if (!entry->tevlist) + zfree(&entry); + else if (pev) { + entry->spev = synthesize_perf_probe_command(pev); + if (!entry->spev || + perf_probe_event__copy(&entry->pev, pev) < 0) { + probe_cache_entry__delete(entry); + return NULL; + } + } + } + + return entry; +} + +int probe_cache_entry__get_event(struct probe_cache_entry *entry, + struct probe_trace_event **tevs) +{ + struct probe_trace_event *tev; + struct str_node *node; + int ret, i; + + ret = strlist__nr_entries(entry->tevlist); + if (ret > probe_conf.max_probes) + return -E2BIG; + + *tevs = zalloc(ret * sizeof(*tev)); + if (!*tevs) + return -ENOMEM; + + i = 0; + strlist__for_each_entry(node, entry->tevlist) { + tev = &(*tevs)[i++]; + ret = parse_probe_trace_command(node->s, tev); + if (ret < 0) + break; + } + return i; +} + +/* For the kernel probe caches, pass target = NULL or DSO__NAME_KALLSYMS */ +static int probe_cache__open(struct probe_cache *pcache, const char *target, + struct nsinfo *nsi) +{ + char cpath[PATH_MAX]; + char sbuildid[SBUILD_ID_SIZE]; + char *dir_name = NULL; + bool is_kallsyms = false; + int ret, fd; + struct nscookie nsc; + + if (target && build_id_cache__cached(target)) { + /* This is a cached buildid */ + strlcpy(sbuildid, target, SBUILD_ID_SIZE); + dir_name = build_id_cache__linkname(sbuildid, NULL, 0); + goto found; + } + + if (!target || !strcmp(target, DSO__NAME_KALLSYMS)) { + target = DSO__NAME_KALLSYMS; + is_kallsyms = true; + ret = sysfs__sprintf_build_id("/", sbuildid); + } else { + nsinfo__mountns_enter(nsi, &nsc); + ret = filename__sprintf_build_id(target, sbuildid); + nsinfo__mountns_exit(&nsc); + } + + if (ret < 0) { + pr_debug("Failed to get build-id from %s.\n", target); + return ret; + } + + /* If we have no buildid cache, make it */ + if (!build_id_cache__cached(sbuildid)) { + ret = build_id_cache__add_s(sbuildid, target, nsi, + is_kallsyms, NULL); + if (ret < 0) { + pr_debug("Failed to add build-id cache: %s\n", target); + return ret; + } + } + + dir_name = build_id_cache__cachedir(sbuildid, target, nsi, is_kallsyms, + false); +found: + if (!dir_name) { + pr_debug("Failed to get cache from %s\n", target); + return -ENOMEM; + } + + snprintf(cpath, PATH_MAX, "%s/probes", dir_name); + fd = open(cpath, O_CREAT | O_RDWR, 0644); + if (fd < 0) + pr_debug("Failed to open cache(%d): %s\n", fd, cpath); + free(dir_name); + pcache->fd = fd; + + return fd; +} + +static int probe_cache__load(struct probe_cache *pcache) +{ + struct probe_cache_entry *entry = NULL; + char buf[MAX_CMDLEN], *p; + int ret = 0, fddup; + FILE *fp; + + fddup = dup(pcache->fd); + if (fddup < 0) + return -errno; + fp = fdopen(fddup, "r"); + if (!fp) { + close(fddup); + return -EINVAL; + } + + while (!feof(fp)) { + if (!fgets(buf, MAX_CMDLEN, fp)) + break; + p = strchr(buf, '\n'); + if (p) + *p = '\0'; + /* #perf_probe_event or %sdt_event */ + if (buf[0] == '#' || buf[0] == '%') { + entry = probe_cache_entry__new(NULL); + if (!entry) { + ret = -ENOMEM; + goto out; + } + if (buf[0] == '%') + entry->sdt = true; + entry->spev = strdup(buf + 1); + if (entry->spev) + ret = parse_perf_probe_command(buf + 1, + &entry->pev); + else + ret = -ENOMEM; + if (ret < 0) { + probe_cache_entry__delete(entry); + goto out; + } + list_add_tail(&entry->node, &pcache->entries); + } else { /* trace_probe_event */ + if (!entry) { + ret = -EINVAL; + goto out; + } + ret = strlist__add(entry->tevlist, buf); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out; + } + } + } +out: + fclose(fp); + return ret; +} + +static struct probe_cache *probe_cache__alloc(void) +{ + struct probe_cache *pcache = zalloc(sizeof(*pcache)); + + if (pcache) { + INIT_LIST_HEAD(&pcache->entries); + pcache->fd = -EINVAL; + } + return pcache; +} + +void probe_cache__purge(struct probe_cache *pcache) +{ + struct probe_cache_entry *entry, *n; + + list_for_each_entry_safe(entry, n, &pcache->entries, node) { + list_del_init(&entry->node); + probe_cache_entry__delete(entry); + } +} + +void probe_cache__delete(struct probe_cache *pcache) +{ + if (!pcache) + return; + + probe_cache__purge(pcache); + if (pcache->fd > 0) + close(pcache->fd); + free(pcache); +} + +struct probe_cache *probe_cache__new(const char *target, struct nsinfo *nsi) +{ + struct probe_cache *pcache = probe_cache__alloc(); + int ret; + + if (!pcache) + return NULL; + + ret = probe_cache__open(pcache, target, nsi); + if (ret < 0) { + pr_debug("Cache open error: %d\n", ret); + goto out_err; + } + + ret = probe_cache__load(pcache); + if (ret < 0) { + pr_debug("Cache read error: %d\n", ret); + goto out_err; + } + + return pcache; + +out_err: + probe_cache__delete(pcache); + return NULL; +} + +static bool streql(const char *a, const char *b) +{ + if (a == b) + return true; + + if (!a || !b) + return false; + + return !strcmp(a, b); +} + +struct probe_cache_entry * +probe_cache__find(struct probe_cache *pcache, struct perf_probe_event *pev) +{ + struct probe_cache_entry *entry = NULL; + char *cmd = synthesize_perf_probe_command(pev); + + if (!cmd) + return NULL; + + for_each_probe_cache_entry(entry, pcache) { + if (pev->sdt) { + if (entry->pev.event && + streql(entry->pev.event, pev->event) && + (!pev->group || + streql(entry->pev.group, pev->group))) + goto found; + + continue; + } + /* Hit if same event name or same command-string */ + if ((pev->event && + (streql(entry->pev.group, pev->group) && + streql(entry->pev.event, pev->event))) || + (!strcmp(entry->spev, cmd))) + goto found; + } + entry = NULL; + +found: + free(cmd); + return entry; +} + +struct probe_cache_entry * +probe_cache__find_by_name(struct probe_cache *pcache, + const char *group, const char *event) +{ + struct probe_cache_entry *entry = NULL; + + for_each_probe_cache_entry(entry, pcache) { + /* Hit if same event name or same command-string */ + if (streql(entry->pev.group, group) && + streql(entry->pev.event, event)) + goto found; + } + entry = NULL; + +found: + return entry; +} + +int probe_cache__add_entry(struct probe_cache *pcache, + struct perf_probe_event *pev, + struct probe_trace_event *tevs, int ntevs) +{ + struct probe_cache_entry *entry = NULL; + char *command; + int i, ret = 0; + + if (!pcache || !pev || !tevs || ntevs <= 0) { + ret = -EINVAL; + goto out_err; + } + + /* Remove old cache entry */ + entry = probe_cache__find(pcache, pev); + if (entry) { + list_del_init(&entry->node); + probe_cache_entry__delete(entry); + } + + ret = -ENOMEM; + entry = probe_cache_entry__new(pev); + if (!entry) + goto out_err; + + for (i = 0; i < ntevs; i++) { + if (!tevs[i].point.symbol) + continue; + + command = synthesize_probe_trace_command(&tevs[i]); + if (!command) + goto out_err; + ret = strlist__add(entry->tevlist, command); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out_err; + } + + free(command); + } + list_add_tail(&entry->node, &pcache->entries); + pr_debug("Added probe cache: %d\n", ntevs); + return 0; + +out_err: + pr_debug("Failed to add probe caches\n"); + probe_cache_entry__delete(entry); + return ret; +} + +#ifdef HAVE_GELF_GETNOTE_SUPPORT +static unsigned long long sdt_note__get_addr(struct sdt_note *note) +{ + return note->bit32 ? + (unsigned long long)note->addr.a32[SDT_NOTE_IDX_LOC] : + (unsigned long long)note->addr.a64[SDT_NOTE_IDX_LOC]; +} + +static unsigned long long sdt_note__get_ref_ctr_offset(struct sdt_note *note) +{ + return note->bit32 ? + (unsigned long long)note->addr.a32[SDT_NOTE_IDX_REFCTR] : + (unsigned long long)note->addr.a64[SDT_NOTE_IDX_REFCTR]; +} + +static const char * const type_to_suffix[] = { + ":s64", "", "", "", ":s32", "", ":s16", ":s8", + "", ":u8", ":u16", "", ":u32", "", "", "", ":u64" +}; + +/* + * Isolate the string number and convert it into a decimal value; + * this will be an index to get suffix of the uprobe name (defining + * the type) + */ +static int sdt_arg_parse_size(char *n_ptr, const char **suffix) +{ + long type_idx; + + type_idx = strtol(n_ptr, NULL, 10); + if (type_idx < -8 || type_idx > 8) { + pr_debug4("Failed to get a valid sdt type\n"); + return -1; + } + + *suffix = type_to_suffix[type_idx + 8]; + return 0; +} + +static int synthesize_sdt_probe_arg(struct strbuf *buf, int i, const char *arg) +{ + char *op, *desc = strdup(arg), *new_op = NULL; + const char *suffix = ""; + int ret = -1; + + if (desc == NULL) { + pr_debug4("Allocation error\n"); + return ret; + } + + /* + * Argument is in N@OP format. N is size of the argument and OP is + * the actual assembly operand. N can be omitted; in that case + * argument is just OP(without @). + */ + op = strchr(desc, '@'); + if (op) { + op[0] = '\0'; + op++; + + if (sdt_arg_parse_size(desc, &suffix)) + goto error; + } else { + op = desc; + } + + ret = arch_sdt_arg_parse_op(op, &new_op); + + if (ret < 0) + goto error; + + if (ret == SDT_ARG_VALID) { + ret = strbuf_addf(buf, " arg%d=%s%s", i + 1, new_op, suffix); + if (ret < 0) + goto error; + } + + ret = 0; +error: + free(desc); + free(new_op); + return ret; +} + +static char *synthesize_sdt_probe_command(struct sdt_note *note, + const char *pathname, + const char *sdtgrp) +{ + struct strbuf buf; + char *ret = NULL; + int i, args_count, err; + unsigned long long ref_ctr_offset; + char *arg; + int arg_idx = 0; + + if (strbuf_init(&buf, 32) < 0) + return NULL; + + err = strbuf_addf(&buf, "p:%s/%s %s:0x%llx", + sdtgrp, note->name, pathname, + sdt_note__get_addr(note)); + + ref_ctr_offset = sdt_note__get_ref_ctr_offset(note); + if (ref_ctr_offset && err >= 0) + err = strbuf_addf(&buf, "(0x%llx)", ref_ctr_offset); + + if (err < 0) + goto error; + + if (!note->args) + goto out; + + if (note->args) { + char **args = argv_split(note->args, &args_count); + + if (args == NULL) + goto error; + + for (i = 0; i < args_count; ) { + /* + * FIXUP: Arm64 ELF section '.note.stapsdt' uses string + * format "-4@[sp, NUM]" if a probe is to access data in + * the stack, e.g. below is an example for the SDT + * Arguments: + * + * Arguments: -4@[sp, 12] -4@[sp, 8] -4@[sp, 4] + * + * Since the string introduces an extra space character + * in the middle of square brackets, the argument is + * divided into two items. Fixup for this case, if an + * item contains sub string "[sp,", need to concatenate + * the two items. + */ + if (strstr(args[i], "[sp,") && (i+1) < args_count) { + err = asprintf(&arg, "%s %s", args[i], args[i+1]); + i += 2; + } else { + err = asprintf(&arg, "%s", args[i]); + i += 1; + } + + /* Failed to allocate memory */ + if (err < 0) { + argv_free(args); + goto error; + } + + if (synthesize_sdt_probe_arg(&buf, arg_idx, arg) < 0) { + free(arg); + argv_free(args); + goto error; + } + + free(arg); + arg_idx++; + } + + argv_free(args); + } + +out: + ret = strbuf_detach(&buf, NULL); +error: + strbuf_release(&buf); + return ret; +} + +int probe_cache__scan_sdt(struct probe_cache *pcache, const char *pathname) +{ + struct probe_cache_entry *entry = NULL; + struct list_head sdtlist; + struct sdt_note *note; + char *buf; + char sdtgrp[64]; + int ret; + + INIT_LIST_HEAD(&sdtlist); + ret = get_sdt_note_list(&sdtlist, pathname); + if (ret < 0) { + pr_debug4("Failed to get sdt note: %d\n", ret); + return ret; + } + list_for_each_entry(note, &sdtlist, note_list) { + ret = snprintf(sdtgrp, 64, "sdt_%s", note->provider); + if (ret < 0) + break; + /* Try to find same-name entry */ + entry = probe_cache__find_by_name(pcache, sdtgrp, note->name); + if (!entry) { + entry = probe_cache_entry__new(NULL); + if (!entry) { + ret = -ENOMEM; + break; + } + entry->sdt = true; + ret = asprintf(&entry->spev, "%s:%s=%s", sdtgrp, + note->name, note->name); + if (ret < 0) + break; + entry->pev.event = strdup(note->name); + entry->pev.group = strdup(sdtgrp); + list_add_tail(&entry->node, &pcache->entries); + } + buf = synthesize_sdt_probe_command(note, pathname, sdtgrp); + if (!buf) { + ret = -ENOMEM; + break; + } + + ret = strlist__add(entry->tevlist, buf); + + free(buf); + entry = NULL; + + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + break; + } + } + if (entry) { + list_del_init(&entry->node); + probe_cache_entry__delete(entry); + } + cleanup_sdt_note_list(&sdtlist); + return ret; +} +#endif + +static int probe_cache_entry__write(struct probe_cache_entry *entry, int fd) +{ + struct str_node *snode; + struct stat st; + struct iovec iov[3]; + const char *prefix = entry->sdt ? "%" : "#"; + int ret; + /* Save stat for rollback */ + ret = fstat(fd, &st); + if (ret < 0) + return ret; + + pr_debug("Writing cache: %s%s\n", prefix, entry->spev); + iov[0].iov_base = (void *)prefix; iov[0].iov_len = 1; + iov[1].iov_base = entry->spev; iov[1].iov_len = strlen(entry->spev); + iov[2].iov_base = (void *)"\n"; iov[2].iov_len = 1; + ret = writev(fd, iov, 3); + if (ret < (int)iov[1].iov_len + 2) + goto rollback; + + strlist__for_each_entry(snode, entry->tevlist) { + iov[0].iov_base = (void *)snode->s; + iov[0].iov_len = strlen(snode->s); + iov[1].iov_base = (void *)"\n"; iov[1].iov_len = 1; + ret = writev(fd, iov, 2); + if (ret < (int)iov[0].iov_len + 1) + goto rollback; + } + return 0; + +rollback: + /* Rollback to avoid cache file corruption */ + if (ret > 0) + ret = -1; + if (ftruncate(fd, st.st_size) < 0) + ret = -2; + + return ret; +} + +int probe_cache__commit(struct probe_cache *pcache) +{ + struct probe_cache_entry *entry; + int ret = 0; + + /* TBD: if we do not update existing entries, skip it */ + ret = lseek(pcache->fd, 0, SEEK_SET); + if (ret < 0) + goto out; + + ret = ftruncate(pcache->fd, 0); + if (ret < 0) + goto out; + + for_each_probe_cache_entry(entry, pcache) { + ret = probe_cache_entry__write(entry, pcache->fd); + pr_debug("Cache committed: %d\n", ret); + if (ret < 0) + break; + } +out: + return ret; +} + +static bool probe_cache_entry__compare(struct probe_cache_entry *entry, + struct strfilter *filter) +{ + char buf[128], *ptr = entry->spev; + + if (entry->pev.event) { + snprintf(buf, 128, "%s:%s", entry->pev.group, entry->pev.event); + ptr = buf; + } + return strfilter__compare(filter, ptr); +} + +int probe_cache__filter_purge(struct probe_cache *pcache, + struct strfilter *filter) +{ + struct probe_cache_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &pcache->entries, node) { + if (probe_cache_entry__compare(entry, filter)) { + pr_info("Removed cached event: %s\n", entry->spev); + list_del_init(&entry->node); + probe_cache_entry__delete(entry); + } + } + return 0; +} + +static int probe_cache__show_entries(struct probe_cache *pcache, + struct strfilter *filter) +{ + struct probe_cache_entry *entry; + + for_each_probe_cache_entry(entry, pcache) { + if (probe_cache_entry__compare(entry, filter)) + printf("%s\n", entry->spev); + } + return 0; +} + +/* Show all cached probes */ +int probe_cache__show_all_caches(struct strfilter *filter) +{ + struct probe_cache *pcache; + struct strlist *bidlist; + struct str_node *nd; + char *buf = strfilter__string(filter); + + pr_debug("list cache with filter: %s\n", buf); + free(buf); + + bidlist = build_id_cache__list_all(true); + if (!bidlist) { + pr_debug("Failed to get buildids: %d\n", errno); + return -EINVAL; + } + strlist__for_each_entry(nd, bidlist) { + pcache = probe_cache__new(nd->s, NULL); + if (!pcache) + continue; + if (!list_empty(&pcache->entries)) { + buf = build_id_cache__origname(nd->s); + printf("%s (%s):\n", buf, nd->s); + free(buf); + probe_cache__show_entries(pcache, filter); + } + probe_cache__delete(pcache); + } + strlist__delete(bidlist); + + return 0; +} + +enum ftrace_readme { + FTRACE_README_PROBE_TYPE_X = 0, + FTRACE_README_KRETPROBE_OFFSET, + FTRACE_README_UPROBE_REF_CTR, + FTRACE_README_USER_ACCESS, + FTRACE_README_MULTIPROBE_EVENT, + FTRACE_README_IMMEDIATE_VALUE, + FTRACE_README_END, +}; + +static struct { + const char *pattern; + bool avail; +} ftrace_readme_table[] = { +#define DEFINE_TYPE(idx, pat) \ + [idx] = {.pattern = pat, .avail = false} + DEFINE_TYPE(FTRACE_README_PROBE_TYPE_X, "*type: * x8/16/32/64,*"), + DEFINE_TYPE(FTRACE_README_KRETPROBE_OFFSET, "*place (kretprobe): *"), + DEFINE_TYPE(FTRACE_README_UPROBE_REF_CTR, "*ref_ctr_offset*"), + DEFINE_TYPE(FTRACE_README_USER_ACCESS, "*u]<offset>*"), + DEFINE_TYPE(FTRACE_README_MULTIPROBE_EVENT, "*Create/append/*"), + DEFINE_TYPE(FTRACE_README_IMMEDIATE_VALUE, "*\\imm-value,*"), +}; + +static bool scan_ftrace_readme(enum ftrace_readme type) +{ + int fd; + FILE *fp; + char *buf = NULL; + size_t len = 0; + bool ret = false; + static bool scanned = false; + + if (scanned) + goto result; + + fd = open_trace_file("README", false); + if (fd < 0) + return ret; + + fp = fdopen(fd, "r"); + if (!fp) { + close(fd); + return ret; + } + + while (getline(&buf, &len, fp) > 0) + for (enum ftrace_readme i = 0; i < FTRACE_README_END; i++) + if (!ftrace_readme_table[i].avail) + ftrace_readme_table[i].avail = + strglobmatch(buf, ftrace_readme_table[i].pattern); + scanned = true; + + fclose(fp); + free(buf); + +result: + if (type >= FTRACE_README_END) + return false; + + return ftrace_readme_table[type].avail; +} + +bool probe_type_is_available(enum probe_type type) +{ + if (type >= PROBE_TYPE_END) + return false; + else if (type == PROBE_TYPE_X) + return scan_ftrace_readme(FTRACE_README_PROBE_TYPE_X); + + return true; +} + +bool kretprobe_offset_is_supported(void) +{ + return scan_ftrace_readme(FTRACE_README_KRETPROBE_OFFSET); +} + +bool uprobe_ref_ctr_is_supported(void) +{ + return scan_ftrace_readme(FTRACE_README_UPROBE_REF_CTR); +} + +bool user_access_is_supported(void) +{ + return scan_ftrace_readme(FTRACE_README_USER_ACCESS); +} + +bool multiprobe_event_is_supported(void) +{ + return scan_ftrace_readme(FTRACE_README_MULTIPROBE_EVENT); +} + +bool immediate_value_is_supported(void) +{ + return scan_ftrace_readme(FTRACE_README_IMMEDIATE_VALUE); +} |