diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools/perf/util/annotate.h | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'tools/perf/util/annotate.h')
-rw-r--r-- | tools/perf/util/annotate.h | 428 |
1 files changed, 428 insertions, 0 deletions
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h new file mode 100644 index 000000000..8934072c3 --- /dev/null +++ b/tools/perf/util/annotate.h @@ -0,0 +1,428 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_ANNOTATE_H +#define __PERF_ANNOTATE_H + +#include <stdbool.h> +#include <stdint.h> +#include <stdio.h> +#include <linux/types.h> +#include <linux/list.h> +#include <linux/rbtree.h> +#include <asm/bug.h> +#include "symbol_conf.h" +#include "mutex.h" +#include "spark.h" + +struct hist_browser_timer; +struct hist_entry; +struct ins_ops; +struct map; +struct map_symbol; +struct addr_map_symbol; +struct option; +struct perf_sample; +struct evsel; +struct symbol; + +struct ins { + const char *name; + struct ins_ops *ops; +}; + +struct ins_operands { + char *raw; + char *raw_comment; + char *raw_func_start; + struct { + char *raw; + char *name; + struct symbol *sym; + u64 addr; + s64 offset; + bool offset_avail; + bool outside; + } target; + union { + struct { + char *raw; + char *name; + u64 addr; + } source; + struct { + struct ins ins; + struct ins_operands *ops; + } locked; + }; +}; + +struct arch; + +struct ins_ops { + void (*free)(struct ins_operands *ops); + int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms); + int (*scnprintf)(struct ins *ins, char *bf, size_t size, + struct ins_operands *ops, int max_ins_name); +}; + +bool ins__is_jump(const struct ins *ins); +bool ins__is_call(const struct ins *ins); +bool ins__is_ret(const struct ins *ins); +bool ins__is_lock(const struct ins *ins); +int ins__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops, int max_ins_name); +bool ins__is_fused(struct arch *arch, const char *ins1, const char *ins2); + +#define ANNOTATION__IPC_WIDTH 6 +#define ANNOTATION__CYCLES_WIDTH 6 +#define ANNOTATION__MINMAX_CYCLES_WIDTH 19 +#define ANNOTATION__AVG_IPC_WIDTH 36 +#define ANNOTATION_DUMMY_LEN 256 + +struct annotation_options { + bool hide_src_code, + use_offset, + jump_arrows, + print_lines, + full_path, + show_linenr, + show_fileloc, + show_nr_jumps, + show_minmax_cycle, + show_asm_raw, + annotate_src, + full_addr; + u8 offset_level; + int min_pcnt; + int max_lines; + int context; + const char *objdump_path; + const char *disassembler_style; + const char *prefix; + const char *prefix_strip; + unsigned int percent_type; +}; + +enum { + ANNOTATION__OFFSET_JUMP_TARGETS = 1, + ANNOTATION__OFFSET_CALL, + ANNOTATION__MAX_OFFSET_LEVEL, +}; + +#define ANNOTATION__MIN_OFFSET_LEVEL ANNOTATION__OFFSET_JUMP_TARGETS + +extern struct annotation_options annotation__default_options; + +struct annotation; + +struct sym_hist_entry { + u64 nr_samples; + u64 period; +}; + +enum { + PERCENT_HITS_LOCAL, + PERCENT_HITS_GLOBAL, + PERCENT_PERIOD_LOCAL, + PERCENT_PERIOD_GLOBAL, + PERCENT_MAX, +}; + +struct annotation_data { + double percent[PERCENT_MAX]; + double percent_sum; + struct sym_hist_entry he; +}; + +struct annotation_line { + struct list_head node; + struct rb_node rb_node; + s64 offset; + char *line; + int line_nr; + char *fileloc; + int jump_sources; + float ipc; + u64 cycles; + u64 cycles_max; + u64 cycles_min; + char *path; + u32 idx; + int idx_asm; + int data_nr; + struct annotation_data data[]; +}; + +struct disasm_line { + struct ins ins; + struct ins_operands ops; + + /* This needs to be at the end. */ + struct annotation_line al; +}; + +static inline double annotation_data__percent(struct annotation_data *data, + unsigned int which) +{ + return which < PERCENT_MAX ? data->percent[which] : -1; +} + +static inline const char *percent_type_str(unsigned int type) +{ + static const char *str[PERCENT_MAX] = { + "local hits", + "global hits", + "local period", + "global period", + }; + + if (WARN_ON(type >= PERCENT_MAX)) + return "N/A"; + + return str[type]; +} + +static inline struct disasm_line *disasm_line(struct annotation_line *al) +{ + return al ? container_of(al, struct disasm_line, al) : NULL; +} + +/* + * Is this offset in the same function as the line it is used? + * asm functions jump to other functions, for instance. + */ +static inline bool disasm_line__has_local_offset(const struct disasm_line *dl) +{ + return dl->ops.target.offset_avail && !dl->ops.target.outside; +} + +/* + * Can we draw an arrow from the jump to its target, for instance? I.e. + * is the jump and its target in the same function? + */ +bool disasm_line__is_valid_local_jump(struct disasm_line *dl, struct symbol *sym); + +void disasm_line__free(struct disasm_line *dl); +struct annotation_line * +annotation_line__next(struct annotation_line *pos, struct list_head *head); + +struct annotation_write_ops { + bool first_line, current_entry, change_color; + int width; + void *obj; + int (*set_color)(void *obj, int color); + void (*set_percent_color)(void *obj, double percent, bool current); + int (*set_jumps_percent_color)(void *obj, int nr, bool current); + void (*printf)(void *obj, const char *fmt, ...); + void (*write_graph)(void *obj, int graph); +}; + +void annotation_line__write(struct annotation_line *al, struct annotation *notes, + struct annotation_write_ops *ops, + struct annotation_options *opts); + +int __annotation__scnprintf_samples_period(struct annotation *notes, + char *bf, size_t size, + struct evsel *evsel, + bool show_freq); + +int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool raw, int max_ins_name); +size_t disasm__fprintf(struct list_head *head, FILE *fp); +void symbol__calc_percent(struct symbol *sym, struct evsel *evsel); + +struct sym_hist { + u64 nr_samples; + u64 period; + struct sym_hist_entry addr[]; +}; + +struct cyc_hist { + u64 start; + u64 cycles; + u64 cycles_aggr; + u64 cycles_max; + u64 cycles_min; + s64 cycles_spark[NUM_SPARKS]; + u32 num; + u32 num_aggr; + u8 have_start; + /* 1 byte padding */ + u16 reset; +}; + +/** struct annotated_source - symbols with hits have this attached as in sannotation + * + * @histograms: Array of addr hit histograms per event being monitored + * nr_histograms: This may not be the same as evsel->evlist->core.nr_entries if + * we have more than a group in a evlist, where we will want + * to see each group separately, that is why symbol__annotate2() + * sets src->nr_histograms to evsel->nr_members. + * @lines: If 'print_lines' is specified, per source code line percentages + * @source: source parsed from a disassembler like objdump -dS + * @cyc_hist: Average cycles per basic block + * + * lines is allocated, percentages calculated and all sorted by percentage + * when the annotation is about to be presented, so the percentages are for + * one of the entries in the histogram array, i.e. for the event/counter being + * presented. It is deallocated right after symbol__{tui,tty,etc}_annotate + * returns. + */ +struct annotated_source { + struct list_head source; + int nr_histograms; + size_t sizeof_sym_hist; + struct cyc_hist *cycles_hist; + struct sym_hist *histograms; +}; + +struct annotation { + struct mutex lock; + u64 max_coverage; + u64 start; + u64 hit_cycles; + u64 hit_insn; + unsigned int total_insn; + unsigned int cover_insn; + struct annotation_options *options; + struct annotation_line **offsets; + int nr_events; + int max_jump_sources; + int nr_entries; + int nr_asm_entries; + u16 max_line_len; + struct { + u8 addr; + u8 jumps; + u8 target; + u8 min_addr; + u8 max_addr; + u8 max_ins_name; + } widths; + bool have_cycles; + struct annotated_source *src; +}; + +void annotation__init(struct annotation *notes); +void annotation__exit(struct annotation *notes); + +static inline int annotation__cycles_width(struct annotation *notes) +{ + if (notes->have_cycles && notes->options->show_minmax_cycle) + return ANNOTATION__IPC_WIDTH + ANNOTATION__MINMAX_CYCLES_WIDTH; + + return notes->have_cycles ? ANNOTATION__IPC_WIDTH + ANNOTATION__CYCLES_WIDTH : 0; +} + +static inline int annotation__pcnt_width(struct annotation *notes) +{ + return (symbol_conf.show_total_period ? 12 : 7) * notes->nr_events; +} + +static inline bool annotation_line__filter(struct annotation_line *al, struct annotation *notes) +{ + return notes->options->hide_src_code && al->offset == -1; +} + +void annotation__set_offsets(struct annotation *notes, s64 size); +void annotation__compute_ipc(struct annotation *notes, size_t size); +void annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym); +void annotation__update_column_widths(struct annotation *notes); +void annotation__init_column_widths(struct annotation *notes, struct symbol *sym); +void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *ms); + +static inline struct sym_hist *annotated_source__histogram(struct annotated_source *src, int idx) +{ + return ((void *)src->histograms) + (src->sizeof_sym_hist * idx); +} + +static inline struct sym_hist *annotation__histogram(struct annotation *notes, int idx) +{ + return annotated_source__histogram(notes->src, idx); +} + +static inline struct annotation *symbol__annotation(struct symbol *sym) +{ + return (void *)sym - symbol_conf.priv_size; +} + +int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, struct perf_sample *sample, + struct evsel *evsel); + +int addr_map_symbol__account_cycles(struct addr_map_symbol *ams, + struct addr_map_symbol *start, + unsigned cycles); + +int hist_entry__inc_addr_samples(struct hist_entry *he, struct perf_sample *sample, + struct evsel *evsel, u64 addr); + +struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists); +void symbol__annotate_zero_histograms(struct symbol *sym); + +int symbol__annotate(struct map_symbol *ms, + struct evsel *evsel, + struct annotation_options *options, + struct arch **parch); +int symbol__annotate2(struct map_symbol *ms, + struct evsel *evsel, + struct annotation_options *options, + struct arch **parch); + +enum symbol_disassemble_errno { + SYMBOL_ANNOTATE_ERRNO__SUCCESS = 0, + + /* + * Choose an arbitrary negative big number not to clash with standard + * errno since SUS requires the errno has distinct positive values. + * See 'Issue 6' in the link below. + * + * http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html + */ + __SYMBOL_ANNOTATE_ERRNO__START = -10000, + + SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX = __SYMBOL_ANNOTATE_ERRNO__START, + SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF, + SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING, + SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP, + SYMBOL_ANNOTATE_ERRNO__BPF_INVALID_FILE, + SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF, + + __SYMBOL_ANNOTATE_ERRNO__END, +}; + +int symbol__strerror_disassemble(struct map_symbol *ms, int errnum, char *buf, size_t buflen); + +int symbol__annotate_printf(struct map_symbol *ms, struct evsel *evsel, + struct annotation_options *options); +void symbol__annotate_zero_histogram(struct symbol *sym, int evidx); +void symbol__annotate_decay_histogram(struct symbol *sym, int evidx); +void annotated_source__purge(struct annotated_source *as); + +int map_symbol__annotation_dump(struct map_symbol *ms, struct evsel *evsel, + struct annotation_options *opts); + +bool ui__has_annotation(void); + +int symbol__tty_annotate(struct map_symbol *ms, struct evsel *evsel, struct annotation_options *opts); + +int symbol__tty_annotate2(struct map_symbol *ms, struct evsel *evsel, struct annotation_options *opts); + +#ifdef HAVE_SLANG_SUPPORT +int symbol__tui_annotate(struct map_symbol *ms, struct evsel *evsel, + struct hist_browser_timer *hbt, + struct annotation_options *opts); +#else +static inline int symbol__tui_annotate(struct map_symbol *ms __maybe_unused, + struct evsel *evsel __maybe_unused, + struct hist_browser_timer *hbt __maybe_unused, + struct annotation_options *opts __maybe_unused) +{ + return 0; +} +#endif + +void annotation_config__init(struct annotation_options *opt); + +int annotate_parse_percent_type(const struct option *opt, const char *_str, + int unset); + +int annotate_check_args(struct annotation_options *args); + +#endif /* __PERF_ANNOTATE_H */ |