diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /tools/perf/pmu-events/arch/x86/silvermont | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'tools/perf/pmu-events/arch/x86/silvermont')
7 files changed, 1124 insertions, 0 deletions
diff --git a/tools/perf/pmu-events/arch/x86/silvermont/cache.json b/tools/perf/pmu-events/arch/x86/silvermont/cache.json new file mode 100644 index 000000000..818e0664a --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/cache.json @@ -0,0 +1,677 @@ +[ + { + "BriefDescription": "Counts the number of request that were not accepted into the L2Q because the L2Q is FULL.", + "EventCode": "0x31", + "EventName": "CORE_REJECT_L2Q.ALL", + "PublicDescription": "Counts the number of (demand and L1 prefetchers) core requests rejected by the L2Q due to a full or nearly full w condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to insure fairness between cores, or to delay a core?s dirty eviction when the address conflicts incoming external snoops. (Note that L2 prefetcher requests that are dropped are not counted by this event.)", + "SampleAfterValue": "200003" + }, + { + "BriefDescription": "Cycles code-fetch stalled due to an outstanding ICache miss.", + "EventCode": "0x86", + "EventName": "FETCH_STALL.ICACHE_FILL_PENDING_CYCLES", + "PublicDescription": "Counts cycles that fetch is stalled due to an outstanding ICache miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ICache miss. Note: this event is not the same as the total number of cycles spent retrieving instruction cache lines from the memory hierarchy.\r\nCounts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes. This will include cycles due to an ITLB miss, ICache miss and other events.", + "SampleAfterValue": "200003", + "UMask": "0x4" + }, + { + "BriefDescription": "Counts the number of request from the L2 that were not accepted into the XQ", + "EventCode": "0x30", + "EventName": "L2_REJECT_XQ.ALL", + "PublicDescription": "This event counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the IDI link. The XQ may reject transactions from the L2Q (non-cacheable requests), BBS (L2 misses) and WOB (L2 write-back victims).", + "SampleAfterValue": "200003" + }, + { + "BriefDescription": "L2 cache request misses", + "EventCode": "0x2E", + "EventName": "LONGEST_LAT_CACHE.MISS", + "PublicDescription": "This event counts the total number of L2 cache references and the number of L2 cache misses respectively.", + "SampleAfterValue": "200003", + "UMask": "0x41" + }, + { + "BriefDescription": "L2 cache requests from this core", + "EventCode": "0x2E", + "EventName": "LONGEST_LAT_CACHE.REFERENCE", + "PublicDescription": "This event counts requests originating from the core that references a cache line in the L2 cache.", + "SampleAfterValue": "200003", + "UMask": "0x4f" + }, + { + "BriefDescription": "All Loads", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.ALL_LOADS", + "PublicDescription": "This event counts the number of load ops retired.", + "SampleAfterValue": "200003", + "UMask": "0x40" + }, + { + "BriefDescription": "All Stores", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.ALL_STORES", + "PublicDescription": "This event counts the number of store ops retired.", + "SampleAfterValue": "200003", + "UMask": "0x80" + }, + { + "BriefDescription": "Cross core or cross module hitm", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.HITM", + "PEBS": "1", + "PublicDescription": "This event counts the number of load ops retired that got data from the other core or from the other module.", + "SampleAfterValue": "200003", + "UMask": "0x20" + }, + { + "BriefDescription": "Loads missed L1", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.L1_MISS_LOADS", + "PublicDescription": "This event counts the number of load ops retired that miss in L1 Data cache. Note that prefetch misses will not be counted.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Loads hit L2", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.L2_HIT_LOADS", + "PEBS": "1", + "PublicDescription": "This event counts the number of load ops retired that hit in the L2.", + "SampleAfterValue": "200003", + "UMask": "0x2" + }, + { + "BriefDescription": "Loads missed L2", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.L2_MISS_LOADS", + "PEBS": "1", + "PublicDescription": "This event counts the number of load ops retired that miss in the L2.", + "SampleAfterValue": "100007", + "UMask": "0x4" + }, + { + "BriefDescription": "Loads missed UTLB", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.UTLB_MISS", + "PublicDescription": "This event counts the number of load ops retired that had UTLB miss.", + "SampleAfterValue": "200003", + "UMask": "0x10" + }, + { + "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE", + "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any code reads (demand & prefetch) that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000010044", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any code reads (demand & prefetch) that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000044", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any code reads (demand & prefetch) that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000044", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any code reads (demand & prefetch) that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000044", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any code reads (demand & prefetch) that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000044", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any data read (demand & prefetch) that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000013091", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any data read (demand & prefetch) that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680003091", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any data read (demand & prefetch) that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000003091", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any data read (demand & prefetch) that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400003091", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any data read (demand & prefetch) that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200003091", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any request that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000018008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any request that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000008008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any request that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400008008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any request that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200008008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any rfo reads (demand & prefetch) that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_RFO.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000010022", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any rfo reads (demand & prefetch) that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_RFO.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000022", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any rfo reads (demand & prefetch) that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_RFO.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000022", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any rfo reads (demand & prefetch) that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_RFO.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000022", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts any rfo reads (demand & prefetch) that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.ANY_RFO.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000022", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts writeback (modified to exclusive) that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.COREWB.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts writeback (modified to exclusive) that miss L2 with no details on snoop-related information.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.COREWB.L2_MISS.NO_SNOOP_NEEDED", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0080000008", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch instruction cacheline that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000010004", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch instruction cacheline that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000004", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch instruction cacheline that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000004", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch instruction cacheline that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000004", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch instruction cacheline that are are outstanding, per cycle, from the time of the L2 miss to when any response is received.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.OUTSTANDING", + "MSRIndex": "0x1a6", + "MSRValue": "0x4000000004", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000010001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch data read that are are outstanding, per cycle, from the time of the L2 miss to when any response is received.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.OUTSTANDING", + "MSRIndex": "0x1a6", + "MSRValue": "0x4000000001", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch RFOs that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000002", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch RFOs that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000002", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch RFOs that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000002", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch RFOs that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000002", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand and DCU prefetch RFOs that are are outstanding, per cycle, from the time of the L2 miss to when any response is received.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.OUTSTANDING", + "MSRIndex": "0x1a6", + "MSRValue": "0x4000000002", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts demand reads of partial cache lines (including UC and WC) that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000080", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Countsof demand RFO requests to write to partial cache lines that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000100", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts DCU hardware prefetcher data read that have any response type.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.ANY_RESPONSE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0000012000", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts DCU hardware prefetcher data read that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680002000", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts DCU hardware prefetcher data read that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000002000", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts DCU hardware prefetcher data read that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400002000", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts DCU hardware prefetcher data read that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200002000", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts code reads generated by L2 prefetchers that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000040", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts code reads generated by L2 prefetchers that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000040", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts code reads generated by L2 prefetchers that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000040", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts data cacheline reads generated by L2 prefetchers that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000010", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts data cacheline reads generated by L2 prefetchers that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000010", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts data cacheline reads generated by L2 prefetchers that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000010", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts data cacheline reads generated by L2 prefetchers that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000010", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts RFO requests generated by L2 prefetchers that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680000020", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts RFO requests generated by L2 prefetchers that hit in the other module where modified copies were found in other core's L1 cache.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L2_MISS.HITM_OTHER_CORE", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1000000020", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts RFO requests generated by L2 prefetchers that miss L2 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L2_MISS.HIT_OTHER_CORE_NO_FWD", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0400000020", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts RFO requests generated by L2 prefetchers that miss L2 with a snoop miss response.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L2_MISS.SNOOP_MISS", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x0200000020", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts streaming store that miss L2.", + "EventCode": "0xB7", + "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L2_MISS.ANY", + "MSRIndex": "0x1a6,0x1a7", + "MSRValue": "0x1680004800", + "SampleAfterValue": "100007", + "UMask": "0x1" + }, + { + "BriefDescription": "Any reissued load uops", + "EventCode": "0x03", + "EventName": "REHABQ.ANY_LD", + "PublicDescription": "This event counts the number of load uops reissued from Rehabq.", + "SampleAfterValue": "200003", + "UMask": "0x40" + }, + { + "BriefDescription": "Any reissued store uops", + "EventCode": "0x03", + "EventName": "REHABQ.ANY_ST", + "PublicDescription": "This event counts the number of store uops reissued from Rehabq.", + "SampleAfterValue": "200003", + "UMask": "0x80" + }, + { + "BriefDescription": "Loads blocked due to store data not ready", + "EventCode": "0x03", + "EventName": "REHABQ.LD_BLOCK_STD_NOTREADY", + "PublicDescription": "This event counts the cases where a forward was technically possible, but did not occur because the store data was not available at the right time.", + "SampleAfterValue": "200003", + "UMask": "0x2" + }, + { + "BriefDescription": "Loads blocked due to store forward restriction", + "EventCode": "0x03", + "EventName": "REHABQ.LD_BLOCK_ST_FORWARD", + "PEBS": "1", + "PublicDescription": "This event counts the number of retired loads that were prohibited from receiving forwarded data from the store because of address mismatch.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Load uops that split cache line boundary", + "EventCode": "0x03", + "EventName": "REHABQ.LD_SPLITS", + "PEBS": "1", + "PublicDescription": "This event counts the number of retire loads that experienced cache line boundary splits.", + "SampleAfterValue": "200003", + "UMask": "0x8" + }, + { + "BriefDescription": "Uops with lock semantics", + "EventCode": "0x03", + "EventName": "REHABQ.LOCK", + "PublicDescription": "This event counts the number of retired memory operations with lock semantics. These are either implicit locked instructions such as the XCHG instruction or instructions with an explicit LOCK prefix (0xF0).", + "SampleAfterValue": "200003", + "UMask": "0x10" + }, + { + "BriefDescription": "Store address buffer full", + "EventCode": "0x03", + "EventName": "REHABQ.STA_FULL", + "PublicDescription": "This event counts the number of retired stores that are delayed because there is not a store address buffer available.", + "SampleAfterValue": "200003", + "UMask": "0x20" + }, + { + "BriefDescription": "Store uops that split cache line boundary", + "EventCode": "0x03", + "EventName": "REHABQ.ST_SPLITS", + "PublicDescription": "This event counts the number of retire stores that experienced cache line boundary splits.", + "SampleAfterValue": "200003", + "UMask": "0x4" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/floating-point.json b/tools/perf/pmu-events/arch/x86/silvermont/floating-point.json new file mode 100644 index 000000000..f2b1e8f08 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/floating-point.json @@ -0,0 +1,10 @@ +[ + { + "BriefDescription": "Stalls due to FP assists", + "EventCode": "0xC3", + "EventName": "MACHINE_CLEARS.FP_ASSIST", + "PublicDescription": "This event counts the number of times that pipeline stalled due to FP operations needing assists.", + "SampleAfterValue": "200003", + "UMask": "0x4" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/frontend.json b/tools/perf/pmu-events/arch/x86/silvermont/frontend.json new file mode 100644 index 000000000..c35da10f7 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/frontend.json @@ -0,0 +1,66 @@ +[ + { + "BriefDescription": "Counts the number of baclears", + "EventCode": "0xE6", + "EventName": "BACLEARS.ALL", + "PublicDescription": "The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.ANY event counts the number of baclears for any type of branch.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts the number of JCC baclears", + "EventCode": "0xE6", + "EventName": "BACLEARS.COND", + "PublicDescription": "The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.COND event counts the number of JCC (Jump on Condtional Code) baclears.", + "SampleAfterValue": "200003", + "UMask": "0x10" + }, + { + "BriefDescription": "Counts the number of RETURN baclears", + "EventCode": "0xE6", + "EventName": "BACLEARS.RETURN", + "PublicDescription": "The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.RETURN event counts the number of RETURN baclears.", + "SampleAfterValue": "200003", + "UMask": "0x8" + }, + { + "BriefDescription": "Counts the number of times a decode restriction reduced the decode throughput due to wrong instruction length prediction", + "EventCode": "0xE9", + "EventName": "DECODE_RESTRICTION.PREDECODE_WRONG", + "PublicDescription": "Counts the number of times a decode restriction reduced the decode throughput due to wrong instruction length prediction.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Instruction fetches", + "EventCode": "0x80", + "EventName": "ICACHE.ACCESSES", + "PublicDescription": "This event counts all instruction fetches, not including most uncacheable\r\nfetches.", + "SampleAfterValue": "200003", + "UMask": "0x3" + }, + { + "BriefDescription": "Instruction fetches from Icache", + "EventCode": "0x80", + "EventName": "ICACHE.HIT", + "PublicDescription": "This event counts all instruction fetches from the instruction cache.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Icache miss", + "EventCode": "0x80", + "EventName": "ICACHE.MISSES", + "PublicDescription": "This event counts all instruction fetches that miss the Instruction cache or produce memory requests. This includes uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding.", + "SampleAfterValue": "200003", + "UMask": "0x2" + }, + { + "BriefDescription": "Counts the number of times entered into a ucode flow in the FEC. Includes inserted flows due to front-end detected faults or assists. Speculative count.", + "EventCode": "0xE7", + "EventName": "MS_DECODED.MS_ENTRY", + "PublicDescription": "Counts the number of times the MSROM starts a flow of UOPS. It does not count every time a UOP is read from the microcode ROM. The most common case that this counts is when a micro-coded instruction is encountered by the front end of the machine. Other cases include when an instruction encounters a fault, trap, or microcode assist of any sort. The event will count MSROM startups for UOPS that are speculative, and subsequently cleared by branch mispredict or machine clear. Background: UOPS are produced by two mechanisms. Either they are generated by hardware that decodes instructions into UOPS, or they are delivered by a ROM (called the MSROM) that holds UOPS associated with a specific instruction. MSROM UOPS might also be delivered in response to some condition such as a fault or other exceptional condition. This event is an excellent mechanism for detecting instructions that require the use of MSROM instructions.", + "SampleAfterValue": "200003", + "UMask": "0x1" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/memory.json b/tools/perf/pmu-events/arch/x86/silvermont/memory.json new file mode 100644 index 000000000..15ea45187 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/memory.json @@ -0,0 +1,10 @@ +[ + { + "BriefDescription": "Stalls due to Memory ordering", + "EventCode": "0xC3", + "EventName": "MACHINE_CLEARS.MEMORY_ORDERING", + "PublicDescription": "This event counts the number of times that pipeline was cleared due to memory ordering issues.", + "SampleAfterValue": "200003", + "UMask": "0x2" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/other.json b/tools/perf/pmu-events/arch/x86/silvermont/other.json new file mode 100644 index 000000000..cff113adb --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/other.json @@ -0,0 +1,18 @@ +[ + { + "BriefDescription": "Cycles code-fetch stalled due to any reason.", + "EventCode": "0x86", + "EventName": "FETCH_STALL.ALL", + "PublicDescription": "Counts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes. This will include cycles due to an ITLB miss, ICache miss and other events.", + "SampleAfterValue": "200003", + "UMask": "0x3f" + }, + { + "BriefDescription": "Cycles code-fetch stalled due to an outstanding ITLB miss.", + "EventCode": "0x86", + "EventName": "FETCH_STALL.ITLB_FILL_PENDING_CYCLES", + "PublicDescription": "Counts cycles that fetch is stalled due to an outstanding ITLB miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ITLB miss. Note: this event is not the same as page walk cycles to retrieve an instruction translation.", + "SampleAfterValue": "200003", + "UMask": "0x2" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/pipeline.json b/tools/perf/pmu-events/arch/x86/silvermont/pipeline.json new file mode 100644 index 000000000..59f6116a7 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/pipeline.json @@ -0,0 +1,281 @@ +[ + { + "BriefDescription": "Counts the number of branch instructions retired...", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.ALL_BRANCHES", + "PEBS": "1", + "PublicDescription": "ALL_BRANCHES counts the number of any branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003" + }, + { + "BriefDescription": "Counts the number of taken branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.ALL_TAKEN_BRANCHES", + "PEBS": "2", + "PublicDescription": "ALL_TAKEN_BRANCHES counts the number of all taken branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0x80" + }, + { + "BriefDescription": "Counts the number of near CALL branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.CALL", + "PEBS": "1", + "PublicDescription": "CALL counts the number of near CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xf9" + }, + { + "BriefDescription": "Counts the number of far branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.FAR_BRANCH", + "PEBS": "1", + "PublicDescription": "FAR counts the number of far branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xbf" + }, + { + "BriefDescription": "Counts the number of near indirect CALL branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.IND_CALL", + "PEBS": "1", + "PublicDescription": "IND_CALL counts the number of near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xfb" + }, + { + "BriefDescription": "Counts the number of JCC branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.JCC", + "PEBS": "1", + "PublicDescription": "JCC counts the number of conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0x7e" + }, + { + "BriefDescription": "Counts the number of near indirect JMP and near indirect CALL branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.NON_RETURN_IND", + "PEBS": "1", + "PublicDescription": "NON_RETURN_IND counts the number of near indirect JMP and near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xeb" + }, + { + "BriefDescription": "Counts the number of near relative CALL branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.REL_CALL", + "PEBS": "1", + "PublicDescription": "REL_CALL counts the number of near relative CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xfd" + }, + { + "BriefDescription": "Counts the number of near RET branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.RETURN", + "PEBS": "1", + "PublicDescription": "RETURN counts the number of near RET branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xf7" + }, + { + "BriefDescription": "Counts the number of taken JCC branch instructions retired", + "EventCode": "0xC4", + "EventName": "BR_INST_RETIRED.TAKEN_JCC", + "PEBS": "1", + "PublicDescription": "TAKEN_JCC counts the number of taken conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.", + "SampleAfterValue": "200003", + "UMask": "0xfe" + }, + { + "BriefDescription": "Counts the number of mispredicted branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.ALL_BRANCHES", + "PEBS": "1", + "PublicDescription": "ALL_BRANCHES counts the number of any mispredicted branch instructions retired. This umask is an architecturally defined event. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003" + }, + { + "BriefDescription": "Counts the number of mispredicted near indirect CALL branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.IND_CALL", + "PEBS": "1", + "PublicDescription": "IND_CALL counts the number of mispredicted near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003", + "UMask": "0xfb" + }, + { + "BriefDescription": "Counts the number of mispredicted JCC branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.JCC", + "PEBS": "1", + "PublicDescription": "JCC counts the number of mispredicted conditional branches (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003", + "UMask": "0x7e" + }, + { + "BriefDescription": "Counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.NON_RETURN_IND", + "PEBS": "1", + "PublicDescription": "NON_RETURN_IND counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003", + "UMask": "0xeb" + }, + { + "BriefDescription": "Counts the number of mispredicted near RET branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.RETURN", + "PEBS": "1", + "PublicDescription": "RETURN counts the number of mispredicted near RET branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003", + "UMask": "0xf7" + }, + { + "BriefDescription": "Counts the number of mispredicted taken JCC branch instructions retired", + "EventCode": "0xC5", + "EventName": "BR_MISP_RETIRED.TAKEN_JCC", + "PEBS": "1", + "PublicDescription": "TAKEN_JCC counts the number of mispredicted taken conditional branch (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.", + "SampleAfterValue": "200003", + "UMask": "0xfe" + }, + { + "BriefDescription": "Fixed Counter: Counts the number of unhalted core clock cycles", + "EventName": "CPU_CLK_UNHALTED.CORE", + "PublicDescription": "Counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time. In systems with a constant core frequency, this event can give you a measurement of the elapsed time while the core was not in halt state by dividing the event count by the core frequency. This event is architecturally defined and is a designated fixed counter. CPU_CLK_UNHALTED.CORE and CPU_CLK_UNHALTED.CORE_P use the core frequency which may change from time to time. CPU_CLK_UNHALTE.REF_TSC and CPU_CLK_UNHALTED.REF are not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time. The fixed events are CPU_CLK_UNHALTED.CORE and CPU_CLK_UNHALTED.REF_TSC and the programmable events are CPU_CLK_UNHALTED.CORE_P and CPU_CLK_UNHALTED.REF.", + "SampleAfterValue": "2000003", + "UMask": "0x2" + }, + { + "BriefDescription": "Core cycles when core is not halted", + "EventCode": "0x3C", + "EventName": "CPU_CLK_UNHALTED.CORE_P", + "PublicDescription": "This event counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time.", + "SampleAfterValue": "2000003" + }, + { + "BriefDescription": "Reference cycles when core is not halted", + "EventCode": "0x3C", + "EventName": "CPU_CLK_UNHALTED.REF", + "PublicDescription": "This event counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time. This event is not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time.", + "SampleAfterValue": "2000003", + "UMask": "0x1" + }, + { + "BriefDescription": "Fixed Counter: Counts the number of unhalted reference clock cycles", + "EventName": "CPU_CLK_UNHALTED.REF_TSC", + "PublicDescription": "Counts the number of reference cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time. This event is not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time. Divide this event count by core frequency to determine the elapsed time while the core was not in halt state. Divide this event count by core frequency to determine the elapsed time while the core was not in halt state. This event is architecturally defined and is a designated fixed counter. CPU_CLK_UNHALTED.CORE and CPU_CLK_UNHALTED.CORE_P use the core frequency which may change from time to time. CPU_CLK_UNHALTE.REF_TSC and CPU_CLK_UNHALTED.REF are not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time. The fixed events are CPU_CLK_UNHALTED.CORE and CPU_CLK_UNHALTED.REF_TSC and the programmable events are CPU_CLK_UNHALTED.CORE_P and CPU_CLK_UNHALTED.REF.", + "SampleAfterValue": "2000003", + "UMask": "0x3" + }, + { + "BriefDescription": "Cycles the divider is busy. Does not imply a stall waiting for the divider.", + "EventCode": "0xCD", + "EventName": "CYCLES_DIV_BUSY.ALL", + "PublicDescription": "Cycles the divider is busy.This event counts the cycles when the divide unit is unable to accept a new divide UOP because it is busy processing a previously dispatched UOP. The cycles will be counted irrespective of whether or not another divide UOP is waiting to enter the divide unit (from the RS). This event might count cycles while a divide is in progress even if the RS is empty. The divide instruction is one of the longest latency instructions in the machine. Hence, it has a special event associated with it to help determine if divides are delaying the retirement of instructions.", + "SampleAfterValue": "2000003", + "UMask": "0x1" + }, + { + "BriefDescription": "Fixed Counter: Counts the number of instructions retired", + "EventName": "INST_RETIRED.ANY", + "PublicDescription": "This event counts the number of instructions that retire. For instructions that consist of multiple micro-ops, this event counts exactly once, as the last micro-op of the instruction retires. The event continues counting while instructions retire, including during interrupt service routines caused by hardware interrupts, faults or traps. Background: Modern microprocessors employ extensive pipelining and speculative techniques. Since sometimes an instruction is started but never completed, the notion of \"retirement\" is introduced. A retired instruction is one that commits its states. Or stated differently, an instruction might be abandoned at some point. No instruction is truly finished until it retires. This counter measures the number of completed instructions. The fixed event is INST_RETIRED.ANY and the programmable event is INST_RETIRED.ANY_P.", + "SampleAfterValue": "2000003", + "UMask": "0x1" + }, + { + "BriefDescription": "Instructions retired", + "EventCode": "0xC0", + "EventName": "INST_RETIRED.ANY_P", + "PublicDescription": "This event counts the number of instructions that retire execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers.", + "SampleAfterValue": "2000003" + }, + { + "BriefDescription": "Counts all machine clears", + "EventCode": "0xC3", + "EventName": "MACHINE_CLEARS.ALL", + "PublicDescription": "Machine clears happen when something happens in the machine that causes the hardware to need to take special care to get the right answer. When such a condition is signaled on an instruction, the front end of the machine is notified that it must restart, so no more instructions will be decoded from the current path. All instructions \"older\" than this one will be allowed to finish. This instruction and all \"younger\" instructions must be cleared, since they must not be allowed to complete. Essentially, the hardware waits until the problematic instruction is the oldest instruction in the machine. This means all older instructions are retired, and all pending stores (from older instructions) are completed. Then the new path of instructions from the front end are allowed to start into the machine. There are many conditions that might cause a machine clear (including the receipt of an interrupt, or a trap or a fault). All those conditions (including but not limited to MACHINE_CLEARS.MEMORY_ORDERING, MACHINE_CLEARS.SMC, and MACHINE_CLEARS.FP_ASSIST) are captured in the ANY event. In addition, some conditions can be specifically counted (i.e. SMC, MEMORY_ORDERING, FP_ASSIST). However, the sum of SMC, MEMORY_ORDERING, and FP_ASSIST machine clears will not necessarily equal the number of ANY.", + "SampleAfterValue": "200003", + "UMask": "0x8" + }, + { + "BriefDescription": "Self-Modifying Code detected", + "EventCode": "0xC3", + "EventName": "MACHINE_CLEARS.SMC", + "PublicDescription": "This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel? architecture processors.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts the number of cycles when no uops are allocated for any reason.", + "EventCode": "0xCA", + "EventName": "NO_ALLOC_CYCLES.ALL", + "PublicDescription": "The NO_ALLOC_CYCLES.ALL event counts the number of cycles when the front-end does not provide any instructions to be allocated for any reason. This event indicates the cycles where an allocation stalls occurs, and no UOPS are allocated in that cycle.", + "SampleAfterValue": "200003", + "UMask": "0x3f" + }, + { + "BriefDescription": "Counts the number of cycles when no uops are allocated and the alloc pipe is stalled waiting for a mispredicted jump to retire. After the misprediction is detected, the front end will start immediately but the allocate pipe stalls until the mispredicted", + "EventCode": "0xCA", + "EventName": "NO_ALLOC_CYCLES.MISPREDICTS", + "PublicDescription": "Counts the number of cycles when no uops are allocated and the alloc pipe is stalled waiting for a mispredicted jump to retire. After the misprediction is detected, the front end will start immediately but the allocate pipe stalls until the mispredicted.", + "SampleAfterValue": "200003", + "UMask": "0x4" + }, + { + "BriefDescription": "Counts the number of cycles when no uops are allocated, the IQ is empty, and no other condition is blocking allocation.", + "EventCode": "0xCA", + "EventName": "NO_ALLOC_CYCLES.NOT_DELIVERED", + "PublicDescription": "The NO_ALLOC_CYCLES.NOT_DELIVERED event is used to measure front-end inefficiencies, i.e. when front-end of the machine is not delivering micro-ops to the back-end and the back-end is not stalled. This event can be used to identify if the machine is truly front-end bound. When this event occurs, it is an indication that the front-end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: Front-end and Back-end. Front-end is responsible for fetching the instruction, decoding into micro-ops (uops) in machine understandable format and putting them into a micro-op queue to be consumed by back end. The back-end then takes these micro-ops, allocates the required resources. When all resources are ready, micro-ops are executed. If the back-end is not ready to accept micro-ops from the front-end, then we do not want to count these as front-end bottlenecks. However, whenever we have bottlenecks in the back-end, we will have allocation unit stalls and eventually forcing the front-end to wait until the back-end is ready to receive more UOPS. This event counts the cycles only when back-end is requesting more uops and front-end is not able to provide them. Some examples of conditions that cause front-end efficiencies are: Icache misses, ITLB misses, and decoder restrictions that limit the the front-end bandwidth.", + "SampleAfterValue": "200003", + "UMask": "0x50" + }, + { + "BriefDescription": "Counts the number of cycles when no uops are allocated and a RATstall is asserted.", + "EventCode": "0xCA", + "EventName": "NO_ALLOC_CYCLES.RAT_STALL", + "SampleAfterValue": "200003", + "UMask": "0x20" + }, + { + "BriefDescription": "Counts the number of cycles when no uops are allocated and the ROB is full (less than 2 entries available)", + "EventCode": "0xCA", + "EventName": "NO_ALLOC_CYCLES.ROB_FULL", + "PublicDescription": "Counts the number of cycles when no uops are allocated and the ROB is full (less than 2 entries available).", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Counts the number of cycles the Alloc pipeline is stalled when any one of the RSs (IEC, FPC and MEC) is full. This event is a superset of all the individual RS stall event counts.", + "EventCode": "0xCB", + "EventName": "RS_FULL_STALL.ALL", + "SampleAfterValue": "200003", + "UMask": "0x1f" + }, + { + "BriefDescription": "Counts the number of cycles and allocation pipeline is stalled and is waiting for a free MEC reservation station entry. The cycles should be appropriately counted in case of the cracked ops e.g. In case of a cracked load-op, the load portion is sent to M", + "EventCode": "0xCB", + "EventName": "RS_FULL_STALL.MEC", + "PublicDescription": "Counts the number of cycles and allocation pipeline is stalled and is waiting for a free MEC reservation station entry. The cycles should be appropriately counted in case of the cracked ops e.g. In case of a cracked load-op, the load portion is sent to M.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "Micro-ops retired", + "EventCode": "0xC2", + "EventName": "UOPS_RETIRED.ALL", + "PublicDescription": "This event counts the number of micro-ops retired. The processor decodes complex macro instructions into a sequence of simpler micro-ops. Most instructions are composed of one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. In some cases micro-op sequences are fused or whole instructions are fused into one micro-op. See other UOPS_RETIRED events for differentiating retired fused and non-fused micro-ops.", + "SampleAfterValue": "2000003", + "UMask": "0x10" + }, + { + "BriefDescription": "MSROM micro-ops retired", + "EventCode": "0xC2", + "EventName": "UOPS_RETIRED.MS", + "PublicDescription": "This event counts the number of micro-ops retired that were supplied from MSROM.", + "SampleAfterValue": "2000003", + "UMask": "0x1" + } +] diff --git a/tools/perf/pmu-events/arch/x86/silvermont/virtual-memory.json b/tools/perf/pmu-events/arch/x86/silvermont/virtual-memory.json new file mode 100644 index 000000000..1be3fa5c4 --- /dev/null +++ b/tools/perf/pmu-events/arch/x86/silvermont/virtual-memory.json @@ -0,0 +1,62 @@ +[ + { + "BriefDescription": "Loads missed DTLB", + "EventCode": "0x04", + "EventName": "MEM_UOPS_RETIRED.DTLB_MISS_LOADS", + "PEBS": "1", + "PublicDescription": "This event counts the number of load ops retired that had DTLB miss.", + "SampleAfterValue": "200003", + "UMask": "0x8" + }, + { + "BriefDescription": "Total cycles for all the page walks. (I-side and D-side)", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.CYCLES", + "PublicDescription": "This event counts every cycle when a data (D) page walk or instruction (I) page walk is in progress. Since a pagewalk implies a TLB miss, the approximate cost of a TLB miss can be determined from this event.", + "SampleAfterValue": "200003", + "UMask": "0x3" + }, + { + "BriefDescription": "Duration of D-side page-walks in core cycles", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.D_SIDE_CYCLES", + "PublicDescription": "This event counts every cycle when a D-side (walks due to a load) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks.", + "SampleAfterValue": "200003", + "UMask": "0x1" + }, + { + "BriefDescription": "D-side page-walks", + "EdgeDetect": "1", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.D_SIDE_WALKS", + "PublicDescription": "This event counts when a data (D) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.", + "SampleAfterValue": "100003", + "UMask": "0x1" + }, + { + "BriefDescription": "Duration of I-side page-walks in core cycles", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.I_SIDE_CYCLES", + "PublicDescription": "This event counts every cycle when a I-side (walks due to an instruction fetch) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks.", + "SampleAfterValue": "200003", + "UMask": "0x2" + }, + { + "BriefDescription": "I-side page-walks", + "EdgeDetect": "1", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.I_SIDE_WALKS", + "PublicDescription": "This event counts when an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.", + "SampleAfterValue": "100003", + "UMask": "0x2" + }, + { + "BriefDescription": "Total page walks that are completed (I-side and D-side)", + "EdgeDetect": "1", + "EventCode": "0x05", + "EventName": "PAGE_WALKS.WALKS", + "PublicDescription": "This event counts when a data (D) page walk or an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.", + "SampleAfterValue": "100003", + "UMask": "0x3" + } +] |