diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /drivers/media/platform/amphion/vpu_v4l2.c | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'drivers/media/platform/amphion/vpu_v4l2.c')
-rw-r--r-- | drivers/media/platform/amphion/vpu_v4l2.c | 857 |
1 files changed, 857 insertions, 0 deletions
diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c new file mode 100644 index 000000000..6773b8855 --- /dev/null +++ b/drivers/media/platform/amphion/vpu_v4l2.c @@ -0,0 +1,857 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2020-2021 NXP + */ + +#include <linux/init.h> +#include <linux/interconnect.h> +#include <linux/ioctl.h> +#include <linux/list.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/pm_runtime.h> +#include <linux/videodev2.h> +#include <media/v4l2-device.h> +#include <media/v4l2-event.h> +#include <media/v4l2-mem2mem.h> +#include <media/v4l2-ioctl.h> +#include <media/videobuf2-v4l2.h> +#include <media/videobuf2-dma-contig.h> +#include <media/videobuf2-vmalloc.h> +#include "vpu.h" +#include "vpu_core.h" +#include "vpu_v4l2.h" +#include "vpu_msgs.h" +#include "vpu_helpers.h" + +void vpu_inst_lock(struct vpu_inst *inst) +{ + mutex_lock(&inst->lock); +} + +void vpu_inst_unlock(struct vpu_inst *inst) +{ + mutex_unlock(&inst->lock); +} + +dma_addr_t vpu_get_vb_phy_addr(struct vb2_buffer *vb, u32 plane_no) +{ + if (plane_no >= vb->num_planes) + return 0; + return vb2_dma_contig_plane_dma_addr(vb, plane_no) + + vb->planes[plane_no].data_offset; +} + +unsigned int vpu_get_vb_length(struct vb2_buffer *vb, u32 plane_no) +{ + if (plane_no >= vb->num_planes) + return 0; + return vb2_plane_size(vb, plane_no) - vb->planes[plane_no].data_offset; +} + +void vpu_set_buffer_state(struct vb2_v4l2_buffer *vbuf, unsigned int state) +{ + struct vpu_vb2_buffer *vpu_buf = to_vpu_vb2_buffer(vbuf); + + vpu_buf->state = state; +} + +unsigned int vpu_get_buffer_state(struct vb2_v4l2_buffer *vbuf) +{ + struct vpu_vb2_buffer *vpu_buf = to_vpu_vb2_buffer(vbuf); + + return vpu_buf->state; +} + +void vpu_v4l2_set_error(struct vpu_inst *inst) +{ + vpu_inst_lock(inst); + dev_err(inst->dev, "some error occurs in codec\n"); + if (inst->fh.m2m_ctx) { + vb2_queue_error(v4l2_m2m_get_src_vq(inst->fh.m2m_ctx)); + vb2_queue_error(v4l2_m2m_get_dst_vq(inst->fh.m2m_ctx)); + } + vpu_inst_unlock(inst); +} + +int vpu_notify_eos(struct vpu_inst *inst) +{ + static const struct v4l2_event ev = { + .id = 0, + .type = V4L2_EVENT_EOS + }; + + vpu_trace(inst->dev, "[%d]\n", inst->id); + v4l2_event_queue_fh(&inst->fh, &ev); + + return 0; +} + +int vpu_notify_source_change(struct vpu_inst *inst) +{ + static const struct v4l2_event ev = { + .id = 0, + .type = V4L2_EVENT_SOURCE_CHANGE, + .u.src_change.changes = V4L2_EVENT_SRC_CH_RESOLUTION + }; + + vpu_trace(inst->dev, "[%d]\n", inst->id); + v4l2_event_queue_fh(&inst->fh, &ev); + return 0; +} + +int vpu_set_last_buffer_dequeued(struct vpu_inst *inst) +{ + struct vb2_queue *q; + + if (!inst || !inst->fh.m2m_ctx) + return -EINVAL; + + q = v4l2_m2m_get_dst_vq(inst->fh.m2m_ctx); + if (!list_empty(&q->done_list)) + return -EINVAL; + + if (q->last_buffer_dequeued) + return 0; + vpu_trace(inst->dev, "last buffer dequeued\n"); + q->last_buffer_dequeued = true; + wake_up(&q->done_wq); + vpu_notify_eos(inst); + return 0; +} + +bool vpu_is_source_empty(struct vpu_inst *inst) +{ + struct v4l2_m2m_buffer *buf = NULL; + + if (!inst->fh.m2m_ctx) + return true; + v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) { + if (vpu_get_buffer_state(&buf->vb) == VPU_BUF_STATE_IDLE) + return false; + } + return true; +} + +static int vpu_init_format(struct vpu_inst *inst, struct vpu_format *fmt) +{ + const struct vpu_format *info; + + info = vpu_helper_find_format(inst, fmt->type, fmt->pixfmt); + if (!info) { + info = vpu_helper_enum_format(inst, fmt->type, 0); + if (!info) + return -EINVAL; + } + memcpy(fmt, info, sizeof(*fmt)); + + return 0; +} + +static int vpu_calc_fmt_bytesperline(struct v4l2_format *f, struct vpu_format *fmt) +{ + struct v4l2_pix_format_mplane *pixmp = &f->fmt.pix_mp; + int i; + + if (fmt->flags & V4L2_FMT_FLAG_COMPRESSED) { + for (i = 0; i < fmt->comp_planes; i++) + fmt->bytesperline[i] = 0; + return 0; + } + if (pixmp->num_planes == fmt->comp_planes) { + for (i = 0; i < fmt->comp_planes; i++) + fmt->bytesperline[i] = pixmp->plane_fmt[i].bytesperline; + return 0; + } + if (pixmp->num_planes > 1) + return -EINVAL; + + /*amphion vpu only support nv12 and nv12 tiled, + * so the bytesperline of luma and chroma should be same + */ + for (i = 0; i < fmt->comp_planes; i++) + fmt->bytesperline[i] = pixmp->plane_fmt[0].bytesperline; + + return 0; +} + +static int vpu_calc_fmt_sizeimage(struct vpu_inst *inst, struct vpu_format *fmt) +{ + u32 stride = 1; + int i; + + if (!(fmt->flags & V4L2_FMT_FLAG_COMPRESSED)) { + const struct vpu_core_resources *res = vpu_get_resource(inst); + + if (res) + stride = res->stride; + } + + for (i = 0; i < fmt->comp_planes; i++) { + fmt->sizeimage[i] = vpu_helper_get_plane_size(fmt->pixfmt, + fmt->width, + fmt->height, + i, + stride, + fmt->field != V4L2_FIELD_NONE ? 1 : 0, + &fmt->bytesperline[i]); + fmt->sizeimage[i] = max_t(u32, fmt->sizeimage[i], PAGE_SIZE); + if (fmt->flags & V4L2_FMT_FLAG_COMPRESSED) { + fmt->sizeimage[i] = clamp_val(fmt->sizeimage[i], SZ_128K, SZ_8M); + fmt->bytesperline[i] = 0; + } + } + + return 0; +} + +u32 vpu_get_fmt_plane_size(struct vpu_format *fmt, u32 plane_no) +{ + u32 size; + int i; + + if (plane_no >= fmt->mem_planes) + return 0; + + if (fmt->comp_planes == fmt->mem_planes) + return fmt->sizeimage[plane_no]; + if (plane_no < fmt->mem_planes - 1) + return fmt->sizeimage[plane_no]; + + size = fmt->sizeimage[plane_no]; + for (i = fmt->mem_planes; i < fmt->comp_planes; i++) + size += fmt->sizeimage[i]; + + return size; +} + +int vpu_try_fmt_common(struct vpu_inst *inst, struct v4l2_format *f, struct vpu_format *fmt) +{ + struct v4l2_pix_format_mplane *pixmp = &f->fmt.pix_mp; + int i; + int ret; + + fmt->pixfmt = pixmp->pixelformat; + fmt->type = f->type; + ret = vpu_init_format(inst, fmt); + if (ret < 0) + return ret; + + fmt->width = pixmp->width; + fmt->height = pixmp->height; + if (fmt->width) + fmt->width = vpu_helper_valid_frame_width(inst, fmt->width); + if (fmt->height) + fmt->height = vpu_helper_valid_frame_height(inst, fmt->height); + fmt->field = pixmp->field == V4L2_FIELD_ANY ? V4L2_FIELD_NONE : pixmp->field; + vpu_calc_fmt_bytesperline(f, fmt); + vpu_calc_fmt_sizeimage(inst, fmt); + if ((fmt->flags & V4L2_FMT_FLAG_COMPRESSED) && pixmp->plane_fmt[0].sizeimage) + fmt->sizeimage[0] = clamp_val(pixmp->plane_fmt[0].sizeimage, SZ_128K, SZ_8M); + + pixmp->pixelformat = fmt->pixfmt; + pixmp->width = fmt->width; + pixmp->height = fmt->height; + pixmp->flags = fmt->flags; + pixmp->num_planes = fmt->mem_planes; + pixmp->field = fmt->field; + memset(pixmp->reserved, 0, sizeof(pixmp->reserved)); + for (i = 0; i < pixmp->num_planes; i++) { + pixmp->plane_fmt[i].bytesperline = fmt->bytesperline[i]; + pixmp->plane_fmt[i].sizeimage = vpu_get_fmt_plane_size(fmt, i); + memset(pixmp->plane_fmt[i].reserved, 0, sizeof(pixmp->plane_fmt[i].reserved)); + } + + return 0; +} + +static bool vpu_check_ready(struct vpu_inst *inst, u32 type) +{ + if (!inst) + return false; + if (inst->state == VPU_CODEC_STATE_DEINIT || inst->id < 0) + return false; + if (!inst->ops->check_ready) + return true; + return call_vop(inst, check_ready, type); +} + +int vpu_process_output_buffer(struct vpu_inst *inst) +{ + struct v4l2_m2m_buffer *buf = NULL; + struct vb2_v4l2_buffer *vbuf = NULL; + + if (!inst || !inst->fh.m2m_ctx) + return -EINVAL; + + if (!vpu_check_ready(inst, inst->out_format.type)) + return -EINVAL; + + v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vpu_get_buffer_state(vbuf) == VPU_BUF_STATE_IDLE) + break; + vbuf = NULL; + } + + if (!vbuf) + return -EINVAL; + + dev_dbg(inst->dev, "[%d]frame id = %d / %d\n", + inst->id, vbuf->sequence, inst->sequence); + return call_vop(inst, process_output, &vbuf->vb2_buf); +} + +int vpu_process_capture_buffer(struct vpu_inst *inst) +{ + struct v4l2_m2m_buffer *buf = NULL; + struct vb2_v4l2_buffer *vbuf = NULL; + + if (!inst || !inst->fh.m2m_ctx) + return -EINVAL; + + if (!vpu_check_ready(inst, inst->cap_format.type)) + return -EINVAL; + + v4l2_m2m_for_each_dst_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vpu_get_buffer_state(vbuf) == VPU_BUF_STATE_IDLE) + break; + vbuf = NULL; + } + if (!vbuf) + return -EINVAL; + + return call_vop(inst, process_capture, &vbuf->vb2_buf); +} + +struct vb2_v4l2_buffer *vpu_next_src_buf(struct vpu_inst *inst) +{ + struct vb2_v4l2_buffer *src_buf = NULL; + + if (!inst->fh.m2m_ctx) + return NULL; + + src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx); + if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE) + return NULL; + + while (vpu_vb_is_codecconfig(src_buf)) { + v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx); + vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE); + v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE); + + src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx); + if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE) + return NULL; + } + + return src_buf; +} + +void vpu_skip_frame(struct vpu_inst *inst, int count) +{ + struct vb2_v4l2_buffer *src_buf; + enum vb2_buffer_state state; + int i = 0; + + if (count <= 0 || !inst->fh.m2m_ctx) + return; + + while (i < count) { + src_buf = v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx); + if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE) + return; + if (vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_DECODED) + state = VB2_BUF_STATE_DONE; + else + state = VB2_BUF_STATE_ERROR; + i++; + vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE); + v4l2_m2m_buf_done(src_buf, state); + } +} + +struct vb2_v4l2_buffer *vpu_find_buf_by_sequence(struct vpu_inst *inst, u32 type, u32 sequence) +{ + struct v4l2_m2m_buffer *buf = NULL; + struct vb2_v4l2_buffer *vbuf = NULL; + + if (!inst || !inst->fh.m2m_ctx) + return NULL; + + if (V4L2_TYPE_IS_OUTPUT(type)) { + v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vbuf->sequence == sequence) + break; + vbuf = NULL; + } + } else { + v4l2_m2m_for_each_dst_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vbuf->sequence == sequence) + break; + vbuf = NULL; + } + } + + return vbuf; +} + +struct vb2_v4l2_buffer *vpu_find_buf_by_idx(struct vpu_inst *inst, u32 type, u32 idx) +{ + struct v4l2_m2m_buffer *buf = NULL; + struct vb2_v4l2_buffer *vbuf = NULL; + + if (!inst || !inst->fh.m2m_ctx) + return NULL; + + if (V4L2_TYPE_IS_OUTPUT(type)) { + v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vbuf->vb2_buf.index == idx) + break; + vbuf = NULL; + } + } else { + v4l2_m2m_for_each_dst_buf(inst->fh.m2m_ctx, buf) { + vbuf = &buf->vb; + if (vbuf->vb2_buf.index == idx) + break; + vbuf = NULL; + } + } + + return vbuf; +} + +int vpu_get_num_buffers(struct vpu_inst *inst, u32 type) +{ + struct vb2_queue *q; + + if (!inst || !inst->fh.m2m_ctx) + return -EINVAL; + + if (V4L2_TYPE_IS_OUTPUT(type)) + q = v4l2_m2m_get_src_vq(inst->fh.m2m_ctx); + else + q = v4l2_m2m_get_dst_vq(inst->fh.m2m_ctx); + + return q->num_buffers; +} + +static void vpu_m2m_device_run(void *priv) +{ +} + +static void vpu_m2m_job_abort(void *priv) +{ + struct vpu_inst *inst = priv; + struct v4l2_m2m_ctx *m2m_ctx = inst->fh.m2m_ctx; + + v4l2_m2m_job_finish(m2m_ctx->m2m_dev, m2m_ctx); +} + +static const struct v4l2_m2m_ops vpu_m2m_ops = { + .device_run = vpu_m2m_device_run, + .job_abort = vpu_m2m_job_abort +}; + +static int vpu_vb2_queue_setup(struct vb2_queue *vq, + unsigned int *buf_count, + unsigned int *plane_count, + unsigned int psize[], + struct device *allocators[]) +{ + struct vpu_inst *inst = vb2_get_drv_priv(vq); + struct vpu_format *cur_fmt; + int i; + + cur_fmt = vpu_get_format(inst, vq->type); + + if (*plane_count) { + if (*plane_count != cur_fmt->mem_planes) + return -EINVAL; + for (i = 0; i < cur_fmt->mem_planes; i++) { + if (psize[i] < vpu_get_fmt_plane_size(cur_fmt, i)) + return -EINVAL; + } + return 0; + } + + if (V4L2_TYPE_IS_OUTPUT(vq->type)) + *buf_count = max_t(unsigned int, *buf_count, inst->min_buffer_out); + else + *buf_count = max_t(unsigned int, *buf_count, inst->min_buffer_cap); + *plane_count = cur_fmt->mem_planes; + for (i = 0; i < cur_fmt->mem_planes; i++) + psize[i] = vpu_get_fmt_plane_size(cur_fmt, i); + + return 0; +} + +static int vpu_vb2_buf_init(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + + vpu_set_buffer_state(vbuf, VPU_BUF_STATE_IDLE); + return 0; +} + +static int vpu_vb2_buf_out_validate(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + + vbuf->field = V4L2_FIELD_NONE; + + return 0; +} + +static int vpu_vb2_buf_prepare(struct vb2_buffer *vb) +{ + struct vpu_inst *inst = vb2_get_drv_priv(vb->vb2_queue); + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + struct vpu_format *cur_fmt; + u32 i; + + cur_fmt = vpu_get_format(inst, vb->type); + for (i = 0; i < cur_fmt->mem_planes; i++) { + if (vpu_get_vb_length(vb, i) < vpu_get_fmt_plane_size(cur_fmt, i)) { + dev_dbg(inst->dev, "[%d] %s buf[%d] is invalid\n", + inst->id, vpu_type_name(vb->type), vb->index); + vpu_set_buffer_state(vbuf, VPU_BUF_STATE_ERROR); + } + } + + return 0; +} + +static void vpu_vb2_buf_finish(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + struct vpu_inst *inst = vb2_get_drv_priv(vb->vb2_queue); + struct vb2_queue *q = vb->vb2_queue; + + if (vbuf->flags & V4L2_BUF_FLAG_LAST) + vpu_notify_eos(inst); + + if (list_empty(&q->done_list)) + call_void_vop(inst, on_queue_empty, q->type); +} + +void vpu_vb2_buffers_return(struct vpu_inst *inst, unsigned int type, enum vb2_buffer_state state) +{ + struct vb2_v4l2_buffer *buf; + + if (V4L2_TYPE_IS_OUTPUT(type)) { + while ((buf = v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx))) { + vpu_set_buffer_state(buf, VPU_BUF_STATE_IDLE); + v4l2_m2m_buf_done(buf, state); + } + } else { + while ((buf = v4l2_m2m_dst_buf_remove(inst->fh.m2m_ctx))) { + vpu_set_buffer_state(buf, VPU_BUF_STATE_IDLE); + v4l2_m2m_buf_done(buf, state); + } + } +} + +static int vpu_vb2_start_streaming(struct vb2_queue *q, unsigned int count) +{ + struct vpu_inst *inst = vb2_get_drv_priv(q); + struct vpu_format *fmt = vpu_get_format(inst, q->type); + int ret; + + vpu_inst_unlock(inst); + ret = vpu_inst_register(inst); + vpu_inst_lock(inst); + if (ret) { + vpu_vb2_buffers_return(inst, q->type, VB2_BUF_STATE_QUEUED); + return ret; + } + + vpu_trace(inst->dev, "[%d] %s %c%c%c%c %dx%d %u(%u) %u(%u) %u(%u) %d\n", + inst->id, vpu_type_name(q->type), + fmt->pixfmt, + fmt->pixfmt >> 8, + fmt->pixfmt >> 16, + fmt->pixfmt >> 24, + fmt->width, fmt->height, + fmt->sizeimage[0], fmt->bytesperline[0], + fmt->sizeimage[1], fmt->bytesperline[1], + fmt->sizeimage[2], fmt->bytesperline[2], + q->num_buffers); + vb2_clear_last_buffer_dequeued(q); + ret = call_vop(inst, start, q->type); + if (ret) + vpu_vb2_buffers_return(inst, q->type, VB2_BUF_STATE_QUEUED); + + return ret; +} + +static void vpu_vb2_stop_streaming(struct vb2_queue *q) +{ + struct vpu_inst *inst = vb2_get_drv_priv(q); + + vpu_trace(inst->dev, "[%d] %s\n", inst->id, vpu_type_name(q->type)); + + call_void_vop(inst, stop, q->type); + vpu_vb2_buffers_return(inst, q->type, VB2_BUF_STATE_ERROR); + if (V4L2_TYPE_IS_OUTPUT(q->type)) + inst->sequence = 0; +} + +static void vpu_vb2_buf_queue(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + struct vpu_inst *inst = vb2_get_drv_priv(vb->vb2_queue); + + if (V4L2_TYPE_IS_OUTPUT(vb->type)) + vbuf->sequence = inst->sequence++; + + v4l2_m2m_buf_queue(inst->fh.m2m_ctx, vbuf); + vpu_process_output_buffer(inst); + vpu_process_capture_buffer(inst); +} + +static const struct vb2_ops vpu_vb2_ops = { + .queue_setup = vpu_vb2_queue_setup, + .buf_init = vpu_vb2_buf_init, + .buf_out_validate = vpu_vb2_buf_out_validate, + .buf_prepare = vpu_vb2_buf_prepare, + .buf_finish = vpu_vb2_buf_finish, + .start_streaming = vpu_vb2_start_streaming, + .stop_streaming = vpu_vb2_stop_streaming, + .buf_queue = vpu_vb2_buf_queue, + .wait_prepare = vb2_ops_wait_prepare, + .wait_finish = vb2_ops_wait_finish, +}; + +static int vpu_m2m_queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq) +{ + struct vpu_inst *inst = priv; + int ret; + + src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; + inst->out_format.type = src_vq->type; + src_vq->io_modes = VB2_MMAP | VB2_DMABUF; + src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; + src_vq->ops = &vpu_vb2_ops; + src_vq->mem_ops = &vb2_dma_contig_memops; + if (inst->type == VPU_CORE_TYPE_DEC && inst->use_stream_buffer) + src_vq->mem_ops = &vb2_vmalloc_memops; + src_vq->drv_priv = inst; + src_vq->buf_struct_size = sizeof(struct vpu_vb2_buffer); + src_vq->min_buffers_needed = 1; + src_vq->dev = inst->vpu->dev; + src_vq->lock = &inst->lock; + ret = vb2_queue_init(src_vq); + if (ret) + return ret; + + dst_vq->type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + inst->cap_format.type = dst_vq->type; + dst_vq->io_modes = VB2_MMAP | VB2_DMABUF; + dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; + dst_vq->ops = &vpu_vb2_ops; + dst_vq->mem_ops = &vb2_dma_contig_memops; + if (inst->type == VPU_CORE_TYPE_ENC && inst->use_stream_buffer) + dst_vq->mem_ops = &vb2_vmalloc_memops; + dst_vq->drv_priv = inst; + dst_vq->buf_struct_size = sizeof(struct vpu_vb2_buffer); + dst_vq->min_buffers_needed = 1; + dst_vq->dev = inst->vpu->dev; + dst_vq->lock = &inst->lock; + ret = vb2_queue_init(dst_vq); + if (ret) { + vb2_queue_release(src_vq); + return ret; + } + + return 0; +} + +static int vpu_v4l2_release(struct vpu_inst *inst) +{ + vpu_trace(inst->vpu->dev, "%p\n", inst); + + vpu_release_core(inst->core); + put_device(inst->dev); + + if (inst->workqueue) { + cancel_work_sync(&inst->msg_work); + destroy_workqueue(inst->workqueue); + inst->workqueue = NULL; + } + + v4l2_ctrl_handler_free(&inst->ctrl_handler); + mutex_destroy(&inst->lock); + v4l2_fh_del(&inst->fh); + v4l2_fh_exit(&inst->fh); + + call_void_vop(inst, cleanup); + + return 0; +} + +int vpu_v4l2_open(struct file *file, struct vpu_inst *inst) +{ + struct vpu_dev *vpu = video_drvdata(file); + struct vpu_func *func; + int ret = 0; + + if (!inst || !inst->ops) + return -EINVAL; + + if (inst->type == VPU_CORE_TYPE_ENC) + func = &vpu->encoder; + else + func = &vpu->decoder; + + atomic_set(&inst->ref_count, 0); + vpu_inst_get(inst); + inst->vpu = vpu; + inst->core = vpu_request_core(vpu, inst->type); + if (inst->core) + inst->dev = get_device(inst->core->dev); + mutex_init(&inst->lock); + INIT_LIST_HEAD(&inst->cmd_q); + inst->id = VPU_INST_NULL_ID; + inst->release = vpu_v4l2_release; + inst->pid = current->pid; + inst->tgid = current->tgid; + inst->min_buffer_cap = 2; + inst->min_buffer_out = 2; + v4l2_fh_init(&inst->fh, func->vfd); + v4l2_fh_add(&inst->fh); + + ret = call_vop(inst, ctrl_init); + if (ret) + goto error; + + inst->fh.m2m_ctx = v4l2_m2m_ctx_init(func->m2m_dev, inst, vpu_m2m_queue_init); + if (IS_ERR(inst->fh.m2m_ctx)) { + dev_err(vpu->dev, "v4l2_m2m_ctx_init fail\n"); + ret = PTR_ERR(inst->fh.m2m_ctx); + goto error; + } + + inst->fh.ctrl_handler = &inst->ctrl_handler; + file->private_data = &inst->fh; + inst->state = VPU_CODEC_STATE_DEINIT; + inst->workqueue = alloc_workqueue("vpu_inst", WQ_UNBOUND | WQ_MEM_RECLAIM, 1); + if (inst->workqueue) { + INIT_WORK(&inst->msg_work, vpu_inst_run_work); + ret = kfifo_init(&inst->msg_fifo, + inst->msg_buffer, + rounddown_pow_of_two(sizeof(inst->msg_buffer))); + if (ret) { + destroy_workqueue(inst->workqueue); + inst->workqueue = NULL; + } + } + vpu_trace(vpu->dev, "tgid = %d, pid = %d, type = %s, inst = %p\n", + inst->tgid, inst->pid, vpu_core_type_desc(inst->type), inst); + + return 0; +error: + vpu_inst_put(inst); + return ret; +} + +int vpu_v4l2_close(struct file *file) +{ + struct vpu_dev *vpu = video_drvdata(file); + struct vpu_inst *inst = to_inst(file); + + vpu_trace(vpu->dev, "tgid = %d, pid = %d, inst = %p\n", inst->tgid, inst->pid, inst); + + vpu_inst_lock(inst); + if (inst->fh.m2m_ctx) { + v4l2_m2m_ctx_release(inst->fh.m2m_ctx); + inst->fh.m2m_ctx = NULL; + } + vpu_inst_unlock(inst); + + call_void_vop(inst, release); + vpu_inst_unregister(inst); + vpu_inst_put(inst); + + return 0; +} + +int vpu_add_func(struct vpu_dev *vpu, struct vpu_func *func) +{ + struct video_device *vfd; + int ret; + + if (!vpu || !func) + return -EINVAL; + + if (func->vfd) + return 0; + + func->m2m_dev = v4l2_m2m_init(&vpu_m2m_ops); + if (IS_ERR(func->m2m_dev)) { + dev_err(vpu->dev, "v4l2_m2m_init fail\n"); + func->vfd = NULL; + return PTR_ERR(func->m2m_dev); + } + + vfd = video_device_alloc(); + if (!vfd) { + v4l2_m2m_release(func->m2m_dev); + dev_err(vpu->dev, "alloc vpu decoder video device fail\n"); + return -ENOMEM; + } + vfd->release = video_device_release; + vfd->vfl_dir = VFL_DIR_M2M; + vfd->v4l2_dev = &vpu->v4l2_dev; + vfd->device_caps = V4L2_CAP_VIDEO_M2M_MPLANE | V4L2_CAP_STREAMING; + if (func->type == VPU_CORE_TYPE_ENC) { + strscpy(vfd->name, "amphion-vpu-encoder", sizeof(vfd->name)); + vfd->fops = venc_get_fops(); + vfd->ioctl_ops = venc_get_ioctl_ops(); + } else { + strscpy(vfd->name, "amphion-vpu-decoder", sizeof(vfd->name)); + vfd->fops = vdec_get_fops(); + vfd->ioctl_ops = vdec_get_ioctl_ops(); + } + + ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1); + if (ret) { + video_device_release(vfd); + v4l2_m2m_release(func->m2m_dev); + return ret; + } + video_set_drvdata(vfd, vpu); + func->vfd = vfd; + + ret = v4l2_m2m_register_media_controller(func->m2m_dev, func->vfd, func->function); + if (ret) { + v4l2_m2m_release(func->m2m_dev); + func->m2m_dev = NULL; + video_unregister_device(func->vfd); + func->vfd = NULL; + return ret; + } + + return 0; +} + +void vpu_remove_func(struct vpu_func *func) +{ + if (!func) + return; + + if (func->m2m_dev) { + v4l2_m2m_unregister_media_controller(func->m2m_dev); + v4l2_m2m_release(func->m2m_dev); + func->m2m_dev = NULL; + } + if (func->vfd) { + video_unregister_device(func->vfd); + func->vfd = NULL; + } +} |