diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /drivers/media/usb/pwc/pwc-dec23.c | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'drivers/media/usb/pwc/pwc-dec23.c')
-rw-r--r-- | drivers/media/usb/pwc/pwc-dec23.c | 678 |
1 files changed, 678 insertions, 0 deletions
diff --git a/drivers/media/usb/pwc/pwc-dec23.c b/drivers/media/usb/pwc/pwc-dec23.c new file mode 100644 index 000000000..a3aa8c717 --- /dev/null +++ b/drivers/media/usb/pwc/pwc-dec23.c @@ -0,0 +1,678 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Linux driver for Philips webcam + Decompression for chipset version 2 et 3 + (C) 2004-2006 Luc Saillard (luc@saillard.org) + + NOTE: this version of pwc is an unofficial (modified) release of pwc & pcwx + driver and thus may have bugs that are not present in the original version. + Please send bug reports and support requests to <luc@saillard.org>. + The decompression routines have been implemented by reverse-engineering the + Nemosoft binary pwcx module. Caveat emptor. + + +*/ + +#include "pwc-timon.h" +#include "pwc-kiara.h" +#include "pwc-dec23.h" + +#include <linux/string.h> +#include <linux/slab.h> + +/* + * USE_LOOKUP_TABLE_TO_CLAMP + * 0: use a C version of this tests: { a<0?0:(a>255?255:a) } + * 1: use a faster lookup table for cpu with a big cache (intel) + */ +#define USE_LOOKUP_TABLE_TO_CLAMP 1 +/* + * UNROLL_LOOP_FOR_COPYING_BLOCK + * 0: use a loop for a smaller code (but little slower) + * 1: when unrolling the loop, gcc produces some faster code (perhaps only + * valid for intel processor class). Activating this option, automatically + * activate USE_LOOKUP_TABLE_TO_CLAMP + */ +#define UNROLL_LOOP_FOR_COPY 1 +#if UNROLL_LOOP_FOR_COPY +# undef USE_LOOKUP_TABLE_TO_CLAMP +# define USE_LOOKUP_TABLE_TO_CLAMP 1 +#endif + +static void build_subblock_pattern(struct pwc_dec23_private *pdec) +{ + static const unsigned int initial_values[12] = { + -0x526500, -0x221200, 0x221200, 0x526500, + -0x3de200, 0x3de200, + -0x6db480, -0x2d5d00, 0x2d5d00, 0x6db480, + -0x12c200, 0x12c200 + + }; + static const unsigned int values_derivated[12] = { + 0xa4ca, 0x4424, -0x4424, -0xa4ca, + 0x7bc4, -0x7bc4, + 0xdb69, 0x5aba, -0x5aba, -0xdb69, + 0x2584, -0x2584 + }; + unsigned int temp_values[12]; + int i, j; + + memcpy(temp_values, initial_values, sizeof(initial_values)); + for (i = 0; i < 256; i++) { + for (j = 0; j < 12; j++) { + pdec->table_subblock[i][j] = temp_values[j]; + temp_values[j] += values_derivated[j]; + } + } +} + +static void build_bit_powermask_table(struct pwc_dec23_private *pdec) +{ + unsigned char *p; + unsigned int bit, byte, mask, val; + unsigned int bitpower = 1; + + for (bit = 0; bit < 8; bit++) { + mask = bitpower - 1; + p = pdec->table_bitpowermask[bit]; + for (byte = 0; byte < 256; byte++) { + val = (byte & mask); + if (byte & bitpower) + val = -val; + *p++ = val; + } + bitpower<<=1; + } +} + + +static void build_table_color(const unsigned int romtable[16][8], + unsigned char p0004[16][1024], + unsigned char p8004[16][256]) +{ + int compression_mode, j, k, bit, pw; + unsigned char *p0, *p8; + const unsigned int *r; + + /* We have 16 compressions tables */ + for (compression_mode = 0; compression_mode < 16; compression_mode++) { + p0 = p0004[compression_mode]; + p8 = p8004[compression_mode]; + r = romtable[compression_mode]; + + for (j = 0; j < 8; j++, r++, p0 += 128) { + + for (k = 0; k < 16; k++) { + if (k == 0) + bit = 1; + else if (k >= 1 && k < 3) + bit = (r[0] >> 15) & 7; + else if (k >= 3 && k < 6) + bit = (r[0] >> 12) & 7; + else if (k >= 6 && k < 10) + bit = (r[0] >> 9) & 7; + else if (k >= 10 && k < 13) + bit = (r[0] >> 6) & 7; + else if (k >= 13 && k < 15) + bit = (r[0] >> 3) & 7; + else + bit = (r[0]) & 7; + if (k == 0) + *p8++ = 8; + else + *p8++ = j - bit; + *p8++ = bit; + + pw = 1 << bit; + p0[k + 0x00] = (1 * pw) + 0x80; + p0[k + 0x10] = (2 * pw) + 0x80; + p0[k + 0x20] = (3 * pw) + 0x80; + p0[k + 0x30] = (4 * pw) + 0x80; + p0[k + 0x40] = (-1 * pw) + 0x80; + p0[k + 0x50] = (-2 * pw) + 0x80; + p0[k + 0x60] = (-3 * pw) + 0x80; + p0[k + 0x70] = (-4 * pw) + 0x80; + } /* end of for (k=0; k<16; k++, p8++) */ + } /* end of for (j=0; j<8; j++ , table++) */ + } /* end of foreach compression_mode */ +} + +/* + * + */ +static void fill_table_dc00_d800(struct pwc_dec23_private *pdec) +{ +#define SCALEBITS 15 +#define ONE_HALF (1UL << (SCALEBITS - 1)) + int i; + unsigned int offset1 = ONE_HALF; + unsigned int offset2 = 0x0000; + + for (i=0; i<256; i++) { + pdec->table_dc00[i] = offset1 & ~(ONE_HALF); + pdec->table_d800[i] = offset2; + + offset1 += 0x7bc4; + offset2 += 0x7bc4; + } +} + +/* + * To decode the stream: + * if look_bits(2) == 0: # op == 2 in the lookup table + * skip_bits(2) + * end of the stream + * elif look_bits(3) == 7: # op == 1 in the lookup table + * skip_bits(3) + * yyyy = get_bits(4) + * xxxx = get_bits(8) + * else: # op == 0 in the lookup table + * skip_bits(x) + * + * For speedup processing, we build a lookup table and we takes the first 6 bits. + * + * struct { + * unsigned char op; // operation to execute + * unsigned char bits; // bits use to perform operation + * unsigned char offset1; // offset to add to access in the table_0004 % 16 + * unsigned char offset2; // offset to add to access in the table_0004 + * } + * + * How to build this table ? + * op == 2 when (i%4)==0 + * op == 1 when (i%8)==7 + * op == 0 otherwise + * + */ +static const unsigned char hash_table_ops[64*4] = { + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x10, + 0x00, 0x06, 0x01, 0x30, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x01, 0x20, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x50, + 0x00, 0x05, 0x02, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x03, 0x00, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x10, + 0x00, 0x06, 0x02, 0x10, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x01, 0x60, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x50, + 0x00, 0x05, 0x02, 0x40, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x03, 0x40, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x10, + 0x00, 0x06, 0x01, 0x70, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x01, 0x20, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x50, + 0x00, 0x05, 0x02, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x03, 0x00, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x10, + 0x00, 0x06, 0x02, 0x50, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x01, 0x60, + 0x01, 0x00, 0x00, 0x00, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x00, + 0x00, 0x04, 0x01, 0x50, + 0x00, 0x05, 0x02, 0x40, + 0x02, 0x00, 0x00, 0x00, + 0x00, 0x03, 0x01, 0x40, + 0x00, 0x05, 0x03, 0x40, + 0x01, 0x00, 0x00, 0x00 +}; + +/* + * + */ +static const unsigned int MulIdx[16][16] = { + {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,}, + {0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,}, + {0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,}, + {4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4,}, + {6, 7, 8, 9, 7, 10, 11, 8, 8, 11, 10, 7, 9, 8, 7, 6,}, + {4, 5, 5, 4, 4, 5, 5, 4, 4, 5, 5, 4, 4, 5, 5, 4,}, + {1, 3, 0, 2, 1, 3, 0, 2, 1, 3, 0, 2, 1, 3, 0, 2,}, + {0, 3, 3, 0, 1, 2, 2, 1, 2, 1, 1, 2, 3, 0, 0, 3,}, + {0, 1, 2, 3, 3, 2, 1, 0, 3, 2, 1, 0, 0, 1, 2, 3,}, + {1, 1, 1, 1, 3, 3, 3, 3, 0, 0, 0, 0, 2, 2, 2, 2,}, + {7, 10, 11, 8, 9, 8, 7, 6, 6, 7, 8, 9, 8, 11, 10, 7,}, + {4, 5, 5, 4, 5, 4, 4, 5, 5, 4, 4, 5, 4, 5, 5, 4,}, + {7, 9, 6, 8, 10, 8, 7, 11, 11, 7, 8, 10, 8, 6, 9, 7,}, + {1, 3, 0, 2, 2, 0, 3, 1, 2, 0, 3, 1, 1, 3, 0, 2,}, + {1, 2, 2, 1, 3, 0, 0, 3, 0, 3, 3, 0, 2, 1, 1, 2,}, + {10, 8, 7, 11, 8, 6, 9, 7, 7, 9, 6, 8, 11, 7, 8, 10} +}; + +#if USE_LOOKUP_TABLE_TO_CLAMP +#define MAX_OUTER_CROP_VALUE (512) +static unsigned char pwc_crop_table[256 + 2*MAX_OUTER_CROP_VALUE]; +#define CLAMP(x) (pwc_crop_table[MAX_OUTER_CROP_VALUE+(x)]) +#else +#define CLAMP(x) ((x)>255?255:((x)<0?0:x)) +#endif + + +/* If the type or the command change, we rebuild the lookup table */ +void pwc_dec23_init(struct pwc_device *pdev, const unsigned char *cmd) +{ + int flags, version, shift, i; + struct pwc_dec23_private *pdec = &pdev->dec23; + + mutex_init(&pdec->lock); + + if (pdec->last_cmd_valid && pdec->last_cmd == cmd[2]) + return; + + if (DEVICE_USE_CODEC3(pdev->type)) { + flags = cmd[2] & 0x18; + if (flags == 8) + pdec->nbits = 7; /* More bits, mean more bits to encode the stream, but better quality */ + else if (flags == 0x10) + pdec->nbits = 8; + else + pdec->nbits = 6; + + version = cmd[2] >> 5; + build_table_color(KiaraRomTable[version][0], pdec->table_0004_pass1, pdec->table_8004_pass1); + build_table_color(KiaraRomTable[version][1], pdec->table_0004_pass2, pdec->table_8004_pass2); + + } else { + + flags = cmd[2] & 6; + if (flags == 2) + pdec->nbits = 7; + else if (flags == 4) + pdec->nbits = 8; + else + pdec->nbits = 6; + + version = cmd[2] >> 3; + build_table_color(TimonRomTable[version][0], pdec->table_0004_pass1, pdec->table_8004_pass1); + build_table_color(TimonRomTable[version][1], pdec->table_0004_pass2, pdec->table_8004_pass2); + } + + /* Information can be coded on a variable number of bits but never less than 8 */ + shift = 8 - pdec->nbits; + pdec->scalebits = SCALEBITS - shift; + pdec->nbitsmask = 0xFF >> shift; + + fill_table_dc00_d800(pdec); + build_subblock_pattern(pdec); + build_bit_powermask_table(pdec); + +#if USE_LOOKUP_TABLE_TO_CLAMP + /* Build the static table to clamp value [0-255] */ + for (i=0;i<MAX_OUTER_CROP_VALUE;i++) + pwc_crop_table[i] = 0; + for (i=0; i<256; i++) + pwc_crop_table[MAX_OUTER_CROP_VALUE+i] = i; + for (i=0; i<MAX_OUTER_CROP_VALUE; i++) + pwc_crop_table[MAX_OUTER_CROP_VALUE+256+i] = 255; +#endif + + pdec->last_cmd = cmd[2]; + pdec->last_cmd_valid = 1; +} + +/* + * Copy the 4x4 image block to Y plane buffer + */ +static void copy_image_block_Y(const int *src, unsigned char *dst, unsigned int bytes_per_line, unsigned int scalebits) +{ +#if UNROLL_LOOP_FOR_COPY + const unsigned char *cm = pwc_crop_table+MAX_OUTER_CROP_VALUE; + const int *c = src; + unsigned char *d = dst; + + *d++ = cm[c[0] >> scalebits]; + *d++ = cm[c[1] >> scalebits]; + *d++ = cm[c[2] >> scalebits]; + *d++ = cm[c[3] >> scalebits]; + + d = dst + bytes_per_line; + *d++ = cm[c[4] >> scalebits]; + *d++ = cm[c[5] >> scalebits]; + *d++ = cm[c[6] >> scalebits]; + *d++ = cm[c[7] >> scalebits]; + + d = dst + bytes_per_line*2; + *d++ = cm[c[8] >> scalebits]; + *d++ = cm[c[9] >> scalebits]; + *d++ = cm[c[10] >> scalebits]; + *d++ = cm[c[11] >> scalebits]; + + d = dst + bytes_per_line*3; + *d++ = cm[c[12] >> scalebits]; + *d++ = cm[c[13] >> scalebits]; + *d++ = cm[c[14] >> scalebits]; + *d++ = cm[c[15] >> scalebits]; +#else + int i; + const int *c = src; + unsigned char *d = dst; + for (i = 0; i < 4; i++, c++) + *d++ = CLAMP((*c) >> scalebits); + + d = dst + bytes_per_line; + for (i = 0; i < 4; i++, c++) + *d++ = CLAMP((*c) >> scalebits); + + d = dst + bytes_per_line*2; + for (i = 0; i < 4; i++, c++) + *d++ = CLAMP((*c) >> scalebits); + + d = dst + bytes_per_line*3; + for (i = 0; i < 4; i++, c++) + *d++ = CLAMP((*c) >> scalebits); +#endif +} + +/* + * Copy the 4x4 image block to a CrCb plane buffer + * + */ +static void copy_image_block_CrCb(const int *src, unsigned char *dst, unsigned int bytes_per_line, unsigned int scalebits) +{ +#if UNROLL_LOOP_FOR_COPY + /* Unroll all loops */ + const unsigned char *cm = pwc_crop_table+MAX_OUTER_CROP_VALUE; + const int *c = src; + unsigned char *d = dst; + + *d++ = cm[c[0] >> scalebits]; + *d++ = cm[c[4] >> scalebits]; + *d++ = cm[c[1] >> scalebits]; + *d++ = cm[c[5] >> scalebits]; + *d++ = cm[c[2] >> scalebits]; + *d++ = cm[c[6] >> scalebits]; + *d++ = cm[c[3] >> scalebits]; + *d++ = cm[c[7] >> scalebits]; + + d = dst + bytes_per_line; + *d++ = cm[c[12] >> scalebits]; + *d++ = cm[c[8] >> scalebits]; + *d++ = cm[c[13] >> scalebits]; + *d++ = cm[c[9] >> scalebits]; + *d++ = cm[c[14] >> scalebits]; + *d++ = cm[c[10] >> scalebits]; + *d++ = cm[c[15] >> scalebits]; + *d++ = cm[c[11] >> scalebits]; +#else + int i; + const int *c1 = src; + const int *c2 = src + 4; + unsigned char *d = dst; + + for (i = 0; i < 4; i++, c1++, c2++) { + *d++ = CLAMP((*c1) >> scalebits); + *d++ = CLAMP((*c2) >> scalebits); + } + c1 = src + 12; + d = dst + bytes_per_line; + for (i = 0; i < 4; i++, c1++, c2++) { + *d++ = CLAMP((*c1) >> scalebits); + *d++ = CLAMP((*c2) >> scalebits); + } +#endif +} + +/* + * To manage the stream, we keep bits in a 32 bits register. + * fill_nbits(n): fill the reservoir with at least n bits + * skip_bits(n): discard n bits from the reservoir + * get_bits(n): fill the reservoir, returns the first n bits and discard the + * bits from the reservoir. + * __get_nbits(n): faster version of get_bits(n), but asumes that the reservoir + * contains at least n bits. bits returned is discarded. + */ +#define fill_nbits(pdec, nbits_wanted) do { \ + while (pdec->nbits_in_reservoir<(nbits_wanted)) \ + { \ + pdec->reservoir |= (*(pdec->stream)++) << (pdec->nbits_in_reservoir); \ + pdec->nbits_in_reservoir += 8; \ + } \ +} while(0); + +#define skip_nbits(pdec, nbits_to_skip) do { \ + pdec->reservoir >>= (nbits_to_skip); \ + pdec->nbits_in_reservoir -= (nbits_to_skip); \ +} while(0); + +#define get_nbits(pdec, nbits_wanted, result) do { \ + fill_nbits(pdec, nbits_wanted); \ + result = (pdec->reservoir) & ((1U<<(nbits_wanted))-1); \ + skip_nbits(pdec, nbits_wanted); \ +} while(0); + +#define __get_nbits(pdec, nbits_wanted, result) do { \ + result = (pdec->reservoir) & ((1U<<(nbits_wanted))-1); \ + skip_nbits(pdec, nbits_wanted); \ +} while(0); + +#define look_nbits(pdec, nbits_wanted) \ + ((pdec->reservoir) & ((1U<<(nbits_wanted))-1)) + +/* + * Decode a 4x4 pixel block + */ +static void decode_block(struct pwc_dec23_private *pdec, + const unsigned char *ptable0004, + const unsigned char *ptable8004) +{ + unsigned int primary_color; + unsigned int channel_v, offset1, op; + int i; + + fill_nbits(pdec, 16); + __get_nbits(pdec, pdec->nbits, primary_color); + + if (look_nbits(pdec,2) == 0) { + skip_nbits(pdec, 2); + /* Very simple, the color is the same for all pixels of the square */ + for (i = 0; i < 16; i++) + pdec->temp_colors[i] = pdec->table_dc00[primary_color]; + + return; + } + + /* This block is encoded with small pattern */ + for (i = 0; i < 16; i++) + pdec->temp_colors[i] = pdec->table_d800[primary_color]; + + __get_nbits(pdec, 3, channel_v); + channel_v = ((channel_v & 1) << 2) | (channel_v & 2) | ((channel_v & 4) >> 2); + + ptable0004 += (channel_v * 128); + ptable8004 += (channel_v * 32); + + offset1 = 0; + do + { + unsigned int htable_idx, rows = 0; + const unsigned int *block; + + /* [ zzzz y x x ] + * xx == 00 :=> end of the block def, remove the two bits from the stream + * yxx == 111 + * yxx == any other value + * + */ + fill_nbits(pdec, 16); + htable_idx = look_nbits(pdec, 6); + op = hash_table_ops[htable_idx * 4]; + + if (op == 2) { + skip_nbits(pdec, 2); + + } else if (op == 1) { + /* 15bits [ xxxx xxxx yyyy 111 ] + * yyy => offset in the table8004 + * xxx => offset in the tabled004 (tree) + */ + unsigned int mask, shift; + unsigned int nbits, col1; + unsigned int yyyy; + + skip_nbits(pdec, 3); + /* offset1 += yyyy */ + __get_nbits(pdec, 4, yyyy); + offset1 += 1 + yyyy; + offset1 &= 0x0F; + nbits = ptable8004[offset1 * 2]; + + /* col1 = xxxx xxxx */ + __get_nbits(pdec, nbits+1, col1); + + /* Bit mask table */ + mask = pdec->table_bitpowermask[nbits][col1]; + shift = ptable8004[offset1 * 2 + 1]; + rows = ((mask << shift) + 0x80) & 0xFF; + + block = pdec->table_subblock[rows]; + for (i = 0; i < 16; i++) + pdec->temp_colors[i] += block[MulIdx[offset1][i]]; + + } else { + /* op == 0 + * offset1 is coded on 3 bits + */ + unsigned int shift; + + offset1 += hash_table_ops [htable_idx * 4 + 2]; + offset1 &= 0x0F; + + rows = ptable0004[offset1 + hash_table_ops [htable_idx * 4 + 3]]; + block = pdec->table_subblock[rows]; + for (i = 0; i < 16; i++) + pdec->temp_colors[i] += block[MulIdx[offset1][i]]; + + shift = hash_table_ops[htable_idx * 4 + 1]; + skip_nbits(pdec, shift); + } + + } while (op != 2); + +} + +static void DecompressBand23(struct pwc_dec23_private *pdec, + const unsigned char *rawyuv, + unsigned char *planar_y, + unsigned char *planar_u, + unsigned char *planar_v, + unsigned int compressed_image_width, + unsigned int real_image_width) +{ + int compression_index, nblocks; + const unsigned char *ptable0004; + const unsigned char *ptable8004; + + pdec->reservoir = 0; + pdec->nbits_in_reservoir = 0; + pdec->stream = rawyuv + 1; /* The first byte of the stream is skipped */ + + get_nbits(pdec, 4, compression_index); + + /* pass 1: uncompress Y component */ + nblocks = compressed_image_width / 4; + + ptable0004 = pdec->table_0004_pass1[compression_index]; + ptable8004 = pdec->table_8004_pass1[compression_index]; + + /* Each block decode a square of 4x4 */ + while (nblocks) { + decode_block(pdec, ptable0004, ptable8004); + copy_image_block_Y(pdec->temp_colors, planar_y, real_image_width, pdec->scalebits); + planar_y += 4; + nblocks--; + } + + /* pass 2: uncompress UV component */ + nblocks = compressed_image_width / 8; + + ptable0004 = pdec->table_0004_pass2[compression_index]; + ptable8004 = pdec->table_8004_pass2[compression_index]; + + /* Each block decode a square of 4x4 */ + while (nblocks) { + decode_block(pdec, ptable0004, ptable8004); + copy_image_block_CrCb(pdec->temp_colors, planar_u, real_image_width/2, pdec->scalebits); + + decode_block(pdec, ptable0004, ptable8004); + copy_image_block_CrCb(pdec->temp_colors, planar_v, real_image_width/2, pdec->scalebits); + + planar_v += 8; + planar_u += 8; + nblocks -= 2; + } + +} + +/** + * pwc_dec23_decompress - Uncompress a pwc23 buffer. + * @pdev: pointer to pwc device's internal struct + * @src: raw data + * @dst: image output + */ +void pwc_dec23_decompress(struct pwc_device *pdev, + const void *src, + void *dst) +{ + int bandlines_left, bytes_per_block; + struct pwc_dec23_private *pdec = &pdev->dec23; + + /* YUV420P image format */ + unsigned char *pout_planar_y; + unsigned char *pout_planar_u; + unsigned char *pout_planar_v; + unsigned int plane_size; + + mutex_lock(&pdec->lock); + + bandlines_left = pdev->height / 4; + bytes_per_block = pdev->width * 4; + plane_size = pdev->height * pdev->width; + + pout_planar_y = dst; + pout_planar_u = dst + plane_size; + pout_planar_v = dst + plane_size + plane_size / 4; + + while (bandlines_left--) { + DecompressBand23(pdec, src, + pout_planar_y, pout_planar_u, pout_planar_v, + pdev->width, pdev->width); + src += pdev->vbandlength; + pout_planar_y += bytes_per_block; + pout_planar_u += pdev->width; + pout_planar_v += pdev->width; + } + mutex_unlock(&pdec->lock); +} |