diff options
author | 2023-02-21 18:24:12 -0800 | |
---|---|---|
committer | 2023-02-21 18:24:12 -0800 | |
commit | 5b7c4cabbb65f5c469464da6c5f614cbd7f730f2 (patch) | |
tree | cc5c2d0a898769fd59549594fedb3ee6f84e59a0 /Documentation/translations/zh_TW/process/stable-api-nonsense.rst | |
download | linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.tar.gz linux-5b7c4cabbb65f5c469464da6c5f614cbd7f730f2.zip |
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextgrafted
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Diffstat (limited to 'Documentation/translations/zh_TW/process/stable-api-nonsense.rst')
-rw-r--r-- | Documentation/translations/zh_TW/process/stable-api-nonsense.rst | 159 |
1 files changed, 159 insertions, 0 deletions
diff --git a/Documentation/translations/zh_TW/process/stable-api-nonsense.rst b/Documentation/translations/zh_TW/process/stable-api-nonsense.rst new file mode 100644 index 000000000..22caa5b8d --- /dev/null +++ b/Documentation/translations/zh_TW/process/stable-api-nonsense.rst @@ -0,0 +1,159 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _tw_stable_api_nonsense: + +.. include:: ../disclaimer-zh_TW.rst + +:Original: :ref:`Documentation/process/stable-api-nonsense.rst + <stable_api_nonsense>` + +譯者:: + + 中文版維護者: 鍾宇 TripleX Chung <xxx.phy@gmail.com> + 中文版翻譯者: 鍾宇 TripleX Chung <xxx.phy@gmail.com> + 中文版校譯者: 李陽 Li Yang <leoyang.li@nxp.com> + 胡皓文 Hu Haowen <src.res@email.cn> + +Linux 內核驅動接口 +================== + +寫作本文檔的目的,是爲了解釋爲什麼Linux既沒有二進位內核接口,也沒有穩定 +的內核接口。這裡所說的內核接口,是指內核里的接口,而不是內核和用戶空間 +的接口。內核到用戶空間的接口,是提供給應用程式使用的系統調用,系統調用 +在歷史上幾乎沒有過變化,將來也不會有變化。我有一些老應用程式是在0.9版本 +或者更早版本的內核上編譯的,在使用2.6版本內核的Linux發布上依然用得很好 +。用戶和應用程式作者可以將這個接口看成是穩定的。 + + +執行綱要 +-------- + +你也許以爲自己想要穩定的內核接口,但是你不清楚你要的實際上不是它。你需 +要的其實是穩定的驅動程序,而你只有將驅動程序放到公版內核的原始碼樹里, +才有可能達到這個目的。而且這樣做還有很多其它好處,正是因爲這些好處使得 +Linux能成爲強壯,穩定,成熟的作業系統,這也是你最開始選擇Linux的原因。 + + +入門 +----- + +只有那些寫驅動程序的「怪人」才會擔心內核接口的改變,對廣大用戶來說,既 +看不到內核接口,也不需要去關心它。 + +首先,我不打算討論關於任何非GPL許可的內核驅動的法律問題,這些非GPL許可 +的驅動程序包括不公開原始碼,隱藏原始碼,二進位或者是用原始碼包裝,或者 +是其它任何形式的不能以GPL許可公開原始碼的驅動程序。如果有法律問題,請咨 +詢律師,我只是一個程式設計師,所以我只打算探討技術問題(不是小看法律問題, +法律問題很實際,並且需要一直關注)。 + +既然只談技術問題,我們就有了下面兩個主題:二進位內核接口和穩定的內核源 +代碼接口。這兩個問題是互相關聯的,讓我們先解決掉二進位接口的問題。 + + +二進位內核接口 +-------------- +假如我們有一個穩定的內核原始碼接口,那麼自然而然的,我們就擁有了穩定的 +二進位接口,是這樣的嗎?錯。讓我們看看關於Linux內核的幾點事實: + + - 取決於所用的C編譯器的版本,不同的內核數據結構里的結構體的對齊方 + 式會有差別,代碼中不同函數的表現形式也不一樣(函數是不是被inline + 編譯取決於編譯器行爲)。不同的函數的表現形式並不重要,但是數據 + 結構內部的對齊方式很關鍵。 + + - 取決於內核的配置選項,不同的選項會讓內核的很多東西發生改變: + + - 同一個結構體可能包含不同的成員變量 + - 有的函數可能根本不會被實現(比如編譯的時候沒有選擇SMP支持 + 一些鎖函數就會被定義成空函數)。 + - 內核使用的內存會以不同的方式對齊,這取決於不同的內核配置選 + 項。 + + - Linux可以在很多的不同體系結構的處理器上運行。在某個體系結構上編 + 譯好的二進位驅動程序,不可能在另外一個體系結構上正確的運行。 + +對於一個特定的內核,滿足這些條件並不難,使用同一個C編譯器和同樣的內核配 +置選項來編譯驅動程序模塊就可以了。這對於給一個特定Linux發布的特定版本提 +供驅動程序,是完全可以滿足需求的。但是如果你要給不同發布的不同版本都發 +布一個驅動程序,就需要在每個發布上用不同的內核設置參數都編譯一次內核, +這簡直跟噩夢一樣。而且還要注意到,每個Linux發布還提供不同的Linux內核, +這些內核都針對不同的硬體類型進行了優化(有很多種不同的處理器,還有不同 +的內核設置選項)。所以每發布一次驅動程序,都需要提供很多不同版本的內核 +模塊。 + +相信我,如果你真的要採取這種發布方式,一定會慢慢瘋掉,我很久以前就有過 +深刻的教訓... + + +穩定的內核原始碼接口 +-------------------- + +如果有人不將他的內核驅動程序,放入公版內核的原始碼樹,而又想讓驅動程序 +一直保持在最新的內核中可用,那麼這個話題將會變得沒完沒了。 +內核開發是持續而且快節奏的,從來都不會慢下來。內核開發人員在當前接口中 +找到bug,或者找到更好的實現方式。一旦發現這些,他們就很快會去修改當前的 +接口。修改接口意味著,函數名可能會改變,結構體可能被擴充或者刪減,函數 +的參數也可能發生改變。一旦接口被修改,內核中使用這些接口的地方需要同時 +修正,這樣才能保證所有的東西繼續工作。 + +舉一個例子,內核的USB驅動程序接口在USB子系統的整個生命周期中,至少經歷 +了三次重寫。這些重寫解決以下問題: + + - 把數據流從同步模式改成非同步模式,這個改動減少了一些驅動程序的 + 複雜度,提高了所有USB驅動程序的吞吐率,這樣幾乎所有的USB設備都 + 能以最大速率工作了。 + - 修改了USB核心代碼中爲USB驅動分配數據包內存的方式,所有的驅動都 + 需要提供更多的參數給USB核心,以修正了很多已經被記錄在案的死鎖。 + +這和一些封閉原始碼的作業系統形成鮮明的對比,在那些作業系統上,不得不額 +外的維護舊的USB接口。這導致了一個可能性,新的開發者依然會不小心使用舊的 +接口,以不恰當的方式編寫代碼,進而影響到作業系統的穩定性。 +在上面的例子中,所有的開發者都同意這些重要的改動,在這樣的情況下修改代 +價很低。如果Linux保持一個穩定的內核原始碼接口,那麼就得創建一個新的接口 +;舊的,有問題的接口必須一直維護,給Linux USB開發者帶來額外的工作。既然 +所有的Linux USB驅動的作者都是利用自己的時間工作,那麼要求他們去做毫無意 +義的免費額外工作,是不可能的。 +安全問題對Linux來說十分重要。一個安全問題被發現,就會在短時間內得到修 +正。在很多情況下,這將導致Linux內核中的一些接口被重寫,以從根本上避免安 +全問題。一旦接口被重寫,所有使用這些接口的驅動程序,必須同時得到修正, +以確定安全問題已經得到修復並且不可能在未來還有同樣的安全問題。如果內核 +內部接口不允許改變,那麼就不可能修復這樣的安全問題,也不可能確認這樣的 +安全問題以後不會發生。 +開發者一直在清理內核接口。如果一個接口沒有人在使用了,它就會被刪除。這 +樣可以確保內核儘可能的小,而且所有潛在的接口都會得到儘可能完整的測試 +(沒有人使用的接口是不可能得到良好的測試的)。 + + +要做什麼 +-------- + +如果你寫了一個Linux內核驅動,但是它還不在Linux原始碼樹里,作爲一個開發 +者,你應該怎麼做?爲每個發布的每個版本提供一個二進位驅動,那簡直是一個 +噩夢,要跟上永遠處於變化之中的內核接口,也是一件辛苦活。 +很簡單,讓你的驅動進入內核原始碼樹(要記得我們在談論的是以GPL許可發行 +的驅動,如果你的代碼不符合GPL,那麼祝你好運,你只能自己解決這個問題了, +你這個吸血鬼<把Andrew和Linus對吸血鬼的定義連結到這裡>)。當你的代碼加入 +公版內核原始碼樹之後,如果一個內核接口改變,你的驅動會直接被修改接口的 +那個人修改。保證你的驅動永遠都可以編譯通過,並且一直工作,你幾乎不需要 +做什麼事情。 + +把驅動放到內核原始碼樹里會有很多的好處: + + - 驅動的質量會提升,而維護成本(對原始作者來說)會下降。 + - 其他人會給驅動添加新特性。 + - 其他人會找到驅動中的bug並修復。 + - 其他人會在驅動中找到性能優化的機會。 + - 當外部的接口的改變需要修改驅動程序的時候,其他人會修改驅動程序 + - 不需要聯繫任何發行商,這個驅動會自動的隨著所有的Linux發布一起發 + 布。 + +和別的作業系統相比,Linux爲更多不同的設備提供現成的驅動,而且能在更多不 +同體系結構的處理器上支持這些設備。這個經過考驗的開發模式,必然是錯不了 +的 :) + +感謝 +---- +感謝 Randy Dunlap, Andrew Morton, David Brownell, Hanna Linder, +Robert Love, and Nishanth Aravamudan 對於本文檔早期版本的評審和建議。 + +英文版維護者: Greg Kroah-Hartman <greg@kroah.com> + |