AppleTech News

Linux 5.19: jumbo-sized IP packets and support for LoongArch

The open source kernel now supports the Chinese CPU development LoongArch and drills the network for bandwidths beyond 100 GBit/s.

The new Linux kernel 5.19 has been released – a week later than planned: It was one of the rare cases in which Linus Torvalds ordered the kernel an extra week of testing. One reason was Retbleed, a variant of the CPU vulnerability Spectre. Patches to fix the vulnerability resulted in a performance hit in Linux’s seventh release candidate (rc7), prompting further fixes and more extensive testing. Even btfs patches withdrawn in rc7 at short notice brought additional effort in an already busy development cycle.

In addition to Retbleed, the graphics driver for Intel Alder Lake P was also a problem child. In rc7, the new graphics driver required that the GuC firmware also be updated to version 70. If this was not up to date, the graphics only worked without hardware acceleration.

This violated a core principle of kernel developers. Changes to the kernel must never affect userspace (“never break the userspace”). Intel then had to deliver a fix that also provides the usual features with older firmware. Otherwise the new driver could have fallen out of the kernel.

Red Hat developer David Airlie then formulated firmware guidelines to capture this unwritten law. They should be included in the upcoming kernel as part of the documentation.

Linux 5.19 has BIG TCP in its luggage for data centers and providers. In order to avoid the maximum of 64 KB for IP packets, directly connected network hosts (“hops”) in IPv6 can negotiate packet sizes of up to 4 GB using special headers. To do this, these “jumbo packets” (also called jumbograms) set the length field in the IP header to zero and add a hop-by-hop header with the actual packet length.

With 5.19, the hop-by-hop defined in RFC 2675 is now also supported by the mainline kernel for TCP. The kernel is thus geared up to be able to serve bandwidths beyond 100 Gbit/s in the future.

In practice, however, the jumbo packages have their pitfalls and are therefore not activated by default. For example, many tools like tcpdump and some eBPF programs trip over the extra header. You are expecting the TCP header right after the IP header. If hop-by-hop is used, the hop-by-hop header is found between the two, which confuses tcpdump and some eBPF programs.

In addition, the network drivers must be designed for this. Not all drivers have been adapted for a long time. This will fill the agenda for future kernel releases. Also, it’s not a trivial customization. For example, some network cards can generate packets directly in the interface hardware via segment offloading, which complicates the adaptation.

Multipath TCP (MPTCP) is used to bundle multiple TCP connections into one TCP session. This allows multiple network paths to be used for a connection between systems. This is required, for example, to use two Internet access points in parallel, to increase the download speed or to protect yourself against failures without interruption. The individual TCP connections that establish an MPTCP session are called subflows.

So far, a TCP connection created as MPTCP was always MPTCP, even if it only contained one subflow. The new kernel now also allows MPTCP to fall back to “normal” TCP in selected situations when the MPTCP features cannot be used, and thus to get rid of the MPTCP overhead when it does not add any value.

In addition, Linux 5.19 also has a new userspace API to manage subflows. Subflows can be added to or removed from an MPTCP connection, for example. The documentation on this is still very thin. Interested parties can find initial information in the comment for the relevant commit.

With the new version 5.19, Linux supports the Chinese CPU architecture LoongArch for the first time. The RISC architecture is very similar to MIPS and RISC-V. In addition to a 32-bit standard variant (LA32S), it also offers a slimmed-down 32-bit version (LA32R) and a 64-bit architecture (LA64).

In the x86 environment, the new kernel drops a number of boot options: nosep, nosmap, nosmep, noexec, and noclflush. Up until now, it was possible to switch off security-related features using the switches in the kernel command line (cmdline). Originally introduced for backward compatibility with old hardware, these switches no longer make sense from today’s perspective. This plus in security, which is now considered the standard, should no longer be dispensed with in the future.

Linux 5.19 drops the old a.out format for x86 executable programs. The a.out, which is historical and has several disadvantages, has long since been superseded by the more modern ELF (Executable and Linking Format) in Linux. Version 5.1 already marked the a.out format as deprecated.

The h8300 architecture experienced an emotional roller coaster ride. The microcontrollers had already been removed from the kernel in 2013 as obsolete. Two years later they returned to the mainline kernel. Since since then (again) no significant work has been done on it, Linux removes h8300 again. It remains to be seen whether another revival will follow. H8/300 CPUs are still used in the embedded environment.

When it comes to power management, the kernel fixes a major problem on Intel notebooks. When the system tried to go into deep sleep, the battery ran out quickly. Intel has followed up and fixed the problem for Linux 5.19.

Apart from that, the latest kernel improves the power management on Intel, improves on APCI and updates the temperature monitoring. The newly added processors from the Raptor Lake and Alder Lake N generations, which have not yet been released, also benefit from the improved power management.

When it comes to file systems, the innovations in XFS stand out. The file system has been made able to store not only 4 billion extended attributes in an inode, but now up to 2⁴⁷. What sounds like excessive exaggeration focuses on specific areas of application. For example, the XFS developers want to store reverse directory references for online checking and repairing the file system in it, and also store data on internal tree structures for fsverify in it. Instead of developing an additional database for such information, the developers decided to use the implicit already existing key-value store in the form of the extended attributes.

Linux 5.19 XFS also provides “logged attribute replay”. This mechanism makes it possible to change multiple extended attributes of a file in a single atomic operation. This means that statuses of the file system that span multiple attributes can always be kept consistent.

Linus Torvalds put a personal note in his release message. For the first time he released a kernel on an ARM64-based system. He had been waiting for this for a “long” time. ARM64 has supported Linux for quite some time, but no system could have served as a development platform until now.

He’s using an M1-based MacBook to release 5.19. It is the third time that he uses Apple hardware for Linux development. A few years earlier a PowerMac G5 (ppc970) for PowerPC development and a MacBook Air as the lightest Intel notebook variant at the time.

He now also wants to use the M1 system when traveling so that he can also use his own developments on the ARM64 side. He speaks of “dogfooding”, which essentially means “using your own products and services” as “eating your own dog food”.

Since shortly after the release of Linux 5.19, the two-week patch window has been open for submitting new features and changes for the next kernel. As Linus Torvalds hints in his release note, the new kernel will not have the expected version number 5.20. True to a “dislike for large version numbers”, Torvalds plans to christen the next kernel 6.0. It will not be a version jump due to technical reasons, but only an organizational one. The situation was similar with the jump from 4.22 to 5.0.

All changes and innovations from 5.19 are available for reading in the ChangeLog. The kernel sources are available for download at kernel.org.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button