The complexities of enabling OpenCL support

Hello, and welcome back to FOSS Fridays! One of the final preparations for the release of Adélie Linux 1.0-beta6 has been updating the graphical stack to support Wayland and the latest advancements in Linux graphics. This includes updating Mesa 3D. It’s been quite exciting doing the enablement work and seeing Wayfire and Sway running on a wide variety of computers in the Lab. And as part of our effort towards enabling Wayland everywhere, we have added a lot of support for Vulkan and SPIR-V. This has the side-effect of allowing us to build Mesa’s OpenCL support as well.

With Adélie Linux, as with every project we work on at WTI, we are proud to do our very best to offer the same feature set across all of our supported platforms. When we enable a feature, we work hard to enable it everywhere. Since OpenCL is now force-enabled by the Intel “Iris” Gallium driver, in addition to the Vulkan driver, I set off to ensure it was enabled everywhere.

Once again with LLVM

Mesa’s OpenCL support requires libclc, which is an LLVM plugin for OpenCL. This library in turn requires SPIRV-LLVM-Translator, which allows one to translate between LLVM IR and SPIR-V binaries. The Translator uses the familiar Lit test suite, like other components of LLVM. Unfortunately, there were a significant number of test failures on 64-bit PowerPC (big endian):

Total Discovered Tests: 821
  Passed           : 303 (36.91%)
  Failed           : 510 (62.12%)

Further down the rabbit hole

Digging in, I noticed that we had actually skipped tests in glslang because it had one test failure on big endian platforms. And while digging in to that failure, I found that the base SPIRV-Tools package was not handling cross-endian binaries correctly. That is, when run on a big endian system, it would fail to validate and disassemble little endian binaries, and when run on a little endian system, it would fail to validate and disassemble big endian binaries.

I found an outstanding merge request from a year ago against SPIRV-Tools which claimed to fix some endian issues. Applying that to our SPIRV-Tools package allowed those tools to function correctly for me. I then turned back to glslang and determined that the standalone remap tool was simply reading in the binary and assuming it would always match the host’s endianness. I’ve written a patch and submitted it in my issue report, but I am not happy with it and hope to improve it before opening a merge request.

Back to the familiar

I began looking around at the failures in SPIRV-LLVM-Translator and noticed that a lot of them seemed to revolve around unsafe assumptions on endianness. There were a lot of errors of the form:

error: line 11: Invalid extended instruction import ‘nepOs.LC’

Note that this string is actually ‘OpenCL.s’, byte-swapped on a 32-bit stride. Researching the specification, it defines strings as:

A string is interpreted as a nul-terminated stream of characters. All string comparisons are case sensitive. The character set is Unicode in the UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed four per word, following the little-endian convention (i.e., the first octet is in the lowest-order 8 bits of the word).

As an aside, I want to express my deep and earnest gratitude that standards bodies are still paying attention to endianness and ensuring their standards will work on the widest number of platforms. This is a very good thing, and the entire community of software engineering is better for it. This also serves as a great example of the wide scope in which we, as those engineers responsible for portability and maintainability, need to be aware and pay attention.

The standards language quoted above means that on a big endian system, it should actually be written to disk/memory as ‘nepOs.LC’. The Translator was not doing this encoding and therefore the binaries were not correct. I attempted to look at how the strings were serialised, and I believe I found the answer in lib/SPIRV/SPIRVStream.cpp, but it seemed like it would be a challenge to do things the correct way. I decided that for the moment, it would be enough to make the translator operate only on little endian SPIR-V files. After massaging the SPIRVStream.h file to swap when running on a big endian system, I significantly reduced the count of failing tests:

Total Discovered Tests: 821
  Passed           : 500 (60.90%)
  Failed           : 313 (38.12%)

However, now we had some interesting looking errors:

test/DebugInfo/X86/static_member_array.ll:50:10: error: CHECK: expected string not found in input
; CHECK: DW_AT_count {{.*}} (0x04)
         ^
<stdin>:73:33: note: scanning from here
0x00000096: DW_TAG_subrange_type [10] (0x00000091)
                                ^
<stdin>:76:2: note: possible intended match here
 DW_AT_count [DW_FORM_data8] (0x0000000400000000)
 ^

You will note that the failure is that 0x04 != 0x04’0000’0000. This is what happens if you store a 32-bit value into a 64-bit pointer using bad casting, which is very similar to the Clang bug I found and fixed last month. On a hunch, I decided to look at all of the reinterpret_casts used in the translator’s code base, and I hit something that seemed promising. SPIRVValue::getValue, where they were doing equally questionable things with pointers, was amenable to a quick change:

#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
    if (ValueSize > CopyBytes) {
      if (ValueSize == 8 && CopyBytes == 4) {
        uint8_t *foo = reinterpret_cast<uint8_t *>(&TheValue);
        foo += 4;
        std::memcpy(&foo, Words.data(), CopyBytes);
        return TheValue;
      }
      assert(ValueSize == CopyBytes && "Oh no");
    }
#endif

Unfortunately, this only fixed eight tests. The last time there was an endian bug in an LLVM component, it was somewhat easier to find because I could bisect the error. There was a version at which point it had worked in the past, so the change revealed where the error was hiding. Not so with this issue: it seems to have been present since the translator was first written.

Upstream has been apprised of this issue nearly a year ago, with no movement. The amount of time it would take to do a proper root cause analysis on a codebase of this size (65,000+ lines of LLVM plugin style C++) would be prohibitive for the time constraints on beta6. I’d like to work on it, but this is a low priority unless someone wants to contract us to work on it.

The true impact of non-portable code

There is no way to conditionalise a dependency on architecture in an abuild package recipe. This means even if we condition enabling OpenCL support in Mesa on x86_64, Mesa on all other platforms would still pull the broken translator as a build-time dependency. It is likely that libclc itself would not build properly with a broken translator, meaning Mesa’s dependency graph would be incomplete on every other architecture due to libclc as well.

Unfortunately, this means I had to make the difficult decision to disable OpenCL globally, including on x86_64. We never had OpenCL support, so this isn’t a great loss for our users, and we can always come back to it later. However, a few of Mesa’s drivers require OpenCL: namely, the Intel Vulkan driver, and the Intel “Iris” Gallium driver, which supports Broadwell and newer. On these systems, using the iGPU will fall back to software rendering only. We cannot offer hardware acceleration on these systems until we enable OpenCL, for reasons that are not entirely clear to me. It was possible before, and while I understand the performance enhancements with writing shaders this way, not providing a fallback option is really binding us here.

It is my sincere hope that software authors, both corporate and individual, begin to realise the importance of portability and maintainability. If the SPIRV-LLVM-Translator was written with more portability in mind, this wouldn’t be an issue. If the Intel driver in Mesa was written with fewer dependencies in mind, this wouldn’t be an issue. The future is bright if we can all work together to create good, clean code. I greatly look forward to that future.

Until then, at least software rendering should work on Broadwell and newer, right?

Fear and loathing in kernel building

After a long and somewhat unreasonable delay, I have returned to bringing the Adélie Linux kernel package up to date with the latest LTS release, which at the time of this writing is 6.6.58.

Presently, we use the 5.15 LTS branch. I am hoping to see us land the 6.6 branch so that we can have support for newer hardware, features, and devices. There is also hope that there will be significant DRM improvements, allowing a better desktop experience for everyone.

Unfortunately, when it came time to build the x86_64 package, it failed to build. The kernel now requires elfutils to build an x86_64 kernel, even with CPU security issue mitigations disabled – and we don’t want to disable them anyway.

The elfutils library, being part of the GNU project, heavily relies on APIs that are only available in the GNU libc. It is not possible to build elfutils on a musl system without multiple shim libraries, in addition to patching out other behaviour that cannot be stubbed:

I have always been somewhat mistrustful of including software in the critical path that is not maintained and audited. And building the kernel is the most critical path in a distribution.

For this reason, if we must include an argp implementation, I want to make sure it is the best possible implementation we can have.

“Choice”: slim to none

I found a number of libraries that implement the argp interface, but all of them present significant challenges:

  • libargp: Based on gnulib code. Last commit: 9 years ago. Does not accept issues on GitHub, and links to a Bitbucket repository that has been removed. 9 years ago, gnulib didn’t support musl at all. In addition, the lack of ability to contact upstream isn’t great.
  • argp-standalone (Niels Möller edition): Based on glibc, which is what we are trying to emulate anyway. Last release: 20 years ago, which is approaching legal drinking age in the US. Pass.
  • argp-standalone (Érico edition): Based on glibc again. Last commit: 2 years ago. Okay, reasonable. The issues and pull requests are piling up, though: the build system isn’t generated in the released tar files, the install target doesn’t work, a shared library isn’t supported, and more.
  • argp-standalone (org edition): Somewhat we are now three forks deep. Last commit: just three months ago! It uses Meson as a build system, it seems to care about portability… but it fails to support non-English locales. It also appears to have issues with building on GCC 14, possibly, which could present issues in the future. They have self-identified that they should make a new release in June 2023, which is great, but they didn’t actually do it.

Honestly, the last option in that list isn’t so bad. Translation support could always be added later. However, these packages need to be added to our very core critical path of early packages built for the whole system. For that reason, we need to be excessively picky about:

  • Quality of implementation — is this trustworthy enough to be at the very centre of our dependency graph?
  • Dependencies — the Meson build system is great, but that introduces either Python or Muon into the very early graph, before the kernel is even built, which means kernel headers aren’t available.
  • Infrequency of updates — realistically, since changing packages this deep in the graph necessitates rebuilds of everything, updates cannot be done frequently.

And it is for that reason I am annoyed at this situation. The kernel has introduced a build time dependency that, at least on musl libc systems, presents a lot of uncomfortable challenges.

Oh well, it could be worse. It could be the Rust compiler, which means making Rust, LLVM, and all of their dependencies part of the early graph, and meaning that Rust compiler updates have to pass through the Platform Group and cause full system rebuilds!

I’ll see myself out now.

A funny thing happened on the way to the Java bootstrap…

One of my best friends, Horst, is working on “bootstrapping” Java in the Adélie Linux distribution. This means we will be able to build the Java runtime entirely from source, not relying on binaries from Oracle or others – which means we can certify and trust that our Java is free from any third-party code. For a little “light reading” on this subject, see Ken Thompson’s seminal 1983 paper, Reflections on Trusting Trust, and the Bootstrappable Builds site.

Roughly, the Java bootstrap looks like this: build a very old Java runtime that was written in C++, use that to build a slightly less old Java runtime written in Java, and up from there. And while the very old Java runtime that he was using built fine on his AMD Ryzen system, and also seemed to work great on the Arm-based Raspberry Pi, it hung (locked up, froze) on my Power9-based Talos II.

Looking up JamVM on PowerPC systems, Horst found a Gentoo bug from 2007 that describes the issue exactly. There was no solution found by the Gentoo maintainers; they simply removed the ability of PPC64 computers to install the JamVM package.

This obviously wouldn’t do for us. Adélie treats the PPC64 architecture as tier-1: we have to make sure everything works on PPC64 systems. So, I dove into the code and began spelunking around for causes.

The code that ensures thread safety seemed to be at fault, but I couldn’t tease out the real issue at first. I built JamVM with ASAN and UBSAN and found a cavalcade of awfulness, including a home-grown memory allocator that was returning misaligned addresses. That’s sure to give us trouble if we ever endeavour towards SPARC. There were a few signedness errors and a single byte buffer-overrun as well.

I fixed all of those issues, and while JamVM was now “sanitiser-clean” and also reported no issues in Valgrind, it was still hanging. I added some debug statements to the COMPARE_AND_SWAP routine and found that no debugging happened. I thought that was odd, but then I saw the problem: the 64-bit code was set to only compile if “__ppc64__” was defined. Linux defines “__PPC64__”, in all-caps, not lowercase.

I changed that, and for good measure also fixed the types it uses for reading the swapped variable and returning the result, and recompiled. Lo and behold, JamVM was now working on my 64-bit PowerPC.

And that is how I accidentally stumbled upon, and fixed, a 17-year-old bug in an ancient Java runtime. That bug would have been old enough to drive…

The patch is presently fermenting in a branch in Adélie’s packages.git, but will eventually land as bootstrap/jamvm/ppc64-lock.patch.

Porting systemd to musl libc-powered Linux

I have completed an initial new port of systemd to musl. This patch set does not share much in common with the existing OpenEmbedded patchset. I wanted to make a fully updated patch series targeting more current releases of systemd and musl, taking advantage of the latest features and updates in both. I also took a focus on writing patches that could be sent for consideration of inclusion upstream.

The final result is a system that appears to be surprisingly reliable considering the newness of the port, and very fast to boot.

Why?

I have wanted to do this work for almost a decade. In fact, a mention of multiple service manager options – including systemd – is present on the original Adélie Web site from 2015. Other initiatives have always taken priority, until someone contacted us at Wilcox Technologies Inc. (WTI) interested in paying on a contract basis to see this effort completed.

I want to be clear that I did not do this for money. I believe strongly that there is genuine value in having multiple service managers available. User freedom and user choice matter. There are cases where this support would have been useful to me and to many others in the community. I am excited to see this work nearing public release and honoured to be a part of creating more choice in the Linux world.

How?

I started with the latest release tag, v256.5. I wanted a version closely aligned to upstream’s current progress, yet not too far away from the present “stable” 255 release. I also wanted to make sure that the fallout from upstream’s removal of split-/usr support would be felt to its maximum, since reverting that decision is a high priority.

I fixed build errors as they happened until I finally had a built systemd. During this phase, I consulted the original OE patchset twice: once for usage of GLOB_BRACE, and the other for usage of malloc_info and malloc_trim. Otherwise, the patchset was authored entirely originally, mostly through the day (and into the night) of August 16th, 2024.

Many of the issues seen were related to inclusion of headers, and I am already working on bringing those fixes upstream. It was then time to run the test suite.

Tests!

The test suite started with 27 failures. Most of them were simple fixes, but one that gave me a lot of trouble was the time-util test. The strptime implementation in musl does not support the %z format specifier (for time zones), which the systemd test relies on. I could have disabled those tests, but I felt like this would be taking away a lot of functionality. I considered things like important journals from other systems – they would likely have timestamps with %z formats. I wrote a %z translation for systemd and saw the tests passing.

Other test failures were simple C portability fixes, which are also in the process of being sent upstream.

The test suite for systemd-sysusers was the next sticky one. It really exercises the POSIX library functions getgrent and getpwent. The musl implementations of these are fine, but they don’t cope well with the old NIS compatibility shims from the glibc world. They also can’t handle “incomplete” lines. The fix for incomplete line handling is pending, so in the meantime I made the test have no incomplete lines. I added a shim for the NIS compatibility entries in systemd’s putgrent_sane function, making it a little less “sane” but fixing the support perfectly.

Then it was time for the final failing test: test-recurse-dir, which was receiving an EFAULT error code from getdents64. Discussing this with my friends on the Gentoo IRC, we began to wonder if this was an architecture-specific bug. I was doing my port work on my Talos II, a 64-bit PowerPC system. I copied the code over to an Intel Skylake and found the test suite passed. That was both good, in that the tests were all passing, but also bad, because it meant I was dealing with a PPC64-specific bug. I wasn’t sure if this was a kernel bug, a musl bug, or a systemd bug.

Digging into it further, I realised that the pointer math being done would be invalid when cast to a pointer-to-structure on PPC64 due to object alignment guarantees in the ABI. I changed it to use a temporary variable for the pointer math and casting that temporary, and it passed!

And that is how I became the first person alive to see systemd passing its entire test suite on a big-endian 64-bit PowerPC musl libc system.

The moment of truth

I created a small disk image and ran a very strange command: apk add adelie-base-posix dash-binsh systemd. I booted it up as a KVM VM in Qemu and saw “Welcome to Adélie Linux 1.0 Beta 5” before a rather ungraceful – and due to Qemu framebuffer endian issues, colour-swapped – segmentation fault:

Welcome to an endian-swapped systemd core dump!

Debugging this was an experience in early systems debugging that I haven’t had in years. There’s a great summary on this methodology at Linus’s blog.

It turned out that I had disabled a test from build-util as I incorrectly assumed that was only used when debugging in the build root. Since I did not want to spend time digging around how it manually parses ELF files to find their RPATH entries for a feature we are unlikely to use, I stubbed that functionality out entirely. We can always fix it later.

Recreating the disk image and booting it up, I was greeted by an Adélie “rescue” environment booted by systemd. It was frankly bizarre, but also really cool.

The first time systemd ever booted an Adélie Linux system.

From walking to flying

Next, I built test packages on the Skylake builder we are using for x86_64 development. I have a 2012 MacBook Pro that I keep around for testing various experiments, and this felt like a good system for the ultimate experiment. The goal: swapping init systems with a single command.

It turns out that D-Bus and PolicyKit require systemd support to be enabled or disabled at build-time. There is no way to build them in a way that allows them to operate on both types of init system. This is an area I would like to work on more in the future.

I wrote package recipes for both that are built against systemd and “replace” the non-systemd versions. I also marked them to install_if the system wanted systemd.

Next up were some more configuration and dependency fixes. I found out via this experiment that some of the Adélie system packages do not place their pkg-config files in the proper place. I also decided that if I’m already testing this far, I’d use networkd to bring up the laptop in question.

I ran the fateful command apk del openrc; apk add systemd and rebooted. To my surprise, it all worked! The system booted up perfectly with systemd. The oddest sight was my utmps units running:

systemd running s6-ipcserver. The irony is not lost on me.

Still needed: polish…

While the system works really well, and boots in 1/3rd the time of OpenRC on the same system, it isn’t ready for prime time just yet.

Rebooting from a KDE session causes the compositor to freeze. I can reboot manually from a command line, or even from a Konsole inside the session, but not using Plasma’s built-in power buttons. This may be a PolicyKit issue – I haven’t debugged it properly yet.

There aren’t any service unit files written or packaged yet, other than OpenSSH and utmps. We are working with our sponsor on an effort to add -systemd split packages to any of the packages with -openrc splits. We should be able to rely on upstream units where present, and lean on Gentoo and Fedora’s systemd experts to have good base files to reference when needed. I’ve already landed support for this in abuild.

…and You!

This project could not have happened without the generous sponsors of Wilcox Technologies Inc (WTI) making it possible, nor without the generous sponsors of Adélie Linux keeping the distro running. Please consider supporting both Adélie Linux and WTI if you have the means. Together, we are creating the future of Linux systems – a future where users have the choice and freedom to use the tooling they desire.

If you want to help test this new system out, please reach out to me on IRC (awilfox on Interlinked or Libera), or the Adéliegram Telegram channel. It will be a little while before a public beta will be available, as more review and discussion with other projects is needed. We are working with systemd, musl, and other projects to make this as smooth as possible. We want to ensure that what we provide for testing is up to our highest standards of quality.