systemd through the eyes of a musl distribution maintainer

Welcome back to FOSS Fridays! This week, I’m covering a real pickle.

I’m acutely aware of the flames this blog post will inspire, but I feel it is important to write nevertheless. I volunteer my time towards helping to maintain a Linux distribution based on the musl libc, and I am writing an article about systemd. This is my take and my take alone. It is not the opinion of the project – or, as far as I am aware, any of the other volunteers working on it.

systemd, as a service manager, is not actually a bad piece of software by itself. The fact it can act as both a service manager and an inetd(8) replacement is really cool. The unit file format is very nice and expressive. Defining mechanism and leaving policy to the administrator is a good design.

Of course, nothing exists in a vacuum. I don’t like the encouragement to link daemons to libsystemd for better integration – all of the useful integrations can be done with more portable measures. And I really don’t like the fact they consider glibc to be “the Linux API” when musl, Bionic, and other libcs exist.

I’d like to dive into detail on the good and the bad of systemd, as seen through my eyes as all of: end user, administrator, and developer.

Service management: Good

Unit files are easy to write by hand, and also easy to generate in an automated fashion. You can write a basic service in a few lines, and grow into using the other features as needs arise – or you can write a very detailed file, dozens of lines long, making it exact and precise.

Parallel service starting and socket activation are first-class citizens as well, which is something very important to making boot-up faster and more reliable.

The best part about it is the concept that this configuration exactly describes the way the system should appear and exist while it is running. This is similar to how network device standards work – see NETCONF and its stepchild RESTCONF. You define how you want the device to look when it is running, apply the configuration, and eventually the device becomes consistent to that configuration.

This is a far cry from OpenRC or SysV init scripts, which focus almost exclusively on spawning processes. It’s a powerful paradigm shift, and one I wholeheartedly welcome and endorse.

Additionally, the use of cgroups per managed unit means that process tracking is always available, without messy pid files or requiring daemons to never fork. This is another very useful feature that not only helps with overall system control, but also helps debugging and even security auditing. When cgroups are used in this way, you always know which unit spawned any process on a fully-managed system.

Lack of competition: Not good

There is no reason that another service manager couldn’t exist with all of these features. In fact, I hope that there will be competition to systemd that is taken seriously by the community. Having a single package being all things for all use cases leads to significant problems. Changes in systemd will necessarily affect every single user – this may seem obvious, but that means it is more difficult for it to evolve. Evolution of the system may, and in some cases already has, break a wide number of use cases and machines.

Additionally, without competition there is no external pressure nudging it towards ideas and concepts that perhaps the maintainers aren’t sure about. GCC and Clang learn from each other’s successes and failures and use that knowledge to make each other better. There is no package doing that with systemd right now. Innovation is stifled where choice is removed.

Misnaming glibc as “the Linux API”: Bad

I am also unhappy about systemd’s lack of musl libc support. That is probably a blessing for me, because it’s an easy reason to avoid trying to ship it in Adélie. While I have just spent five paragraphs noting how great systemd is at service management, it is really bad at a lot of other things. This is where most articles go off the deep end, but I want to provide some constructive criticism on some of the issues I’ve personally faced and felt while using systemd-based machines.

The Journal: Very bad

journald is my least-favourite feature of systemd, bar none. While I understand the reasons why it was designed the way it was, I do not appreciate that it is the only way to log on a systemd system. Sure, you can ForwardToSyslog and set the journal to be in-memory-only with a small size, and pretend journald doesn’t exist. However, that is not only excess processor power and memory usage for negative gain, it’s also an additional attack surface. It would be great if there were a “stub” journald that was strictly a forwarder with no other code.

I am also unhappy with how the journal tries to “eat” core files. While the Linux default setting of “putting a file named ‘core’ in $CWD” is absolutely unusable for development and production, the weird mixture of FS and binary journal makes things needlessly complex. The documentation even explicitly calls out that core files may exist without corresponding journal entries, and journal entries may point to core files that no longer exist. Yet they use xattrs to put “some metadata” in the core files. Why not just have a sidecar file (maybe [core file name].info or .json or .whatever) that contains all the information from the journal, and have a single journal entry that points to that file if the administrator is interested in more information about the crash?

resolved: A solution looking for a problem

resolved might a decent idea on its own, but there are already other packages that can provide a local caching resolver without the many problems of resolved. Moreover, the very idea of a DNS resolver being part of “the system layer” seems ill-advised to me.

DNSSEC support is experimental and not handled correctly, and they readily admit that. It’s fine to know your limitations, but DNSSEC is something that is incredibly valuable to have on endpoints. I don’t really think resolved can be taken seriously without it. It’s beyond me how no one has contributed this feature to such a widely-used package.

There are odd issues with local domain search. This is made more complicated on home networks where a lot of what it does is overkill. On enterprise networks, it’s likely a bad fit anyway, which makes me question why it supports everything it does.

Lastly, and relatedly, in my opinion resolved tries to shoehorn too many odd features and protocols without having the basics done first. mDNS is better taken care of by a dedicated package like Avahi. LLMNR support has been deprecated by its creator Microsoft in favour of mDNS for over a year. As LLMNR has always been a security risk, I’m not sure why the support was added in the first place.

nspawn: Niche tool for niche uses

Any discussion including resolved would be remiss without mentioning the main reason it exists, and that is nspawn. It’s an interesting take on being “in between” chroot and a full container like Docker. It has niche uses, and I don’t have any real qualms with it, but I’ve never found it useful in any of my work so I don’t have a lot of experience with it. Usually when I am grabbing for chroot I want shared state between host and container, so nspawn wouldn’t make sense there. And when I grab for Podman, I want full isolation, which I feel more comfortable handing to a package that has more tooling around it.

Ancillary tools: Why in the system layer?

networkd is immature, doesn’t have a lot of support for advanced use cases, and has no GUI for end users. I don’t know why they want to stuff networking into the “system layer” when NetworkManager exists and keeps all the networking goop out of the system layer.

timedated seems like a cute way to allow users to change timezones via a PolicyKit action but otherwise seems like something that would be better taken care of by a “real” NTP client like Chrony or NTP. And again, I don’t know why it should live in the system layer.

systemd-boot only supports EFI, which makes it non-portable and inflexible. You won’t find EFI on Power or Z, and I have plenty of ARM boards that don’t support mainline U-Boot as well. This really isn’t a problem with systemd-boot, as it’s totally understandable to only want to deal with a single platform’s idiosyncrasies. What is concerning is the fact that distros like Fedora are pivoting away from GRUB in favour of it, which means they are losing even more portability.

In conclusion: A summary

What I really want to make clear with this article is:

  • I don’t blindly hate systemd, and in fact I really admire many of its qualities as an actual service manager. What I dislike is its attempt to take over what they term the “system layer”, when there are no alternatives available.
  • The problems I have with systemd are tangible and not just hand-wavy “Unix good, sysd bad”.
  • If there was an effort to have systemd separate from all of the other tentacles it has grown, I would genuinely push to have it be available as a service manager in Adélie. I feel that as a service manager – and only as a service manager – it would provide a fantastic user experience that cannot be rivaled by other existing solutions.

Thank you for reading. Have a great day, and please remember that behind every keyboard is a real person with real feelings.