systemd through the eyes of a musl distribution maintainer

Welcome back to FOSS Fridays! This week, I’m covering a real pickle.

I’m acutely aware of the flames this blog post will inspire, but I feel it is important to write nevertheless. I volunteer my time towards helping to maintain a Linux distribution based on the musl libc, and I am writing an article about systemd. This is my take and my take alone. It is not the opinion of the project – or, as far as I am aware, any of the other volunteers working on it.

systemd, as a service manager, is not actually a bad piece of software by itself. The fact it can act as both a service manager and an inetd(8) replacement is really cool. The unit file format is very nice and expressive. Defining mechanism and leaving policy to the administrator is a good design.

Of course, nothing exists in a vacuum. I don’t like the encouragement to link daemons to libsystemd for better integration – all of the useful integrations can be done with more portable measures. And I really don’t like the fact they consider glibc to be “the Linux API” when musl, Bionic, and other libcs exist.

I’d like to dive into detail on the good and the bad of systemd, as seen through my eyes as all of: end user, administrator, and developer.

Service management: Good

Unit files are easy to write by hand, and also easy to generate in an automated fashion. You can write a basic service in a few lines, and grow into using the other features as needs arise – or you can write a very detailed file, dozens of lines long, making it exact and precise.

Parallel service starting and socket activation are first-class citizens as well, which is something very important to making boot-up faster and more reliable.

The best part about it is the concept that this configuration exactly describes the way the system should appear and exist while it is running. This is similar to how network device standards work – see NETCONF and its stepchild RESTCONF. You define how you want the device to look when it is running, apply the configuration, and eventually the device becomes consistent to that configuration.

This is a far cry from OpenRC or SysV init scripts, which focus almost exclusively on spawning processes. It’s a powerful paradigm shift, and one I wholeheartedly welcome and endorse.

Additionally, the use of cgroups per managed unit means that process tracking is always available, without messy pid files or requiring daemons to never fork. This is another very useful feature that not only helps with overall system control, but also helps debugging and even security auditing. When cgroups are used in this way, you always know which unit spawned any process on a fully-managed system.

Lack of competition: Not good

There is no reason that another service manager couldn’t exist with all of these features. In fact, I hope that there will be competition to systemd that is taken seriously by the community. Having a single package being all things for all use cases leads to significant problems. Changes in systemd will necessarily affect every single user – this may seem obvious, but that means it is more difficult for it to evolve. Evolution of the system may, and in some cases already has, break a wide number of use cases and machines.

Additionally, without competition there is no external pressure nudging it towards ideas and concepts that perhaps the maintainers aren’t sure about. GCC and Clang learn from each other’s successes and failures and use that knowledge to make each other better. There is no package doing that with systemd right now. Innovation is stifled where choice is removed.

Misnaming glibc as “the Linux API”: Bad

I am also unhappy about systemd’s lack of musl libc support. That is probably a blessing for me, because it’s an easy reason to avoid trying to ship it in Adélie. While I have just spent five paragraphs noting how great systemd is at service management, it is really bad at a lot of other things. This is where most articles go off the deep end, but I want to provide some constructive criticism on some of the issues I’ve personally faced and felt while using systemd-based machines.

The Journal: Very bad

journald is my least-favourite feature of systemd, bar none. While I understand the reasons why it was designed the way it was, I do not appreciate that it is the only way to log on a systemd system. Sure, you can ForwardToSyslog and set the journal to be in-memory-only with a small size, and pretend journald doesn’t exist. However, that is not only excess processor power and memory usage for negative gain, it’s also an additional attack surface. It would be great if there were a “stub” journald that was strictly a forwarder with no other code.

I am also unhappy with how the journal tries to “eat” core files. While the Linux default setting of “putting a file named ‘core’ in $CWD” is absolutely unusable for development and production, the weird mixture of FS and binary journal makes things needlessly complex. The documentation even explicitly calls out that core files may exist without corresponding journal entries, and journal entries may point to core files that no longer exist. Yet they use xattrs to put “some metadata” in the core files. Why not just have a sidecar file (maybe [core file name].info or .json or .whatever) that contains all the information from the journal, and have a single journal entry that points to that file if the administrator is interested in more information about the crash?

resolved: A solution looking for a problem

resolved might a decent idea on its own, but there are already other packages that can provide a local caching resolver without the many problems of resolved. Moreover, the very idea of a DNS resolver being part of “the system layer” seems ill-advised to me.

DNSSEC support is experimental and not handled correctly, and they readily admit that. It’s fine to know your limitations, but DNSSEC is something that is incredibly valuable to have on endpoints. I don’t really think resolved can be taken seriously without it. It’s beyond me how no one has contributed this feature to such a widely-used package.

There are odd issues with local domain search. This is made more complicated on home networks where a lot of what it does is overkill. On enterprise networks, it’s likely a bad fit anyway, which makes me question why it supports everything it does.

Lastly, and relatedly, in my opinion resolved tries to shoehorn too many odd features and protocols without having the basics done first. mDNS is better taken care of by a dedicated package like Avahi. LLMNR support has been deprecated by its creator Microsoft in favour of mDNS for over a year. As LLMNR has always been a security risk, I’m not sure why the support was added in the first place.

nspawn: Niche tool for niche uses

Any discussion including resolved would be remiss without mentioning the main reason it exists, and that is nspawn. It’s an interesting take on being “in between” chroot and a full container like Docker. It has niche uses, and I don’t have any real qualms with it, but I’ve never found it useful in any of my work so I don’t have a lot of experience with it. Usually when I am grabbing for chroot I want shared state between host and container, so nspawn wouldn’t make sense there. And when I grab for Podman, I want full isolation, which I feel more comfortable handing to a package that has more tooling around it.

Ancillary tools: Why in the system layer?

networkd is immature, doesn’t have a lot of support for advanced use cases, and has no GUI for end users. I don’t know why they want to stuff networking into the “system layer” when NetworkManager exists and keeps all the networking goop out of the system layer.

timedated seems like a cute way to allow users to change timezones via a PolicyKit action but otherwise seems like something that would be better taken care of by a “real” NTP client like Chrony or NTP. And again, I don’t know why it should live in the system layer.

systemd-boot only supports EFI, which makes it non-portable and inflexible. You won’t find EFI on Power or Z, and I have plenty of ARM boards that don’t support mainline U-Boot as well. This really isn’t a problem with systemd-boot, as it’s totally understandable to only want to deal with a single platform’s idiosyncrasies. What is concerning is the fact that distros like Fedora are pivoting away from GRUB in favour of it, which means they are losing even more portability.

In conclusion: A summary

What I really want to make clear with this article is:

  • I don’t blindly hate systemd, and in fact I really admire many of its qualities as an actual service manager. What I dislike is its attempt to take over what they term the “system layer”, when there are no alternatives available.
  • The problems I have with systemd are tangible and not just hand-wavy “Unix good, sysd bad”.
  • If there was an effort to have systemd separate from all of the other tentacles it has grown, I would genuinely push to have it be available as a service manager in Adélie. I feel that as a service manager – and only as a service manager – it would provide a fantastic user experience that cannot be rivaled by other existing solutions.

Thank you for reading. Have a great day, and please remember that behind every keyboard is a real person with real feelings.

32 thoughts on “systemd through the eyes of a musl distribution maintainer”

    1. I am very hopeful that s6-rc grows to be that alternative. With enough time and effort, wrappers around some of the most common systemd interfaces (like systemctl, unit files, and sd_notify) could probably even be created so the transition is “seamless”.

      My main concern is that s6-rc is at the end of the day still skaware, and while I know that Adélie (and likely Void and Alpine) wouldn’t be averse to it, I don’t know if distributions like Debian or Fedora could be convinced. It would be a true game changer if they could. And, by having the components of systemd more modular as I suggest, you could potentially even “mix and match”. Maybe some people like the journal, so they could use it with s6-rc if it was separate.

      Like

      1. A transition from/to systemd to a different view of the machine will never, and could never, be seamless, because systemd is hopelessly holistic and maximalist.

        I have written a document for a customer who was thinking about writing an automated unit file conversion tool, and it took me *a week* to fully analyze the unit file format and write this document. And the results aren’t exactly hopeful: https://skarnet.org/software/s6/unit-conversion.html

        The best effort I’ve seen to provide systemd as an alternative to another service manager was done by Serge E. Hallyn in Ubuntu 16.04, that supported both systemd and sysvinit with sysv-rc. It was actually really well done – but ultimately it was a lot of hacks to fit a bunch of square pegs into round holes, and not in the fun way. The amount of duct tape and voodoo curses was just too high – and I don’t think it would be at all reasonable for Adélie, or any other distro, to do anything resembling it. Unfortunately, I believe systemd has hopelessly fractured the distro world into “systemd-only” and “without systemd”.

        Liked by 1 person

      2. That document is actually a lot more reassuring than I thought it would be.

        I hold no illusions, and the word “seamless” was behind scare quotes for a reason. Simple units should be convertible without much issue, but the only way we could ever have true seamless transitions would be if the systemd project wanted to have that as a goal as well. And I highly doubt they will.

        Having a better service manager that is stable, accessible, and more reliable would definitely be a reason to go through the transition one way, though. 😉

        Like

    1. Or systemd-homed. Or systemd-logind.

      I’m still hoping to get a systemd-cdd and a systemd-rm-rfd.

      I like the idea of a great init system. I thoroughly dislike the feature creeping they’ve done. A lot of time with worse results than the projects they replaced.

      I would have loved for them to contribute to existing projects instead of going all NIH and using RedHat’s influence to push all that stuff in gnome and other large projects that distros need to distribute.

      Like

  1. Have you considered Devuan as the basis for your distribution? It has done quite the good job of removing systemd dependencies, and that should be a decent launch-pad for switching C libraries.

    If not – perhaps you could pick-and-choose parts of their work to achieve the same for your own distribution.

    Like

    1. Our distribution is fine without rebasing on Devuan. We aren’t having a problem with removing systemd from the packages we ship, and we’re already using musl. This article was primarily about how I wish systemd wasn’t as viral, maximalist, and anti-portable as it is.

      Like

  2. This was very well written, thank you. Nice to see others share a similar perspective on systemd. I don’t love it, but I don’t hate it either, this considering the totality of how huge it’s gotten.

    I basically forced it to be modularized with ArchLinux, limiting what it controls (avoiding resolved, networkd, timesyncd, etc). Wholeheartedly agree that if it was more pluggable out the gate, I’d be an advocate for it as well, instead of walking on eggshells with it.

    Liked by 1 person

  3. There will be no viable alternatives to systemd because because no one really wants to maintain alternatives to udev ( eudev ), logind (elogind ), tmpfiles.d ( opentmpfiles which is dead ) and that is the big problem.

    Like

    1. I mean, there *are* alternatives – mdevd, ConsoleKit, and I still don’t know why tmpfiles.d needs to exist at all. If there were a *popular* alternative to systemd, that integrated all of those components, there could be a resurgence of support in upstream packages.

      Like

  4. My contention for the systemd project lies mostly in leadership, messaging, and a chaotic configuration environment. Its technical structure is also immense. I’m only one guy, and even on my server I don’t have complex needs. systemd definitely feels like something meant for enterprise deployment at cloud scale. My values in computing these days are leaning more toward something like KISS Linux’s goal of being understandable — and maintainable — by a single person. There are practical limits to this, but I believe a lot of our computing problems come from too many layers and too much stuff that frankly, YAGNI. Naturally, systemd is anathema to that. It’s essentially the Leatherman or All-tool when I’m just looking for a pocket knife.

    I have run into misconfiguration and auto-misconfiguration of systemd on some distros. Ubuntu server and Debian like to re-enable Apache and disrupt my Lighttpd service. I have used `systemctl disable apache.service` or equivalent a number of times, enabling lighttpd with a similar command, and I still have to double-check every time I upgrade apache. The degree to which it’s a delicate balance of cards, and disrupting that balance creating *in*determinate boot state, is another part of why I don’t like it. Every reboot is flipping a coin whether or not I’ll have to go SSH in and determine what messed up. These problems didn’t happen to me on Gentoo, Alpine, or Adelie back when I used it. At worst, I may have to alter the service starting order. That’s bearable and it makes sense. Eventual consistency doesn’t really make sense to me. It feels sloppy.

    Even with documentation and an evening or weekend to learn, systemd feels like it actively wants to push you away from managing your system and take over for you. I’m not cool with that angle. There are at least three different places to find unit files, and the distinction in purpose isn’t all that clear. Then there’s the difference between unit.unit, unit.service, @unit.service, unit.socket, and more! It also lacks an ootb facility for a one-time script like /etc/rc.local, which may be necessary to patch small problems like sysfs settings until a new kernel is installed. Putting one together is possible as a one-shot unit I think they’re called, but why did I have to set that up when other rc systems generally provide that?

    In general, it feels like a tool that belongs in a Microsoft toolset. Seriously, I would expect Azure Linux to boot via systemd. Which is fitting, because Poettering’s efforts in systemd are part of what got him a job at Microsoft. Many developers like him simply engage in the ecosystem to prop up profitable careers, and leave the messes behind for others to deal with. It has left me with a feeling of skepticism toward any new tool and developer that gains traction. Intent and results matter to me, it’s about more than code.

    Who will be head of the next cult of personality that we’re told (not by you, to clarify) we can’t criticize, who will then gain some fame and a cushy job somewhere after leaving a mess in the ecosystem? Is that something that should be incentivized or rewarded? That’s where I see projects like systemd, PulseAudio, Wayland, and PipeWire. Software with a goal of displacing other software is pernicious to the ecosystem.

    I liked the points you covered on systemd’s short-comings, good write-up with points that people tend to miss or ignore. The unit file idea on its own isn’t terrible, but how it achieves that and the rest of the “modular” tools leave me feeling like my system is less stable and less discoverable, despite all the system tools being under the same git repo. Some of systemd’s goals are admirable and can be very useful, but accessing that utility is difficult and I don’t identify with the community or methodology at all.

    I’ve spent the last 3 years on normie distros, giving systemd many chances at winning me over and it just isn’t the right tool for me. In 2024, I’m returning to a distro and init system that won’t fight me for control of my computer or mysteriously ignore my explicit settings. I was hoping to avoid LFS and KISS but it seems that’s what it takes to get the control you want anymore.

    Apologies for the length, I tried to shorten and fit in every major point.

    Liked by 1 person

    1. The troubles you had with apache vs. lighttpd come down to you using the administration tools incorrectly. Don’t get me wrong, it’s an easy mistake to make, the lack of simple documentation or direction not helping.

      This specific issue also isn’t unique to systemd. It could have happened on a system with a different service management mechanism. There, the packaging system would or could analogously overwrite your administrative decision by means of “update-rc.d apache2 defaults” or “service apache2 enable” or whatever it does when managing packaged services.

      Basically, “systemctl disable apache2.service” is the wrong incantation. It doesn’t properly reflect what you intended. It can easily be rendered null and void by purge-removing and then reinstalling Apache.

      “disable”/”enable” are not meant as administrative permanent policies. Other system management tools are possibly using equivalents of those and thus will override a former change made with them.

      What you actually want is:

      systemctl mask –now apache2.service # the –now also stops the service if it is already running

      This is a policy decision that will survive package uninstalls, installs, upgrades etc.
      Even better, you can use this incantation prophylactically before even having the apache2 package installed and it will work as intended.

      Like

      1. I was unaware of the masking command. Why is that needed when ‘enable’ and ‘disable’ are explicitly meant to indicate what systemd enables or disables on boot? Enable and disable worked before, on other distros that used systemd. While the mask may be another, maybe better way to do it from within systemd, that’s not really the point.

        This is a problem of whatever tools Debian or Ubuntu use to make service decisions is not as visible as it needs to be. If I ever find myself in the same situation again, I might explore the ‘alternatives’ system they use, but to me it’s adding yet another layer onto a system layer that should already be doing the work. If something this complex wants to take over, then maybe it should also figure out that I explicitly enabled lighttpd while disabling apache2, and respect my wishes. That’s not hard. This isn’t Windows, we shouldn’t need “policy” to determine services on boot.

        The point is that other distros, service managers, and inits *don’t defy expectations*. Gentoo, Alpine, and Adélie never did this to me. They don’t surprise me with shit out of left field and require me to find the correct paragraph of documentation among a literal book’s worth of content. So rather than continue to fight the current, I will seek out or build an environment more suitable for me, which is closer to Gentoo, KISS, or LFS than Ubuntu or Debian. I have prior distribution experience so it’s within reach for me.

        Thanks for mentioning the feature though. I’ll try to remember it if I’m stuck on systemd again.

        Like

    2. First of all: yes, Debian and Ubuntu make a mess of systemd, caused by bad design and security decisions (auto-enable services by default), usage of old scripts, and not cleaning up their shit. Secondly, system administration means that you are capable of reading ten-step manuals and reading man pages and understanding why it is designed like this. A good example is your rant about the location of files; it is important that you find out not only where the files are but also why. The distinction is obvious and essential for security reasons and also helps with version management.

      Like

  5. Why is journald actually bad though? I could not tell from your article. If it was a forwarder, what would you forward logs to instead?

    As for the extra services like networkd, timedated, and systemd-boot, these belong in the system layer because they are system-level things. If systemd is booting up your whole desktop, then it makes sense to also make sure networking is setup and make the clock & timezones work seamlessly.

    Like

    1. The usual reproach is the binary format of its “log” files and how it fits badly with all other log-related tools. While it is probably highly optimized and all, it is a pain to work with, where just a simple text log with a logrotate configuration is simpler in most use case. Just purging some data from these logs is made more complex by this binary format.

      Like

      1. It may be more straightforward in your environment, but that does not mean it is a good thing in every environment. systemd-journald solves many problems: it catches all information around a process, without the need of syslog library implementations that only works when the application is started. Logging into text files can put unnecessary pressure on storage, etc. Furthermore, journalctl makes searching in log files very easy, especially if you are searching for specific information, for instance, if you want all the logs around a user, device, or process. That you think it fits badly with other log-related tools says more about your knowledge than about systemd-journald.

        Like

  6. Great article. I agree with all your points.

    I like systemds fundamentals but theres too much in there. I dont understand the need for networkd, timesyncd, timedated, resolved, boot, nspawn, machinectl etc. when more capable alternatives exist. Fortunately they’re optional, but I think systemd just has its fingers in too many pies. Lennart seems to see issues in other packages and his solution rather than to improve them is to create whole new alternatives with the systemd brand.

    Like

  7. I love systemd. This does not mean that I mean systemd should be the only option but I’ll never pick systemd-less over distribution with systemd. The amount of time I’ve saved thanks to systemd, journald and reliability it provided me simply make it an obvious pick. It’s also clear why systemd takes holistic approach then – they want guarantee that critical system components will be present (network connectivity, clock synchronization, boot are all a must). That said, in a perfect world one should be always able to supplement faulty/imperfect functions of systemd with alternatives. I hope chrony and NetworkManager will remain fully usable over what systemd offers and systemd-boot will stay away from my ecosystem…

    Like

  8. I’ll comment on the parts I am familiar with. There is no sense in making uneducated comments.

    Regarding lack of competition, that is actually a very good thing in practice as it brings distributions closer together. You only need to learn how to operate a different package manager these days to work on different distributions. The rest is straightforward. Nevertheless, I understand the principle behind competition leading to faster bug resolution. But, while systemd does need much better QC, the situation is still easier than fragmentation.

    The “linux API” part is essentially larger than systemd. It is common among substantial numbers of projects and distributions and tied to very specific toolchain versions. You’ll need to blame a larger ecosystem for that.

    Networkd is indeed lacking. I am open minded though and I would like to see cases where it has something to offer over networkmanager to even justify its existence but I haven’t come across anything yet. It also doesn’t support pppoe. But then again, people wrote it for free and I can just forget networkd exists and use networkmanager.

    systemd-boot and timesynd made my life easier. I don’t need to have grub anymore or figure out how to “generate” a configuration for it. Even two decades ago, I was better off writing a simplified grub config file manually.

    Like

  9. I would like for systemd to support other c libraries like musl, dietlibc, that is it should accept patches to make that possible. As a sysadmin working with quite a lot of distribution i like the way systemd simplyfies service management, i like journald, but i agree it would be nice if ther was a journald stub, so that you could disable it altogether.

    Like

  10. In a sense, systemd simplifies the old UNIX philosophy into: everything is a file or a process. So, in that view, networking, users/groups, time, etc., are part of the process. It also has many options to isolate properties of the process from, for instance, the filesystem; the optimal isolation is systemd-nspawn.

    Like

    1. systemd simplifies the old UNIX philosophy

      Except that it doesn’t do the part that really matters to a lot of people: do one thing and do it well.

      Like

      1. First of all, historically, the ‘do one thing and do it well’ is targeted developers: do not add features to existing programs that do not belong there. Instead, write a new program.

        Systemd has one job: process management, and it does that very well. The fact that some distros and system engineers do not know how to configure things properly is not something you can blame systemd for.

        Like

  11. Like the previous commenter, I don’t hate or love systemd. I don’t love the way journald provides info. It seems obfuscated. I don’t like that it doesn’t create a file in addition to binary, by default. I used networkd and systemd-bootd and liked them just fine. I’m sick of grub’s antiquated, flaky BS. NetworkManager has always been buggy. I use alpine on the server and i like UI of openrc, but it seems a little half-baked or buggy, as i’ve had some strange behavior. I like the idea of musl, but i want to be able to trust that it is not spawning unseen gremlins somewhere, not that it’s actually musl’s fault. I’m also tired of new system software being written in non-memory safe langs. It shouldn’t be acceptable to anyone. Skilled devs who don’t care about mem safety are questionable to me.

    Like

  12. I agree with nearly all of the points in the article. I have been assistive administrator, among other things, since 1985. I prefer my systems simple, hard to break, easy to troubleshoot, easy to administrate: systemd is not that. It provides too many different functions through uncertain interfaces with logging that can make troubleshooting more difficult. It seems to be trying too hard to be too many different things at once. it is the opposite of a simple, dependable, easy to troubleshoot system. That, from a different level, that goes much of the point of your article. it is not that it does not work, or should be avoided, it’s that it is absolutely not the best tool for any of the jobs that it does.

    Like

Leave a reply to ecstaticd3069e4e7e Cancel reply