I’ve been enjoying using the JetBrains IDE CLion to do some refactoring and improvements to the Auctions code base. However, when I tried to build the Mac app bundle with it, the app failed to launch:
2022-07-30 19:54:15.117 Auctions[80371:16543044] Unable to load nib file: Auctions, exiting
The XIB files were definitely part of the CMake project. I later learned that CMake does not automatically add XIB compilation targets to a project. It relies on the Xcode generator to do that.
I found a long-archived documentation page from CMake on the Kitware GitLab that described a method to build NIB files from XIBs, and have modified it to make it simpler for Auctions.
You can see the change in the commit diff, but I’ll include the snippet here for posterity.
First, you define an array with the XIB file names with no suffix. For instance, I’ve done set(COCOA_UI_XIBS AXAccountsWindow AXSignInWindow Auctions) for the three XIB files presently in the codebase.
Now it starts correctly and works properly when built from within CLion. This was surprisingly difficult to debug and fix, so I hope this post can help others avoid the hours of dead ends that I endured.
Today, I would like to discuss a project that I care very deeply about: the musl libc. One of the most controversial and long-standing debates in the musl community is that musl does not define a preprocessor macro.
What’s in a macro?
Simply put, preprocessor macros allow C code to build parts of itself conditionally. For example, the GNU libc defines the “__GLIBC__” macro. If your code needs to do something specific to function properly on systems using that library, it can conditionally build that code using “#ifdef __GLIBC__”.
The authors of musl have said that they will not add a preprocessor macro identifying the platform as musl because:
It’s a bug to assume a certain implementation has particular properties rather than testing.
I agree with this sentiment in theory, and in an idealised world this would hold up. However, I’d like to discuss why I think this may need to be reconsidered moving forward.
Sometimes you can’t test
One major reason this is an issue is that sometimes it is not possible to do what the authors consider the “correct” form of testing, which is compile-testing.
This practice requires you to build a small test program, determine whether it built properly, determine its runtime characteristics, and then use the results of that test to influence how your actual software is built. This is an alternative to using the conditional code with preprocessor macros.
However, there are many reasons you may not be able to successfully perform such testing. Cross compilation is a large gap here. In fact, many years ago when I was starting the Adélie project, this caused failures in the base image I was building.
The Bash shell could not perform any compile-time or run-time checks because it was being cross-compiled from a GNU libc system to a musl libc system. This caused it to use “fallback” code that worked improperly. If musl had defined a __MUSL__ macro, Bash would not have needed to assume it was running on a pre-POSIX system.
Similarly, the mailing list thread that made me feel strongly enough to write this article involves a header-only library. These types of libraries are meant to be “drop-in” and function without any changes to a developer’s build system. If header-only libraries start requiring you to use build-time tests, you lose the main reason to use them in the first place.
The author of this thread correctly points out that FreeBSD versions their API with a preprocessor macro. Any software that requires a certain API can simply ensure that __FreeBSD_version is defined as greater-or-equal than the versions that introduced that API.
The main reason that the musl project is fearful of this approach, at least to my observation, is that features or APIs (or indeed, bug fixes) can be backported to prior versions. I feel very strongly that this is not the responsibility of the libc.
If a distribution backports a feature, API, or patch to an older version of a library, it is that distribution’s responsibility to ensure that the software they build against it continues to function. When I backported an API from Qt 5.10 to 5.9 to ensure KDE continued building for Adélie, it was my responsibility as maintainer of those packages to keep them building properly. It certainly does not mean Qt should stop defining a preprocessor macro to determine the version being built against.
Additionally, some APIs are privileged. Determining whether these APIs work correctly using run-time testing can prevent CI/CD from working properly because the CI user does not have permission to use them.
A versioned macro like FreeBSD’s makes sense
I feel that the best way forward for musl is to define a macro like FreeBSD’s. It monotonically increases as APIs or features are added.
I agree that simple bug fixes, and even behavioural changes, probably should not be tracked with this macro. However, this would make it significantly easier to use new APIs as they are introduced.
It also makes builds more efficient. The cost of compile-time tests racks up quickly. On my POWER9 Talos workstation, typical ./configure runs take longer than the builds themselves. This is because fork+exec is still a slow path on POWER. It is similar on ARM, MIPS, and many other RISC architectures.
Macros like these don’t fully eliminate the need for ./configure, but they lessen the workload. Compile-time tests make sense for behaviour detection, but they do not make sense for API detection.
Over the July 4th holiday weekend, I was working on a secret project. It was a resounding success and I can now announce to the world: Spotify runs on musl distributions!
This article will describe how I went about accomplishing this feat. If you just want to take Spotify for a test drive on your Adélie workstation or Void desktop, scroll to the “Instructions” heading.
Greetz
Thanks to these fine dwellers of IRC for helping make sense of the twisty mazes.
[[sroracle]]
Aerdan
cb
dalias
skarnet
gcompat 0.4.0: how very cash LC_MONETARY of you
The latest release version of gcompat did not get very far:
awilcox on laptop spotify % ./spotify
Segmentation fault (core dumped)
Inspecting the core file was minimally helpful:
Thread 1 "ld-musl-x86_64." received signal SIGSEGV, Segmentation fault.
0x0000000001d6ff60 in ?? ()
(gdb) bt
#0 0x0000000001d6ff60 in ?? ()
#1 0x00007fffffffd738 in ?? ()
#2 0x0000000001e94f13 in ?? ()
#3 0x00007fffffffd6d0 in ?? ()
#4 0x00007fffffffd738 in ?? ()
#5 0x0000000003e9d691 in ?? ()
#6 0x0000000003e9d698 in ?? ()
#7 0x0000000003e9d691 in ?? ()
#8 0x00007fffffffd738 in ?? ()
#9 0x00007fffffffdc40 in ?? ()
#10 0x0000000001ccd0f0 in ?? ()
#11 0x00007fffffffd7a0 in ?? ()
#12 0x0000000000000001 in ?? ()
#13 0x00007fffffffd720 in ?? ()
#14 0x0000000001e92e92 in ?? ()
#15 0x0000000003e9d691 in ?? ()
#16 0x0000000003e9d698 in ?? ()
#17 0x00007fffffffd738 in ?? ()
#18 0x00007fffffffd738 in ?? ()
#19 0x00007fffffffd760 in ?? ()
#20 0x0000000001e9dd51 in ?? ()
#21 0x00007fffffffdc40 in ?? ()
#22 0x0000000003e9b3e0 in ?? ()
#23 0x00007fffffffd7e8 in ?? ()
#24 0x00007fffffffd7b8 in ?? ()
#25 0x00007fffffffd7b8 in ?? ()
#26 0x00007fffffffd828 in ?? ()
#27 0x00007fffffffd810 in ?? ()
#28 0x0000000001e9df09 in ?? ()
#29 0x612f656d6f682f1a in ?? ()
#30 0x0000786f636c6977 in ?? ()
#31 0x0000000000000000 in ?? ()
(gdb) info registers
rax 0x54454e4f4d5f434c 6072345775086453580
rbx 0x53 83
rcx 0x53 83
rdx 0x2 2
rsi 0x53 83
rdi 0x3e9b1a0 65647008
rbp 0x7fffffffd6f0 0x7fffffffd6f0
rsp 0x7fffffffd690 0x7fffffffd690
r8 0x0 0
r9 0x0 0
r10 0x1 1
r11 0x7fffffffdb9c 140737488346012
r12 0x7fffffffd6b8 140737488344760
r13 0x7fffffffd6b0 140737488344752
r14 0x7fffffffd6a8 140737488344744
r15 0x7fffffffd6c0 140737488344768
rip 0x1d6ff60 0x1d6ff60
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
What are we trying to do? Looking at symbols present in the Spotify binary, this is actually part of the G++ runtime; specifically, std::ctype::do_tolower:
That rax value looks suspicious, and we can see if we translate it to ASCII that it is the little-endian representation of the string “LC_MONETARY”. We’re trying to reach 0x70 into a structure in %rax for a pointer value, but we’re getting a string instead.
It turns out that when libstdc++ is compiled on a glibc system, it will attempt to access the internal __ctype_* members in the locale_t of the current locale. musl’s locale_t is not ABI-compatible with glibc’s. In fact, it is only 48 bytes in length; 0x70 (or 112 bytes) is past the end of the locale object musl has provided it!
… but a blank white screen only. After some inspecting, I found that one of the many zygotes CEF was forking was segfaulting:
[158358.508029] ThreadPoolForeg[3230]: segfault at 0 ip 0000000000000000 sp 00007fe3203db448 error 14 in spotify[200000+1acd000]
[158365.067313] ThreadPoolForeg[3252]: segfault at 0 ip 0000000000000000 sp 00007f2d69c172e8 error 14 in spotify[200000+1acd000]
[158378.506832] ThreadPoolForeg[3312]: segfault at 0 ip 0000000000000000 sp 00007f52ed7c8448 error 14 in spotify[200000+1acd000]
[158383.654027] ThreadPoolForeg[3339]: segfault at 0 ip 0000000000000000 sp 00007fcb631eb2e8 error 14 in spotify[200000+1acd000]
I replaced libcef.so from the Spotify DEB package with a matched-version libcef.so from Spotify’s Open Source builds page. This allowed me to have more debugging symbols, and generating a core dump revealed:
Core was generated by `ld-linux-x86-64.so.2 --argv0 /usr/share/spotify/spotify --type=utility --field-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
[Current thread is 1 (LWP 12774)]
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00007f79a8a3d671 in sqlite3MallocSize () at ../../third_party/sqlite/amalgamation/sqlite3.c:26957
#2 mallocWithAlarm () at ../../third_party/sqlite/amalgamation/sqlite3.c:26891
#3 sqlite3Malloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:26913
#4 0x00007f79a8aff232 in sqlite3MallocZero () at ../../third_party/sqlite/amalgamation/sqlite3.c:27118
#5 pthreadMutexAlloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:25755
#6 0x00007f79a8a4e9b2 in sqlite3MutexAlloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:25298
#7 chrome_sqlite3_initialize () at ../../third_party/sqlite/amalgamation/sqlite3.c:24906
#8 0x00007f79a8a350bd in EnsureSqliteInitialized () at ../../sql/initialization.cc:55
#9 0x00007f79a8a30eb2 in OpenInternal () at ../../sql/database.cc:1357
#10 0x00007f79a8a30dfa in Open () at ../../sql/database.cc:270
#11 0x00007f79a8fb8de6 in InitializeDatabase () at ../../net/extras/sqlite/sqlite_persistent_store_backend_base.cc:99
#12 0x00007f79a8fb9751 in LoadNelPoliciesAndNotifyInBackground () at ../../net/extras/sqlite/sqlite_persistent_reporting_and_nel_store.cc:1041
#13 0x00007f79a5abe25b in Invoke<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), scoped_refptr, base::OnceCallback > () at ../../base/bind_internal.h:498
#14 MakeItSo<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), scoped_refptr, base::OnceCallback > ()
at ../../base/bind_internal.h:598
#15 RunImpl<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), std::__1::tuple<scoped_refptr, base::OnceCallback >, 0, 1> () at ../../base/bind_internal.h:671
#16 RunOnce () at ../../base/bind_internal.h:640
#17 0x00007f79a7776fa0 in Run () at ../../base/callback.h:98
#18 RunTask () at ../../base/task/common/task_annotator.cc:142
#19 0x00007f79a7792862 in base::internal::TaskTracker::RunBlockShutdown(base::internal::Task*) () at ../../base/task/thread_pool/task_tracker.cc:743
#20 0x00007f79a7792062 in RunTask () at ../../base/task/thread_pool/task_tracker.cc:598
#21 0x00007f79a77d42fb in RunTask () at ../../base/task/thread_pool/task_tracker_posix.cc:23
#22 0x00007f79a7791a43 in RunAndPopNextTask () at ../../base/task/thread_pool/task_tracker.cc:450
#23 0x00007f79a7798386 in RunWorker () at ../../base/task/thread_pool/worker_thread.cc:321
#24 0x00007f79a77980f4 in base::internal::WorkerThread::RunPooledWorker() () at ../../base/task/thread_pool/worker_thread.cc:223
#25 0x00007f79a77d4a05 in ThreadFunc () at ../../base/threading/platform_thread_posix.cc:81
#26 0x00007f79ac9fe2dd in ?? ()
#27 0x00007f79aca799e8 in ?? ()
#28 0x00007f7998247ce0 in ?? ()
#29 0x0000000000000000 in ?? ()
(gdb) frame 1
#1 0x00007f79a8a3d671 in sqlite3MallocSize () at ../../third_party/sqlite/amalgamation/sqlite3.c:26957
26957 return sqlite3GlobalConfig.m.xSize(p);
Inspecting the SQLite3 code, I realised that it was somehow getting a nullptr for the malloc_usable_size pointer. Further inspection revealed that this was not exactly the case:
(gdb) disassemble 0x7f79a77d5520
Dump of assembler code for function malloc_usable_size():
0x00007f79a77d5520 : push %rbp
0x00007f79a77d5521 : mov %rsp,%rbp
0x00007f79a77d5524 : mov %rdi,%rsi
0x00007f79a77d5527 : mov 0x484a76a(%rip),%rdi # 0x7f79ac01fc98
0x00007f79a77d552e : mov 0x28(%rdi),%rax
0x00007f79a77d5532 : xor %edx,%edx
0x00007f79a77d5534 : pop %rbp
0x00007f79a77d5535 : jmpq *%rax
End of assembler dump.
Looking at how the Chromium allocator works internally, the issue is that RTLD_NEXT won’t work on libraries loaded before libcef. And looking at the output of ldd spotify revealed both libm and libdl before libcef; musl always redirects these to libc for glibc ABI compatibility.
Using PatchELF to remove these two DT_NEEDEDs from the binary yielded a surprising result…
Music makes the people come together
Spotify, playing “Rhinestone Eyes” by Gorillaz, on my Adélie laptop
It works! All the features I tested work: Spotify Connect, which means I can control the laptop’s playback using the iOS and Apple Watch apps; radio playback; Bluetooth speaker support.
Instructions
You will need to download the official Spotify 64-bit DEB. I have not tested this on a 32-bit system yet, but I see no reason it won’t work. Once you have the DEB, extract the data.tar.xz file somewhere. Use PatchELF on the Spotify binary as so:
Move the extracted usr/share/spotify directory to your system’s /usr/share directory. For better integration, I moved the /usr/share/spotify/spotify.desktop file to /usr/share/applications. Then move the usr/bin/spotify link to /usr/bin.
Ensure that you have the latest gcompat installed. As I write this, only Adélie has the newest version in the current repo. I’ll be submitting merge requests to the distros I know that ship gcompat this week to ensure everyone has a chance to play around with the new bits.
Have fun!
Do you like running Spotify on musl? Or do you just like reading about fun hacks? Consider donating to Adélie to keep the fun going!
I’m currently in the process of trying to bring up the PowerPC platform as a fully supported architecture in Firefox. I’ve already implemented better support for XPCOM / JS interfacing, and fixed a crash in the JavaScript interpreter. My next challenge is fixing graphical issues, which is proving to be more of a challenge than I initially anticipated.
However, I have broken some new ground. Before, the compositing engine had wildly inaccurate colouring caused by errant “swizzle methods” (which are functions that take images of one colour type and change them to another – or, “swizzle” them). This resulted in Firefox 64 (Nightly) looking like this on my workstation:
Firefox, with a broken compositor
I’ve just managed to fix these methods and lo and behold, Firefox 64 (Nightly) looks like this now:
Firefox, with working compositor
Obviously, there are still some minor nits to work out. (Namely that my avatar in the top right corner has a blue tint to it!) I believe the last issues are going to be in the Cairo code, which seems to get very confused by Skia’s byte ordering. I already have a lead on how to potentially fix this issue. The good news is that there are no longer any (non-debug) crashers, unless you attempt to view H.264 video. This is because video playback via FFmpeg crashes due to another byte ordering issue.
All in all, I’m very satisfied with what I was able to knock out in just a few hours on a Friday night. Thanks go to the #gfx chat room on Mozilla IRC for their guidance in what to look at.