Compiling XIBs with CMake without Xcode

I’ve been enjoying using the JetBrains IDE CLion to do some refactoring and improvements to the Auctions code base. However, when I tried to build the Mac app bundle with it, the app failed to launch:

2022-07-30 19:54:15.117 Auctions[80371:16543044] Unable to load nib file: Auctions, exiting

The XIB files were definitely part of the CMake project. I later learned that CMake does not automatically add XIB compilation targets to a project. It relies on the Xcode generator to do that.

I found a long-archived documentation page from CMake on the Kitware GitLab that described a method to build NIB files from XIBs, and have modified it to make it simpler for Auctions.

You can see the change in the commit diff, but I’ll include the snippet here for posterity.

First, you define an array with the XIB file names with no suffix. For instance, I’ve done set(COCOA_UI_XIBS AXAccountsWindow AXSignInWindow Auctions) for the three XIB files presently in the codebase.

Then we have the loop to build them:

find_program(IBTOOL ibtool REQUIRED)
foreach(XIBFILE ${COCOA_UI_XIBS})
add_custom_command(TARGET Auctions POST_BUILD
COMMAND ${IBTOOL} --compile ${CMAKE_CURRENT_BINARY_DIR}/Auctions.app/Contents/Resources/${XIBFILE}.nib ${CMAKE_CURRENT_SOURCE_DIR}/${XIBFILE}.xib
COMMENT "Compiling NIB file ${XIBFILE}.nib")
endforeach()

Now it starts correctly and works properly when built from within CLion. This was surprisingly difficult to debug and fix, so I hope this post can help others avoid the hours of dead ends that I endured.

Until next time, Happy Hacking!

The musl preprocessor debate

Today, I would like to discuss a project that I care very deeply about: the musl libc. One of the most controversial and long-standing debates in the musl community is that musl does not define a preprocessor macro.

What’s in a macro?

Simply put, preprocessor macros allow C code to build parts of itself conditionally. For example, the GNU libc defines the “__GLIBC__” macro. If your code needs to do something specific to function properly on systems using that library, it can conditionally build that code using “#ifdef __GLIBC__”.

The authors of musl have said that they will not add a preprocessor macro identifying the platform as musl because:

It’s a bug to assume a certain implementation has particular properties rather than testing.

Rich Felker, “Re: #define __MUSL__ in features.h”, 2013-03-29

I agree with this sentiment in theory, and in an idealised world this would hold up. However, I’d like to discuss why I think this may need to be reconsidered moving forward.

Sometimes you can’t test

One major reason this is an issue is that sometimes it is not possible to do what the authors consider the “correct” form of testing, which is compile-testing.

This practice requires you to build a small test program, determine whether it built properly, determine its runtime characteristics, and then use the results of that test to influence how your actual software is built. This is an alternative to using the conditional code with preprocessor macros.

However, there are many reasons you may not be able to successfully perform such testing. Cross compilation is a large gap here. In fact, many years ago when I was starting the Adélie project, this caused failures in the base image I was building.

The Bash shell could not perform any compile-time or run-time checks because it was being cross-compiled from a GNU libc system to a musl libc system. This caused it to use “fallback” code that worked improperly. If musl had defined a __MUSL__ macro, Bash would not have needed to assume it was running on a pre-POSIX system.

Similarly, the mailing list thread that made me feel strongly enough to write this article involves a header-only library. These types of libraries are meant to be “drop-in” and function without any changes to a developer’s build system. If header-only libraries start requiring you to use build-time tests, you lose the main reason to use them in the first place.

The author of this thread correctly points out that FreeBSD versions their API with a preprocessor macro. Any software that requires a certain API can simply ensure that __FreeBSD_version is defined as greater-or-equal than the versions that introduced that API.

The main reason that the musl project is fearful of this approach, at least to my observation, is that features or APIs (or indeed, bug fixes) can be backported to prior versions. I feel very strongly that this is not the responsibility of the libc.

If a distribution backports a feature, API, or patch to an older version of a library, it is that distribution’s responsibility to ensure that the software they build against it continues to function. When I backported an API from Qt 5.10 to 5.9 to ensure KDE continued building for Adélie, it was my responsibility as maintainer of those packages to keep them building properly. It certainly does not mean Qt should stop defining a preprocessor macro to determine the version being built against.

Additionally, some APIs are privileged. Determining whether these APIs work correctly using run-time testing can prevent CI/CD from working properly because the CI user does not have permission to use them.

A versioned macro like FreeBSD’s makes sense

I feel that the best way forward for musl is to define a macro like FreeBSD’s. It monotonically increases as APIs or features are added.

I agree that simple bug fixes, and even behavioural changes, probably should not be tracked with this macro. However, this would make it significantly easier to use new APIs as they are introduced.

It also makes builds more efficient. The cost of compile-time tests racks up quickly. On my POWER9 Talos workstation, typical ./configure runs take longer than the builds themselves. This is because fork+exec is still a slow path on POWER. It is similar on ARM, MIPS, and many other RISC architectures.

Macros like these don’t fully eliminate the need for ./configure, but they lessen the workload. Compile-time tests make sense for behaviour detection, but they do not make sense for API detection.

Live from Adélie: Streaming Spotify on musl

Over the July 4th holiday weekend, I was working on a secret project. It was a resounding success and I can now announce to the world: Spotify runs on musl distributions!

This article will describe how I went about accomplishing this feat. If you just want to take Spotify for a test drive on your Adélie workstation or Void desktop, scroll to the “Instructions” heading.

Greetz

Thanks to these fine dwellers of IRC for helping make sense of the twisty mazes.

  • [[sroracle]]
  • Aerdan
  • cb
  • dalias
  • skarnet

gcompat 0.4.0: how very cash LC_MONETARY of you

The latest release version of gcompat did not get very far:

awilcox on laptop spotify % ./spotify
Segmentation fault (core dumped)

Inspecting the core file was minimally helpful:

Thread 1 "ld-musl-x86_64." received signal SIGSEGV, Segmentation fault.
0x0000000001d6ff60 in ?? ()
(gdb) bt
#0  0x0000000001d6ff60 in ?? ()
#1  0x00007fffffffd738 in ?? ()
#2  0x0000000001e94f13 in ?? ()
#3  0x00007fffffffd6d0 in ?? ()
#4  0x00007fffffffd738 in ?? ()
#5  0x0000000003e9d691 in ?? ()
#6  0x0000000003e9d698 in ?? ()
#7  0x0000000003e9d691 in ?? ()
#8  0x00007fffffffd738 in ?? ()
#9  0x00007fffffffdc40 in ?? ()
#10 0x0000000001ccd0f0 in ?? ()
#11 0x00007fffffffd7a0 in ?? ()
#12 0x0000000000000001 in ?? ()
#13 0x00007fffffffd720 in ?? ()
#14 0x0000000001e92e92 in ?? ()
#15 0x0000000003e9d691 in ?? ()
#16 0x0000000003e9d698 in ?? ()
#17 0x00007fffffffd738 in ?? ()
#18 0x00007fffffffd738 in ?? ()
#19 0x00007fffffffd760 in ?? ()
#20 0x0000000001e9dd51 in ?? ()
#21 0x00007fffffffdc40 in ?? ()
#22 0x0000000003e9b3e0 in ?? ()
#23 0x00007fffffffd7e8 in ?? ()
#24 0x00007fffffffd7b8 in ?? ()
#25 0x00007fffffffd7b8 in ?? ()
#26 0x00007fffffffd828 in ?? ()
#27 0x00007fffffffd810 in ?? ()
#28 0x0000000001e9df09 in ?? ()
#29 0x612f656d6f682f1a in ?? ()
#30 0x0000786f636c6977 in ?? ()
#31 0x0000000000000000 in ?? ()
(gdb) info registers
rax            0x54454e4f4d5f434c  6072345775086453580
rbx            0x53                83
rcx            0x53                83
rdx            0x2                 2
rsi            0x53                83
rdi            0x3e9b1a0           65647008
rbp            0x7fffffffd6f0      0x7fffffffd6f0
rsp            0x7fffffffd690      0x7fffffffd690
r8             0x0                 0
r9             0x0                 0
r10            0x1                 1
r11            0x7fffffffdb9c      140737488346012
r12            0x7fffffffd6b8      140737488344760
r13            0x7fffffffd6b0      140737488344752
r14            0x7fffffffd6a8      140737488344744
r15            0x7fffffffd6c0      140737488344768
rip            0x1d6ff60           0x1d6ff60
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

What are we trying to do? Looking at symbols present in the Spotify binary, this is actually part of the G++ runtime; specifically, std::ctype::do_tolower:

  1d6ff51:       48 8b 05 18 a8 12 02    mov    0x212a818(%rip),%rax        # 3e9a770 
  1d6ff58:       48 8b 40 70             mov    0x70(%rax),%rax
  1d6ff5c:       48 0f be cb             movsbq %bl,%rcx
=>1d6ff60:       8a 1c 88                mov    (%rax,%rcx,4),%bl
  1d6ff63:       89 d8                   mov    %ebx,%eax
  1d6ff65:       5b                      pop    %rbx
  1d6ff66:       c3                      retq

That rax value looks suspicious, and we can see if we translate it to ASCII that it is the little-endian representation of the string “LC_MONETARY”. We’re trying to reach 0x70 into a structure in %rax for a pointer value, but we’re getting a string instead.

It turns out that when libstdc++ is compiled on a glibc system, it will attempt to access the internal __ctype_* members in the locale_t of the current locale. musl’s locale_t is not ABI-compatible with glibc’s. In fact, it is only 48 bytes in length; 0x70 (or 112 bytes) is past the end of the locale object musl has provided it!

I implemented a stub locale module in gcompat, and… it tried to exec /proc/self/exe, which broke under the gcompat loader. This required me to write a patch interposing the execv* functions to catch this. And suddenly…

The lights that stop me turn to stone

Slight success! We have a Spotify window!

Spotify, but only a white screen

… but a blank white screen only. After some inspecting, I found that one of the many zygotes CEF was forking was segfaulting:

[158358.508029] ThreadPoolForeg[3230]: segfault at 0 ip 0000000000000000 sp 00007fe3203db448 error 14 in spotify[200000+1acd000]
[158365.067313] ThreadPoolForeg[3252]: segfault at 0 ip 0000000000000000 sp 00007f2d69c172e8 error 14 in spotify[200000+1acd000]
[158378.506832] ThreadPoolForeg[3312]: segfault at 0 ip 0000000000000000 sp 00007f52ed7c8448 error 14 in spotify[200000+1acd000]
[158383.654027] ThreadPoolForeg[3339]: segfault at 0 ip 0000000000000000 sp 00007fcb631eb2e8 error 14 in spotify[200000+1acd000]

I replaced libcef.so from the Spotify DEB package with a matched-version libcef.so from Spotify’s Open Source builds page. This allowed me to have more debugging symbols, and generating a core dump revealed:

Core was generated by `ld-linux-x86-64.so.2 --argv0 /usr/share/spotify/spotify --type=utility --field-'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()
[Current thread is 1 (LWP 12774)]
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f79a8a3d671 in sqlite3MallocSize () at ../../third_party/sqlite/amalgamation/sqlite3.c:26957
#2  mallocWithAlarm () at ../../third_party/sqlite/amalgamation/sqlite3.c:26891
#3  sqlite3Malloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:26913
#4  0x00007f79a8aff232 in sqlite3MallocZero () at ../../third_party/sqlite/amalgamation/sqlite3.c:27118
#5  pthreadMutexAlloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:25755
#6  0x00007f79a8a4e9b2 in sqlite3MutexAlloc () at ../../third_party/sqlite/amalgamation/sqlite3.c:25298
#7  chrome_sqlite3_initialize () at ../../third_party/sqlite/amalgamation/sqlite3.c:24906
#8  0x00007f79a8a350bd in EnsureSqliteInitialized () at ../../sql/initialization.cc:55
#9  0x00007f79a8a30eb2 in OpenInternal () at ../../sql/database.cc:1357
#10 0x00007f79a8a30dfa in Open () at ../../sql/database.cc:270
#11 0x00007f79a8fb8de6 in InitializeDatabase () at ../../net/extras/sqlite/sqlite_persistent_store_backend_base.cc:99
#12 0x00007f79a8fb9751 in LoadNelPoliciesAndNotifyInBackground () at ../../net/extras/sqlite/sqlite_persistent_reporting_and_nel_store.cc:1041
#13 0x00007f79a5abe25b in Invoke<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), scoped_refptr, base::OnceCallback > () at ../../base/bind_internal.h:498
#14 MakeItSo<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), scoped_refptr, base::OnceCallback > ()
    at ../../base/bind_internal.h:598
#15 RunImpl<void (leveldb_proto::ProtoDatabaseSelector::*)(base::OnceCallback), std::__1::tuple<scoped_refptr, base::OnceCallback >, 0, 1> () at ../../base/bind_internal.h:671
#16 RunOnce () at ../../base/bind_internal.h:640
#17 0x00007f79a7776fa0 in Run () at ../../base/callback.h:98
#18 RunTask () at ../../base/task/common/task_annotator.cc:142
#19 0x00007f79a7792862 in base::internal::TaskTracker::RunBlockShutdown(base::internal::Task*) () at ../../base/task/thread_pool/task_tracker.cc:743
#20 0x00007f79a7792062 in RunTask () at ../../base/task/thread_pool/task_tracker.cc:598
#21 0x00007f79a77d42fb in RunTask () at ../../base/task/thread_pool/task_tracker_posix.cc:23
#22 0x00007f79a7791a43 in RunAndPopNextTask () at ../../base/task/thread_pool/task_tracker.cc:450
#23 0x00007f79a7798386 in RunWorker () at ../../base/task/thread_pool/worker_thread.cc:321
#24 0x00007f79a77980f4 in base::internal::WorkerThread::RunPooledWorker() () at ../../base/task/thread_pool/worker_thread.cc:223
#25 0x00007f79a77d4a05 in ThreadFunc () at ../../base/threading/platform_thread_posix.cc:81
#26 0x00007f79ac9fe2dd in ?? ()
#27 0x00007f79aca799e8 in ?? ()
#28 0x00007f7998247ce0 in ?? ()
#29 0x0000000000000000 in ?? ()
(gdb) frame 1
#1  0x00007f79a8a3d671 in sqlite3MallocSize () at ../../third_party/sqlite/amalgamation/sqlite3.c:26957
26957     return sqlite3GlobalConfig.m.xSize(p);

Inspecting the SQLite3 code, I realised that it was somehow getting a nullptr for the malloc_usable_size pointer. Further inspection revealed that this was not exactly the case:

(gdb) disassemble 0x7f79a77d5520
Dump of assembler code for function malloc_usable_size():
   0x00007f79a77d5520 :     push   %rbp
   0x00007f79a77d5521 :     mov    %rsp,%rbp
   0x00007f79a77d5524 :     mov    %rdi,%rsi
   0x00007f79a77d5527 :     mov    0x484a76a(%rip),%rdi        # 0x7f79ac01fc98 
   0x00007f79a77d552e :    mov    0x28(%rdi),%rax
   0x00007f79a77d5532 :    xor    %edx,%edx
   0x00007f79a77d5534 :    pop    %rbp
   0x00007f79a77d5535 :    jmpq   *%rax
End of assembler dump.

Looking at how the Chromium allocator works internally, the issue is that RTLD_NEXT won’t work on libraries loaded before libcef. And looking at the output of ldd spotify revealed both libm and libdl before libcef; musl always redirects these to libc for glibc ABI compatibility.

Using PatchELF to remove these two DT_NEEDEDs from the binary yielded a surprising result…

Music makes the people come together

Spotify on Adélie Linux
Spotify, playing “Rhinestone Eyes” by Gorillaz, on my Adélie laptop

It works! All the features I tested work: Spotify Connect, which means I can control the laptop’s playback using the iOS and Apple Watch apps; radio playback; Bluetooth speaker support.

Instructions

You will need to download the official Spotify 64-bit DEB. I have not tested this on a 32-bit system yet, but I see no reason it won’t work. Once you have the DEB, extract the data.tar.xz file somewhere. Use PatchELF on the Spotify binary as so:

$ patchelf --remove-needed libm.so.6 usr/share/spotify/spotify
$ patchelf --remove-needed libdl.so.2 usr/share/spotify/spotify

Move the extracted usr/share/spotify directory to your system’s /usr/share directory. For better integration, I moved the /usr/share/spotify/spotify.desktop file to /usr/share/applications. Then move the usr/bin/spotify link to /usr/bin.

Ensure that you have the latest gcompat installed. As I write this, only Adélie has the newest version in the current repo. I’ll be submitting merge requests to the distros I know that ship gcompat this week to ensure everyone has a chance to play around with the new bits.

Have fun!


Do you like running Spotify on musl? Or do you just like reading about fun hacks? Consider donating to Adélie to keep the fun going!

Big Endian Firefox: Now with more compositing

I’m currently in the process of trying to bring up the PowerPC platform as a fully supported architecture in Firefox.  I’ve already implemented better support for XPCOM / JS interfacing, and fixed a crash in the JavaScript interpreter. My next challenge is fixing graphical issues, which is proving to be more of a challenge than I initially anticipated.

However, I have broken some new ground.  Before, the compositing engine had wildly inaccurate colouring caused by errant “swizzle methods” (which are functions that take images of one colour type and change them to another – or, “swizzle” them).  This resulted in Firefox 64 (Nightly) looking like this on my workstation:

Firefox, with blue blocks and weird colours everywhere
Firefox, with a broken compositor

I’ve just managed to fix these methods and lo and behold, Firefox 64 (Nightly) looks like this now:

Screenshot_20181019_205618
Firefox, with working compositor

Obviously, there are still some minor nits to work out.  (Namely that my avatar in the top right corner has a blue tint to it!)  I believe the last issues are going to be in the Cairo code, which seems to get very confused by Skia’s byte ordering.  I already have a lead on how to potentially fix this issue.  The good news is that there are no longer any (non-debug) crashers, unless you attempt to view H.264 video.  This is because video playback via FFmpeg crashes due to another byte ordering issue.

All in all, I’m very satisfied with what I was able to knock out in just a few hours on a Friday night.  Thanks go to the #gfx chat room on Mozilla IRC for their guidance in what to look at.