fix: trigger ShipIt Mach service after SMJobSubmit to unblock on-demand-only mode (#51191)

* fix: trigger ShipIt Mach service to unblock on-demand-only mode

When a macOS system update is pending, launchd puts the user domain
into on-demand-only mode, preventing ShipIt from starting. The
MachServices endpoint in the job dictionary was registered but never
connected to (a leftover from the XPC removal in 2013).

Instead of removing MachServices, fire a lightweight XPC connection
to the Mach port after SMJobSubmit. This satisfies launchd's
on-demand trigger, starting ShipIt immediately while preserving
KeepAlive retry behavior.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>

* fix: add ResetAtClose to ShipIt MachServices to prevent standing demand

The XPC trigger message sent after SMJobSubmit sits in the Mach port's
kernel queue unread. Without ResetAtClose, this creates standing demand
that causes launchd to respawn ShipIt after a successful exit(0),
defeating KeepAlive.SuccessfulExit = NO.

Set ResetAtClose on the MachServices registration so launchd tears down
and recreates the port when ShipIt exits, flushing the stale trigger.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>

* fix: drain Mach port before exit(0) instead of using ResetAtClose

ResetAtClose blocks KeepAlive.SuccessfulExit retries in on-demand-only
mode because it removes demand when the port resets. Instead, have
ShipIt drain its own Mach service port (via bootstrap_check_in +
mach_msg) before each exit(EXIT_SUCCESS). This clears the standing
demand from the trigger message so launchd won't respawn after a
successful exit, while leaving the message in place on failure exits
so KeepAlive retries remain demand-backed.

Tested in on-demand-only mode (pending macOS update):
- exit(0) + drain: 1 run, no respawn ✓
- exit(1) + no drain: continuous respawn every 2s ✓

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>

* chore: update patch

* chore: harden ShipIt Mach trigger and simplify port drain

Scope the XPC trigger to the unprivileged path and add a send barrier
so the connection cannot be released before the message is on the wire.
Reduce drainMachServicePort to bootstrap_check_in (process exit flushes
the queue), dropping the mach_msg loop whose buffer/dealloc usage was
incorrect, and remove the no-op drain from the posix_spawn'd launch
helper. Patch filename regenerated to match the commit subject.

* fix: restore explicit mach_msg drain in drainMachServicePort

bootstrap_check_in alone does not prevent respawn: launchd tracks
outstanding demand independently of the receive right's lifetime, so the
queued trigger message must be explicitly dequeued with mach_msg before
exit(0). Verified empirically (check-in-only: 5 respawns in 10s; full
drain: 1 run). Keep the correctness fixes from the previous commit
(4K buffer, mach_msg_destroy on each receive, no mach_port_deallocate).

---------

Co-authored-by: Claude <svc-devxp-claude@slack-corp.com>
Co-authored-by: Samuel Attard <sattard@anthropic.com>
This commit is contained in:
Keeley Hammond
2026-04-21 09:46:18 -07:00
committed by GitHub
parent 1ad6173286
commit 313f8955d1
3 changed files with 140 additions and 50 deletions

View File

@@ -12,4 +12,4 @@ use_uttype_class_instead_of_deprecated_uttypeconformsto.patch
fix_clean_up_orphaned_staged_updates_before_downloading_new_update.patch
fix_add_explicit_json_property_mappings_for_shipit_request_model.patch
fix_resolve_target_bundle_path_once_at_start_of_install.patch
fix_remove_vestigial_machservices_from_shipit_launchd_job.patch
fix_trigger_shipit_mach_service_after_smjobsubmit_to_unblock.patch

View File

@@ -1,49 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Keeley Hammond <vertedinde@electronjs.org>
Date: Tue, 14 Apr 2026 10:00:00 -0700
Subject: fix: remove vestigial MachServices from ShipIt launchd job
The MachServices key in the ShipIt launchd job dictionary is a leftover
from when ShipIt was an XPC service (removed in Squirrel/Squirrel.Mac@d6ca1c2
":fire: XPC :fire:" in October 2013). Since then, nothing connects to
the registered Mach port.
When a macOS system update is pending, launchd puts the user domain into
"on-demand-only mode", where only jobs with an on-demand trigger (like a
MachServices connection) are started. Since nothing connects to ShipIt's
Mach port, launchd pends the spawn indefinitely with:
"pending spawn, domain in on-demand-only mode"
This prevents ShipIt from running, causing Electron auto-updates to fail
whenever a macOS update is staged. By the time the domain exits
on-demand-only mode (after reboot/update completion), the staged update
bundle is often stale, leading to "Too many attempts to install" errors.
Removing the MachServices key eliminates the on-demand trigger, so
launchd falls back to evaluating KeepAlive.SuccessfulExit which
correctly starts ShipIt immediately on submission.
Also removes the stale "service name" comment on the jobLabel argument,
since ShipIt is no longer a Mach service.
diff --git a/Squirrel/SQRLShipItLauncher.m b/Squirrel/SQRLShipItLauncher.m
index 6a9151d92f399184fff9854eb00ea506165bbbe2..0ebd2a23d62424c41e15413edeab360bef87ecc4 100644
--- a/Squirrel/SQRLShipItLauncher.m
+++ b/Squirrel/SQRLShipItLauncher.m
@@ -50,14 +50,10 @@ + (RACSignal *)shipItJobDictionary {
@(LAUNCH_JOBKEY_KEEPALIVE_SUCCESSFULEXIT): @NO
};
- jobDict[@(LAUNCH_JOBKEY_MACHSERVICES)] = @{
- jobLabel: @YES
- };
-
NSMutableArray *arguments = [[NSMutableArray alloc] init];
[arguments addObject:[squirrelBundle URLForResource:@"ShipIt" withExtension:nil].path];
- // Pass in the service name so ShipIt knows how to broadcast itself.
+ // Pass in the job label so ShipIt can identify itself.
[arguments addObject:jobLabel];
// We need to pass the path to ShipIt rather than having ShipIt

View File

@@ -0,0 +1,139 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Keeley Hammond <vertedinde@electronjs.org>
Date: Tue, 14 Apr 2026 10:00:00 -0700
Subject: fix: trigger ShipIt Mach service after SMJobSubmit to unblock
on-demand-only mode
When a macOS system update is pending (downloaded but not yet installed),
launchd puts the user domain (gui/<uid>) into "on-demand-only mode".
In this mode, launchd only starts jobs triggered by an on-demand event
such as a Mach port connection -- KeepAlive and RunAtLoad are suppressed.
ShipIt's launchd job registers a MachServices endpoint, but nothing ever
connects to it. The MachServices key was originally used when ShipIt was
a full XPC service (removed in Squirrel/Squirrel.Mac@d6ca1c2 in October
2013). The client-side connection was removed but the server-side
MachServices registration was left behind, creating a trigger that
launchd waits on but nothing ever fires.
Fix: after SMJobSubmit, open a lightweight XPC connection to the
registered Mach service name and send an empty message. This satisfies
launchd's on-demand trigger and causes it to start ShipIt immediately.
In normal operation (no pending update), the job starts via KeepAlive
anyway and the trigger is a harmless no-op. Unlike launchctl kickstart,
this preserves KeepAlive.SuccessfulExit respawn behavior because launchd
treats the activation as a legitimate on-demand event.
The trigger message stays in the Mach port's kernel queue (ShipIt has
no XPC listener), which creates standing demand that provides the
on-demand activity needed for KeepAlive retries in on-demand-only mode.
To prevent this standing demand from also respawning ShipIt after a
successful exit(0), ShipIt checks in for its Mach service port and
dequeues every pending message before each exit(EXIT_SUCCESS). Checking
in alone is not sufficient: launchd tracks demand independently of the
port's lifetime and will respawn the job if the message was never read.
On failure exits, the message is left in place so launchd treats the
KeepAlive respawn as demand-backed.
diff --git a/Squirrel/SQRLShipItLauncher.m b/Squirrel/SQRLShipItLauncher.m
index 6a9151d92f399184fff9854eb00ea506165bbbe2..a087f20043fa79a07391ed065031396d7ec6fce4 100644
--- a/Squirrel/SQRLShipItLauncher.m
+++ b/Squirrel/SQRLShipItLauncher.m
@@ -10,6 +10,7 @@
#import <ReactiveObjC/EXTScope.h>
#import "SQRLDirectoryManager.h"
#import <ReactiveObjC/ReactiveObjC.h>
+#import <xpc/xpc.h>
#import <Security/Security.h>
#import <ServiceManagement/ServiceManagement.h>
#import <launch.h>
@@ -57,7 +58,7 @@ + (RACSignal *)shipItJobDictionary {
NSMutableArray *arguments = [[NSMutableArray alloc] init];
[arguments addObject:[squirrelBundle URLForResource:@"ShipIt" withExtension:nil].path];
- // Pass in the service name so ShipIt knows how to broadcast itself.
+ // Pass in the job label so ShipIt can identify itself.
[arguments addObject:jobLabel];
// We need to pass the path to ShipIt rather than having ShipIt
@@ -154,6 +155,23 @@ + (RACSignal *)launchPrivileged:(BOOL)privileged {
return [RACSignal error:CFBridgingRelease(cfError)];
}
+ // Trigger an on-demand launch by sending a message to the job's
+ // Mach service. When loginwindow begins a restart (e.g. for a
+ // pending macOS update) it puts the per-user launchd domain into
+ // on-demand-only mode, which defers RunAtLoad/KeepAlive spawns
+ // but still honors real IPC demand. The system domain is not
+ // affected, so this is only needed for the unprivileged path.
+ if (!privileged) {
+ xpc_connection_t trigger = xpc_connection_create_mach_service(self.shipItJobLabel.UTF8String, NULL, 0);
+ xpc_connection_set_event_handler(trigger, ^(xpc_object_t __unused event) {});
+ xpc_connection_resume(trigger);
+ xpc_connection_send_message(trigger, xpc_dictionary_create(NULL, NULL, 0));
+ // send_message is async; keep the connection alive until the
+ // message is actually on the wire so ARC releasing `trigger`
+ // at end-of-scope can't drop it first.
+ xpc_connection_send_barrier(trigger, ^{ (void)trigger; });
+ }
+
return [RACSignal empty];
}]
flatten]
diff --git a/Squirrel/ShipIt-main.m b/Squirrel/ShipIt-main.m
index acf545199dbf1831fe8a73155c6e4d0db4047934..e26c0f11870ddbe801e572b1696af484231bf1dc 100644
--- a/Squirrel/ShipIt-main.m
+++ b/Squirrel/ShipIt-main.m
@@ -15,6 +15,8 @@
#include <spawn.h>
#include <sys/wait.h>
+#include <mach/mach.h>
+#include <servers/bootstrap.h>
#import "NSError+SQRLVerbosityExtensions.h"
#import "RACSignal+SQRLTransactionExtensions.h"
@@ -63,6 +65,28 @@ static BOOL clearInstallationAttempts(NSString *applicationIdentifier) {
return CFPreferencesSynchronize((__bridge CFStringRef)applicationIdentifier, kCFPreferencesCurrentUser, kCFPreferencesCurrentHost);
}
+// Drain the Mach service port registered via MachServices in the launchd
+// job dictionary before exit(0) so launchd sees no outstanding demand and
+// does not immediately respawn the job. bootstrap_check_in transfers the
+// receive right into this task, but that alone is not sufficient: launchd
+// tracks demand independently of the port's lifetime, so the queued
+// trigger message must be explicitly dequeued. On failure exits the
+// message is intentionally left queued so the KeepAlive respawn is
+// demand-backed while the launchd domain is in on-demand-only mode.
+static void drainMachServicePort(const char *serviceName) {
+ mach_port_t port = MACH_PORT_NULL;
+ if (bootstrap_check_in(bootstrap_port, serviceName, &port) != KERN_SUCCESS) return;
+
+ struct {
+ mach_msg_header_t header;
+ uint8_t body[4096];
+ } msg;
+ while (mach_msg(&msg.header, MACH_RCV_MSG | MACH_RCV_TIMEOUT,
+ 0, sizeof(msg), port, 0, MACH_PORT_NULL) == KERN_SUCCESS) {
+ mach_msg_destroy(&msg.header);
+ }
+}
+
// Waits for all instances of the target application (as described in the
// `request`) to exit, then sends completed.
static RACSignal *waitForTerminationIfNecessary(SQRLShipItRequest *request) {
@@ -206,12 +230,14 @@ static void installRequest(RACSignal *readRequestSignal, NSString *applicationId
if ([[error domain] isEqual:SQRLInstallerErrorDomain] && [error code] == SQRLInstallerErrorAppStillRunning) {
NSLog(@"Installation cancelled: %@", error);
clearInstallationAttempts(applicationIdentifier);
+ drainMachServicePort(applicationIdentifier.UTF8String);
exit(EXIT_SUCCESS);
} else {
NSLog(@"Installation error: %@", error);
exit(EXIT_FAILURE);
}
} completed:^{
+ drainMachServicePort(applicationIdentifier.UTF8String);
exit(EXIT_SUCCESS);
}];
}