fix(auth): prevent Qt atexit deadlock in PAM fork children#88
fix(auth): prevent Qt atexit deadlock in PAM fork children#88felixonmars wants to merge 1 commit into
Conversation
PAM modules (notably pam_gnome_keyring) may fork() internally and call exit() instead of _exit() in the fork child, e.g. when starting a daemon or communicating with an existing one. Since the DDM session leader inherits Qt's atexit handlers, exit() triggers libQt6DBus cleanup which tries to join its dispatcher thread. After fork(), only the calling thread survives, so the join blocks forever on a futex. Fix this by registering a pthread_atfork child handler in the session leader, right before pam_open_session(). The handler pushes an atexit entry that calls _exit(0); since atexit handlers run in LIFO order, this runs first and terminates the grandchild without executing Qt's broken post-fork cleanup. This fixes the immediate stuck after clicking login on Arch Linux.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: felixonmars The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Reviewer's guide (collapsed on small PRs)Reviewer's GuideAdds a pthread_atfork child handler in the DDM auth session setup so that PAM-spawned fork children exit via _exit(0) from an atexit handler, preventing Qt/DBus atexit thread-join deadlocks after fork. Sequence diagram for PAM fork child exit handling with pthread_atforksequenceDiagram
actor User
participant DDM_SessionLeader
participant PAM_Module
participant PAM_Child
participant C_Library_atexit
User->>DDM_SessionLeader: Click login
DDM_SessionLeader->>DDM_SessionLeader: Register pthread_atfork handlers
Note over DDM_SessionLeader: child handler pushes atexit(_exit(0))
DDM_SessionLeader->>PAM_Module: pam_open_session()
PAM_Module->>PAM_Module: fork()
PAM_Module->>PAM_Child: Create child process
activate PAM_Child
PAM_Child->>C_Library_atexit: Inherit atexit handlers
PAM_Child->>PAM_Child: exit()
PAM_Child->>C_Library_atexit: Run atexit handlers (LIFO)
C_Library_atexit->>PAM_Child: Call _exit(0)
deactivate PAM_Child
Note over C_Library_atexit,DDM_SessionLeader: Qt/DBus atexit handlers are not executed
DDM_SessionLeader->>User: Session opens successfully
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Registering a global pthread_atfork child handler here means all subsequent forks in this process (not just PAM internals) will inherit an atexit that unconditionally calls _exit(0), which may bypass expected cleanup or status codes for unrelated child processes; consider narrowing the scope or adding a guard to limit this behavior to the PAM-specific context.
- The atexit handler installed in the fork child always exits with status 0, which can mask genuine failure conditions in PAM modules or other forked helpers; if you keep this approach, consider a strategy that preserves or communicates the intended exit status while still avoiding the Qt atexit deadlock.
- pthread_atfork can fail (e.g., ENOMEM), and in that case the deadlock risk remains but without any indication; it may be worth checking the return value and logging or handling the failure path explicitly.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Registering a global pthread_atfork child handler here means *all* subsequent forks in this process (not just PAM internals) will inherit an atexit that unconditionally calls _exit(0), which may bypass expected cleanup or status codes for unrelated child processes; consider narrowing the scope or adding a guard to limit this behavior to the PAM-specific context.
- The atexit handler installed in the fork child always exits with status 0, which can mask genuine failure conditions in PAM modules or other forked helpers; if you keep this approach, consider a strategy that preserves or communicates the intended exit status while still avoiding the Qt atexit deadlock.
- pthread_atfork can fail (e.g., ENOMEM), and in that case the deadlock risk remains but without any indication; it may be worth checking the return value and logging or handling the failure path explicitly.
## Individual Comments
### Comment 1
<location path="src/daemon/Auth.cpp" line_range="279" />
<code_context>
+ // Register a pthread_atfork child handler here so that any
+ // grandchild processes spawned by PAM will have an atexit handler
+ // that runs _exit() first (LIFO), bypassing Qt's broken cleanup.
+ pthread_atfork(nullptr, nullptr, []() {
+ atexit([]() { _exit(0); });
+ });
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider handling the pthread_atfork return code and avoiding repeated registration.
`pthread_atfork` can fail (e.g., ENOMEM); right now that failure is silently ignored, so the workaround may never be installed and the caller can’t detect it. In addition, if this path is executed multiple times, each call adds another atfork handler that then adds another atexit handler in children, causing unbounded growth of handler lists. Please check and handle the `pthread_atfork` return value, and ensure registration happens only once per process (e.g., via `std::once_flag`).
Suggested implementation:
```cpp
// Register a pthread_atfork child handler here so that any
// grandchild processes spawned by PAM will have an atexit handler
// that runs _exit() first (LIFO), bypassing Qt's broken cleanup.
//
// Use std::call_once to avoid registering multiple handlers in the
// same process, and check the pthread_atfork return value.
static std::once_flag pamAtforkOnce;
std::call_once(pamAtforkOnce, [] {
const int rc = pthread_atfork(
nullptr,
nullptr,
[]() {
atexit([]() { _exit(0); });
}
);
if (rc != 0) {
// Best-effort logging; failure means the workaround is not installed.
qWarning() << "Failed to register pthread_atfork handler for PAM:"
<< strerror(rc);
}
});
// Open session
auto sessionEnv = openSessionInternal(env);
```
To compile successfully, you will also need to:
1. Ensure the appropriate headers are included near the top of `src/daemon/Auth.cpp` (if they are not already):
- `#include <mutex>` for `std::once_flag` and `std::call_once`.
- `#include <cstring>` for `strerror`.
2. Ensure that `qWarning()` is available in this translation unit (it usually is via common Qt headers such as `<QDebug>` or similar). If not already present, add the appropriate Qt debug/logging include used elsewhere in this project.
3. If your project uses a different logging mechanism than `qWarning()`, replace the logging line inside the `if (rc != 0)` block with the project's standard error-logging macro or function.
</issue_to_address>
### Comment 2
<location path="src/daemon/Auth.cpp" line_range="276-277" />
<code_context>
+ // thread). After fork(), that thread doesn't exist, so the join
+ // blocks forever on a futex/semaphore.
+ //
+ // Register a pthread_atfork child handler here so that any
+ // grandchild processes spawned by PAM will have an atexit handler
+ // that runs _exit() first (LIFO), bypassing Qt's broken cleanup.
+ pthread_atfork(nullptr, nullptr, []() {
</code_context>
<issue_to_address>
**question (bug_risk):** Scope and lifetime of the atfork-based workaround may be broader than intended.
Since `pthread_atfork` is process-wide, this handler will run on every future `fork()`, not just those triggered indirectly by PAM, and will change termination behavior for all child processes by installing the `_exit(0)` atexit hook. If that broader effect isn’t intended, consider scoping the workaround more narrowly (e.g., via a helper process or only around the specific PAM operations known to fork) so other fork-using paths aren’t unintentionally altered.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| // Register a pthread_atfork child handler here so that any | ||
| // grandchild processes spawned by PAM will have an atexit handler | ||
| // that runs _exit() first (LIFO), bypassing Qt's broken cleanup. | ||
| pthread_atfork(nullptr, nullptr, []() { |
There was a problem hiding this comment.
suggestion (bug_risk): Consider handling the pthread_atfork return code and avoiding repeated registration.
pthread_atfork can fail (e.g., ENOMEM); right now that failure is silently ignored, so the workaround may never be installed and the caller can’t detect it. In addition, if this path is executed multiple times, each call adds another atfork handler that then adds another atexit handler in children, causing unbounded growth of handler lists. Please check and handle the pthread_atfork return value, and ensure registration happens only once per process (e.g., via std::once_flag).
Suggested implementation:
// Register a pthread_atfork child handler here so that any
// grandchild processes spawned by PAM will have an atexit handler
// that runs _exit() first (LIFO), bypassing Qt's broken cleanup.
//
// Use std::call_once to avoid registering multiple handlers in the
// same process, and check the pthread_atfork return value.
static std::once_flag pamAtforkOnce;
std::call_once(pamAtforkOnce, [] {
const int rc = pthread_atfork(
nullptr,
nullptr,
[]() {
atexit([]() { _exit(0); });
}
);
if (rc != 0) {
// Best-effort logging; failure means the workaround is not installed.
qWarning() << "Failed to register pthread_atfork handler for PAM:"
<< strerror(rc);
}
});
// Open session
auto sessionEnv = openSessionInternal(env);
To compile successfully, you will also need to:
- Ensure the appropriate headers are included near the top of
src/daemon/Auth.cpp(if they are not already):#include <mutex>forstd::once_flagandstd::call_once.#include <cstring>forstrerror.
- Ensure that
qWarning()is available in this translation unit (it usually is via common Qt headers such as<QDebug>or similar). If not already present, add the appropriate Qt debug/logging include used elsewhere in this project. - If your project uses a different logging mechanism than
qWarning(), replace the logging line inside theif (rc != 0)block with the project's standard error-logging macro or function.
| // Register a pthread_atfork child handler here so that any | ||
| // grandchild processes spawned by PAM will have an atexit handler |
There was a problem hiding this comment.
question (bug_risk): Scope and lifetime of the atfork-based workaround may be broader than intended.
Since pthread_atfork is process-wide, this handler will run on every future fork(), not just those triggered indirectly by PAM, and will change termination behavior for all child processes by installing the _exit(0) atexit hook. If that broader effect isn’t intended, consider scoping the workaround more narrowly (e.g., via a helper process or only around the specific PAM operations known to fork) so other fork-using paths aren’t unintentionally altered.
There was a problem hiding this comment.
Pull request overview
This PR fixes a post-fork() deadlock during PAM session opening in DDM’s session leader by ensuring fork-children created inside PAM modules don’t run Qt’s atexit cleanup (notably QtDBus thread teardown) that can hang after fork().
Changes:
- Add a
pthread_atfork()child handler in the session leader beforepam_open_session()work begins. - In the atfork child handler, register an
atexit()handler that immediately terminates via_exit()to bypass Qt’s post-fork cleanup path.
| pthread_atfork(nullptr, nullptr, []() { | ||
| atexit([]() { _exit(0); }); | ||
| }); |
|
TAG Bot New tag: 0.3.4 |
PAM modules (notably pam_gnome_keyring) may fork() internally and call exit() instead of _exit() in the fork child, e.g. when starting a daemon or communicating with an existing one. Since the DDM session leader inherits Qt's atexit handlers, exit() triggers libQt6DBus cleanup which tries to join its dispatcher thread. After fork(), only the calling thread survives, so the join blocks forever on a futex.
Fix this by registering a pthread_atfork child handler in the session leader, right before pam_open_session(). The handler pushes an atexit entry that calls _exit(0); since atexit handlers run in LIFO order, this runs first and terminates the grandchild without executing Qt's broken post-fork cleanup.
This fixes the immediate stuck after clicking login on Arch Linux.
Summary by Sourcery
Bug Fixes: