Init: Graceful termination and signal propagation #29

Merged
rawan merged 4 commits from development_signals into development 2026-03-12 11:10:10 +00:00
Member

related issue: #8

Changes implemented:

  1. Re-entrancy guard on poweroff
    Added AtomicBool flag. The first thread to call poweroff() claims it. Any subsequent callers loop on pause() forever, preventing concurrent shutdown races on unmounts and the reboot syscall.

  2. Child process termination before unmounting
    Added terminate_all_children() which runs before sync() and unmounts:

    • Sends SIGTERM to all child processes
    • Polls for up to 3 seconds, reaping exited children every 50ms
    • If any children survive, sends SIGKILL and reaps for up to 1 second

This ensures all file handles on mounts are released before umount runs

related issue: https://forge.ourworld.tf/geomind_code/my_hypervisor/issues/8 Changes implemented: 1. Re-entrancy guard on `poweroff` Added AtomicBool flag. The first thread to call poweroff() claims it. Any subsequent callers loop on pause() forever, preventing concurrent shutdown races on unmounts and the reboot syscall. 2. Child process termination before unmounting Added `terminate_all_children()` which runs before sync() and unmounts: - Sends SIGTERM to all child processes - Polls for up to 3 seconds, reaping exited children every 50ms - If any children survive, sends SIGKILL and reaps for up to 1 second This ensures all file handles on mounts are released before umount runs
feat: child termination before unmount, re-entrancy guard on poweroff
All checks were successful
Unit and Integration Test / test (push) Successful in 1m47s
5ea2d78324
salmaelsoly requested changes 2026-03-11 14:18:09 +00:00
Dismissed
@ -25,1 +31,4 @@
/// Non-blocking reap of all zombie children.
fn reap_all_children() {
loop {
Member

i think this is normal blocking function can u update the comment because it's kinda confusing

i think this is normal blocking function can u update the comment because it's kinda confusing
Author
Member

it is non-blocking due to the WNOHANG flag, the loop keeps reaping any exited children and breaks as soon as there are none left to collect, so the function never stalls waiting for a child that's still running

it is non-blocking due to the `WNOHANG` flag, the loop keeps reaping any exited children and breaks as soon as there are none left to collect, so the function never stalls waiting for a child that's still running
Member

I meant that u should wait for the function to return or to break

I meant that u should wait for the function to return or to break
@ -26,0 +41,4 @@
}
Ok(WaitStatus::StillAlive) | Err(_) => break,
_ => continue,
}
Member

why we are breaking on still alive shouldn't we continue?

why we are breaking on still alive shouldn't we continue?
Author
Member

Still alive means there's no more zombies to collect right now. All living children are still running so we should break(no reaping work to do)

actually this is an infinite loop, so we should break somewhere to avoid running forever

Still alive means there's no more zombies to collect right now. All living children are still running so we should break(no reaping work to do) actually this is an infinite loop, so we should break somewhere to avoid running forever
@ -26,0 +59,4 @@
while Instant::now() < deadline {
reap_all_children();
if no_children_left() {
return;
Member

why we still need this check? already reap all children match on waitpid and returns so the status of children can be known from it i think no_children_left is just redundant

why we still need this check? already reap all children match on waitpid and returns so the status of children can be known from it i think no_children_left is just redundant
Member

we can adjust reap all children to return bool and remove no_children_left

we can adjust reap all children to return bool and remove no_children_left
rawan marked this conversation as resolved
@ -26,0 +66,4 @@
println!("[init] Sending SIGKILL to remaining child processes...");
let _ = nix::sys::signal::kill(Pid::from_raw(-1), Signal::SIGKILL);
Member

this signal all process does it exclude the calling process itself?

this signal all process does it exclude the calling process itself?
Author
Member

yes, kill with pid -1 is sent to every process for which the calling process has permission to send signals, except for process 1 (init), which is the calling process

see kill(2) man page man 2 kill

yes, kill with pid -1 is sent to every process for which the calling process has permission to send signals, except for process 1 (init), which is the calling process see kill(2) man page `man 2 kill`
@ -26,0 +74,4 @@
return;
}
std::thread::sleep(Duration::from_millis(50));
}
Member

this part is duplicated we can extract it into helper function with setting delay needed

 fn wait_for_children(timeout: Duration) -> bool {
      let deadline = Instant::now() + timeout;
      while Instant::now() < deadline {
          reap_all_children() 
          if no_children_left{ 
              return true;
          }
          std::thread::sleep(Duration::from_millis(50));
      }
      false
  }
this part is duplicated we can extract it into helper function with setting delay needed ``` fn wait_for_children(timeout: Duration) -> bool { let deadline = Instant::now() + timeout; while Instant::now() < deadline { reap_all_children() if no_children_left{ return true; } std::thread::sleep(Duration::from_millis(50)); } false } ```
rawan marked this conversation as resolved
refactor: remove cope duplication, adding constants
All checks were successful
Unit and Integration Test / test (push) Successful in 1m46s
ee45cd3ec9
Member

can u please fix conflicts

can u please fix conflicts
Merge branch 'development' into development_signals
Some checks failed
Unit and Integration Test / test (push) Failing after 27s
c2fc1d1511
fix: merge manual edits
All checks were successful
Unit and Integration Test / test (push) Successful in 1m46s
b375c01f76
rawan merged commit baac8d490e into development 2026-03-12 11:10:10 +00:00
rawan deleted branch development_signals 2026-03-12 11:10:10 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/my_hypervisor!29
No description provided.