Reliable shutdown on Windows + lelab --stop#46
Conversation
On Windows, stopping LeLab (Ctrl-C / closing the terminal) often leaves the
uvicorn and Vite child processes alive, which keeps port :8000 — and any open
camera handles — held. The next `lelab` then fails to bind, or a camera won't
open because the previous run never released it.
- `lelab --stop`: find and terminate a running LeLab and its child process tree,
freeing the port.
- Clean process-tree teardown on exit (psutil, with a Windows `taskkill /T`
fallback) so the uvicorn/Vite children and their handles are actually released.
- Pre-flight port checks with an actionable message ("…run `lelab --stop` to free
it") instead of an opaque bind error, plus a readiness wait before opening the
browser.
Tests in tests/test_scripts_lelab.py cover the stop / teardown / port-check paths
with mocked psutil + subprocess (no real processes spawned).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
952e688 to
5131314
Compare
| # asyncio Proactor loop doesn't wind down cleanly), leaving the terminal | ||
| # stuck. Take over signal handling: stop hard and reap any child | ||
| # subprocesses (training/recording/inference) so the prompt always returns. | ||
| server.install_signal_handlers = lambda: None |
There was a problem hiding this comment.
The hard-exit path is gated by intent (the comment is about the Windows Proactor loop hanging) but not by code — install_signal_handlers = lambda: None and the os._exit(0) below run on every platform. On macOS/Linux this drops the graceful shutdown the old prod code explicitly opted into (timeout_graceful_shutdown=2) and skips the @app.on_event("shutdown") cleanup. Can we gate this to Windows (e.g. if os.name == "nt": around the override + hard exit) and keep uvicorn's native graceful handler elsewhere?
| if returncode is not None: | ||
| logger.error("%s stopped with exit code %s.", name, returncode) | ||
| _shutdown_processes(processes) | ||
| raise SystemExit(returncode or 1) |
There was a problem hiding this comment.
returncode or 1 turns a clean child exit (returncode == 0) into exit code 1. Suggest raise SystemExit(returncode if returncode is not None else 1).
| def _wait_for_port(port: int, timeout: int = 30) -> bool: | ||
| for _ in range(timeout): | ||
| sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
| def _fail(message: str) -> None: |
There was a problem hiding this comment.
If you annotate this -> typing.NoReturn, the type-checker knows control never returns past _fail, which lets you delete the unreachable raise AssertionError("unreachable") lines at 151 and 246 (they're only there because the return type currently looks like None).
| def _install_signal_handlers(processes: Sequence[tuple[str, subprocess.Popen]]) -> None: | ||
| def shutdown(_signum, _frame) -> None: | ||
| logger.info("Shutting down LeLab...") | ||
| _shutdown_processes(processes) |
There was a problem hiding this comment.
Minor: this runs teardown, then the raise SystemExit(0) is caught by the except BaseException at line 430, which calls _shutdown_processes again. It's idempotent so harmless, but the handler could just raise SystemExit(0) and let the except own teardown to avoid the double pass.
nicolas-rabault
left a comment
There was a problem hiding this comment.
Thank you @nobullryder for this contribution.
Here are some changes requests that could improve it.
Problem
On Windows, stopping
lelab(Ctrl-C / closing the terminal) often leaves the uvicorn and Vite child processes alive, which keeps port:8000— and any open camera handles — held. The nextlelabthen fails to bind, or a camera won't open because the previous run never released it.What this does
lelab --stop— finds and terminates a running LeLab and its child process tree, freeing the port.taskkill /Tfallback) so the uvicorn/Vite children and their handles are actually released.lelab --stopto free it") instead of an opaque bind error, plus a readiness wait before opening the browser.Tests
tests/test_scripts_lelab.pycovers the stop / teardown / port-check paths with mockedpsutil+subprocess(no real processes spawned).🤖 Generated with Claude Code