<feed xmlns='http://www.w3.org/2005/Atom'>
<title>dotfiles/etc/kernel/cmdline-linux-hardened.tmpl, branch master</title>
<subtitle>My linux config and rc files</subtitle>
<id>https://git.sommerfeld.dev/dotfiles/atom/etc/kernel/cmdline-linux-hardened.tmpl?h=master</id>
<link rel='self' href='https://git.sommerfeld.dev/dotfiles/atom/etc/kernel/cmdline-linux-hardened.tmpl?h=master'/>
<link rel='alternate' type='text/html' href='https://git.sommerfeld.dev/dotfiles/'/>
<updated>2026-05-29T10:18:15Z</updated>
<entry>
<title>feat(suspend): re-enable suspend on s2idle, drop diagnostic scaffolding</title>
<updated>2026-05-29T10:18:15Z</updated>
<author>
<name>sommerfeld</name>
<email>sommerfeld@sommerfeld.dev</email>
</author>
<published>2026-05-29T10:18:15Z</published>
<link rel='alternate' type='text/html' href='https://git.sommerfeld.dev/dotfiles/commit/?id=6e0c5c33438e5e898bd075c33a45b3abf9d1b26b'/>
<id>urn:sha1:6e0c5c33438e5e898bd075c33a45b3abf9d1b26b</id>
<content type='text'>
Confirmed root cause: this hardware's S3 (deep) firmware path triggers a
fatal wake-from-suspend hang only on linux-hardened. INIT_ON_FREE + slab
hardening + tighter locking turn a latent driver race that stock linux
gets away with into an unrecoverable panic so early the journal isn't
even flushed. mem_sleep_default=s2idle bypasses the BIOS S3 path
entirely (s0ix is a pure-kernel low-power state) and suspends/resumes
reliably under hardened.

This is a widespread Lenovo S3 firmware issue across post-2018
ThinkPads (see Ubuntu T560, X1C9/10/11 reports). Lenovo themselves
moved newer firmwares to s2idle-only. Not a linux-hardened bug per se;
just hardened being a strict enough kernel to make the bug fatal.

Keep:
* mem_sleep_default=s2idle in etc/kernel/cmdline-linux-hardened.tmpl
  (only the hardened UKI; stock linux keeps unchanged shared cmdline)

Revert (all the diagnostic / speculative scaffolding from the last
few commits):
* MODULES=(intel_lpss_pci) → MODULES=()  — Arch wiki touchpad fix was
  not the cause here
* nmi_watchdog=panic softlockup_panic=1 panic=10 — only needed to
  auto-reboot during diagnosis
* no_console_suspend — diagnostic-only
* etc/systemd/logind.conf.d/20-no-suspend.conf  — masking workaround
* sleep-target masking block in run_onchange_after_deploy-etc.sh.tmpl,
  replaced with a one-shot cleanup that removes any leftover
  /dev/null symlinks from systems that ran the previous version
* systemd-pstore.service from systemd-units/system.txt — added only to
  catch the diagnostic panic
* diagnose-suspend.sh helper (and its .gitignore/.chezmoiignore entries)
* sway suspend → lock-session keybind workaround
* power-menu.sh Suspend entry restoration
* KEYBINDS.md docs
</content>
</entry>
<entry>
<title>fix(suspend): switch hardened to s2idle, keep console alive, archive pstore</title>
<updated>2026-05-29T10:18:14Z</updated>
<author>
<name>sommerfeld</name>
<email>sommerfeld@sommerfeld.dev</email>
</author>
<published>2026-05-29T10:18:14Z</published>
<link rel='alternate' type='text/html' href='https://git.sommerfeld.dev/dotfiles/commit/?id=ad8e14860fa0ca978f5ef6e02860d24f5e39c361'/>
<id>urn:sha1:ad8e14860fa0ca978f5ef6e02860d24f5e39c361</id>
<content type='text'>
Previous attempt (early-loading intel_lpss_pci) did not fix the wake-from-suspend
panic on linux-hardened. The journal of the failed boot ends cleanly at the
last sync with no panic, oops, or even 'PM: suspend entry' message — the kernel
dies so fast nothing is flushed, even with panic=10 + watchdog knobs.

Three changes to make progress:

* mem_sleep_default=s2idle: switch S3 'deep' (broken firmware path on Coffee
  Lake ThinkPads) to s2idle / s0ix. Many Lenovo machines only suspend reliably
  via s2idle; the stock linux kernel may be masking the issue elsewhere.
* no_console_suspend: keep console alive across the suspend/resume cycle so
  the panic actually prints somewhere visible, instead of being eaten when
  the framebuffer goes dark.
* systemd-pstore.service: archive /sys/fs/pstore/* to /var/lib/systemd/pstore/
  on every boot, so the next panic (if EFI variables capture it) survives.

Drop 'quiet' from hardened cmdline so console messages are visible.
</content>
</entry>
<entry>
<title>fix(suspend): load intel_lpss_pci from initramfs (Arch wiki touchpad fix)</title>
<updated>2026-05-29T10:18:14Z</updated>
<author>
<name>sommerfeld</name>
<email>sommerfeld@sommerfeld.dev</email>
</author>
<published>2026-05-29T10:18:14Z</published>
<link rel='alternate' type='text/html' href='https://git.sommerfeld.dev/dotfiles/commit/?id=be5f8a2e6be3af4963399bb7f994f76d76b3a239'/>
<id>urn:sha1:be5f8a2e6be3af4963399bb7f994f76d76b3a239</id>
<content type='text'>
Symptoms (Intel CPU + linux-hardened + blinking caps lock + hard
hang on resume from S3) are a direct match for the Arch wiki entry:

  https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Touchpad_causes_a_kernel_panic_on_resume
  https://bbs.archlinux.org/viewtopic.php?id=231881

When intel_lpss_pci is loaded late (via udev after userspace is up),
the touchpad/I2C controller it parents can be torn down by suspend
before the module's resume callback is registered, leading to a
NULL-deref panic during resume. The kernel never makes it far enough
to flush logs — which matches our 'PM: suspend entry (deep)' being
the last journal line.

Fix: load intel_lpss_pci from the initramfs so it's available before
the suspend/resume code path runs.

Why this only bites linux-hardened: the hardening config enables
INIT_ON_FREE, slab freelist hardening, page poisoning, and stricter
pointer validation, which turn what's a silent UAF on stock linux
into an immediate panic on hardened. Stock 'just works' by accident.

Also drop the speculative init_on_free=0 from the hardened cmdline
now that we have a targeted hypothesis. Keep nmi_watchdog=panic +
softlockup_panic=1 + panic=10 as belt-and-braces: if this fix is
wrong, the next hang will auto-reboot with a usable panic log in
'journalctl -b -1 -k' instead of needing the power button again.
</content>
</entry>
<entry>
<title>feat(suspend): hardened-only init_on_free=0 + hang-detection cmdline</title>
<updated>2026-05-29T10:18:14Z</updated>
<author>
<name>sommerfeld</name>
<email>sommerfeld@sommerfeld.dev</email>
</author>
<published>2026-05-29T10:18:14Z</published>
<link rel='alternate' type='text/html' href='https://git.sommerfeld.dev/dotfiles/commit/?id=e2a7a2fdb9ba66e777ec1a8c0d3c9301cc21bdab'/>
<id>urn:sha1:e2a7a2fdb9ba66e777ec1a8c0d3c9301cc21bdab</id>
<content type='text'>
Split the hardened UKI cmdline off the shared etc/kernel/cmdline.tmpl
so we can carry workarounds without poking the stock linux build.

Daily-driving linux-hardened on this hardware has reliably hung on
resume from S3: black screen, blinking caps-lock + power LED, only
the power button helps. The kernel journal stops at 'PM: suspend
entry (deep)' with nothing after, so the freeze is below the level
where logs can flush — characteristic of a hard hang inside a device
driver's suspend/resume callback rather than a userspace bug.

linux-hardened defaults init_on_free=1, which zeroes pages on free.
On Intel + iwlwifi/i915/nvme stacks this routinely surfaces latent
UAFs as suspend hangs that are invisible on stock linux. Drop that
knob to 0 for the hardened cmdline as the working hypothesis.

Add nmi_watchdog=panic, softlockup_panic=1, panic=10 so if the next
attempt still wedges, a stuck CPU self-panics and auto-reboots
within ~10s, giving us a 'journalctl -b -1 -k' trace to look at
instead of having to force-power-off blindly.

Stock linux is untouched.
</content>
</entry>
</feed>
