aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/etc/kernel
Commit message (Collapse)AuthorAgeFilesLines
* feat(suspend): re-enable suspend on s2idle, drop diagnostic scaffoldingLibravatar sommerfeld3 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Confirmed root cause: this hardware's S3 (deep) firmware path triggers a fatal wake-from-suspend hang only on linux-hardened. INIT_ON_FREE + slab hardening + tighter locking turn a latent driver race that stock linux gets away with into an unrecoverable panic so early the journal isn't even flushed. mem_sleep_default=s2idle bypasses the BIOS S3 path entirely (s0ix is a pure-kernel low-power state) and suspends/resumes reliably under hardened. This is a widespread Lenovo S3 firmware issue across post-2018 ThinkPads (see Ubuntu T560, X1C9/10/11 reports). Lenovo themselves moved newer firmwares to s2idle-only. Not a linux-hardened bug per se; just hardened being a strict enough kernel to make the bug fatal. Keep: * mem_sleep_default=s2idle in etc/kernel/cmdline-linux-hardened.tmpl (only the hardened UKI; stock linux keeps unchanged shared cmdline) Revert (all the diagnostic / speculative scaffolding from the last few commits): * MODULES=(intel_lpss_pci) → MODULES=() — Arch wiki touchpad fix was not the cause here * nmi_watchdog=panic softlockup_panic=1 panic=10 — only needed to auto-reboot during diagnosis * no_console_suspend — diagnostic-only * etc/systemd/logind.conf.d/20-no-suspend.conf — masking workaround * sleep-target masking block in run_onchange_after_deploy-etc.sh.tmpl, replaced with a one-shot cleanup that removes any leftover /dev/null symlinks from systems that ran the previous version * systemd-pstore.service from systemd-units/system.txt — added only to catch the diagnostic panic * diagnose-suspend.sh helper (and its .gitignore/.chezmoiignore entries) * sway suspend → lock-session keybind workaround * power-menu.sh Suspend entry restoration * KEYBINDS.md docs
* fix(suspend): switch hardened to s2idle, keep console alive, archive pstoreLibravatar sommerfeld3 days1-1/+1
| | | | | | | | | | | | | | | | | | | | Previous attempt (early-loading intel_lpss_pci) did not fix the wake-from-suspend panic on linux-hardened. The journal of the failed boot ends cleanly at the last sync with no panic, oops, or even 'PM: suspend entry' message — the kernel dies so fast nothing is flushed, even with panic=10 + watchdog knobs. Three changes to make progress: * mem_sleep_default=s2idle: switch S3 'deep' (broken firmware path on Coffee Lake ThinkPads) to s2idle / s0ix. Many Lenovo machines only suspend reliably via s2idle; the stock linux kernel may be masking the issue elsewhere. * no_console_suspend: keep console alive across the suspend/resume cycle so the panic actually prints somewhere visible, instead of being eaten when the framebuffer goes dark. * systemd-pstore.service: archive /sys/fs/pstore/* to /var/lib/systemd/pstore/ on every boot, so the next panic (if EFI variables capture it) survives. Drop 'quiet' from hardened cmdline so console messages are visible.
* fix(suspend): load intel_lpss_pci from initramfs (Arch wiki touchpad fix)Libravatar sommerfeld3 days1-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Symptoms (Intel CPU + linux-hardened + blinking caps lock + hard hang on resume from S3) are a direct match for the Arch wiki entry: https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Touchpad_causes_a_kernel_panic_on_resume https://bbs.archlinux.org/viewtopic.php?id=231881 When intel_lpss_pci is loaded late (via udev after userspace is up), the touchpad/I2C controller it parents can be torn down by suspend before the module's resume callback is registered, leading to a NULL-deref panic during resume. The kernel never makes it far enough to flush logs — which matches our 'PM: suspend entry (deep)' being the last journal line. Fix: load intel_lpss_pci from the initramfs so it's available before the suspend/resume code path runs. Why this only bites linux-hardened: the hardening config enables INIT_ON_FREE, slab freelist hardening, page poisoning, and stricter pointer validation, which turn what's a silent UAF on stock linux into an immediate panic on hardened. Stock 'just works' by accident. Also drop the speculative init_on_free=0 from the hardened cmdline now that we have a targeted hypothesis. Keep nmi_watchdog=panic + softlockup_panic=1 + panic=10 as belt-and-braces: if this fix is wrong, the next hang will auto-reboot with a usable panic log in 'journalctl -b -1 -k' instead of needing the power button again.
* feat(suspend): hardened-only init_on_free=0 + hang-detection cmdlineLibravatar sommerfeld3 days1-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Split the hardened UKI cmdline off the shared etc/kernel/cmdline.tmpl so we can carry workarounds without poking the stock linux build. Daily-driving linux-hardened on this hardware has reliably hung on resume from S3: black screen, blinking caps-lock + power LED, only the power button helps. The kernel journal stops at 'PM: suspend entry (deep)' with nothing after, so the freeze is below the level where logs can flush — characteristic of a hard hang inside a device driver's suspend/resume callback rather than a userspace bug. linux-hardened defaults init_on_free=1, which zeroes pages on free. On Intel + iwlwifi/i915/nvme stacks this routinely surfaces latent UAFs as suspend hangs that are invisible on stock linux. Drop that knob to 0 for the hardened cmdline as the working hypothesis. Add nmi_watchdog=panic, softlockup_panic=1, panic=10 so if the next attempt still wedges, a stuck CPU self-panics and auto-reboots within ~10s, giving us a 'journalctl -b -1 -k' trace to look at instead of having to force-power-off blindly. Stock linux is untouched.
* fix(etc): restrict lsblk to the parent device onlyLibravatar sommerfeld2026-05-131-1/+1
| | | | | | | lsblk without -d lists the partition AND its children, so on a LUKS setup the second line (the mapper's UUID) was leaking into the rendered cmdline and deploy script. Add -d so only the partition's own UUID is emitted.
* feat(etc): template kernel cmdline, derive LUKS UUID from partition nameLibravatar sommerfeld2026-05-132-1/+1
| | | | | | | | | | | | | | | | | Prompt once at 'chezmoi init' time for the LUKS root partition (e.g. nvme0n1p2) and store it under [data].luksRootPartition in the per-machine chezmoi config. etc/kernel/cmdline.tmpl resolves the UUID at apply time via lsblk, so reinstalls only require re-entering the partition name. The etc deploy script now renders *.tmpl sources through 'chezmoi execute-template' and installs them without the suffix. The resolved UUID is folded into the onchange hash so the script re-runs when the UUID changes even if etc/ content is unchanged. just etc-status/diff transparently handle .tmpl sources (strip suffix for the live-path mapping, render before diffing). etc-re-add skips .tmpl files since template sources can't be reverse-rendered from the live file.
* feat(boot): switch to systemd initramfs + rd.luks.name cmdlineLibravatar sommerfeld2026-05-131-1/+1
| | | | | | | | | Prerequisite for TPM2 LUKS unlock. systemd-cryptenroll stores TPM hints in LUKS2 token metadata, so no cmdline options are needed beyond rd.luks.name (sd-encrypt auto-discovers enrolled tokens). After chezmoi apply: sudo mkinitcpio -P && sudo sbctl verify, then reboot. Passphrase still works; TPM enrollment is a separate step.
* efistub -> UKI migrationLibravatar sommerfeld2026-04-211-0/+1
Track /etc/kernel/cmdline and enable default_uki/fallback_uki in linux.preset. Remove create-efi helper (UKI is self-contained; only needed once at install time). Update bootstrap to print the one-off efibootmgr command instead of launching create-efi.