| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | perf(dictate): switch default model to base for ~5x speedup | 2026-05-13 | 1 | -2/+2 | |
| | | | | | | | | | | | large-v3-turbo-q5_0 ran ~1-2x realtime on the T490's CPU, making push-to-talk feel sluggish. The base multilingual model is ~142 MB (vs 547 MB) and runs ~7-10x realtime, dropping perceived latency on short utterances from a few seconds to near-instant. Quality on short EN/PT dictation remains usable; bump WHISPER_MODEL to small or large-v3-turbo if accuracy matters more than latency. | ||||
| * | feat(sway): add dictate (whisper.cpp) and ocr (tesseract) keybinds | 2026-05-13 | 1 | -0/+87 | |
| Push-to-talk dictation toggle on Super+i: parecord captures 16 kHz mono WAV, whisper-cli transcribes (auto language), output is typed via wtype and copied to the clipboard. Region OCR on Super+Shift+o: slurp + grim feed tesseract (eng+por), result lands in the clipboard with a notification preview. Adds wtype to wayland.txt; tesseract (+eng/por data) and whisper.cpp + the large-v3-turbo-q5_0 model package to extra.txt. | |||||
