aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/dot_local/bin/executable_dictate
Commit message (Collapse)AuthorAgeFilesLines
* perf(dictate): switch default model to base for ~5x speedupLibravatar sommerfeld2026-05-131-2/+2
| | | | | | | | | | large-v3-turbo-q5_0 ran ~1-2x realtime on the T490's CPU, making push-to-talk feel sluggish. The base multilingual model is ~142 MB (vs 547 MB) and runs ~7-10x realtime, dropping perceived latency on short utterances from a few seconds to near-instant. Quality on short EN/PT dictation remains usable; bump WHISPER_MODEL to small or large-v3-turbo if accuracy matters more than latency.
* feat(sway): add dictate (whisper.cpp) and ocr (tesseract) keybindsLibravatar sommerfeld2026-05-131-0/+87
Push-to-talk dictation toggle on Super+i: parecord captures 16 kHz mono WAV, whisper-cli transcribes (auto language), output is typed via wtype and copied to the clipboard. Region OCR on Super+Shift+o: slurp + grim feed tesseract (eng+por), result lands in the clipboard with a notification preview. Adds wtype to wayland.txt; tesseract (+eng/por data) and whisper.cpp + the large-v3-turbo-q5_0 model package to extra.txt.