From 5c241d65ed4a6ec2bc3e5d75d6858ed6722f1b17 Mon Sep 17 00:00:00 2001 From: sommerfeld Date: Wed, 13 May 2026 13:43:23 +0100 Subject: feat(sway): add dictate (whisper.cpp) and ocr (tesseract) keybinds Push-to-talk dictation toggle on Super+i: parecord captures 16 kHz mono WAV, whisper-cli transcribes (auto language), output is typed via wtype and copied to the clipboard. Region OCR on Super+Shift+o: slurp + grim feed tesseract (eng+por), result lands in the clipboard with a notification preview. Adds wtype to wayland.txt; tesseract (+eng/por data) and whisper.cpp + the large-v3-turbo-q5_0 model package to extra.txt. --- meta/extra.txt | 9 +++++++++ meta/wayland.txt | 3 +++ 2 files changed, 12 insertions(+) (limited to 'meta') diff --git a/meta/extra.txt b/meta/extra.txt index f6082d9..fb87c3d 100644 --- a/meta/extra.txt +++ b/meta/extra.txt @@ -3,3 +3,12 @@ pandoc-bin syncthing udisks2 autenticacao-gov-pt-bin + +# OCR (used by ~/.local/bin/ocr) +tesseract +tesseract-data-eng +tesseract-data-por + +# Speech-to-text (used by ~/.local/bin/dictate) +whisper.cpp +whisper.cpp-model-large-v3-turbo-q5_0 diff --git a/meta/wayland.txt b/meta/wayland.txt index fa0f26f..91d68b4 100644 --- a/meta/wayland.txt +++ b/meta/wayland.txt @@ -30,6 +30,9 @@ grim slurp wf-recorder +# Wayland typing (used by dictate, etc) +wtype + # Image viewer imv -- cgit v1.3.1