From d643aaf8a3a3fc8a31f39cb2ae3eaaf681300d4e Mon Sep 17 00:00:00 2001 From: sommerfeld Date: Wed, 13 May 2026 13:43:23 +0100 Subject: perf(dictate): switch default model to base for ~5x speedup large-v3-turbo-q5_0 ran ~1-2x realtime on the T490's CPU, making push-to-talk feel sluggish. The base multilingual model is ~142 MB (vs 547 MB) and runs ~7-10x realtime, dropping perceived latency on short utterances from a few seconds to near-instant. Quality on short EN/PT dictation remains usable; bump WHISPER_MODEL to small or large-v3-turbo if accuracy matters more than latency. --- meta/extra.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'meta/extra.txt') diff --git a/meta/extra.txt b/meta/extra.txt index fb87c3d..36c311f 100644 --- a/meta/extra.txt +++ b/meta/extra.txt @@ -10,5 +10,7 @@ tesseract-data-eng tesseract-data-por # Speech-to-text (used by ~/.local/bin/dictate) +# `base` multilingual: ~142 MB, ~7-10x realtime on a 4c CPU. Override +# WHISPER_MODEL in the script's environment to use a different ggml model. whisper.cpp -whisper.cpp-model-large-v3-turbo-q5_0 +whisper.cpp-model-base -- cgit v1.3.1