Blurt - a GNOME shell extension for speech to text input into any window #1834
Replies: 1 comment
-
Reposting my comment here since it refers to the above: In response to a feature request, I added network transcription support (using the whisper.cpp server implementation) to Blurt (GNOME extension) and BlahST speech-to-text input tools (based on whisper.cpp). I am blown away by the more-than expected speedup of transcription when going to the server! Before, I was getting ~30x-faster-than-realtime transcription with a local whisper.cpp instance that was loading the model file on each call. And the request itself (timed to stderr with curl itself) This is almost 90x-faster-than-real-time (~140 ms for a 12.5s speech clip). Loading the model takes about 110 ms for the "main" executable. Seems like there is extra advantage to running a local server with the model preloaded??? Any thoughts? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Blurt is a simple Gnome shell extension evolved from the command line utility NoteWhispers, which itself, is built around the great whisper.cpp.
So whisper.cpp has become a standard tool in my Linux workflow, initially mostly for Joplin note taking, but now, thanks to this extension, in every application with editable text field. Wanted to avoid simulating input events (frowned upon for good reasons), so one has to still use the middle mouse button to paste the transcribed text from the clipboard.
The base model is used by default with 30x-faster-than-realtime transcription (CUBLAS), resulting in about 300ms transcription for 10s speech on an average machine with a new(ish) CPU. Give it a try here or at extensions.gnome.org if you use GNOME on Linux.
Beta Was this translation helpful? Give feedback.
All reactions