Replies: 2 comments
This comment was marked as spam.
This comment was marked as spam.
-
That's a limitation of the original Whisper model. There are derivative projects, such as WhisperX, that employ other techniques (e.g. wav2vec 2.0) to try to improve upon this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am using whisper.cpp to create .SRT subtitle files from audio. Everything is working beautifully, except the timestamps are always on one-second boundaries. In all of the examples I see online, the start/end times of spoken sentences seem to have sub-second accuracy.
Is there a setting that controls this?
My setup:
Command:
Sample output:
As you can see, every line is output as if it was spoken precisely on one-second boundaries. Is this fixable? What have I done wrong?
Thanks in advance, happy to provide more info...
Beta Was this translation helpful? Give feedback.
All reactions