Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Good work #6

Open
michoael opened this issue Sep 15, 2023 · 18 comments
Open

Good work #6

michoael opened this issue Sep 15, 2023 · 18 comments

Comments

@michoael
Copy link

I'm wait this project from a lot time, thank you bro.
I changed something to transcribe wav file but from choose not from recorder but when i try transcribe i get only one sentence and not complete listen.
My file wav worked fine on vosk, what i do wrong?

@nyadla-sys
Copy link
Owner

can you share wav file here

@michoael
Copy link
Author

I'm sorry i can't understand how upload file here.
but i converted wav file by ffmpeg with this command (from Mp4 to wav) and i add more 10second because if file short convert to long and can listening it & all this worked fine on vosk library

` String[] c = {
"-y", "-i", nameFile,"-acodec", "pcm_s16le", "-ar" ,"16000" ,"-ac" ,"1" ,"-af","apad=pad_dur=10s",lastL.getPath()
};

`
and long file 4 minutes
sometimes get only first sentence, and another time get 5 sentence (with any changed on code)

if share wav file important please tell me how i can send it

@nyadla-sys
Copy link
Owner

Could you please upload to Google drive and share the link

@michoael
Copy link
Author

if redirect to wrong link take it copy and past to download directly

@nyadla-sys
Copy link
Owner

I got file

@nyadla-sys
Copy link
Owner

I got file and I am working on it

@nyadla-sys
Copy link
Owner

I see the below output "become our student and get access to effective and free educational materials" and I think it has multiple voices and let me try with original pytorch openai whisper

@nyadla-sys
Copy link
Owner

I tried your file with original openai whsiper model on Google colab and I see the same output as above, and I guess it may be due to Speaker diarisation
!pip install transformersfrom transformers import pipeline
pipe = pipeline(task="automatic-speech-recognition", model="openai/whisper-tiny")
pipe("/content/testwo.wav")

##output
{'text': ' become our student and get access to effective and free educational materials.'}

@nyadla-sys
Copy link
Owner

by the way right now model restricts to take only 30seconds audio clip as input rest of the file will be ignored.in order to make it to work for big file we need to split audio content into 30s chunks and feed for whisper model

@michoael
Copy link
Author

if you try it again like 3 or 4 times you can get

Become our student and get access to effective and free educational materials. Where are you studying and what's your major? I am studying at Beijing University. I major in civil law. Why did you choose Beijing University?

@michoael
Copy link
Author

michoael commented Sep 16, 2023

you say : and I think it has multiple voices and let me try with original pytorch openai whisper

but somtimes i can get this resulte

Become our student and get access to effective and free educational materials. Where are you studying and what's your major? I am studying at Beijing University. I major in civil law. Why did you choose Beijing University?

this resulte on your model !

@nyadla-sys
Copy link
Owner

nyadla-sys commented Sep 16, 2023

and it is expected result and if you need more to be transcribe need to split audio file into 30s chunk each and feed as input to model and you are expected to get full audio text

@michoael
Copy link
Author

imm, but how i can cut audio to 30 sec without cut on speaker ?

@michoael
Copy link
Author

and no way to change it ? i hope use long file directly because i want add timestamp too

@nyadla-sys
Copy link
Owner

I need to work on to support long files

@michoael
Copy link
Author

okay, i will wait you, and i hope do it like vosk, thank you bro

@michoael
Copy link
Author

If you can tell me what you will work on
Code java or models ( i want learn and understand)

@lrq3000
Copy link

lrq3000 commented Oct 7, 2023

I think nyadla you mean the challence here is where to split, because we don't want to split in the middle of a word, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants