Good work #6

michoael · 2023-09-15T18:42:41Z

I'm wait this project from a lot time, thank you bro.
I changed something to transcribe wav file but from choose not from recorder but when i try transcribe i get only one sentence and not complete listen.
My file wav worked fine on vosk, what i do wrong?

nyadla-sys · 2023-09-15T21:36:21Z

can you share wav file here

michoael · 2023-09-16T12:50:56Z

I'm sorry i can't understand how upload file here.
but i converted wav file by ffmpeg with this command (from Mp4 to wav) and i add more 10second because if file short convert to long and can listening it & all this worked fine on vosk library

` String[] c = {
"-y", "-i", nameFile,"-acodec", "pcm_s16le", "-ar" ,"16000" ,"-ac" ,"1" ,"-af","apad=pad_dur=10s",lastL.getPath()
};

`
and long file 4 minutes
sometimes get only first sentence, and another time get 5 sentence (with any changed on code)

if share wav file important please tell me how i can send it

nyadla-sys · 2023-09-16T12:53:35Z

Could you please upload to Google drive and share the link

michoael · 2023-09-16T13:31:59Z

if redirect to wrong link take it copy and past to download directly

nyadla-sys · 2023-09-16T13:32:35Z

I got file

nyadla-sys · 2023-09-16T13:33:05Z

I got file and I am working on it

nyadla-sys · 2023-09-16T13:41:38Z

I see the below output "become our student and get access to effective and free educational materials" and I think it has multiple voices and let me try with original pytorch openai whisper

nyadla-sys · 2023-09-16T13:49:24Z

I tried your file with original openai whsiper model on Google colab and I see the same output as above, and I guess it may be due to Speaker diarisation
!pip install transformersfrom transformers import pipeline
pipe = pipeline(task="automatic-speech-recognition", model="openai/whisper-tiny")
pipe("/content/testwo.wav")

##output
{'text': ' become our student and get access to effective and free educational materials.'}

nyadla-sys · 2023-09-16T13:51:38Z

by the way right now model restricts to take only 30seconds audio clip as input rest of the file will be ignored.in order to make it to work for big file we need to split audio content into 30s chunks and feed for whisper model

michoael · 2023-09-16T13:51:55Z

if you try it again like 3 or 4 times you can get

Become our student and get access to effective and free educational materials. Where are you studying and what's your major? I am studying at Beijing University. I major in civil law. Why did you choose Beijing University?

michoael · 2023-09-16T14:04:01Z

you say : and I think it has multiple voices and let me try with original pytorch openai whisper

but somtimes i can get this resulte

Become our student and get access to effective and free educational materials. Where are you studying and what's your major? I am studying at Beijing University. I major in civil law. Why did you choose Beijing University?

this resulte on your model !

nyadla-sys · 2023-09-16T14:09:36Z

and it is expected result and if you need more to be transcribe need to split audio file into 30s chunk each and feed as input to model and you are expected to get full audio text

michoael · 2023-09-16T14:14:42Z

imm, but how i can cut audio to 30 sec without cut on speaker ?

michoael · 2023-09-16T14:25:34Z

and no way to change it ? i hope use long file directly because i want add timestamp too

nyadla-sys · 2023-09-16T14:31:48Z

I need to work on to support long files

michoael · 2023-09-16T14:37:58Z

okay, i will wait you, and i hope do it like vosk, thank you bro

michoael · 2023-09-16T18:27:27Z

If you can tell me what you will work on
Code java or models ( i want learn and understand)

lrq3000 · 2023-10-07T19:53:18Z

I think nyadla you mean the challence here is where to split, because we don't want to split in the middle of a word, right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Good work #6

Good work #6

michoael commented Sep 15, 2023

nyadla-sys commented Sep 15, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023 •

edited

Loading

nyadla-sys commented Sep 16, 2023 •

edited

Loading

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023

lrq3000 commented Oct 7, 2023 •

edited

Loading

Good work #6

Good work #6

Comments

michoael commented Sep 15, 2023

nyadla-sys commented Sep 15, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023 • edited Loading

nyadla-sys commented Sep 16, 2023 • edited Loading

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023

nyadla-sys commented Sep 16, 2023

michoael commented Sep 16, 2023

michoael commented Sep 16, 2023

lrq3000 commented Oct 7, 2023 • edited Loading

michoael commented Sep 16, 2023 •

edited

Loading

nyadla-sys commented Sep 16, 2023 •

edited

Loading

lrq3000 commented Oct 7, 2023 •

edited

Loading