You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see in the changelog for 0.0.3 we should be able to 'Play audio segments as they are generated #26" but I'm having trouble getting that to work.
I might be doing something silly! Here are my results, it still saves a wav file, then starts playing after:
% python -m mlx_audio.tts.generate --model mlx-community/orpheus-3b-0.1-ft-4bit --text "Hello world" --play
Fetching 6 files: 100%|████████████████████████| 6/6 [00:00<00:00, 62601.55it/s]
Model: mlx-community/orpheus-3b-0.1-ft-4bit
Text: Hello world
Voice: None
Speed: 1.0x
Language: a
0%| | 0/1200 [00:00<?, ?it/s]mx.metal.set_wired_limt is deprecated and will be removed in a future version. Use mx.set_wired_limit instead.
mx.metal.get_peak_memory is deprecated and will be removed in a future version. Use mx.get_peak_memory instead.
mx.metal.clear_cache is deprecated and will be removed in a future version. Use mx.clear_cache instead.
10%|███▋ | 114/1200 [00:00<00:07, 143.25it/s]
==========
Duration: 00:00:01.365
Samples/sec: 0.7
Prompt: 1 tokens, 0.7 tokens-per-sec
Audio: 1 samples, 0.7 samples-per-sec
Real-time factor: 1.12x
Processing time: 1.22s
Peak memory usage: 1.92GB
✅ Audio successfully generated and saving as: audio_000.wav
The text was updated successfully, but these errors were encountered:
Thanks for the reply Blaizzy, and thanks for all of your work!
It would be really exciting to have streaming! I think a lot of us are working on speech to speech pipelines, myself included, and a streaming output from the TTS is the last gap to close.
I have an M1 ultra that can generate faster than realtime with both orpheus and CSM, and I love the results. If I could just play the first audio bytes out sooner, I'd be so happy!
I attempted to bring it over to MLX myself, but their implementation appears to use some unsupported operations on mlx, and that is unfortunately over my head at this time.
I see in the changelog for 0.0.3 we should be able to 'Play audio segments as they are generated #26" but I'm having trouble getting that to work.
I might be doing something silly! Here are my results, it still saves a wav file, then starts playing after:
The text was updated successfully, but these errors were encountered: