diff --git a/.gitignore b/.gitignore new file mode 100644 index 00000000..e43b0f98 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.DS_Store diff --git a/README.md b/README.md index 7f3463ba..c20a7fd4 100644 --- a/README.md +++ b/README.md @@ -12,13 +12,20 @@ In the subdirectories of this repository you can find some examples of using the - [rtp_to_hls](https://github.com/membraneframework/membrane_demo/tree/master/rtp_to_hls) - receiving RTP stream and broadcasting it via HLS - [rtsp_to_hls](https://github.com/membraneframework/membrane_demo/tree/master/rtsp_to_hls) - receiving RTSP stream and converting it to HLS - [video_mixer](https://github.com/membraneframework/membrane_demo/tree/master/video_mixer) - how to mix audio and video files -- [speech_to_text](https://github.com/membraneframework/membrane_demo/tree/master/speech_to_text) - real-time speech recognition using [Whisper](https://github.com/openai/whisper) in [Livebook] - [webrtc_to_hls](https://github.com/jellyfish-dev/membrane_rtc_engine/tree/master/examples/webrtc_to_hls) - converting WebRTC stream into HLS - [webrtc_videoroom](https://github.com/jellyfish-dev/membrane_rtc_engine/tree/master/examples/webrtc_videoroom) - basic example of [Membrane RTC Engine](https://github.com/jellyfish-dev/membrane_rtc_engine.git). It's as simple as possible just to show you how to use our API. -- + +Also there are some livebook examples located in [livebooks](https://github.com/membraneframework/membrane_demo/tree/master/livebooks) directory: + +- [speech_to_text](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/speech_to_text) - real-time speech recognition using [Whisper](https://github.com/openai/whisper) in [Livebook] +- [audio_mixer](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/audio_mixer) - mix a beep sound into background music +- [messages_source_and_sink](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/messages_source_and_sink) - setup a simple pipeline and send messages through it +- [playing_mp3_file](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/playing_mp3_file) - read mp3 file, transcode to acc and play +- [rtmp](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/rtmp) - send and receive `RTMP` stream +- [soundwave](https://github.com/membraneframework/membrane_demo/tree/master/livebooks/soundwave) - plot live audio amplitude on a graph ## Copyright and License -Copyright 2018, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) [![Software Mansion](https://logo.swmansion.com/logo?color=white&variant=desktop&width=200&tag=membrane-github)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) diff --git a/livebooks/README.md b/livebooks/README.md new file mode 100644 index 00000000..c37d88e9 --- /dev/null +++ b/livebooks/README.md @@ -0,0 +1,10 @@ +# Livebook examples + +This folder contains interactive livebook examples. To launch them you need to install livebook first. + +## Installation + +It is recommended to install Livebook via command line ([see official installation guide](https://github.com/livebook-dev/livebook#escript)). + +If livebook was installed directly from the official page, one should add `$PATH` variable to the Livebook environment: +![Setting path](./assets/path_set.png "Title") \ No newline at end of file diff --git a/livebooks/assets/path_set.png b/livebooks/assets/path_set.png new file mode 100644 index 00000000..a06d753e Binary files /dev/null and b/livebooks/assets/path_set.png differ diff --git a/livebooks/audio_mixer/README.md b/livebooks/audio_mixer/README.md new file mode 100644 index 00000000..5c259d12 --- /dev/null +++ b/livebooks/audio_mixer/README.md @@ -0,0 +1,13 @@ +# Membrane audio mixer + +This livebook shows how to mix a beep sound into background music over a period of time. + +To run the demo, [install Livebook](https://github.com/livebook-dev/livebook#escript) and open the `audio_mixer.livemd` file there. + +## Copyright and License + +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +[![Software Mansion](https://docs.membrane.stream/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +Licensed under the [Apache License, Version 2.0](LICENSE) diff --git a/livebooks/audio_mixer/assets/beep.aac b/livebooks/audio_mixer/assets/beep.aac new file mode 100644 index 00000000..2637c555 Binary files /dev/null and b/livebooks/audio_mixer/assets/beep.aac differ diff --git a/livebooks/audio_mixer/assets/sample_music_short.mp3 b/livebooks/audio_mixer/assets/sample_music_short.mp3 new file mode 100644 index 00000000..16280ea7 Binary files /dev/null and b/livebooks/audio_mixer/assets/sample_music_short.mp3 differ diff --git a/livebooks/audio_mixer/audio_mixer.livemd b/livebooks/audio_mixer/audio_mixer.livemd new file mode 100644 index 00000000..121ce3c2 --- /dev/null +++ b/livebooks/audio_mixer/audio_mixer.livemd @@ -0,0 +1,120 @@ +# Mixing audio files + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"}, + {:membrane_audio_mix_plugin, "~> 0.16.0"}, + {:membrane_file_plugin, "~> 0.16.0"}, + {:membrane_mp3_mad_plugin, "~> 0.18.0"}, + {:membrane_ffmpeg_swresample_plugin, "~> 0.19.0"}, + {:membrane_aac_fdk_plugin, "~> 0.18.0"}, + {:membrane_kino_plugin, github: "membraneframework-labs/membrane_kino_plugin", tag: "v0.3.1"}, + {:membrane_tee_plugin, "~> 0.12.0"} +]) +``` + +## Installation + +To run this demo one needs to install native dependencies: + +1. [MP3 MAD](https://github.com/membraneframework/membrane_mp3_mad_plugin/tree/v0.14.0#installation) +2. [AAC FDK](https://github.com/membraneframework/membrane_aac_fdk_plugin#installation) +3. [SWResample FFmpeg](https://github.com/membraneframework/membrane_ffmpeg_swresample_plugin#installation) + +## Description + +This is an example of mixing multiple short "beep" sound into background music, one by one, every second. + +## Pipeline definition + +Define all constants. + +```elixir +n_beeps = 30 +beep_filepath = "./assets/beep.aac" +background_filepath = "./assets/sample_music_short.mp3" +:ok +``` + +The file's "beep" sound input is decoded from `AAC` and split into separate inputs using the `Tee` element. These inputs are then filled into the mixer with corresponding time offsets. + +```elixir +import Membrane.ChildrenSpec + +alias Membrane.{File, AAC, Tee, Time} + +beep_audio_input = + child({:file_source, :beep}, %File.Source{location: beep_filepath}) + |> child({:decoder_aac, :beep}, AAC.FDK.Decoder) + |> child(:beeps, Tee.PushOutput) + +beeps_split = + for i <- 1..n_beeps do + get_child(:beeps) + |> via_in(:input, options: [offset: Time.seconds(i)]) + |> get_child(:mixer) + end + +:ok +``` + +The background music is loaded from a file and then decoded from the `MP3` format to the appropriate `Raw Audio` format. + +All mixer inputs must be of the same format. + +```elixir +import Membrane.ChildrenSpec + +alias Membrane.{File, RawAudio, MP3} +alias Membrane.FFmpeg.SWResample.Converter + +background_audio_input = + child(:file_source, %File.Source{location: background_filepath}) + |> child(:decoder_mp3, MP3.MAD.Decoder) + |> child(:converter, %Converter{ + input_stream_format: %RawAudio{channels: 2, sample_format: :s24le, sample_rate: 44_100}, + output_stream_format: %RawAudio{channels: 1, sample_format: :s16le, sample_rate: 44_100} + }) + |> get_child(:mixer) + +:ok +``` + +Mixer is created and directly connected to audio input of the player. + +```elixir +import Membrane.ChildrenSpec + +alias Membrane.{AudioMixer, AAC, Kino} + +kino = Membrane.Kino.Player.new(audio: true) + +mixer_output = + child(:mixer, Membrane.AudioMixer) + |> child(:encoder_aac, AAC.FDK.Encoder) + |> via_in(:audio) + |> child(:player, %Kino.Player.Sink{kino: kino}) + +:ok +``` + +Whole pipeline structure. + +```elixir +spec = beeps_split ++ [beep_audio_input, background_audio_input, mixer_output] +:ok +``` + +## Playing audio + +```elixir +alias Membrane.RCPipeline, as: RC + +pipeline = RC.start!() +RC.exec_actions(pipeline, spec: spec) + +kino +``` diff --git a/livebooks/messages_source_and_sink/README.md b/livebooks/messages_source_and_sink/README.md new file mode 100644 index 00000000..f3b9c183 --- /dev/null +++ b/livebooks/messages_source_and_sink/README.md @@ -0,0 +1,13 @@ +# Membrane messages source and sink + +This livebook shows how to setup a simple pipeline and send messages through it. + +To run the demo, [install Livebook](https://github.com/livebook-dev/livebook#escript) and open the `messages_source_and_sink.livemd` file there. + +## Copyright and License + +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +[![Software Mansion](https://docs.membrane.stream/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +Licensed under the [Apache License, Version 2.0](LICENSE) diff --git a/livebooks/messages_source_and_sink/messages_source_and_sink.livemd b/livebooks/messages_source_and_sink/messages_source_and_sink.livemd new file mode 100644 index 00000000..9c88f212 --- /dev/null +++ b/livebooks/messages_source_and_sink/messages_source_and_sink.livemd @@ -0,0 +1,140 @@ +# Messages source and sink + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"} +]) +``` + +## Erlang messages driven source + +```elixir +defmodule MessageSource do + use Membrane.Source + + require Membrane.Logger + + def_output_pad :output, + flow_control: :push, + accepted_format: _any + + def_options register_name: [ + description: "The name under which the element's process will be registered", + spec: atom() + ] + + + @impl true + def handle_init(_ctx, opts) do + Process.register(self(), opts.register_name) + {[], %{buffered: []}} + end + + @impl true + def handle_playing(_ctx, state) do + {actions, state} = send_buffers(state) + {[stream_format: {:output, %Membrane.RemoteStream{type: :bytestream}}] ++ actions, state} + end + + @impl true + def handle_info({:message, message}, ctx, state) do + state = %{state | buffered: state.buffered ++ [message]} + + if ctx.playback == :playing do + send_buffers(state) + else + {[], state} + end + end + + @impl true + def handle_info(msg, _ctx, state) do + Membrane.Logger.warning("Unknown message received: #{inspect(msg)}") + {[], state} + end + + defp send_buffers(state) do + actions = + Enum.map(state.buffered, fn message -> + {:buffer, {:output, %Membrane.Buffer{payload: message}}} + end) + + {actions, %{state | buffered: []}} + end +end +``` + +## Erlang messages driven sink + +```elixir +defmodule MessageSink do + use Membrane.Sink + + def_input_pad :input, + flow_control: :push, + accepted_format: _any + + def_options receiver: [ + description: "PID of the process that will receive messages from the sink", + spec: pid() + ] + + @impl true + def handle_init(_ctx, opts) do + {[], %{receiver: opts.receiver}} + end + + @impl true + def handle_buffer(:input, buffer, _ctx, state) do + send(state.receiver, {:message, self(), buffer.payload}) + {[], state} + end +end +``` + +## Pipeline definition and startup + +```elixir +alias Membrane.RCPipeline +import Membrane.ChildrenSpec + +defmodule MyPipeline do + use Membrane.Pipeline + + @impl true + def handle_init(_ctx, opts) do + spec = + child(:source, %MessageSource{register_name: :messages_source}) + |> child(:sink, %MessageSink{receiver: Keyword.get(opts, :receiver)}) + + {[spec: spec], nil} + end +end + +{:ok, _supervisor, pipeline} = Membrane.Pipeline.start(MyPipeline, receiver: self()) +payloads = 1..10 + +Task.async(fn -> + Enum.each( + payloads, + &send(:messages_source, {:message, &1}) + ) +end) + +:ok +``` + +## Printing of the messages received and pipeline termination + +```elixir +for _i <- 1..10 do + receive do + {:message, _pid, _value} = msg -> IO.inspect(msg) + end +end + +RCPipeline.terminate(pipeline) +``` diff --git a/livebooks/playing_mp3_file/README.md b/livebooks/playing_mp3_file/README.md new file mode 100644 index 00000000..fcee8bb4 --- /dev/null +++ b/livebooks/playing_mp3_file/README.md @@ -0,0 +1,13 @@ +# Membrane playing mp3 file demo + +This livebook shows how to load `MP3` audio from the file, transcode it to the `AAC` codec, and play it. + +To run the demo, [install Livebook](https://github.com/livebook-dev/livebook#escript) and open the `playing_mp3_file.livemd` file there. + +## Copyright and License + +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +[![Software Mansion](https://docs.membrane.stream/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +Licensed under the [Apache License, Version 2.0](LICENSE) diff --git a/livebooks/playing_mp3_file/assets/sample.mp3 b/livebooks/playing_mp3_file/assets/sample.mp3 new file mode 100644 index 00000000..4d57d779 Binary files /dev/null and b/livebooks/playing_mp3_file/assets/sample.mp3 differ diff --git a/livebooks/playing_mp3_file/playing_mp3_file.livemd b/livebooks/playing_mp3_file/playing_mp3_file.livemd new file mode 100644 index 00000000..4017a8af --- /dev/null +++ b/livebooks/playing_mp3_file/playing_mp3_file.livemd @@ -0,0 +1,82 @@ +# Playing MP3 File + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"}, + {:membrane_file_plugin, "~> 0.16.0"}, + {:membrane_mp3_mad_plugin, "~> 0.18.0"}, + {:membrane_ffmpeg_swresample_plugin, "~> 0.19.0"}, + {:membrane_aac_fdk_plugin, "~> 0.18.0"}, + {:membrane_kino_plugin, github: "membraneframework-labs/membrane_kino_plugin", tag: "v0.3.1"} +]) +``` + +## Installation + +To run this demo one needs to install native dependencies: + +1. [MP3 MAD](https://github.com/membraneframework/membrane_mp3_mad_plugin/tree/v0.14.0#installation) +2. [AAC FDK](https://github.com/membraneframework/membrane_aac_fdk_plugin#installation) +3. [SWResample FFmpeg](https://github.com/membraneframework/membrane_ffmpeg_swresample_plugin#installation) + +## Description + +This is example of loading `MP3` audio from the file, transcoding it to the `AAC` codec, and playing it via `Membrane.Kino.Player`. + +## Pipeline definition + +Defines simple linear pipeline of the given structure: + +1. Load `MP3` file from the file. +2. Transcode `MP3` to `AAC` (it is required by `Kino.Player`): + 1. Decode `MP3` format to `RawAudio`, + 2. Change `sample_format` from `s24le` to `s16le` (it is required by `FDK.Encoder`), + 3. Encode it to `AAC` format. +3. Fill in audio stream to the player via `:audio` input pad. + +```elixir +import Membrane.ChildrenSpec, + only: [{:child, 2}, {:child, 3}, {:via_in, 2}] + +alias Membrane.{ + File, + MP3, + FFmpeg, + RawAudio, + AAC, + Kino +} + +# https://freemusicarchive.org/music/Paper_Navy/All_Grown_Up/08_Swan_Song/ +audio_path = "./assets/sample.mp3" +kino = Membrane.Kino.Player.new(audio: true) + +spec = + child(:file_source, %File.Source{location: audio_path}) + |> child(:decoder_mp3, MP3.MAD.Decoder) + |> child(:converter, %FFmpeg.SWResample.Converter{ + input_stream_format: %RawAudio{channels: 2, sample_format: :s24le, sample_rate: 44_100}, + output_stream_format: %RawAudio{channels: 2, sample_format: :s16le, sample_rate: 44_100} + }) + |> child(:encoder_aac, AAC.FDK.Encoder) + |> via_in(:audio) + |> child(:player, %Kino.Player.Sink{kino: kino}) + +:ok +``` + +## Player + +Run pipeline: + +```elixir +alias Membrane.RCPipeline, as: RC + +pipeline = RC.start!() +RC.exec_actions(pipeline, spec: spec) + +kino +``` diff --git a/livebooks/rtmp/README.md b/livebooks/rtmp/README.md new file mode 100644 index 00000000..b9904572 --- /dev/null +++ b/livebooks/rtmp/README.md @@ -0,0 +1,15 @@ +# Membrane RMTP demo + +Sender livebook shows how to download video and audio from the web using the `Hackney` plugin, and stream it via `RTMP` to the other, receiver livebook. + +Receiver livebook shows how to receive `RTMP` stream mentioned above and play it in the livebook. + +To run the demo, [install Livebook](https://github.com/livebook-dev/livebook#escript) and open both `rtmp_sender.livemd` and `rtmp_sender.livemd` files there. + +## Copyright and License + +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +[![Software Mansion](https://docs.membrane.stream/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +Licensed under the [Apache License, Version 2.0](LICENSE) diff --git a/livebooks/rtmp/rtmp_receiver.livemd b/livebooks/rtmp/rtmp_receiver.livemd new file mode 100644 index 00000000..b84ff3a0 --- /dev/null +++ b/livebooks/rtmp/rtmp_receiver.livemd @@ -0,0 +1,177 @@ +# RTMP Receiver + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"}, + {:membrane_realtimer_plugin, "~> 0.9.0"}, + {:membrane_rtmp_plugin, "~> 0.19.0"}, + {:membrane_kino_plugin, github: "membraneframework-labs/membrane_kino_plugin", tag: "v0.3.1"} +]) +``` + +## Installation + +To run this demo one needs to install native dependencies: + +1. [H264 FFmpeg](https://github.com/membraneframework/membrane_h264_ffmpeg_plugin/#installation) + +## Description + +Defines a server that receives a media stream from the RTMP source and plays it directly in the notebook. + +## Pipeline definition + +Here's the definition of the pipeline: + +1. The RTMP source provides video and audio data. +2. The data is then parsed into suitable H264 and AAC formats. +3. Finally, the media is pushed to Kino.Player. + +```elixir +defmodule RTMP.Receiver.Pipeline do + use Membrane.Pipeline + + @impl true + def handle_init(_ctx, socket: socket, kino: kino) do + source = + child(:source, %Membrane.RTMP.SourceBin{ + socket: socket + }) + + playing_audio = + get_child(:source) + |> via_out(:audio) + |> child(:audio_parser, %Membrane.AAC.Parser{ + out_encapsulation: :ADTS + }) + |> via_in(:audio) + |> get_child(:player) + + playing_video = + get_child(:source) + |> via_out(:video) + |> child(:video_parser, %Membrane.H264.Parser{ + generate_best_effort_timestamps: %{framerate: {25, 1}}, + output_stream_structure: :annexb + }) + |> via_in(:video) + |> get_child(:player) + + player = child(:player, %Membrane.Kino.Player.Sink{kino: kino}) + + spec = [source, playing_audio, playing_video, player] + {[spec: spec], %{}} + end + + # Once the source initializes, we grant it the control over the tcp socket + @impl true + def handle_child_notification( + {:socket_control_needed, _socket, _source} = notification, + :source, + _ctx, + state + ) do + send(self(), notification) + + {[], state} + end + + def handle_child_notification(_notification, _child, _ctx, state) do + {[], state} + end + + @impl true + def handle_info({:socket_control_needed, socket, source} = notification, _ctx, state) do + case Membrane.RTMP.SourceBin.pass_control(socket, source) do + :ok -> + :ok + + {:error, :not_owner} -> + Process.send_after(self(), notification, 200) + end + + {[], state} + end + + # The rest of the module is used for self-termination of the pipeline after processing finishes + @impl true + def handle_element_end_of_stream(:sink, _pad, _ctx, state) do + Membrane.Pipeline.terminate(self()) + {[], state} + end + + @impl true + def handle_element_end_of_stream(_child, _pad, _ctx, state) do + {[], state} + end +end + +:ok +``` + +## Server + +Receiving an RTMP stream requires creating a TCP server. After the connection is established, a pipeline is created using the TCP socket. + +```elixir +defmodule RTMP.Receiver do + @server_ip {127, 0, 0, 1} + + def run(port: port, kino: kino) do + parent = self() + + server_options = %Membrane.RTMP.Source.TcpServer{ + port: port, + listen_options: [ + :binary, + packet: :raw, + active: false, + ip: @server_ip + ], + socket_handler: fn socket -> + # On new connection a pipeline is started + {:ok, _supervisor, pipeline} = + Membrane.Pipeline.start(RTMP.Receiver.Pipeline, socket: socket, kino: kino) + + send(parent, {:pipeline_spawned, pipeline}) + {:ok, pipeline} + end + } + + {:ok, pipeline} = start_server(server_options) + + await_termination(pipeline) + end + + defp start_server(server_options) do + {:ok, _server_pid} = Membrane.RTMP.Source.TcpServer.start_link(server_options) + + receive do + {:pipeline_spawned, pipeline} -> + {:ok, pipeline} + end + end + + defp await_termination(pipeline) do + monitor_ref = Process.monitor(pipeline) + + receive do + {:DOWN, ^monitor_ref, :process, _pid, _reason} -> + :ok + end + end +end + +:ok +``` + +```elixir +port = 1942 + +kino = Membrane.Kino.Player.new(video: true, audio: true) +Kino.render(kino) +RTMP.Receiver.run(port: port, kino: kino) +``` diff --git a/livebooks/rtmp/rtmp_sender.livemd b/livebooks/rtmp/rtmp_sender.livemd new file mode 100644 index 00000000..f429b7e6 --- /dev/null +++ b/livebooks/rtmp/rtmp_sender.livemd @@ -0,0 +1,140 @@ +# RTMP Sender + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"}, + {:membrane_realtimer_plugin, "~> 0.9.0"}, + {:membrane_hackney_plugin, "~> 0.11.0"}, + {:membrane_rtmp_plugin, "~> 0.21.0"} +]) +``` + +## Installation + +To run this demo one needs to install native dependencies: + +1. [H264 FFmpeg](https://github.com/membraneframework/membrane_h264_ffmpeg_plugin/#installation) + +## Description + +Defines a pipeline downloading [Big Buck Bunny](https://en.wikipedia.org/wiki/Big_Buck_Bunny) trailer video and audio from Membranes' asset samples page using the `Hackney` plugin, and sending it via `RTMP` to the other Livebook. + +## Pipeline definition + +To download media from the internet, we use `Hackney`. We then convert the raw data to the appropriate format, such as `H264` or `AAC`. We also regulate the download speed to avoid overloading the system and ensure that the media is added to the `MP4 container` to comply with `RTMP.Sink` requirements. Finally, we transmit both video and audio using `RTMP` to the other livebook. Once the entire stream has been sent, the pipeline will automatically terminate. + +```elixir +defmodule RTMP.Sender.Pipeline do + use Membrane.Pipeline + + @samples_url "https://raw.githubusercontent.com/membraneframework/static/gh-pages/samples/big-buck-bunny/" + @video_url @samples_url <> "bun33s_480x270.h264" + @audio_url @samples_url <> "bun33s.aac" + @impl true + def handle_init(_ctx, destination: destination) do + video_source = + child(:video_source, %Membrane.Hackney.Source{ + location: @video_url, + hackney_opts: [follow_redirect: true] + }) + |> child(:video_parser, %Membrane.H264.Parser{ + output_alignment: :au, + skip_until_keyframe: true, + generate_best_effort_timestamps: %{framerate: {25, 1}} + }) + |> child(:video_realtimer, Membrane.Realtimer) + |> child(:video_payloader, %Membrane.H264.Parser{output_stream_structure: :avc1}) + + audio_source = + child(:audio_source, %Membrane.Hackney.Source{ + location: @audio_url, + hackney_opts: [follow_redirect: true] + }) + |> child(:audio_parser, %Membrane.AAC.Parser{ + out_encapsulation: :ADTS + }) + |> child(:audio_realtimer, Membrane.Realtimer) + + rtmp_sink = + child(:rtmp_sink, %Membrane.RTMP.Sink{ + rtmp_url: destination, + max_attempts: :infinity + }) + + spec = [ + video_source + |> via_in(Pad.ref(:video, 0)) + |> get_child(:rtmp_sink), + audio_source + |> via_in(Pad.ref(:audio, 0)) + |> get_child(:rtmp_sink), + rtmp_sink + ] + + {[spec: spec], %{streams_to_end: 2}} + end + + # The rest of the example module is only used for self-termination of the pipeline after processing finishes + + @impl true + def handle_element_end_of_stream(:rtmp_sink, _pad, _ctx, %{streams_to_end: 1} = state) do + Membrane.Pipeline.terminate(self()) + {[], %{state | streams_to_end: 0}} + end + + @impl true + def handle_element_end_of_stream(:rtmp_sink, _pad, _ctx, state) do + {[], %{state | streams_to_end: 1}} + end + + @impl true + def handle_element_end_of_stream(_child, _pad, _ctx, state) do + {[], state} + end +end + +:ok +``` + +## Sender + +RTMP protocol requires a client-server communication, where the TCP server receives the data and the client sends it. + +```elixir +defmodule RTMP.Sender do + def run(port: port) do + destination_url = "rtmp://localhost:#{port}" + + {:ok, pipeline} = start_tcp_client(destination_url) + + await_termination(pipeline) + end + + defp start_tcp_client(destination_url) do + {:ok, _supervisor, pipeline} = + Membrane.Pipeline.start(RTMP.Sender.Pipeline, destination: destination_url) + + {:ok, pipeline} + end + + defp await_termination(pipeline) do + monitor_ref = Process.monitor(pipeline) + + receive do + {:DOWN, ^monitor_ref, :process, _pid, _reason} -> + :ok + end + end +end + +:ok +``` + +```elixir +port = 1942 + +RTMP.Sender.run(port: port) +``` diff --git a/livebooks/soundwave/README.md b/livebooks/soundwave/README.md new file mode 100644 index 00000000..06095a7f --- /dev/null +++ b/livebooks/soundwave/README.md @@ -0,0 +1,13 @@ +# Membrane Soundwave demo + +This livebook example shows how to perform real-time soundwave plotting with the use of the [Membrane Framework](https://github.com/membraneframework) and [Vega-Lite](https://vega.github.io/vega-lite/). + +To run the demo, [install Livebook](https://github.com/livebook-dev/livebook#escript) and open the `soundwave.livemd` file there. + +## Copyright and License + +Copyright 2024, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +[![Software Mansion](https://docs.membrane.stream/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane) + +Licensed under the [Apache License, Version 2.0](LICENSE) diff --git a/livebooks/soundwave/soundwave.livemd b/livebooks/soundwave/soundwave.livemd new file mode 100644 index 00000000..6abdf336 --- /dev/null +++ b/livebooks/soundwave/soundwave.livemd @@ -0,0 +1,253 @@ +# Soundwave plotting example + +```elixir +File.cd(__DIR__) +Logger.configure(level: :error) + +Mix.install([ + {:membrane_core, "~> 1.0"}, + {:membrane_raw_audio_parser_plugin, "~> 0.4.0"}, + {:membrane_portaudio_plugin, "~> 0.18.0"}, + {:vega_lite, "~> 0.1.8"}, + {:kino_vega_lite, "~> 0.1.11"} +]) +``` + +## Introduction + +This livebook example shows how to perform real-time soundwave plotting with the use of the [Membrane Framework](https://github.com/membraneframework) and [Vega-Lite](https://vega.github.io/vega-lite/). + +By following that example you will learn how to read the audio from the microphone, how is audio represented, and how to create your custom Membrane element that plots the soundwave with the use of the elixir bindings to the Vega-Lite. + +## Installation + +You need to have `FFmpeg` installed. For installation details take a look [here](https://www.ffmpeg.org/). + +## Soundwave plotting sink + +Since there is no plugin in the `Membrane Framework`, that already provides an element capable of plotting soundwave, we need to write one on our own. +The element, called `Visualizer` is a sink, placed at the end of a pipeline. + +The element has a single `:input` pad, on which raw audio is expected to appear. + +> Raw audio is represented as an array of samples, with each sample describing the amplitude of the sound at a given time. There is a possibility that there are a few samples (from so-called different channels) for the same point in time. In such a case, the samples from different channels (e.g. samples `A` from the first channel and samples `B` from the second channel) might be either interleaved (`ABABABAB`), or put one sequence after the other: (`AAAABBBB`). +> +> Each sample is of a particular format, and the format is defined by: +> +> * the type of a number - e.g. `f` might stand for a `float` and `s` might stand for a `signed` integer +> * number of bits used to represent a number +> * endianness (order of bytes) - specifies the significance of the bytes in the byte sequence (little endian or big endian). +> An exemplary sample format might be `s16le` which stands for a signed integer written on 16 bits, with low endian order of bytes. +> +> For some intuition on the formats you can take a look at a [`Membrane.RawAudio.SampleFormat` module](https://github.com/membraneframework/membrane_raw_audio_format/blob/master/lib/membrane_raw_audio/sample_format.ex) + +### Stream format handling + +Once the `stream_format` is received on the `:input` pad, some relevant information, i.e. the number of channels or the sampling rate, is fetched out of `Membrane.RawAudio` stream format structure. Based on that information a `VegaLite` chart is prepared. + +### Buffers handling + +Once a buffer is received, its payload is split into samples, based on `sample_format` of the `Membrane.RawAudio`. The amplitude of sound from different channels measured at the same time is average. As a result, a list of samples with each sample being an amplitude of sound at a given time is produced. + +That list of samples is appended to the list of unprocessed samples stored in the element's state. Right after that `maybe_plot` function is invoked - and if there are enough samples, the samples are used to produce some points that are put on the plot. + +### Plotting of the soundwave + +Plotting all the audio samples with the typically used frequency (e.g. `44100 Hz`) is impossible due to limitations of the plot displaying system. That is why the list of samples is split into several chunks, and for each of these chunks, a sample with `maximal` and `minimal` amplitude is found. For each chunk, only these two samples representing a given chunk are later put on the plot, with `x` value being a given sample timestamp, and `y` value being a measured amplitude of audio. +tributes are used to drive the process of plotting: + +* `@windows_size` - describes the maximum number of points that are visible together on a plot, +* `@window_duration` - describes the time range (in seconds) of points visible on the plot, +* `@plot_updating_frequency` - describes how many times per second a plot should be updated with new points. + We encourage you to play with these attributes and adjust them to your needs. Please be aware, that setting too high `@windows_size` or `@plot_updating_frequency` might cause the plot to not be generated in real-time. At the same time, setting too low values of these parameters might result in a loss of the plot's accuracy (for instance making it insensitive to high-frequency sounds). + +For more implementation details take a look at the code and the comments that describe parts, that might appear unobvious. + +```elixir +defmodule Visualizer do + use Membrane.Sink + + alias Membrane.RawAudio + alias VegaLite, as: Vl + + require Membrane.Logger + + @window_size 1000 + + # seconds + @window_duration 3 + + # Hz + @plot_update_frequency 50 + + @points_per_update @window_size / (@window_duration * @plot_update_frequency) + + def_input_pad :input, + accepted_format: %RawAudio{}, + flow_control: :auto + + @impl true + def handle_init(_ctx, _opts) do + {[], + %{ + chart: nil, + initial_pts: nil, + bytes_per_sample: nil, + sample_rate: nil, + sample_format: nil, + channels: nil, + samples: [] + }} + end + + defguardp has_stream_format_arrived(ctx) when ctx.pads.input.stream_format != nil + + @impl true + def handle_stream_format(:input, stream_format, ctx, state) + when not has_stream_format_arrived(ctx) do + {_sign, bits_per_sample, _endianness} = + RawAudio.SampleFormat.to_tuple(stream_format.sample_format) + + chart = create_chart(stream_format) + Kino.render(chart) + + {[], + %{ + state + | sample_rate: stream_format.sample_rate, + sample_format: stream_format.sample_format, + channels: stream_format.channels, + bytes_per_sample: :erlang.round(bits_per_sample / 8), + chart: chart + }} + end + + @impl true + def handle_stream_format(:input, _stream_format, _ctx, state) do + Membrane.Logger.warning(":input stream format received once again, ignoring.") + {[], state} + end + + @impl true + def handle_buffer(:input, buffer, ctx, state) do + state = if state.initial_pts == nil, do: %{state | initial_pts: buffer.pts}, else: state + + samples = + for <> do + RawAudio.sample_to_value(sample, ctx.pads.input.stream_format) + end + # we need to make an average out of the samples for all the channels + |> Enum.chunk_every(state.channels) + |> Enum.map(&(Enum.sum(&1) / length(&1))) + + state = %{state | samples: state.samples ++ samples} + + maybe_plot(buffer.pts, state) + end + + defp maybe_plot(pts, state) do + samples_per_update = state.sample_rate / @plot_update_frequency + samples_per_point = :erlang.ceil(samples_per_update / @points_per_update) + + state = + if length(state.samples) > samples_per_update do + sample_duration = Ratio.new(1, state.sample_rate) |> Membrane.Time.seconds() + + # `*2`, because in each loop run we are producing 2 points + points = + Enum.chunk_every(state.samples, 2 * samples_per_point) + |> Enum.with_index() + |> Enum.flat_map(fn {point_samples, chunk_i} -> + Enum.with_index(point_samples) + |> Enum.min_max_by(fn {value, _sample_i} -> value end) + |> Tuple.to_list() + |> Enum.map(fn {value, sample_i} -> + # the pts of a given sample is the pts of the buffer in which it has arrived + # plus the time that has elapsed for all the previous chunks from that buffer + # plus the time for all the preceeding samples from a given chunk + # minus the first buffer's pts + x = + (pts + (chunk_i * samples_per_point + sample_i) * sample_duration - + state.initial_pts) + |> Membrane.Time.as_milliseconds(:round) + + %{x: x, y: value} + end) + end) + + Kino.VegaLite.push_many(state.chart, points, window: @window_size) + %{state | samples: []} + else + state + end + + {[], state} + end + + defp create_chart(stream_format) do + Vl.new(width: 1000, height: 400, title: "Amplitude vs time") + |> Vl.mark(:line, point: true) + |> Vl.encode_field(:x, "x", title: "Time [s]", type: :quantitative) + |> Vl.encode_field(:y, "y", + title: "Amplitude", + type: :quantitative, + scale: %{ + domain: [ + # we want the range of the domain to be slightly bigger than the range of an amplitude + RawAudio.sample_min(stream_format) * 1.1, + RawAudio.sample_max(stream_format) * 1.1 + ] + } + ) + |> Kino.VegaLite.new() + end +end +``` + +## Pipeline structure + +Once we are ready with the `Visualizer` element, we can set the pipeline up. +The pipeline will consist of: + +* a microphone input, +* a raw audio parser (we need that element to provide timestamps to the buffers), +* the `Visualizer`. + +All the elements are connected linearly. + +```elixir +import Membrane.ChildrenSpec + +spec = + child(:microphone, Membrane.PortAudio.Source) + |> child(:audio_parser, %Membrane.RawAudioParser{ + overwrite_pts?: true + }) + |> child(:visualizer, Visualizer) + +:ok +``` + +## Running the pipeline + +Finally, we can start the `Membrane.RCPipeline` (remote-controlled pipeline): + +```elixir +alias Membrane.RCPipeline + +pipeline = RCPipeline.start!() +``` + +Finally, we can commission `spec` action execution with the previously created pipeline stucture: + +```elixir +RCPipeline.exec_actions(pipeline, spec: spec) +``` + +On the plot above you should be able to see the relation between an audio amplitude and time. + +You can terminate the pipeline with the following code: + +```elixir +RCPipeline.terminate(pipeline) +``` diff --git a/speech_to_text/README.md b/livebooks/speech_to_text/README.md similarity index 100% rename from speech_to_text/README.md rename to livebooks/speech_to_text/README.md diff --git a/speech_to_text/speech_to_text.livemd b/livebooks/speech_to_text/speech_to_text.livemd similarity index 100% rename from speech_to_text/speech_to_text.livemd rename to livebooks/speech_to_text/speech_to_text.livemd