Detection intensity/loudness #183

jmtmp · 2024-09-02T15:46:10Z

jmtmp
Sep 2, 2024

Wouldn't be interesting to evaluate and store the intensity/loudness of the bird recording?

alexbelgium · 2024-09-09T10:55:23Z

alexbelgium
Sep 9, 2024

Lol now that is synchronicity : I was just in the process of evaluating how we can calculate the global sound level of the recording.

However my interest was different : it is raining very much for the moment, and as ever when it does my Birdnet-Pi starts detecting the most interesting (but unlikely birds). It is probably due to a saturation of the spectrogram which interfers with the algorithm. I was therefore thinking that it could be interesting to measure the average loudness of a wav file, and modify the level of confidence (or detectability of new birds) if it is too high (= too much wind, too much rain, neighbour mowing his grass).

However, I think we can only do it in a total/average manner : I don't think it would be possible to extract the loudness of the bird calls specifically as it would mean specifically extracting it

8 replies

alexbelgium Sep 13, 2024

Perhaps this script could work with providing the wav file as argument. Not much tested though. I was thinking that if it works, we could add the two values (average loudness, and SNR) to the observations in the sqlite database. This might allow interesting correlations

You need to execute it with : $PYTHON_VIRTUAL_ENV script.py file.wav

script.py content :

import wave
import numpy as np
import sys

def read_wav_file(wav_file):
    with wave.open(wav_file, 'r') as wf:
        signal = wf.readframes(-1)
        return np.frombuffer(signal, dtype=np.int16)

def calculate_snr(signal):
    signal_power = np.mean(signal**2)
    noise = signal - np.mean(signal)
    noise_power = np.mean(noise**2)
    snr = 10 * np.log10(signal_power / noise_power)
    return snr

def calculate_average_loudness(signal):
    rms = np.sqrt(np.mean(signal**2))
    return rms

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python script.py <wav_file>")
        sys.exit(1)

    wav_file = sys.argv[1]
    signal = read_wav_file(wav_file)
    
    snr = calculate_snr(signal)
    print(f"SNR: {snr:.2f}")

    average_loudness = calculate_average_loudness(signal)
    print(f"Loudness: {average_loudness:.2f}")

jmtmp Sep 13, 2024
Author

That's great!
And which is better, working with the whole waw file or evaluating chunk by chunk, just like the model does?

alexbelgium Sep 17, 2024

I made a branch here to calculate SNR and loudness on the 3s chunks and write them to the BirdDB.txt file

https://github.com/alexbelgium/BirdNET-Pi/tree/SNR

jmtmp Sep 17, 2024
Author

I would like to test it, can you help me how to install it?

alexbelgium Sep 17, 2024

Run the following script. You'll see in your file BirdDB.txt that the SNR and loudness will appear.
Example : 2024-09-17;17:49:27;Erithacus rubecula;Rougegorge familier;0.7324;50.6786;4.7227;0.70;38;1.25;1;0.000916;0.002194 with SNR being 0.000916 and loudness 0.002194
Both seem to correlate exponentially so indeed tracking only SNR makes more sense

#!/bin/bash
for files in scripts/server.py scripts/install_services.sh scripts/utils/helpers.py scripts/utils/reporting.py; do
    echo "Adapting $files"
    rm /home/$USER/BirdNET-Pi/$files
    curl -o /home/$USER/BirdNET-Pi/$files https://raw.githubusercontent.com/alexbelgium/BirdNET-Pi/SNR/$files
    chown "$USER:$USER" /home/$USER/BirdNET-Pi/$files
    chmod 777 /home/$USER/BirdNET-Pi/$files
done
systemctl restart birdnet_recording
systemctl restart birdnet_log
systemctl restart birdnet_analysis

and to revert to initial it should be :

#!/bin/bash
for files in scripts/server.py scripts/install_services.sh scripts/utils/helpers.py scripts/utils/reporting.py; do
    echo "Adapting $files"
    rm /home/$USER/BirdNET-Pi/$files
    curl -o /home/$USER/BirdNET-Pi/$files https://raw.githubusercontent.com/Nachtzuster/BirdNET-Pi/refs/heads/main/$files
    chown "$USER:$USER" /home/$USER/BirdNET-Pi/$files
    chmod 777 /home/$USER/BirdNET-Pi/$files
done
systemctl restart birdnet_recording
systemctl restart birdnet_log
systemctl restart birdnet_analysis

alexbelgium · 2024-09-18T07:53:36Z

alexbelgium
Sep 18, 2024

I'm working on the model, basic SNR doesn't work. I've put a code that extracts noise from the bottom 30% of the signal in terms of intensity + the <150hz section that should not contain much birdsongs. We might try to even increase the 30% part, as I don't think birdsongs take up that much bandwidth in the total signal.

3 replies

jmtmp Sep 18, 2024
Author

Unfortunately, the script does not work for me, when I create an sh file and run it, it gives me the error "Syntax error: word unexpected (expecting 'do')". I don't know why, this is my first raspberry and linux.

It would be nice to test SNR on live data, i.e. ideally for each recording to display SNR in addition to confidence and links, for the begining at least somewhere, I don't know how laborious it is to code.

My first idea to increase the accuracy of the SNR was to really focus on the area where the birdsong is. I.e. cut off the bottom and the top, in both cases the cutoff could be variable depending on whether there is any signal in the given area (deviation from noise greater than something?). This could be relatively simple. In theory it would then be possible to cut off the begining and end of the chunk as well (again depending on whether there is a signal or just noise). The SNR should then be significantly more predictive.

jmtmp Sep 18, 2024
Author

By the way, if this would work, it could also be the primary filter whether the signal is suitable for further evaluation by the model.

alexbelgium Sep 18, 2024

I'm currently in the process of doing some tests and tweaking the algorithm so it might be easier for you to test once its done. I think there was an issue with the initial code for the sh, but it should have been edited in the post above. Not sure if the edits showed for example if you got the notification by email.

Regarding SNR calculation :

High/low filters : I wonder if we should cut off bottom and top, isn't it where the noise would be the most representative of background? As between 500Hz and 10Khz we should get the bird songs. I was going the opposite route actually : use those (in addition to low power parts of the signal) to calculate a more accurate SNR
Chunks vs whole file : I'm also starting to wonder if we are right to calculate on chunks instead of the whole extraction file (so 10-20 seconds). The whole file should be much more representative of ambiant sound, as the 3s chunk could be for example a whole song therefore with no significant background noise that could be extrapolated from the whole spectrum. So actually, we "need" background noise without birdsongs to be able to calculate an accurate SNR, and the more background we have the better the signal can be compared to it
Btw xeno-canto has described a similar method to classify their observations but it is based on much longer samples, coherent with the idea of analyzing the whole extraction wav https://xeno-canto.org/article/273

I'll continue doing some tests with both and post my results a bit later. Currently I see no (little) correlations between detections & SNR (which is expected), however I do see an evolution of SNR with spectrogram "cleanliness".

alexbelgium · 2024-09-21T09:15:22Z

alexbelgium
Sep 21, 2024

So, in the end I did many test with many different methods based on low/high band ; loudness vs fourrier transform (= spectrum analysis)... And based on 9 seconds wav files here is the result :

As you can see, 2 methods are the best in terms of amplitude, correlation with xeno-canto SNR and speed : the 1 and 6. However, the 1 is much easier in terms of code.

So in the end I've implemented method 1 in the code of this branch : https://github.com/alexbelgium/BirdNET-Pi/tree/SNR

To try it, follow the instructions in this post (#183 (reply in thread)) and it will analyse SNR for each detection independently (so chunk of 3 seconds) and add write it in the BirdDB.txt file.

If you want to try manually the python code on individual wav files, you can add the code below in a python file and run the script with the wav file as argument :

Method 1 and 3

import os
import argparse
import time
import numpy as np
from scipy.io import wavfile
from scipy.signal import butter, sosfilt
from scipy.fftpack import fft
from scipy.interpolate import interp1d

# Common bandpass filter function
def bandpass_filter(audio_signal, sample_rate, low_freq, high_freq):
    sos = butter(4, [low_freq, high_freq], btype='bandpass', fs=sample_rate, output='sos')
    return sosfilt(sos, audio_signal)

# Method 1 : direct SNR
def method_1(audio_signal, sample_rate=48000):
    start_time = time.time()
    filtered_signal = bandpass_filter(audio_signal, sample_rate, 250, 10000)
    signal_power = np.mean(filtered_signal ** 2)
    quiet_threshold = np.percentile(np.abs(filtered_signal), 20)
    quiet_section_noise = filtered_signal[np.abs(filtered_signal) < quiet_threshold]
    noise_power = np.mean(quiet_section_noise ** 2) if len(quiet_section_noise) > 0 else 0.001
    snr = 10 * np.log10(signal_power / noise_power)
    return round(snr, 6), time.time() - start_time

# Method 3 : fourier transform
def method_3(audio_signal, sample_rate=48000):
    start_time = time.time()
    filtered_signal = bandpass_filter(audio_signal, sample_rate, 250, 10000)
    fft_signal = fft(filtered_signal)
    magnitude_spectrum = np.abs(fft_signal)
    dominant_threshold = np.max(magnitude_spectrum) * 0.3
    signal_mask = magnitude_spectrum > dominant_threshold
    noise_mask = ~signal_mask
    signal_power = np.mean(np.abs(fft_signal[signal_mask]) ** 2)
    noise_power = np.mean(np.abs(fft_signal[noise_mask]) ** 2)
    snr = 10 * np.log10(signal_power / noise_power)
    return round(snr, 6), time.time() - start_time


# Load WAV file
def load_wav_file(filepath):
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"The file {filepath} does not exist or cannot be accessed.")
    
    sample_rate, audio_signal = wavfile.read(filepath)
    return sample_rate, audio_signal

# Main function to compute SNR values and timings for each method
def compute_all_snrs(filepath):
    sample_rate, audio_signal = load_wav_file(filepath)
    results = {}
    
    snr_1, time_1 = method_1(audio_signal, sample_rate)
    snr_3, time_3 = method_3(audio_signal, sample_rate)

    results['Method 1'] = {'SNR': snr_1, 'Time': time_1}
    results['Method 3'] = {'SNR': snr_3, 'Time': time_3}

    return results

# Parse the file path argument from the command line
def parse_arguments():
    parser = argparse.ArgumentParser(description='Calculate SNR of an audio file using different methods.')
    parser.add_argument('filepath', type=str, help='Path to the WAV file')
    return parser.parse_args()

# Example usage
if __name__ == "__main__":
    args = parse_arguments()
    snr_results = compute_all_snrs(args.filepath)
    for method, result in snr_results.items():
        print(f"{method}: SNR = {result['SNR']} dB, Time = {result['Time']} seconds")

Method 6

import sys
import numpy as np
import scipy.io.wavfile as wav
from scipy.signal import butter, filtfilt
import time

# Define low-pass and high-pass filters
def apply_lowpass(data, highcut, fs):
    nyq = 0.5 * fs
    high = min(highcut / nyq, 1.0)
    if high == 1.0:
        return data  # No filtering needed
    b, a = butter(5, high, btype='low')
    return filtfilt(b, a, data)

def apply_highpass(data, lowcut, fs):
    nyq = 0.5 * fs
    low = max(0, lowcut / nyq)
    if low == 0:
        return data  # No filtering needed
    b, a = butter(5, low, btype='high')
    return filtfilt(b, a, data)

# Calculate SNR by applying low-pass and high-pass filters to separate signal and noise
def calculate_snr_with_details(signal, fs, signal_band=(250, 10000), noise_bands=((0, 200), (10000, None))):
    start_time = time.time()  # Start the execution timer
    
    # Filter for the signal band (birdsong frequencies)
    signal_filtered = apply_lowpass(apply_highpass(signal, signal_band[0], fs), signal_band[1], fs)
    
    # Calculate power of the signal
    signal_power = np.sum(signal_filtered**2)
    
    # Filter and calculate noise power from the low and high frequency bands
    low_noise = apply_lowpass(signal, noise_bands[0][1], fs)
    high_noise = apply_highpass(signal, noise_bands[1][0], fs)
    noise_power = np.sum(low_noise**2) + np.sum(high_noise**2)
    
    # Avoid division by zero in case noise power is very low
    noise_power = np.maximum(noise_power, 1e-10)
    
    # Calculate SNR in dB
    snr = 10 * np.log10(signal_power / noise_power)
    
    # Example loudest and quietest level (for demonstration purposes)
    loudest_level = 10 * np.log10(np.max(np.abs(signal_filtered))**2)
    quietest_level = 10 * np.log10(np.min(np.abs(signal_filtered[np.nonzero(signal_filtered)]))**2)
    
    # Measure the execution time
    execution_time = time.time() - start_time
    
    # Output the calculated values
    return snr, loudest_level, quietest_level, execution_time

def main(file_path):
    # Perform the calculation and show results
    try:
        # Reading the WAV file
        fs, audio_data = wav.read(file_path)
        
        # Handle stereo files by converting to mono
        if len(audio_data.shape) == 2:
            audio_data = np.mean(audio_data, axis=1)
        
        # Calculate SNR and detailed levels
        snr_value, loudest_level, quietest_level, execution_time = calculate_snr_with_details(audio_data, fs)
        
        # Display the results for the file
        #print(f"\nAnalyzed File: {file_path}")
        #print(f"SNR: {snr_value:.1f} dB (ISO/ITU)")
        #print(f"Fade-in removed: 0s")
        #print(f"Fade-out removed: 0s")
        #print(f"Loudest level: {loudest_level:.2f} dB (ISO)")
        #print(f"Quietest level: {quietest_level:.2f} dB (ITU)")
        #print(f"Execution Time: {execution_time:.2f} seconds")
        print(f"Method 6: SNR = {snr_value:.6f} dB, Time = {execution_time:.17f} seconds")
    
    except Exception as e:
        print(f"Error processing {file_path}: {e}")

# Check if the script is being run with a file argument
if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python script.py <file_path>")
    else:
        file_path = sys.argv[1]
        main(file_path)

0 replies

alexbelgium · 2024-09-29T09:14:57Z

alexbelgium
Sep 29, 2024

New model pushed in the SNR branch, that estimates modulation of signal in 3 main bands : 200-500 ; 500-1000 ; 1000-8000hz to calculate SNR on the band in which the birdsong is

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detection intensity/loudness #183

{{title}}

Replies: 4 comments 11 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Detection intensity/loudness #183

Replies: 4 comments · 11 replies

jmtmp Sep 13, 2024 Author

jmtmp Sep 17, 2024 Author

jmtmp Sep 18, 2024 Author

jmtmp Sep 18, 2024 Author

Replies: 4 comments 11 replies

jmtmp Sep 13, 2024
Author

jmtmp Sep 17, 2024
Author

jmtmp Sep 18, 2024
Author

jmtmp Sep 18, 2024
Author