Replies: 4 comments 11 replies
-
Lol now that is synchronicity : I was just in the process of evaluating how we can calculate the global sound level of the recording. However my interest was different : it is raining very much for the moment, and as ever when it does my Birdnet-Pi starts detecting the most interesting (but unlikely birds). It is probably due to a saturation of the spectrogram which interfers with the algorithm. I was therefore thinking that it could be interesting to measure the average loudness of a wav file, and modify the level of confidence (or detectability of new birds) if it is too high (= too much wind, too much rain, neighbour mowing his grass). However, I think we can only do it in a total/average manner : I don't think it would be possible to extract the loudness of the bird calls specifically as it would mean specifically extracting it |
Beta Was this translation helpful? Give feedback.
-
I'm working on the model, basic SNR doesn't work. I've put a code that extracts noise from the bottom 30% of the signal in terms of intensity + the <150hz section that should not contain much birdsongs. We might try to even increase the 30% part, as I don't think birdsongs take up that much bandwidth in the total signal. |
Beta Was this translation helpful? Give feedback.
-
So, in the end I did many test with many different methods based on low/high band ; loudness vs fourrier transform (= spectrum analysis)... And based on 9 seconds wav files here is the result : As you can see, 2 methods are the best in terms of amplitude, correlation with xeno-canto SNR and speed : the 1 and 6. However, the 1 is much easier in terms of code. So in the end I've implemented method 1 in the code of this branch : https://github.com/alexbelgium/BirdNET-Pi/tree/SNR To try it, follow the instructions in this post (#183 (reply in thread)) and it will analyse SNR for each detection independently (so chunk of 3 seconds) and add write it in the BirdDB.txt file. If you want to try manually the python code on individual wav files, you can add the code below in a python file and run the script with the wav file as argument : Method 1 and 3import os
import argparse
import time
import numpy as np
from scipy.io import wavfile
from scipy.signal import butter, sosfilt
from scipy.fftpack import fft
from scipy.interpolate import interp1d
# Common bandpass filter function
def bandpass_filter(audio_signal, sample_rate, low_freq, high_freq):
sos = butter(4, [low_freq, high_freq], btype='bandpass', fs=sample_rate, output='sos')
return sosfilt(sos, audio_signal)
# Method 1 : direct SNR
def method_1(audio_signal, sample_rate=48000):
start_time = time.time()
filtered_signal = bandpass_filter(audio_signal, sample_rate, 250, 10000)
signal_power = np.mean(filtered_signal ** 2)
quiet_threshold = np.percentile(np.abs(filtered_signal), 20)
quiet_section_noise = filtered_signal[np.abs(filtered_signal) < quiet_threshold]
noise_power = np.mean(quiet_section_noise ** 2) if len(quiet_section_noise) > 0 else 0.001
snr = 10 * np.log10(signal_power / noise_power)
return round(snr, 6), time.time() - start_time
# Method 3 : fourier transform
def method_3(audio_signal, sample_rate=48000):
start_time = time.time()
filtered_signal = bandpass_filter(audio_signal, sample_rate, 250, 10000)
fft_signal = fft(filtered_signal)
magnitude_spectrum = np.abs(fft_signal)
dominant_threshold = np.max(magnitude_spectrum) * 0.3
signal_mask = magnitude_spectrum > dominant_threshold
noise_mask = ~signal_mask
signal_power = np.mean(np.abs(fft_signal[signal_mask]) ** 2)
noise_power = np.mean(np.abs(fft_signal[noise_mask]) ** 2)
snr = 10 * np.log10(signal_power / noise_power)
return round(snr, 6), time.time() - start_time
# Load WAV file
def load_wav_file(filepath):
if not os.path.exists(filepath):
raise FileNotFoundError(f"The file {filepath} does not exist or cannot be accessed.")
sample_rate, audio_signal = wavfile.read(filepath)
return sample_rate, audio_signal
# Main function to compute SNR values and timings for each method
def compute_all_snrs(filepath):
sample_rate, audio_signal = load_wav_file(filepath)
results = {}
snr_1, time_1 = method_1(audio_signal, sample_rate)
snr_3, time_3 = method_3(audio_signal, sample_rate)
results['Method 1'] = {'SNR': snr_1, 'Time': time_1}
results['Method 3'] = {'SNR': snr_3, 'Time': time_3}
return results
# Parse the file path argument from the command line
def parse_arguments():
parser = argparse.ArgumentParser(description='Calculate SNR of an audio file using different methods.')
parser.add_argument('filepath', type=str, help='Path to the WAV file')
return parser.parse_args()
# Example usage
if __name__ == "__main__":
args = parse_arguments()
snr_results = compute_all_snrs(args.filepath)
for method, result in snr_results.items():
print(f"{method}: SNR = {result['SNR']} dB, Time = {result['Time']} seconds") Method 6import sys
import numpy as np
import scipy.io.wavfile as wav
from scipy.signal import butter, filtfilt
import time
# Define low-pass and high-pass filters
def apply_lowpass(data, highcut, fs):
nyq = 0.5 * fs
high = min(highcut / nyq, 1.0)
if high == 1.0:
return data # No filtering needed
b, a = butter(5, high, btype='low')
return filtfilt(b, a, data)
def apply_highpass(data, lowcut, fs):
nyq = 0.5 * fs
low = max(0, lowcut / nyq)
if low == 0:
return data # No filtering needed
b, a = butter(5, low, btype='high')
return filtfilt(b, a, data)
# Calculate SNR by applying low-pass and high-pass filters to separate signal and noise
def calculate_snr_with_details(signal, fs, signal_band=(250, 10000), noise_bands=((0, 200), (10000, None))):
start_time = time.time() # Start the execution timer
# Filter for the signal band (birdsong frequencies)
signal_filtered = apply_lowpass(apply_highpass(signal, signal_band[0], fs), signal_band[1], fs)
# Calculate power of the signal
signal_power = np.sum(signal_filtered**2)
# Filter and calculate noise power from the low and high frequency bands
low_noise = apply_lowpass(signal, noise_bands[0][1], fs)
high_noise = apply_highpass(signal, noise_bands[1][0], fs)
noise_power = np.sum(low_noise**2) + np.sum(high_noise**2)
# Avoid division by zero in case noise power is very low
noise_power = np.maximum(noise_power, 1e-10)
# Calculate SNR in dB
snr = 10 * np.log10(signal_power / noise_power)
# Example loudest and quietest level (for demonstration purposes)
loudest_level = 10 * np.log10(np.max(np.abs(signal_filtered))**2)
quietest_level = 10 * np.log10(np.min(np.abs(signal_filtered[np.nonzero(signal_filtered)]))**2)
# Measure the execution time
execution_time = time.time() - start_time
# Output the calculated values
return snr, loudest_level, quietest_level, execution_time
def main(file_path):
# Perform the calculation and show results
try:
# Reading the WAV file
fs, audio_data = wav.read(file_path)
# Handle stereo files by converting to mono
if len(audio_data.shape) == 2:
audio_data = np.mean(audio_data, axis=1)
# Calculate SNR and detailed levels
snr_value, loudest_level, quietest_level, execution_time = calculate_snr_with_details(audio_data, fs)
# Display the results for the file
#print(f"\nAnalyzed File: {file_path}")
#print(f"SNR: {snr_value:.1f} dB (ISO/ITU)")
#print(f"Fade-in removed: 0s")
#print(f"Fade-out removed: 0s")
#print(f"Loudest level: {loudest_level:.2f} dB (ISO)")
#print(f"Quietest level: {quietest_level:.2f} dB (ITU)")
#print(f"Execution Time: {execution_time:.2f} seconds")
print(f"Method 6: SNR = {snr_value:.6f} dB, Time = {execution_time:.17f} seconds")
except Exception as e:
print(f"Error processing {file_path}: {e}")
# Check if the script is being run with a file argument
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python script.py <file_path>")
else:
file_path = sys.argv[1]
main(file_path) |
Beta Was this translation helpful? Give feedback.
-
New model pushed in the SNR branch, that estimates modulation of signal in 3 main bands : 200-500 ; 500-1000 ; 1000-8000hz to calculate SNR on the band in which the birdsong is |
Beta Was this translation helpful? Give feedback.
-
Wouldn't be interesting to evaluate and store the intensity/loudness of the bird recording?
Beta Was this translation helpful? Give feedback.
All reactions