Skip to content

Conversation

@kgarwoodsdzwa
Copy link
Member

Tools to attempt to isolate significant events in a wav file for unlabeled audio. Visualize the RMS and Mel-spectrogram for a wav file, or extract segments that exceed the average RMS for a wav file to inspect closer or label. Tested on one week deployment of passive acoustic recorders at panda site and was able to determine a repeated feeding time where panda chewing can be heard.

kgarwoodsdzwa and others added 14 commits June 6, 2025 14:57
needs some work to go through numerous sound files and give
better resulting filenames. but this script takes a sound file
and then calculates the rms across a certain number of frames you
tell it the size and hop for, giving you an array of rms values.
It then calculates the average rms, and multiplies it by 1.5
and creates 3 second segments of audio centered around the
frame that exceeded the rms threshold. it will not create
overlapping segments. there's definitely some potential issues
with this but for now seems to be able to create segments based
on the relative loudness of the whole sample. for example it
hasn't had to handle creating a segment where the values that
exceed the threshold equate for more than the specified 3s for
clip creation.
displays 2 charts with the same timestep, the top one is the rms
for each of your specified frame lengths and the bottom is a
mel spectrogram. you can see the correlation between the two
and it can be helpful
beginnings of way to run multiple files through this process.
needs to create different filenames to differentiate the original
wavs still but then it should be gravy
filenames were assuming 1 wav file before, now it writes the name
of the clip with the original wav file in the filename so its
clearer, it also successfully usilizes the proper specified
outpath to save the files
there is one error of catching too broad an except but im going
to ignore it for right now because it helps with debugging
the last catch for making the final segment in a wav file was
creating a new right_index to stop the recording, when it should
have just been using the stop_index so that it wouldn't go out of
bounds. also made a sub function to make the rms array to clean
up the code a bit
i want to ultimately put this in the whoot package so it shouldn't
necessarily be called main
adding the frame_length, the hop_length, and desired clip size to
the config file to easily adjust the values
@kgarwoodsdzwa kgarwoodsdzwa requested a review from Sean1572 August 25, 2025 22:03
@kgarwoodsdzwa kgarwoodsdzwa linked an issue Aug 25, 2025 that may be closed by this pull request
@kgarwoodsdzwa kgarwoodsdzwa marked this pull request as draft August 26, 2025 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add tool to extract loudest segments from panda zoo recordings

2 participants