Australian Bioacoustic Search Tool
December 1, 2023 2:20 AM   Subscribe

Australian Bioacoustic Search Tool
The Australian Acoustic Observatory has 360 microphones across the continent, and over 2 million hours of audio. However, none of it is labeled: We want to make this enormous repository useful to researchers. We have found that researchers are often looking for 'hard' signals - specific call-types, birds with very little available training data, and so on. So we built an acoustic-similarity search tool, allowing researchers to provide an example of what they're looking for, which we then match against embeddings from the A2O dataset.

Here's some fun examples!

Laughing Kookaburra

Pacific Koel

Chiming Wedgebill

How it works, in a nutshell:
We use audio source separation to pull apart the A2O data, and then run an embedding model on each channel of the separated audio to produce a 'fingerprint' of the sound. All of this is put in a vector database with a link back to the original audio. When someone performs a search, we embed their audio, and then match against all of the embeddings in the vector database.

Right now, about 1% of the A2O data is indexed (the first minute of every recording, evenly sampled across the day). We're looking to get initial feedback and will then continue to iterate and expand coverage.
Role: team lead, acoustic ml research
posted by kaibutsu (5 comments total) 5 users marked this as a favorite

Amazing! I listened to a few examples and they all seemed to be good matches (but I'm no expert). Does the system ever get it wrong?
posted by mpark at 3:46 PM on December 1, 2023

Thanks for trying it out!

It turns out that one shot retrieval is a really hard task, so yes, especially when you get into hard examples.

However! We're aiming to make intersection with the system a bit more like how you normally use search: of you get bad answers, it might be worthwhile to try a different question. It may be easier to surface more distinctive call types, and focus on those, or it might work better if you don't a cleaner audio clip to query with. Part of the idea is to put the power to iterate in the hands of the user, instead of making them wait for model updates from one of a handful of ml engineers (who also, virtually, don't generally have the appropriate domain knowledge to know exactly what better looks like). (That said, we will keep making the model better!)
posted by kaibutsu at 6:24 PM on December 1, 2023 [2 favorites]

This is very cool thanks!
posted by tiny frying pan at 10:50 AM on December 12, 2023

Is there a role for human evaluation (e.g., crowdsourced listening & tagging), or has the software been accurate enough?

This is cool!
posted by wenestvedt at 6:43 AM on January 30 [1 favorite]

There's a lot of need for human experts! Australian species are very underrepresented in every database of bid vocalisations; a big part of the idea here is to make it easier to collect up a range of samples for a wide range of species, starting from a handful and some expert knowledge.

There's really no end to bioacoustics questions. There are 10k species of birds, many with a wide range of different calls, as well as geographic or individual variation. Calls relate to behavior, so as we nail down species identification, we start getting into questions about call types, which are often not well annotated or consistently tracked. A good example is juvenile calls: they can help tell if a population is reproducing successfully, but the difference between adult and juvenile calls is entirely in the heads of experts... Until someone has good enough reason to train a classifier for their particular species of interest.
posted by kaibutsu at 10:08 PM on February 4 [1 favorite]

« Older Bottlecap Mountain - “I Guess It's Christmas” - Of...   |   Noir solo journaling game... Newer »

You are not currently logged in. Log in or create a new account to post comments.