bcAdmin 4 call identification using CoreML

22. Februar 2024

An ongoing project that started some years ago is the new version of batIdent with added bat calls and new measurements. Since end of 2023 a first version of a classification algorithm is available including new calls. We want to give you insights into how the development was done to give you a better understanding of the basis for automated species identification. This will include various descriptions of classification accuracy and quality. The new algorithm is available directly in bcAdmin4 and users with a license including the year 2024 can make use of it.

Currently we are still evaluating and improving the classifiers for the final release.

Reference database

Since roughly 15 years we are maintaining our own database of reference recordings. For this purpose we even developed a highly specialised software called bcRefCalls. Now in its third major revision it offers many custom filters, call type classification tools all together helping to choose the optimal set of calls for species identification. It includes many graphs and plots to test the data regarding good coverage of call variability. In addition we examine each call in regards of completeness of recording and quality of measurements. Only the best calls get marked as good calls, while others get marked as bad. Many calls are not marked at all due to very bad quality. When training the classifiers we test the effect of call quality.

Each call in training was examined and marked as good or bad.

The new classifiers proved to be stable using good and bad calls thus resulting in a classifier more suited to every day recordings. Usually when deploying your detector, you will find many calls of lesser quality. To give the model a chance to classify these as well, the decision was made.

Validation recordings

The common procedure when training a model is that the software takes a random amount of recordings for crossvalidation. Since these calls are taken from the same data base as the training set, so from the same recordings, there may be an error introduced. Now there exists no public available dataset of unrelated recordings for evaluation. We therefore decided to compile a set of 260 recordings from our database, mostly unrelated to the reference recording and use them for an extra evaluation outside the training process.

We released our validation recordings for public private use

Since no data sets with european bat call recordings exist for the purpose of validating different identification tools, we decided to publish our set to the public. These recordings are maintained with the reference calls used for training classifiers. A selection process was in place to compile recordings of acceptable quality. The recordings can be used for non-commercial tests of call classification software for free. For users of bcAdmin4 a database is located within the download.

Archive of validation set (445 MB)