Skip to content

Commit

Permalink
Merge branch 'main' of github.com:NINAnor/rare_species_detections int…
Browse files Browse the repository at this point in the history
…o main
  • Loading branch information
femke-sintef committed Jan 10, 2024
2 parents 4018981 + f24030e commit aa7e360
Show file tree
Hide file tree
Showing 7 changed files with 99 additions and 72 deletions.
2 changes: 1 addition & 1 deletion BEATs_on_ESC50/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM busybox AS model

# Create a folder where to store the BEATs checkpoints
ADD "https://valle.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D" /model/BEATs_iter3_plus_AS2M.pt
#ADD "https://valle.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D" /model/BEATs_iter3_plus_AS2M.pt

FROM busybox AS data

Expand Down
92 changes: 42 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@

[![DOI](https://zenodo.org/badge/597046464.svg)](https://zenodo.org/badge/latestdoi/597046464)

:collision: **A PIPELINE FOR FINE-TUNING BEATs ON ESC50 DATASET IS PROVIDED [HERE](https://github.com/NINAnor/rare_species_detections/tree/main/BEATs_on_ESC50)**. The rest of the repository is on training a prototypical network using BEATs as feature extractor

:arrow_down:
:collision: **A PIPELINE FOR FINE-TUNING BEATs ON ESC50 DATASET IS PROVIDED [HERE](https://github.com/NINAnor/rare_species_detections/tree/main/BEATs_on_ESC50)**. The rest of the repository is on training a prototypical network using BEATs as feature extractor :collision:

**Few-shot learning is a highly promising paradigm for sound event detection. It is also an extremely good fit to the needs of users in bioacoustics, in which increasingly large acoustic datasets commonly need to be labelled for events of an identified category** (e.g. species or call-type), even though this category might not be known in other datasets or have any yet-known label. While satisfying user needs, this will also benchmark few-shot learning for the wider domain of sound event detection (SED).

<p align="center"><img src="images/VM.png" alt="figure" width="400" height="400"/></p>
<p align="center"><img src="images/VM.png" alt="figure" width="300" height="300"/></p>

**Few-shot learning describes tasks in which an algorithm must make predictions given only a few instances of each class, contrary to standard supervised learning paradigm.** The main objective is to find reliable algorithms that are capable of dealing with data sparsity, class imbalance and noisy/busy environments. Few-shot learning is usually studied using N-way-K-shot classification, where N denotes the number of classes and K the number of examples for each class.

Expand All @@ -32,16 +30,9 @@ We have made a small wrapper to download the DCASE data and the BEATs model. Onl
./dcase_setup.sh /BASE/FOLDER/
```

The script should create a `DCASE` folder containing all the [DCASE data (i.e. Development and Evaluation set)](https://dcase.community/challenge2023/task-few-shot-bioacoustic-event-detection#validation-set) and a `BEATs` folder containing the [model weights](https://github.com/microsoft/unilm/tree/master/beats) in the specified base folder.

Once the necessary files have been dowloaded, you can either pull the Docker image and rename it:

```bash
docker pull docker pull ghcr.io/ninanor/rare_species_detections:main
docker tag docker pull ghcr.io/ninanor/rare_species_detections:main beats
```
The script should create a `DCASE` folder containing the [DCASE Development Set (i.e. Training and Validation set)](https://dcase.community/challenge2023/task-few-shot-bioacoustic-event-detection#validation-set) and a `BEATs` folder containing the [model weights](https://github.com/microsoft/unilm/tree/master/beats) in the specified base folder.

Or create the Docker image from the Dockerfile located in our repository:
Once the necessary files have been dowloaded create the Docker image from the Dockerfile located in our repository:

```bash
git clone https://github.com/NINAnor/rare_species_detections.git
Expand All @@ -51,33 +42,43 @@ docker build -t beats -f Dockerfile .

## Processing the data

First we need to process the DCASE data (i.e. denoising, resampling ... and saving the data as numpy array). For this we can use:
Because of the duration of the preprocessing, we save the preprocessed files as `numpy arrays`. This way we can experiment with the pipeline without constantly pre-processing the data. To facilitate the pre-processing step use:

```bash
./preprocess_data.sh /BASE/FOLDER
```

The script will create a new folder `DCASEfewshot` containing three subfolders (`train`, `validate` and `evaluate`). Each of these folder contains the processed data in the form of `numpy arrays`.
The script will create a new folder `DCASEfewshot` containing three subfolders (`train`, `validate` and `evaluate`). Each of these folder **contains a subfolder with a hash as a name**. **The hash has been created based on the processing parameters**. The processed data in the form of `numpy arrays`.

:black_nib: You can change the parameters for preprocessing the data in the [CONFIG.yaml file](/CONFIG.yaml)

:black_nib: Note that to create the `numpy arrays` for `train`, `validate` and `evaluate` you need to change the [CONFIG.yaml file](/CONFIG.yaml) at each iteration.

## Train the model

It is now possible to train the network using `prototypicalbeats/trainer.py`:
Now that the data have been preprocessed into numpy arrays you can use them as a model input with `train_model.sh`:

```bash
./train_model.sh /BASE/FOLDER
```

The training script should create a `log` folder in the base folder (`lightning_logs/`) in which the model weights (`version_X/checkpoints/*.ckpt`) and the training configuration (`version_X/checkpoints/config.yaml`) are stored.

:black_nib: You can change the parameters for training the model in the [CONFIG.yaml file](/CONFIG.yaml)

## Using the model on the Validation / Evaluation dataset

To run the prediction use the script `test_model`. Note that the file `CONFIG.yaml` file need to be updated. In particular you will need to change the `model_path`, `status` (either `test` or `validate`). and `set_type` (`Validation_Set` or `Evaluation_Set`)
:black_nib: Update the `status` parameter of the [CONFIG.yaml file](/CONFIG.yaml) to the dataset you want to use the model on. Change `status` to either **validate** or **evaluate**.

:black_nib: Also update the `model_path` in the [CONFIG.yaml file](/CONFIG.yaml) to the checkpoints (`ckpt`) that has been trained in the previous step (stored in `lightning_logs`)

To run the prediction use the script `test_model`.

```bash
./test_model.sh /BASE/FOLDER
```

`test_model.sh` creates a result file `eval_out.csv` in the `BASE/FOLDER` containing all the detections made the model.
`test_model.sh` creates a result file `eval_out.csv` in the folder containing the processed `validation` data. **The full path is printed in the console**

Note that there are other advanced options. For instance, if `--wav_save` is specified, the script will also return a `.wav` file for all files containing additional channels: the ground truth labels, the predicted labels, the distance to the POS prototype and finally the p-values. The `.wav` file can be opened in [Audacity](https://www.audacityteam.org/) to be inspected more closely.

Expand All @@ -86,42 +87,33 @@ Note that there are other advanced options. For instance, if `--wav_save` is spe
Once the `eval_out.csv` has been created, it is possible to get the results for our approach. Note that the metrics can only be computed for the `Validation_Set` as it contains all ground truth labels as opposed to the `Evaluation_Set` for which only the 5 first samples of the POS class are labelled.

```bash
docker run -v $CODE_DIR:/app \
-v $DATA_DIR:/data \
--gpus all \
beats \
poetry run python evaluation/evaluation_metrics/evaluation.py \
-pred_file /data/eval_out.csv \
-ref_files_path /data/DCASE/Development_Set_annotations/Validation_Set \
-team_name BEATs \
-dataset VAL \
-savepath /data/.
./compute_metrics.sh /BASE/FOLDER /PATH/TO/eval_out.csv
```

The results we obtained:
Here are the results we obtain using our pipeline described in our [Technical Report](https://dcase.community/documents/challenge2023/technical_reports/DCASE2023_Gelderblom_SINTEF_t5.pdf)

```
Evaluation for: BEATs VAL
BUK1_20181011_001004.wav {'TP': 15, 'FP': 35, 'FN': 16, 'total_n_pos_events': 31}
BUK1_20181013_023504.wav {'TP': 2, 'FP': 258, 'FN': 22, 'total_n_pos_events': 24}
BUK4_20161011_000804.wav {'TP': 1, 'FP': 30, 'FN': 46, 'total_n_pos_events': 47}
BUK4_20171022_004304a.wav {'TP': 7, 'FP': 17, 'FN': 10, 'total_n_pos_events': 17}
BUK5_20161101_002104a.wav {'TP': 31, 'FP': 7, 'FN': 57, 'total_n_pos_events': 88}
BUK5_20180921_015906a.wav {'TP': 4, 'FP': 24, 'FN': 19, 'total_n_pos_events': 23}
ME1.wav {'TP': 9, 'FP': 18, 'FN': 2, 'total_n_pos_events': 11}
ME2.wav {'TP': 41, 'FP': 27, 'FN': 0, 'total_n_pos_events': 41}
R4_cleaned recording_13-10-17.wav {'TP': 19, 'FP': 14, 'FN': 0, 'total_n_pos_events': 19}
R4_cleaned recording_16-10-17.wav {'TP': 30, 'FP': 8, 'FN': 0, 'total_n_pos_events': 30}
R4_cleaned recording_17-10-17.wav {'TP': 36, 'FP': 9, 'FN': 0, 'total_n_pos_events': 36}
R4_cleaned recording_TEL_19-10-17.wav {'TP': 52, 'FP': 12, 'FN': 2, 'total_n_pos_events': 54}
R4_cleaned recording_TEL_20-10-17.wav {'TP': 64, 'FP': 8, 'FN': 0, 'total_n_pos_events': 64}
R4_cleaned recording_TEL_23-10-17.wav {'TP': 84, 'FP': 8, 'FN': 0, 'total_n_pos_events': 84}
R4_cleaned recording_TEL_24-10-17.wav {'TP': 99, 'FP': 14, 'FN': 0, 'total_n_pos_events': 99}
R4_cleaned recording_TEL_25-10-17.wav {'TP': 99, 'FP': 9, 'FN': 0, 'total_n_pos_events': 99}
file_423_487.wav {'TP': 57, 'FP': 13, 'FN': 0, 'total_n_pos_events': 57}
file_97_113.wav {'TP': 11, 'FP': 27, 'FN': 109, 'total_n_pos_events': 120}
Overall_scores: {'precision': 0.2911279078300433, 'recall': 0.4938446186969832, 'fmeasure (percentage)': 36.631}
Evaluation for: TeamBEATs VAL
BUK1_20181011_001004.wav {'TP': 13, 'FP': 22, 'FN': 18, 'total_n_pos_events': 31}
BUK1_20181013_023504.wav {'TP': 3, 'FP': 206, 'FN': 21, 'total_n_pos_events': 24}
BUK4_20161011_000804.wav {'TP': 1, 'FP': 22, 'FN': 46, 'total_n_pos_events': 47}
BUK4_20171022_004304a.wav {'TP': 6, 'FP': 15, 'FN': 11, 'total_n_pos_events': 17}
BUK5_20161101_002104a.wav {'TP': 39, 'FP': 7, 'FN': 49, 'total_n_pos_events': 88}
BUK5_20180921_015906a.wav {'TP': 4, 'FP': 9, 'FN': 19, 'total_n_pos_events': 23}
ME1.wav {'TP': 10, 'FP': 21, 'FN': 1, 'total_n_pos_events': 11}
ME2.wav {'TP': 41, 'FP': 35, 'FN': 0, 'total_n_pos_events': 41}
R4_cleaned recording_13-10-17.wav {'TP': 19, 'FP': 23, 'FN': 0, 'total_n_pos_events': 19}
R4_cleaned recording_16-10-17.wav {'TP': 30, 'FP': 9, 'FN': 0, 'total_n_pos_events': 30}
R4_cleaned recording_17-10-17.wav {'TP': 36, 'FP': 6, 'FN': 0, 'total_n_pos_events': 36}
R4_cleaned recording_TEL_19-10-17.wav {'TP': 52, 'FP': 29, 'FN': 2, 'total_n_pos_events': 54}
R4_cleaned recording_TEL_20-10-17.wav {'TP': 64, 'FP': 10, 'FN': 0, 'total_n_pos_events': 64}
R4_cleaned recording_TEL_23-10-17.wav {'TP': 84, 'FP': 5, 'FN': 0, 'total_n_pos_events': 84}
R4_cleaned recording_TEL_24-10-17.wav {'TP': 99, 'FP': 13, 'FN': 0, 'total_n_pos_events': 99}
R4_cleaned recording_TEL_25-10-17.wav {'TP': 99, 'FP': 8, 'FN': 0, 'total_n_pos_events': 99}
file_423_487.wav {'TP': 57, 'FP': 7, 'FN': 0, 'total_n_pos_events': 57}
file_97_113.wav {'TP': 11, 'FP': 30, 'FN': 109, 'total_n_pos_events': 120}
Overall_scores: {'precision': 0.348444259075038, 'recall': 0.525770811091538, 'fmeasure (percentage)': 41.912}
```

## Taking the idea further:
Expand Down
17 changes: 17 additions & 0 deletions compute_metrics.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

BASE_FOLDER=$1
EVAL_CSV_PATH=$2
CONFIG_PATH="/app/CONFIG.yaml"

docker run -v $PWD:/app \
-v $BASE_FOLDER:/data \
-v $EVAL_CSV_PATH:/eval_folder/eval_out.csv \
--gpus all \
beats \
poetry run python /app/evaluate/evaluation_metrics/evaluation.py \
-pred_file /eval_folder/eval_out.csv \
-ref_files_path /data/DCASE/Development_Set_annotations/Validation_Set \
-team_name TeamBEATs \
-dataset VAL \
-savepath /data/.
25 changes: 14 additions & 11 deletions dcase_setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ mkdir -p $TARGET_FOLDER
############################
# Download the BEATs model #
############################
MODEL_FOLDER=$BASE_FOLDER/BEATs
MODEL_FOLDER=$BASE_FOLDER/model/BEATs
mkdir -p $MODEL_FOLDER
wget -O "$MODEL_FOLDER/BEATs_iter3_plus_AS2M.pt" "https://valle.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D"

Expand All @@ -47,18 +47,21 @@ download_and_unzip "https://zenodo.org/record/6482837/files/Development_Set_anno
# Acoustic data
download_and_unzip "https://zenodo.org/record/6482837/files/Development_Set.zip?download=1" "$TARGET_FOLDER"

###############################
# Download the evaluation set #
###############################
mkdir -p "$TARGET_FOLDER/Development_Set/Evaluation_Set"
#####################################################################
# Download the evaluation set - OUTDATED AS THIS WAS FOR DCASE 2023 #
#####################################################################

download_and_unzip "https://zenodo.org/record/7879692/files/Annotations_only.zip?download=1" "$TARGET_FOLDER"
mv "$TARGET_FOLDER/Annotations_only" "$TARGET_FOLDER/Development_Set_annotations/Evaluation_Set"


#mkdir -p "$TARGET_FOLDER/Development_Set/Evaluation_Set"

#download_and_unzip "https://zenodo.org/record/7879692/files/Annotations_only.zip?download=1" "$TARGET_FOLDER"
#mv "$TARGET_FOLDER/Annotations_only" "$TARGET_FOLDER/Development_Set_annotations/Evaluation_Set"

# Acoustic data
for i in {1..3}
do
download_and_unzip "https://zenodo.org/record/7879692/files/eval_$i.zip?download=1" "$TARGET_FOLDER/Development_Set/Evaluation_Set"
done
#for i in {1..3}
#do
# download_and_unzip "https://zenodo.org/record/7879692/files/eval_$i.zip?download=1" "$TARGET_FOLDER/Development_Set/Evaluation_Set"
#done


17 changes: 16 additions & 1 deletion evaluate/evaluateDCASE.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,12 @@ def train_model(

if pretrained_model:
# Load the pretrained model
pretrained_model = ProtoBEATsModel.load_from_checkpoint(pretrained_model)
try:
pretrained_model = ProtoBEATsModel.load_from_checkpoint(pretrained_model)
except KeyError:
print("Failed to load the pretrained model. Please check the checkpoint file.")
return None


# train the model
trainer.fit(model, datamodule=datamodule_class)
Expand Down Expand Up @@ -569,6 +574,16 @@ def write_wav(
training_config_path = os.path.join(version_path, "config.yaml")
version_name = os.path.basename(version_path)

# Select 'set_type' depending on chosen status
if cfg["data"]["status"]=="train":
cfg["data"]["set_type"] = "Training_Set"

elif cfg["data"]["status"]=="validate":
cfg["data"]["set_type"] = "Validation_Set"

else:
cfg["data"]["set_type"] = "Evaluation_Set"

# Get correct paths to dataset
my_hash_dict = {
"resample": cfg["data"]["resample"],
Expand Down
6 changes: 2 additions & 4 deletions preprocess_data.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@ for SET in "${!SETS[@]}"; do
docker run -v $DATA_DIR:/data \
-v $PWD:/app \
--gpus all \
# -it \
beats \
# bash \
poetry run python /app/data_utils/DCASEfewshot.py --config $CONFIG_PATH \

poetry run python /app/data_utils/DCASEfewshot.py --config $CONFIG_PATH

done
12 changes: 7 additions & 5 deletions train_model.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
BASE_FOLDER=$1
CONFIG_PATH="/app/CONFIG.yaml"

docker run -v $BASE_FOLDER:/data \
-v $PWD:/app \
--gpus all \
beats \
poetry run prototypicalbeats/trainer.py fit --config $CONFIG_PATH
# Check if BASE_FOLDER is not set or empty
if [ -z "$BASE_FOLDER" ]; then
echo "Error: BASE_FOLDER is not specified."
exit 1
fi

docker run -v $BASE_FOLDER:/data -v $PWD:/app --gpus all beats poetry run prototypicalbeats/trainer.py fit --config $CONFIG_PATH

0 comments on commit aa7e360

Please sign in to comment.