v1.2.0
New features
-
Datasets
- Added a new dataset
geolifeclef2024_pre_extracted
following 2024 edition of Kaggle challenge GeoLifeCLEF- Computed rolling
mean
and rollingstd
values of GeoLifeCLEF2024 dataset for each modality. These values are stored in this dataset's transform functions
- Computed rolling
- Added a new dataset
-
Models
- Added a new model "MultimodalEnsemble" in
geolifeclef2024_multimodal_ensemble
based on @picekl work on GeoLifeCLEF2024
- Added a new model "MultimodalEnsemble" in
-
Scripts
- Added new scripts
split_obs_spatially.py
,sort_files_glc_fashion.sh
split_obs_spatially.py
: splits a CSV observation dataset into a training and a val subset where val observation plots are spatially separated from training ones. This scripts uses newverde
package.sort_files_glc_fashion.sh
:This script re-organizes files in one folder into folders and sub-folders in the same way as for the GeoLifeCLEF challenge.
That is to say in the following manner.Each file is re-arranged in folders and sub-folders in the following way:
A file named 'ABCDWXYZ.pt' located at 'root_path/' will be moved to
'root_path/YZ/WX/ABCDWXYZ.pt'.Each file name must be at least 3 characters long. For instance:
A file named 'XYZ.pt' located at 'root_path/' will be moved to
'root_path/YZ/X/XYZ.pt'.split_obs_per_species_frequency
: splits a CSV observation dataset into a training and a val subset based on species frequency
- Added
split_obs_spatially.py
andsplit_obs_per_species_frequency.py
scripts to Malpolon as modules inmalpolon.data.utils
- Added new scripts
Changes
- Renamed
scripts
folder totoolbox
- Renamed scenarios from {"Ecologists", "Inference", "Kaggle"} to {"Custom_train", "Inference", "Benchmarks"} and re-organized experiments
- Fixed examples-related bugs, file links, duplicate files and cleaned config files
- Updated code documentation, repository READMEs and examples tutorial files