Release v1.2.0 · plantnet/malpolon

New features

Datasets
- Added a new dataset geolifeclef2024_pre_extracted following 2024 edition of Kaggle challenge GeoLifeCLEF
  - Computed rolling mean and rolling std values of GeoLifeCLEF2024 dataset for each modality. These values are stored in this dataset's transform functions
Models
- Added a new model "MultimodalEnsemble" in geolifeclef2024_multimodal_ensemble based on @picekl work on GeoLifeCLEF2024
Scripts
- Added new scripts split_obs_spatially.py, sort_files_glc_fashion.sh
  - split_obs_spatially.py: splits a CSV observation dataset into a training and a val subset where val observation plots are spatially separated from training ones. This scripts uses new verde package.
  - sort_files_glc_fashion.sh:
    
    This script re-organizes files in one folder into folders and sub-folders in the same way as for the GeoLifeCLEF challenge.
    That is to say in the following manner.
    
    Each file is re-arranged in folders and sub-folders in the following way:
    A file named 'ABCDWXYZ.pt' located at 'root_path/' will be moved to
    'root_path/YZ/WX/ABCDWXYZ.pt'.
    
    Each file name must be at least 3 characters long. For instance:
    A file named 'XYZ.pt' located at 'root_path/' will be moved to
    'root_path/YZ/X/XYZ.pt'.
  - split_obs_per_species_frequency: splits a CSV observation dataset into a training and a val subset based on species frequency
- Added split_obs_spatially.py and split_obs_per_species_frequency.py scripts to Malpolon as modules in malpolon.data.utils

Changes

Renamed scripts folder to toolbox
Renamed scenarios from {"Ecologists", "Inference", "Kaggle"} to {"Custom_train", "Inference", "Benchmarks"} and re-organized experiments
Fixed examples-related bugs, file links, duplicate files and cleaned config files
Updated code documentation, repository READMEs and examples tutorial files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.2.0

New features

Changes

Contributors