Skip to content

Releases: microsoft/torchgeo

v0.6.1

10 Oct 18:22
d610ff1
Compare
Choose a tag to compare

TorchGeo 0.6.1 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.6.0 release.

This release fixes an important security vulnerability and properly documents a lack of support for rasterio 1.4. All users are recommended to update to TorchGeo 0.6.1 if they are using torchgeo.models.get_weight.

Dependencies

  • rasterio: 1.4 not yet supported (#2327)

Datamodules

  • Datamodule: use persistent workers for parallel data loading (#2291)
  • OSCD: update normalization statistics (#2282)

Datasets

  • Datasets: add support for os.PathLike (#2273)
  • GeoDataset: allow a mix of str and pathlib paths (#2270)

Models

  • API: avoid use of eval in get_weight (#2323)

Tests

  • CD: set up continuous deployment to PyPI (#2342)
  • CI: install tensorboard to speed up notebooks (#2315)
  • CI: install TorchGeo from checked out repo (#2306)
  • dependabot: only update npm lockfile (#2277)
  • prettier: ignore cache directories (#2278)
  • prettier: prefer single quotes (#2280)
  • pytest: set default --cov and --cov-report (#2275)
  • pytest: set matplotlib backend locally too (#2326)
  • pytest: silence numpy 2 warnings in PyTorch (#2302)
  • ruff: remove NPY tests now that we test numpy 2 in CI (#2287)

Documentation

  • Alternatives: add scikit-eo to list of TorchGeo alternatives (#2340)
  • Contributing: installation-agnostic prettier usage (#2279)
  • Datasets: move dataset CSV to subdirectory (#2281, #2304)
  • Datasets: update NAIP resolution (#2325)
  • Tutorials: fix NAIP downloads by signing URL (#2343)
  • Tutorials: update recommended strategy for raster datasets containing images and masks (#2293)

Contributors

This release is thanks to the following contributors:

@adamjstewart
@calebrob6
@MathiasBaumgartinger
@Nowosad
@sfalkena

v0.6.0

01 Sep 10:12
7500ee2
Compare
Choose a tag to compare

TorchGeo 0.6.0 Release Notes

TorchGeo 0.6 adds 18 new datasets, 15 new datamodules, and 27 new pre-trained models, encompassing 11 months of hard work by 23 contributors from around the world.

Highlights of this release

Multimodal foundation models

Diagram of a unified multimodal Earth foundation model

There are thousands of Earth observation satellites orbiting the Earth at any given time. Historically, in order to use one of these satellites in a deep learning pipeline, you would first need to collect millions of manually-labeled images from this sensor in order to train a model. Self-supervised learning enabled label-free pre-training, but still required millions of diverse sensor-specific images, making it difficult to use newly launched or expensive commercial satellites.

TorchGeo 0.6 adds multiple new multimodal foundation models capable of being used with imagery from any satellite/sensor, even ones the model was not explicitly trained on. While GASSL and Scale-MAE only support RGB images, DOFA supports RGB, SAR, MSI, and HSI with any number of spectral bands. It uses a novel wavelength-based encoder to map the spectral wavelength of each band to a known range of wavelengths seen during training.

The following table describes the dynamic spatial (resolution), temporal (time span), and/or spectral (wavelength) support, either via their training data (implicit) or via their model architecture (explicit), offered by each of these models:

Model Spatial Temporal Spectral
DOFA implicit - explicit
GASSL implicit - -
Scale-MAE explicit - -

TorchGeo 0.6 also adds multiple new unimodal foundation models, including DeCUR and SatlasPretrain.

Source Cooperative migration

Migration from Radiant MLHub to Source Cooperative

TorchGeo contains a number of datasets from the recently defunct Radiant MLHub:

These datasets were recently migrated to Source Cooperative (and AWS in the case of SpaceNet), but with a completely different file format and directory structure. It took a lot of effort, but we have finally ported all of these datasets to the new download location and file hierarchy. As an added bonus, the new data loader code is significantly simpler, allowing us to remove 2.5K lines of code in the process!

OSGeo community project

OSGeo Community logo

TorchGeo is now officially a member of the OSGeo community! OSGeo is a not-for-profit foundation for open source geospatial software, providing financial, organizational, and legal support. We are in good company, with other OSGeo projects including GDAL, PROJ, GEOS, QGIS, and PostGIS. Membership in OSGeo promotes advertising of TorchGeo to the community, and also ensures that we follow best practices for the stability, health, and interoperability of the open source geospatial ecosystem.

All TorchGeo users are encouraged to join us on Slack, join our Hugging Face organization, and join us in OSGeo using any of the following badges in our README:

slack
huggingface
osgeo

Lightning Studios support

Lightning AI logo

TorchGeo has always had a close collaboration with Lightning AI, including active contributions to PyTorch Lightning and TorchMetrics. In this release, we added buttons allowing users to launch our tutorial notebooks in the new Lightning Studios platform. Lightning Studios is a more powerful version of Google Colab, with reproducible software and data environments allowing you to pick up where you left off, VS Code and terminal support, and the ability to quickly scale up to a large number of GPUs. All TorchGeo tutorials have been confirmed to work in both Lightning Studios and Google Colab, allowing users to get started with TorchGeo without having to invest in their own hardware.

Backwards-incompatible changes

  • All Radiant MLHub datasets have been ported to the Source Cooperative file hierarchy (#1830)
  • GeoDataset: the bbox sample key was renamed to bounds in order to support Kornia (#2199)
  • Chesapeake7 and Chesapeake13: datasets were removed when updating to the 2022 edition (#2214)
  • Benin Cashews and Rwanda Field Boundary: remove os.path.expanduser for consistency (#1705)
  • LEVIR-CD and OSCD: images key was split into image1 and image2 for change detection (#1684, #1696)
  • EuroSAT: B08A was renamed to B8A to match Sentinel-2 (#1646)

Dependencies

New (optional) dependencies

  • aws-cli: to download datasets from AWS (#2203)
  • azcopy: to download datasets from Azure (#2064)
  • prettier: for YAML file formatting (#2018)
  • ruff: for code style and documentation testing (#1994)

Removed (optional) dependencies

  • radiant-mlhub: website no longer exists (#1830)
  • rarfile: datasets rehosted as zip files (#2210)
  • zipfile-deflate: no longer needed for newer Chesapeake data (#2214)
  • black: replaced by ruff (#1994)
  • flake8: replaced by ruff (#1994)
  • isort: replaced by ruff (#1994)
  • pydocstyle: replaced by ruff (#1994)
  • pyupgrade: replaced by ruff (#1994)

Changes to existing dependencies

  • python: 3.10+ required following SPEC 0 (#1966)
  • fiona: 1.8.21+ required (#1966)
  • kornia: 0.7.3+ required (#1979, #2144)
  • lightly: 1.4.5+ required (#2196)
  • lightning: 2.3 not supported due to bug (#2155, #2211)
  • matplotlib: 3.5+ required (#1966)
  • numpy: 1.21.2+ required (#1966), numpy 2 support added (#2151)
  • pandas: 1.3.3+ required (#1966)
  • pillow: 3.3+ required (#1966), jpeg2000 support required (#2209)
  • pyproj: 3.3+ required (#1966)
  • rasterio: 1.3+ required (#1966)
  • shapely: 1.8+ required (#1966)
  • torch: 1.13+ required (#1358)
  • torchvision: 0.14+ required (#1358)
  • h5py: 3.6+ required (#1966)
  • opencv: 4.5.4+ required (#1966)
  • pycocotools: 2.0.7+ required (#1966)
  • scikit-image: 0.19+ required (#1966)
  • scipy: 1.7.2+ required (#1966)

Datamodules

New datamodules

Changes to existing datamodules

  • Remove torchgeo.datamodules.utils.dataset_split (#2005)
  • EuroSAT: make sure normalization is actually applied (#2176)

Changes to existing base classes

  • Fix plotting in datamodules when dataset is a subset (#2003)

Datasets

New datasets

Changes to existing datasets

  • Benin Cashews: migrate to Source Cooperative (#2116)
  • Benin Cashews: remove os.path.expanduser for consistency (#1705)
  • BigEarthNet: fix broken download link (#2174)
  • CDL: add 2023 checksum (#1844)
  • Chesapeake: update to 2022 edition (#2214)
  • ChesapeakeCVPR: reuse NLCD colormap (#1690)
  • Cloud Cover: migrate to Source Cooperative (#2117)
  • CV4A Kenya Crop Type: migrate to Source Cooperative (#2090)
  • EuroSAT: rename B08A to B8A to match Sentinel-2 (#1646)
  • FireRisk: redistribute on Hugging Face (#2000)
  • GlobBiomass: add min/max timestamp ...
Read more

v0.5.2

03 Mar 19:51
Compare
Choose a tag to compare

TorchGeo 0.5.2 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.5.1 release.

This release contains a number of important fixes to reproducibility and determinism. All users are recommended to upgrade to 0.5.2 if they want to ensure the reproducibility of their work.

TorchGeo has always supported Python 3.12, but this is now officially tested!

Dependencies

  • Test TorchGeo support for Python 3.12 (#1837)
  • lightly 1.4.26 is incompatible with smp (#1824, #1825)
  • Add dev container to support Github Codespaces development (#1085)

Datamodules

  • L7 Irish previously used a nondeterministic train/val/test split. This is now fixed (#1899, #1908)
  • L8 Biome previously used a nondeterministic train/val/test split. This is now fixed (#1899, #1908)
  • Tropical Cyclone previously used a nondeterministic train/val/test split. This is now fixed (#1839)
  • SEN12MS previously used a nondeterministic train/val/test split. This is now fixed (#1839)

Datasets

  • RasterDataset: clarify documentation of is_image and dtype (#1811)
  • GeoDataset previously used a nondeterministic train/val/test split. This is now fixed (#1899, #1908)
  • xView2 previously used a nondeterministic order. This is now fixed (#1918)
  • HuggingFace: use stable download URLs (#1916)
  • GitLab: use stable download URLs (#1917)
  • Deep Globe Land Cover: document download steps (#1797, #1921)
  • PASTIS: fix default folds (#1810)
  • SustainBench Crop Yield: fix download support (#1753, #1755)
  • SustainBench Crop Yield: eager data loading (#1754, #1756)

Models

  • HuggingFace: use stable download URLs (#1916)
  • ViTSmall16_Weights: fix typo (#1904)

Samplers

  • RandomGeoSampler: optional length is optional (#1907)

Trainers

  • Remove unnecessary argmax before call to torchmetrics (#1777)
  • Better document default trainer metrics (#1874, #1914, #1923, #1924)
  • ObjectDetectionTask: increase test coverage (#1739)

Scripts

  • SSL4EO download: skip downloading missing coordinates (#1821)
  • Ensure that all files have the license header at the top (#1787)

Tests

  • Notebooks: use stable dependency versions (#1838)
  • Don't cast warnings to errors (#1793)
  • Fix lightning-utilities deprecation warning (#1733)
  • Fix pre-commit dependency versions (#1781)

Documentation

  • RasterDataset: clarify documentation of is_image and dtype (#1811)
  • RtD: use stable dependency versions (#1827)
  • Document TorchGeo alternatives (#1742)
  • Tutorials: load_state_dict does not return the model (#1503)
  • README: fix VHR-10 example (#1686, #1920)
  • README: add TorchGeo podcast episodes (#1806)
  • README: add PyTorch badge (#1882)
  • README: add OSGeo badge (#1880)
  • README: add color lexing of bibtex (#1820)
  • README: fix Spack link (#1804)

Contributors

This release is thanks to the following contributors:

@adamjstewart
@ashnair1
@calebrob6
@DimitrisMantas
@dmeaux
@isaaccorley
@jdilger
@julien-blanchon
@konstantinklemmer
@nilsleh
@tatsubori

v0.5.1

10 Nov 16:46
6694cbd
Compare
Choose a tag to compare

TorchGeo 0.5.1 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.5.0 release.

Datamodules

  • EuroSAT: make channel normalization statistics responsive to dynamic band selection (#1634, #1681)

Datasets

  • AGB Live Woody Biomass: update download link for dataset (#1679, #1713)
  • EuroSAT: remove classes attribute and instead rely on ImageFolder classes (#1648, #1650)
  • OSCD: change image datatype be float instead of int (#1652, #1656)
  • RESICS45: remove classes attribute and instead rely on ImageFolder classes (#1648, #1650)
  • UC Merced: fix plotting which expects images from dataset to be normalized already (#1712)
  • UC Merced: remove classes attribute and instead rely on ImageFolder classes (#1648, #1650)
  • GeoDataset: check if the path points to a Virtual File System, to prevent error of looking and not finding the paths locally (#1605, #1612)
  • GeoDatasets: consistent use of paths argument instead of root in RuntimeError of several datasets (#1704, #1717)

Trainers

  • During logging, trainers were expecting a datamodule with plot functionality, which was preventing trainers from being used with custom Pytorch Dataloaders (#1703)
  • Remove default callback configurations of trainers and leave it to user instead (#1640, #1641, #1642, #1645, #1647)
  • Skip weights and augmentations when saving hparams, allowing these parameters to be changed (#1622, #1639, #1670)

Scripts

  • Solve logging conflict by allowing config.yaml file to be overwritten (#1621, #1625)

Tests

  • Greatly reduce memory footprint of CI which was causing PR tests to fail (#1658)
  • Copy testing csv file instead of downloading it for MapInWild dataset test (#1657)
  • Fix choco install unrar in CI by using 7zip instead of unrar (#1697)
  • CI: use unique names for release caches (#1723)

Documentation

  • README: update SemanticSegmentationTask example with arguments introduced in 0.5 (#1608)
  • README: add section on LightningCLI usage with torchgeo (#1626, #1628)
  • README: add section on availability of pretrained weights in torchgeo (#1716)
  • BioMassters: fix typo in docs' overview table of non-geo datasets (#1718)
  • SSL4EO-L Benchmark: add dataset information to documentation (#1719)

Contributors

This release is thanks to the following contributors (in alphabetical order):

@adamjstewart
@ashnair1
@dylanrstewart
@kaybe20
@menglutao
@nilsleh
@pioneerHitesh
@robmarkcole

v0.5.0

30 Sep 21:53
fe546bf
Compare
Choose a tag to compare

TorchGeo 0.5.0 Release Notes

0.5.0 encompasses over 8 months of hard work and new features contributed by 20 users from around the world. Below, we detail specific features worth highlighting.

Highlights of this release

New command-line interface

TorchGeo has always had tight integration with PyTorch Lightning, including datamodules for common benchmark datasets and trainers for most computer vision tasks. TorchGeo 0.5.0 introduces a new command-line interface for model training based on LightningCLI. It can be invoked in two ways:

# If torchgeo has been installed
torchgeo
# If torchgeo has been installed, or if it has been cloned to the current directory
python3 -m torchgeo

It supports command-line configuration or YAML/JSON config files. Valid options can be found from the help messages:

# See valid stages
torchgeo --help
# See valid trainer options
torchgeo fit --help
# See valid model options
torchgeo fit --model.help ClassificationTask
# See valid data options
torchgeo fit --data.help EuroSAT100DataModule

Using the following config file:

trainer:
  max_epochs: 20
model:
  class_path: ClassificationTask
  init_args:
    model: "resnet18"
    in_channels: 13
    num_classes: 10
data:
  class_path: EuroSAT100DataModule
  init_args:
    batch_size: 8
  dict_kwargs:
    download: true

we can see the script in action:

# Train and validate a model
torchgeo fit --config config.yaml
# Validate-only
torchgeo validate --config config.yaml
# Calculate and report test accuracy
torchgeo test --config config.yaml

It can also be imported and used in a Python script if you need to extend it to add new features:

from torchgeo.main import main

main(["fit", "--config", "config.yaml"])

See the Lightning documentation for more details.

Self-supervised learning and Landsat

SSL4EO-S12 Logo

Self-supervised learning has become a dominant technique for model pre-training, especially in domains (like remote sensing) that are rich in data but lacking in large labeled datasets. The 0.5.0 release adds powerful trainers for the following SSL techniques:

  • BYOL [1]
  • MoCo [1, 2, 3]
  • SimCLR [1, 2]

large unlabeled datasets for multiple satellite platforms:

  • SeCo [1]
  • SSL4EO-L [1]
  • SSL4EO-S12 [1]

and the first ever models pre-trained on Landsat imagery. See our SSL4EO-L paper for more details.

Utilities for splitting GeoDatasets

In prior releases, the only way to create train/val/test splits of GeoDatasets was to use a Sampler roi. This limited the types of splits you could perform, and was unintuitive for users coming from PyTorch where the dataset can be split into multiple datasets. TorchGeo 0.5.0 introduces new splitting utilities for GeoDatasets in torchgeo.datasets, including:

  • random_bbox_assignment: randomly assigns each scene to a different split
  • random_bbox_splitting: randomly split each scene and assign each half to a different split
  • random_grid_cell_assignment: overlay a grid and randomly assign each grid cell to a different split
  • roi_split: split using a roi just like with Sampler
  • time_series_split: split along the time axis

Splitting with a Sampler roi is not yet deprecated, but users are encouraged to adopt the new dataset splitting utility functions.

GeoDatasets now accept lists as input

Previously, each GeoDataset accepted a single root directory as input. Now, users can pass one or more directories, or a list of files they want to include. At first glance, this doesn't seem like a big deal, but it actually opens a lot of possibilities for how users can construct GeoDatasets. For example, users can use custom filters:

files = []
for file in glob.glob("*.tif"):
    # check pixel QA band or metadata file
    if cloud_cover < 20:  # select images with minimal cloud cover
        files.append(file)
ds = Landsat8(files)

or use remote files from S3 buckets or Azure blob storage. Basically, as long as GDAL knows how to read the file, TorchGeo supports it, wherever the file lives.

Note that some datasets may not support a list of files if you also want to automatically download the dataset because we need to know the directory to download to.

Building a community

With over 50 contributors from around the world, we needed a better way to discuss ideas and share announcements. TorchGeo now has a public Slack channel! Join us and say hello 👋

Now that the majority of the features we've needed have been implemented, one of our goals for the next release is to improve our documentation and tutorials. Expect to see TorchGeo tutorials at all the popular ML/RS conferences next year! We're excited to meet our users in person and learn more about their unique use cases and needs.

Backwards-incompatible changes

  • GeoDataset: first parameter renamed from root to paths (#1442, #1597)
  • Trainers: many parameters renamed (#1541)
  • FAIR1M datamodule: *_split_pct parameters removed (#1275)
  • Inria datamodule: *_split_pct parameters removed (#1540)
  • SemanticSegmentationTask: changes to weights parameter (#1046)

Dependencies

  • Drop Python 3.7 and 3.8 support following NEP 29 (#1058, #1246)
  • Dependencies now listed in pyproject.toml (#1446)
  • Drop upper bounds on dependencies (#1480)
  • Lightly: new required dependency (#1252, #1285)
  • Lightning: extra dependencies now required (#1559)
  • Omegaconf: no longer a dependency (#1559)
  • Pandas: now supports v2.1 (#1537)
  • Pandas: new required dependency (#1586)
  • Scikit-Learn: no longer a dependency (#1063)
  • TorchMetrics: now supports v1 (#1465)

Datamodules

New datamodules:

Changes to existing datamodules:

  • FAIR1M: add val/test splits, drop split parameters (#1275)
  • Inria: add val split, drop split parameters (#654, #1540)
  • RESISC45: better normalization (#1349)
  • So2Sat: support RGB-only mode (#1283)
  • So2Sat: control size of validation dataset (#1283)

New base classes:

Changes to existing base classes:

  • GeoDataModule: automatically infer epoch length (#1257)
  • BaseDataModule: better error messages (#1307, #1441)

Datasets

New datasets:

Changes to existing datasets:

  • CDL: add years parameter (#1337)
  • CDL: add classes parameter (#1392)
  • CDL: map class labels to ordinal numbers (#1364, #1368)
  • CDL: return figure (#1369)
  • CMS Mangrove Canopy: return figure (#1369)
  • DFC2022: avoid interpolation in colormap (#1372)
  • FAIR1M: add val/test splits (#1275)
  • FAIR1M: add download support (#1275)
  • Inria: add validation split (#654, #1540)
  • SeCo: add seasons parameter (#1168)
  • SeCo: faster initialization (#1168)
  • SeCo: support new directory structure (#1235)
  • So2Sat: add version 3 (#1086, #1283)
  • UCMerced: fix image shape bug (#1238)
  • USAVars: return lat/lon of centroid (#1240)
  • USAVars: convert image to float32 (#1433)
  • USAVars: download from Hugging Face (#1453)

Changes to existing base classes:

  • GeoDataset: accept list of files or directories (#1427, #1442, #1597)
  • GeoDataset: add files property (#1442, #1597)
  • Intersection/UnionDataset: fix crs/res propagation (#1341, #1344)
  • RasterDataset: add dtype attribute (#1149)
  • RasterDataset: allow sampling outside bounds of image (#1329, #1344)

New utility functions:

  • Add utilities to split GeoDatasets (#536, #866)
  • BoundingBox has a new split function (#866)

Models

Changes to existing models:

  • RCF: add empirical sampling mode (#1339)

New pre-trained model weights:

Changes to existing pre-trained model weights:

Samplers

Changes to existing samplers:

  • GridGeoSampler: don't change stride of last patch (#1245, #1329)

Trainers

New trainers:

Changes to existing trainers:

  • Add ability to freeze backbones and decoders (#1290)
  • Fix support for datasets without a plot method (#1551, #1585)
  • BYOL: add random season contrast (#1168)
  • Classification: add class weights for cross entropy loss (#1592)
  • Semantic Segmentation: add class weights for cross entropy loss (#1221)
  • Semantic Segmentation: add ...
Read more

v0.4.1

11 Apr 21:52
Compare
Choose a tag to compare

TorchGeo 0.4.1 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.4.0 release.

Dependencies

Some dependencies have changed:

  • nbmake: 1.3.3+ required now (#1124)
  • omegaconf: now optional (#1214)
  • pytorch-lightning: replaced with lightning (#1178, #1179)
  • sphinx: 6+ not yet supported (#1144)
  • tensorboard: now optional (#1214)
  • pip install torchgeo[all] added, installs all optional dependencies (#1095)

Other dependencies now support newer versions:

  • black: add 23 support (#1080)
  • kornia: add 0.6.10 support (#1123)
  • mypy: add 1 support (#1089)
  • nbsphinx: add 0.9 support (#1173)
  • pandas: add 2 support (#1216)
  • pyvista: add 0.38 support (#1083)
  • radiant-mlhub: add 0.5 support (#1102)
  • scikit-image: add 0.20 support (#1153)
  • setuptools: add 67 support (#1066)
  • torch: add 2 support (#1177)
  • torchvision: add 0.15 support (#1177)

Datamodules

  • SeCo: fix transforms (#1166)

Datasets

Fixes for benchmark datasets:

  • BigEarthNet: fix order of class labels (#1127)
  • CDL: add checksum for 2022 mask (#1201)
  • EuroSAT: fix SSL issue, redistribute on Hugging Face (#1065, #1072)
  • FAIR1M: fix directory name (#1098, #1099)
  • Landsat: better default bands (#1169)
  • UC Merced: redistribute on Hugging Face (#1076)
  • USAVars: fix class labels (#1138)

Fixes for base classes:

  • RasterDataset: fix support for datasets where all_bands does not actually contain all bands (e.g., Landsat) (#1134, #1135)
  • RasterDataset: fix support for datasets where all_bands is not defined and separate_files is False (#1135)
  • RasterDataset: fix bug when separate_files and no date in filename_regex (#1191)
  • RasterDataset: remove unnecessary glob (#1219)
  • RasterDataset: better error message when no data found (#1193)
  • IntersectionDataset: better error message when no overlap (#1192)

Models

There are several improvements to our new pre-trained weights:

  • Add sha256 suffix for security (#1105)
  • Add and improve normalizations (#1119, #1166)

Trainers

  • BYOL: Fix image size to match ViT patch size (#1084)
  • Fix support for loading ViT weights (#1049, #1084)
  • Fix support for non-TensorBoardLogger (#1143, #1145)

Tests

A lot of work in this patch release went towards improving CI:

  • Constrain dependencies to avoid CI hang (#1062)
  • Codecov: use repository upload token (#1077)
  • Cache pip installs (#1057)
  • Cancel in-progress jobs on new commit (#1094) but not the labeler tasks (#1187)
  • Test notebooks when they are modified (#1097)
  • Speed up object detection tests (#1148)
  • Fix tests on macOS arm64 (MPS support) (#1188)
  • Properly test pre-trained model transforms (#1166)
  • Speed up notebook tests (#665, #1124)

Documentation

  • Update the example embedded in the README (#1211)
  • Fix broken URLs throughout the documentation (#1125)
  • Tutorial downloads are now much smaller and faster (#1124)
  • Replace CSV with TensorBoard in Trainer tutorial (#1163, #1189)
  • Fix version selection button (#1144)

Contributors

This release is thanks to the following contributors:

@adamjstewart
@ashnair1
@bugraaldal
@calebrob6
@isaaccorley
@julien-blanchon
@lucastao
@nilsleh
@SpontaneousDuck
@TolgaAktas

v0.4.0

24 Jan 23:47
671737f
Compare
Choose a tag to compare

TorchGeo 0.4.0 Release Notes

This is our biggest release yet, with improved support for pre-trained models, faster datamodules and transforms, and more powerful trainers. See the following sections for specific changes to each module:

As always, thanks to our many contributors!

Backwards-incompatible changes

  • Datasets: So2Sat bands were renamed (#735)
  • Datasets: TropicalCycloneWindEstimation was renamed to TropicalCyclone (#815, #846)
  • Datasets: VisionDataset and VisionClassificationDataset (deprecated in 0.3) have been removed (#627)
  • Datamodules: many arguments have been renamed or reordered (#666, #730, #992)
  • Datamodules: CycloneDataModule was renamed to TropicalCycloneDataModule (#815, #846)
  • Models: resnet50 has a new multi-weight API (#917)
  • Trainers: many arguments have been renamed (#916, #917, #918, #919, #920)
  • Transforms: now take a single image as input instead of a sample dict (#999)

Dependencies

  • Open3D replaced by PyVista (#663)
  • Remove packaging dependency (#1019)
  • Support einops 0.6 (#896)
  • Support flake8 6 (#910)
  • Support mypy 0.991 (#900)
  • Support pytest-cov 4 (#801)
  • Support pyupgrade 3 (#817)
  • Support setuptools 66 (#1017)
  • Support shapely 2 (#949)
  • Support sphinx 6 (#990)
  • Support timm 0.6 (#1002)
  • Support torchmetrics 0.11 (#925)
  • Support torchvision 0.14 (#875)

Datamodules

Our existing datamodules worked well, but suffered from several performance issues. For the average dataset with 3 splits (train/val/test), we were instantiating the dataset 10 times! All data augmentation was done on the CPU, one sample at a time. A multiprocessing bug prevented parallel data loading on macOS and Windows. And a serious bug was discovered in some of our datamodules that allowed training images to leak into the test set (only affected datamodules using torchgeo.datamodules.utils.dataset_split). All of these bugs have been fixed, and performance has been drastically improved. Datasets are only instantiated 3 times (once for each split). All data augmentation happens on the GPU, an entire batch at a time. And multiprocessing is now supported on all platforms. By refactoring our datamodules and adding new base classes, we were able to remove 1.6K lines of duplicated code in the process!

New datamodules:

Changes to existing datamodules:

  • Only instantiate dataset in prepare_data if download is requested (#967, #974)
  • Only instantiate datasets needed for a given stage (#992)
  • Use Kornia for all data augmentation (#992)
  • Faster data augmentation (CPU → GPU, sample → batch) (#992)
  • Fix macOS/Windows multiprocessing bug (#886, #992)
  • Fix bug with train images leaking into test set (#992)
  • Add plot method to all datamodules (#814, #992)
  • torchgeo.datamodules.utils.dataset_split is deprecated, use torch.utils.data.random_split instead (#992)
  • Pass kwargs directly to datasets (#666, #730)
  • Add random cropping to several datamodules (#851, #853, #855, #876, #929)
  • Inria Aerial Image Labeling: fix predict dimensions (#975)
  • LandCover.ai: fix mIoU calculation and plotting (#959)
  • Tropical Cyclone: CycloneDataModule was renamed to TropicalCycloneDataModule (#815, #846)

New base classes:

  • Add GeoDataModule and NonGeoDataModule base classes (#992)

Datasets

This release adds a new Sentinel-1 dataset. Here is a scene taken over the Big Island of Hawai'i:

HH_HV

Additionally, all image datasets now have a plot method.

New datasets:

  • Cloud Cover Detection (#510)
  • Sentinel-1 (#821)
  • SpaceNet 6 (#878)

Changes to existing datasets:

  • Add default root argument to all datasets (#802)
  • Consistent capitalization of band names (#778)
  • Many datasets now return float images and int labels (#992)
  • Chesapeake CVPR: add plot method (#820)
  • ETCI 2021: fix data loading (#861)
  • NASA Marine Debris: fix plot warning when model outputs no prediction boxes (#988)
  • OSCD: images are now stacked channel-wise (#992)
  • SEN12MS: mask is only single channel (#992)
  • Sentinel-2: use 10,000 as scale factor (#1027)
  • So2Sat: rename bands (#735)
  • Tropical Cyclone: renamed from TropicalCycloneWindEstimation to TropicalCyclone (#815, #846)
  • Tropical Cyclone: images are RGB, not grayscale (#992)
  • VHR-10: add plot method (#847)
  • xView2: remove labels folder (#787)

Changes to existing base classes:

  • RasterDataset supports band indexing now (#687)
  • UnionDataset actually works now (#769, #786)
  • UnionDataset and IntersectionDataset support transforms (#867, #870)
  • VectorDataset supports multi-label datasets (#862)

Models

Due to the nature of satellite imagery (different number of spectral bands for every satellite), it is impossible to have a single set of pre-trained weights for each model. TorchGeo has always had multi-weight support:

model = resnet50(sensor="sentinel2", bands="all", pretrained=True)

However, this is difficult to extend if you want more fine-grained control over model weights. More recently, torchvision introduced a new multi-weight support API:

With the 0.4.0 release, TorchGeo has now adopted the same API:

model = resnet50(weights=ResNet50_Weights.SENTINEL2_ALL_MOCO)

We also support PyTorch Hub now:

>>> import torch
>>> from torchgeo.models import ResNet18_Weights
>>> torch.hub.list("microsoft/torchgeo", trust_repo=True)
Downloading: "https://github.com/microsoft/torchgeo/zipball/models/weights" to ~/.cache/torch/hub/models_weights.zip
['resnet18', 'resnet50', 'vit_small_patch16_224']
>>> model = torch.hub.load("microsoft/torchgeo", "resnet18")
Using cache found in ~/.cache/torch/hub/microsoft_torchgeo_models_weights
>>> model = torch.hub.load("microsoft/torchgeo", "resnet18", weights=ResNet18_Weights.SENTINEL2_RGB_MOCO)
Using cache found in ~/.cache/torch/hub/microsoft_torchgeo_models_weights

In our previous release, we had 1 model pre-trained on 1 satellite with 1 training procedure. We now have 3 models (ResNet-18, ResNet-50, ViT) trained on both Sentinel-1 and Sentinel-2 for all bands and RGB-only bands with 3 SSL techniques (MoCo, DINO, SeCo), and plans to expand this in the future. Shoutout to Zhu Lab and ServiceNow for publishing these weights!

New models:

  • Add ResNet-18 and ViT models (#917)

Changes to existing models:

New utility functions:

  • Functions to list, query, and initialize models and weights (#917)

Samplers

Changes to existing samplers:

  • All random samplers now have a default value for length (#755)

New utility functions:

  • get_random_bounding_box and tile_to_chips are now public functions (#755)

Trainers

This release introduces a new trainer for object detection, one of our most highly requested features. All trainers now support prediction. Our old trainers only supported ResNet backbones. Our new trainers now support the 600+ backbones provided by the timm library. And all of the new pre-trained models mentioned above are now supported by our trainers as well.

New trainers:

  • Object Detection: add trainer, add Faster R-CNN (#442, #758)
  • Object Detection: add RetinaNet and FCOS (#984)

Changes to existing trainers:

Transforms

Whenever possible, we try to avoid reinventing the wheel. For data augmentation transforms that aren't specific to geospatial data or satellite imagery, we use existing implementations in popular libraries like:

Until now, we've been fairly agnostic towards data augmentation libraries. However, neither PIL nor OpenCV support multispectral imagery. Because of this, we've decided to use Kornia for all transforms.

Changes to existing transforms:

  • All transforms are now compatible with kornia.augmentation.AugmentationSequential (#999)
  • All transforms now take a single image as input instead of a sample dict (#999)
  • `torchgeo.transforms.Augmentation...
Read more

v0.3.1

08 Sep 21:30
44fa413
Compare
Choose a tag to compare

TorchGeo 0.3.1 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.3.0 release.

Dependencies

  • pytorch-lightning: add 1.9 support (#697, #771)
  • radiant-mlhub: 0.5 not yet supported (#711)
  • segmentation-models-pytorch: add 0.3 support (#692)
  • setuptools: add 65 support (#715, #753)
  • torchvision: fix 0.12 pretrained model support (#761)

DataModules

  • Fix rounding bugs in train/val/test split sizes (#675, #679, #736)

Datasets

  • Fix rounding bugs leading to inconsistent image shapes in vector datasets (#674, #675, #679, #736)
  • IDTReeS: fix (x, y) coordinate swap in boxes (#683, #684)
  • IDTReeS: clip boxes to bounds of image (#684, #760)
  • Sentinel-2: add support for files downloaded from USGS EarthExplorer (#505, #754)
  • Sentinel-2: prevent dataset from loading bands at different resolutions (#754)
  • Sentinel-2: support loading even when band B02 is not present (#754)

Samplers

  • GridGeoSampler: adjust stride of last row/col to sample entire ROI (#431, #448, #630)

Transforms

  • NDVI: fix computation, we were computing the negative (#713, #714)
  • SWI: fix band names (#714)

Documentation

API docs:

  • USAVars is a regression dataset (#699)

Tutorials:

  • Use IntersectionDataset in sampler (#707)
  • Custom Raster Datasets: complete overhaul with real data (#766, #772)
  • Trainers: optional datasets required (#759)
  • Transforms: replace cell magic with shell command (#756)
  • Transforms: fix GPU usage (#763, #767)
  • Clean up file names, execution counts, and output (#770)

Contributors

This release is thanks to the following contributors:

v0.3.0

11 Jul 05:55
cc553c4
Compare
Choose a tag to compare

TorchGeo 0.3.0 Release Notes

This release contains a number of new features, and brings increased stability to installations and testing.

In previous releases, not all dependencies had a minimum supported version listed, causing issues if users had old versions lying around. Old releases would also install the latest version of all dependencies even if they had never been tested before. TorchGeo now lists a minimum and maximum supported version for all dependencies. Moreover, we now test the minimum supported versions of all dependencies. Dependencies are automatically updated using dependabot to prevent unrelated CI failures from sneaking into PRs. We hope this makes it even easier to contribute to TorchGeo, and ensures that old releases will continue to work even if our dependencies make backwards-incompatible changes.

Backwards-incompatible changes

  • VisionDataset and VisionClassificationDataset have been renamed to NonGeoDataset and NonGeoClassificationDataset (#627)
  • Sample size now defaults to pixel units, use units=Units.CRS for old behavior (#294)
  • RasterDataset no longer has a plot method, subclasses have their own plot methods (#476)
  • Plot method of RasterDataset subclasses now take sample dicts, not image tensors (#476)
  • Removed FCEF model, use segmentation_models_pytorch.Unet instead (#345)
  • SemanticSegmentationTrainer: ignore_zeros renamed to ignore_index (#444, #644)

Dependencies

  • Python 3.7+ is now required (#413, #482, #486)
  • Add lower version bounds to all dependencies based on testing (#574)
  • Add upper version bounds to all dependencies based on semver (#544, #557)
  • Fix Conda environment installation (#527, #528, #529, #545)

Datamodules

New datamodules:

  • Inria Aerial Image Labeling (#498)
  • USAVars (#441)

Changes to existing datamodules:

  • Improved consistency between datamodules (#657)

Datasets

New datasets:

Changes to existing datasets:

  • Benin Small Holder Cashews: return geospatial metadata (#377)
  • BigEarthNet: fix checksum (#550)
  • CBF: add plot method (#410)
  • CDL: add 2021 download (#418)
  • CDL: add plot method (#415)
  • Chesapeake: add plot method (#417)
  • EuroSat: new bands parameter (#396, #397)
  • LandCover.ai: update download URL (#559, #579)
  • Landsat: add support for all Level-1 and Level-2 products (#492, #504)
  • Landsat: add plot method (#661)
  • NAIP: add plot method (#407)
  • Seasonal Contrast: ensure that all images are square (#658)
  • Sentinel: add plot method (#416, #493)
  • SEN12MS: avoid casting float to int (#500, #502)
  • So2Sat: new bands parameter (#394)

Base classes and utilities:

  • VisionDataset and VisionClassificationDataset have been renamed to NonGeoDataset and NonGeoClassificationDataset (#627)
  • RasterDataset no longer has a plot method, subclasses have their own plot methods (#476)
  • Plot method of RasterDataset subclasses now take sample dicts, not image tensors (#476)
  • BoundingBox has new area and volume attributes (#375)
  • Don't subtract microsecond from mint (#506)

Models

Changes to existing models:

  • Removed FCEF model, use segmentation_models_pytorch.Unet instead (#345)
  • FCSiamConf and FCSiamDiff now inherit from segmentation_models_pytorch.Unet, allowing for easily loading pretrained weights (#345)

Samplers

New samplers:

  • PreChippedGeoSampler (#479)

Changes to existing samplers:

  • Allow for point sampling (#477)
  • Allow for sampling of entire scene (#477)
  • RandomGeoSampler no longer suffers from area bias (#408, #477)
  • Sample size now defaults to pixel units, use units=Units.CRS for old behavior (#294)

Trainers

Changes to existing trainers:

  • BYOLTask: fix in_channels handling (#522)
  • BYOLTask: fix loading of encoder weights (#524)
  • SemanticSegmentationTask: ignore_zeros renamed to ignore_index (#444, #644)

Transforms

New spectral indices:

New base classes:

  • AppendTriBandNormalizedDifferenceIndex (#414)

Documentation

  • Improved README (#589, #626)
  • Add dataset tables (#435, #478, #649)
  • Shorter dataset/datamodule/model names (#569, #571)
  • Spectral indices now display mathematical equations (#400)
  • Fix NAIP download in tutorials (#526, #531)
  • Add issue templates on GitHub (#584, #590)
  • Clarify Windows conda installation (#581)
  • Public type hints (#508)

Tests

  • Test on Python 3.10 (#457)
  • Use dependabot to manage dependencies (#488, #551, #647)
  • Test minimum version of dependencies (#574)
  • Resolve and test for deprecation warnings (#567)
  • FCSiam tests no longer require internet access (#495, #497)

Contributors

This release is thanks to the following contributors:

v0.2.1

20 Mar 16:41
af38975
Compare
Choose a tag to compare

TorchGeo 0.2.1 Release Notes

This is a bugfix release. There are no new features or API changes with respect to the 0.2.0 release.

Dependencies

  • Fix minimum supported kornia version (#350)
  • Support older pytorch-lightning (#347, #351)
  • Add support for torchmetrics 0.8+ (#361, #382)

DataModules

  • RESISC45: fix normalization statistics (#440)

Datasets

Fixes for dataset base classes:

  • GeoDataset: fix len() of empty dataset (#374)
  • RasterDataset: add support for float dtype (#379, #384)
  • RasterDataset: don't override custom cmap (#421, #422)
  • VectorDataset: fix issue with empty query (#399, #454, #467)

Fixes for specific datasets:

  • CDL: update checksums due to new file formats (#423, #424, #428)
  • Chesapeake: support extraction of deflate64-compressed zip files (#59, #282)
  • Chesapeake: allow multiple datasets to share same root (#419, #420)
  • ChesapeakeCVPR: update prior extension data to version 1.1 (#359)
  • IDTReeS: fix citation (#389)
  • LandCover.ai: support already-downloaded dataset (#383)
  • Sentinel-2: fix regex to support band 8A (#393)
  • SpaceNet 2: update checksum due to data format consistency fix (#469)

Samplers

  • Avoid bounding boxes smaller than patch size (#319, #376)

Tutorials

  • Fix variable name in trainer notebook (#434)

Tests

  • Fix integration tests on macOS/Windows (#349, #468)

Contributors

This release is thanks to the following contributors: