Skip to content

Commit 365e6f6

Browse files
authored
Merge pull request #24 from dlr-eoc/l2a
retrained models, support for l2a, refactoring
2 parents bdf735d + bd82c28 commit 365e6f6

20 files changed

+869
-158
lines changed

.github/workflows/pythonapp.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ jobs:
1212
runs-on: ubuntu-latest
1313
strategy:
1414
matrix:
15-
python-version: ['3.8', '3.9', '3.10']
15+
python-version: ['3.8', '3.9', '3.10', '3.11']
1616

1717
steps:
1818
- uses: actions/checkout@v3

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,5 @@ test.py
88
/dist
99
/build
1010
*.egg-info
11-
/*.egg
11+
/*.egg
12+
/src

CHANGELOG.rst

+17
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,23 @@
11
Changelog
22
=========
33

4+
[1.0.0] (2024-11-XX)
5+
--------------------
6+
Added
7+
*******
8+
- support for L2A product level
9+
- example notebooks
10+
- accuracy assessment and comparison with previous models
11+
12+
Changed
13+
*******
14+
- retrained all models with new architecture and training data
15+
- removed backwards compatibility with old band naming scheme
16+
17+
Fixed
18+
*******
19+
- nodata selection across all bands
20+
421
[0.2.2] (2024-10-21)
522
--------------------
623
Added

MANIFEST.in

+8-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
include requirements.txt
22
include README.md
33
include LICENSE
4-
include ukis_csmask/model_4b.onnx
5-
include ukis_csmask/model_6b.onnx
4+
include ukis_csmask/model_4b_l1c.onnx
5+
include ukis_csmask/model_4b_l1c.json
6+
include ukis_csmask/model_4b_l2a.onnx
7+
include ukis_csmask/model_4b_l2a.json
8+
include ukis_csmask/model_6b_l1c.onnx
9+
include ukis_csmask/model_6b_l1c.json
10+
include ukis_csmask/model_6b_l2a.onnx
11+
include ukis_csmask/model_6b_l2a.json

README.md

+14-56
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://black.readthedocs.io/en/stable/)
99
[![DOI](https://zenodo.org/badge/328616234.svg)](https://zenodo.org/badge/latestdoi/328616234)
1010

11-
UKIS Cloud Shadow MASK (ukis-csmask) package masks clouds and cloud shadows in Sentinel-2, Landsat-9, Landsat-8, Landsat-7 and Landsat-5 images. Masking is performed with a pre-trained convolution neural network. It is fast and works directly on Level-1C data (no atmospheric correction required). Images just need to be in Top Of Atmosphere (TOA) reflectance and include at least the "blue", "green", "red" and "nir" spectral bands. Best performance (in terms of accuracy and speed) is achieved when images also include "swir16" and "swir22" spectral bands and are resampled to approximately 30 m spatial resolution.
11+
UKIS Cloud Shadow MASK (ukis-csmask) package masks clouds and cloud shadows in Sentinel-2, Landsat-9, Landsat-8, Landsat-7 and Landsat-5 images. Masking is performed with a pre-trained convolution neural network. It is fast and works with both Level-1C (no atmospheric correction) and Level-2A (atmospherically corrected) data. Images just need to be in reflectance and include at least the "blue", "green", "red" and "nir" spectral bands. Best performance (in terms of accuracy and speed) is achieved when images also include "swir16" and "swir22" spectral bands and are resampled to approximately 30 m spatial resolution.
1212

1313
This [publication](https://doi.org/10.1016/j.rse.2019.05.022) provides further insight into the underlying algorithm and compares it to the widely used [Fmask](http://www.pythonfmask.org/en/latest/) algorithm across a heterogeneous test dataset.
1414

@@ -23,31 +23,23 @@ If you use ukis-csmask in your work, please consider citing one of the above pub
2323

2424
![Examples](img/examples.png)
2525

26-
## Example (Sentinel 2)
27-
Here's an example on how to compute a cloud and cloud shadow mask from an image. Please note that here we use [ukis-pysat](https://github.com/dlr-eoc/ukis-pysat) for convencience image handling, but you can also work directly with [numpy](https://numpy.org/) arrays.
26+
## Example
27+
Here's an example on how to compute a cloud and cloud shadow mask from an image. Please note that here we use [ukis-pysat](https://github.com/dlr-eoc/ukis-pysat) for convencience image handling, but you can also work directly with [numpy](https://numpy.org/) arrays. Further examples can be found [here](examples).
2828

2929
````python
3030
from ukis_csmask.mask import CSmask
3131
from ukis_pysat.raster import Image, Platform
3232

3333
# read Level-1C image from file, convert digital numbers to TOA reflectance
3434
# and make sure resolution is 30 m to get best performance
35+
# NOTE: band_order must match the order of bands in the input image. it does not have to be in this explicit order.
36+
band_order = ["blue", "green", "red", "nir", "swir16", "swir22"]
3537
img = Image(data="sentinel2.tif", dimorder="last")
36-
img.dn2toa(platform=Platform.Sentinel2)
37-
img.warp(
38-
resampling_method=0,
39-
resolution=30,
40-
dst_crs=img.dataset.crs
41-
)
38+
img.dn2toa(platform=Platform.Sentinel2, wavelength=band_order)
39+
img.warp(resampling_method=0,resolution=30,dst_crs=img.dataset.crs)
4240

4341
# compute cloud and cloud shadow mask
44-
# NOTE: band_order must match the order of bands in the input image. it does not have to be in this explicit order.
45-
# make sure to use these six spectral bands to get best performance
46-
csmask = CSmask(
47-
img=img.arr,
48-
band_order=["blue", "green", "red", "nir", "swir16", "swir22"],
49-
nodata_value=0,
50-
)
42+
csmask = CSmask(img=img.arr, product_level="l1c", band_order=band_order, nodata_value=0)
5143

5244
# access cloud and cloud shadow mask
5345
csmask_csm = csmask.csm
@@ -64,48 +56,14 @@ csmask_csm.write_to_file("sentinel2_csm.tif", dtype="uint8", compress="PACKBITS"
6456
csmask_valid.write_to_file("sentinel2_valid.tif", dtype="uint8", compress="PACKBITS", kwargs={"nbits":2})
6557
````
6658

67-
## Example (Landsat 8)
68-
Here's a similar example based on Landsat 8.
59+
## Accuracy assessment
60+
The original ukis-csmask models, which are available in [ukis-csmask<=v0.2.2](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v0.2.2) and are described in this [publication](https://doi.org/10.1016/j.rse.2019.05.022), have been trained and tested on a custom reference dataset specifically for Level-1C data.
6961

70-
````python
71-
import rasterio
72-
import numpy as np
73-
from ukis_csmask.mask import CSmask
74-
from ukis_pysat.raster import Image, Platform
62+
From [ukis-csmask>=v1.0.0](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v1.0.0) on, we provide new models for Level-1C (L1C) and Level-2A (L2A) data, which have been trained on a much larger reference dataset (consisting of [SPARCS](https://www.usgs.gov/landsat-missions/spatial-procedures-automated-removal-cloud-and-shadow-sparcs-validation-data), [CloudSEN12+](https://cloudsen12.github.io/) and some additional custom samples). Both datasets natively only provide L1C images. Therefore, we have compiled corresponding L2A images for each sample.
7563

76-
# set Landsat 8 source path and prefix (example)
77-
data_path = "/your_data_path/"
78-
L8_file_prefix = "LC08_L1TP_191015_20210428_20210507_02_T1"
79-
80-
data_path = data_path+L8_file_prefix+"/"
81-
mtl_file = data_path+L8_file_prefix+"_MTL.txt"
82-
83-
# stack [B2:'Blue', B3:'Green', B4:'Red', B5:'NIR', B6:'SWIR1', B7:'SWIR2'] as numpy array
84-
L8_band_files = [data_path+L8_file_prefix+'_B'+ x + '.TIF' for x in [str(x+2) for x in range(6)]]
85-
86-
# >> adopted from https://gis.stackexchange.com/questions/223910/using-rasterio-or-gdal-to-stack-multiple-bands-without-using-subprocess-commands
87-
# read metadata of first file
88-
with rasterio.open(L8_band_files[0]) as src0:
89-
meta = src0.meta
90-
# update meta to reflect the number of layers
91-
meta.update(count = len(L8_band_files))
92-
# read each layer and append it to numpy array
93-
L8_bands = []
94-
for id, layer in enumerate(L8_band_files, start=1):
95-
with rasterio.open(layer) as src1:
96-
L8_bands.append(src1.read(1))
97-
L8_bands = np.stack(L8_bands,axis=2)
98-
# <<
99-
100-
img = Image(data=L8_bands, crs = meta['crs'], transform = meta['transform'], dimorder="last")
101-
102-
img.dn2toa(
103-
platform=Platform.Landsat8,
104-
mtl_file=mtl_file,
105-
wavelengths = ["blue", "green", "red", "nir", "swir16", "swir22"]
106-
)
107-
# >> proceed by analogy with Sentinel 2 example
108-
````
64+
![Accuracy](img/accuracy.png)
65+
66+
Above barplot compares the new [ukis-csmask>=v1.0.0](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v1.0.0) models against the previous [ukis-csmask<=v0.2.2](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v0.2.2) models on [CloudSEN12+](https://cloudsen12.github.io/) and [SPARCS](https://www.usgs.gov/landsat-missions/spatial-procedures-automated-removal-cloud-and-shadow-sparcs-validation-data) test splits for both L1C and L2A images. The results indicate the superior performance of the new [ukis-csmask>=v1.0.0](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v1.0.0) models against the previous [ukis-csmask<=v0.2.2](https://github.com/dlr-eoc/ukis-csmask/releases/tag/v0.2.2) models across all tested datasets and product levels. Providing separate models for each product level provides further improvements and enables greater flexibiliy.
10967

11068
## Installation
11169
The easiest way to install ukis-csmask is through pip. To install ukis-csmask with [default CPU provider](https://onnxruntime.ai/docs/execution-providers/) run the following.

examples/landsat8_l1c.ipynb

+158
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Segment clouds and cloud shadows in Landsat-8 images (L1C)\n",
8+
"This notebook shows an example on how to use [ukis-csmask](https://github.com/dlr-eoc/ukis-csmask) to segment clouds and cloud shadows in Level-1C images from Landsat-8. Images are loaded from local file system. Here we use [ukis-pysat](https://github.com/dlr-eoc/ukis-pysat) for convencience image handling, but you can also work directly with [numpy](https://numpy.org/) arrays.\n",
9+
"\n",
10+
"> NOTE: to run this notebook, we first need to install some additional dependencies for image handling\n",
11+
"```shell\n",
12+
"pip install ukis-pysat[complete]\n",
13+
"```"
14+
]
15+
},
16+
{
17+
"cell_type": "code",
18+
"execution_count": null,
19+
"id": "703f3744-902d-470b-a80f-9a8d3ea08dfa",
20+
"metadata": {},
21+
"outputs": [],
22+
"source": [
23+
"import rasterio\n",
24+
"import numpy as np\n",
25+
"\n",
26+
"from pathlib import Path\n",
27+
"from ukis_csmask.mask import CSmask\n",
28+
"from ukis_pysat.raster import Image, Platform"
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": null,
34+
"id": "c9cd86e4",
35+
"metadata": {},
36+
"outputs": [],
37+
"source": [
38+
"# user settings\n",
39+
"data_path = \"/your_data_path/\"\n",
40+
"L8_file_prefix = \"LC08_L1TP_191015_20210428_20210507_02_T1\"\n",
41+
"product_level = \"l1c\"\n",
42+
"band_order = [\"blue\", \"green\", \"red\", \"nir\", \"swir16\", \"swir22\"]\n",
43+
"providers = [\"CUDAExecutionProvider\"]\n",
44+
"out_dir = \"ukis-csmask/examples\""
45+
]
46+
},
47+
{
48+
"cell_type": "code",
49+
"execution_count": null,
50+
"id": "8ca03c78-1e24-479c-9786-a1b43206a08b",
51+
"metadata": {},
52+
"outputs": [],
53+
"source": [
54+
"# set Landsat 8 source path and prefix (example)\n",
55+
"data_path = data_path + L8_file_prefix + \"/\"\n",
56+
"mtl_file = data_path + L8_file_prefix + \"_MTL.txt\"\n",
57+
"\n",
58+
"# stack [B2:'Blue', B3:'Green', B4:'Red', B5:'NIR', B6:'SWIR1', B7:'SWIR2'] as numpy array\n",
59+
"L8_band_files = [data_path + L8_file_prefix + \"_B\" + x + \".TIF\" for x in [str(x + 2) for x in range(6)]]\n",
60+
"\n",
61+
"# >> adopted from https://gis.stackexchange.com/questions/223910/using-rasterio-or-gdal-to-stack-multiple-bands-without-using-subprocess-commands\n",
62+
"# read metadata of first file\n",
63+
"with rasterio.open(L8_band_files[0]) as src0:\n",
64+
" meta = src0.meta\n",
65+
"# update meta to reflect the number of layers\n",
66+
"meta.update(count=len(L8_band_files))\n",
67+
"# read each layer and append it to numpy array\n",
68+
"L8_bands = []\n",
69+
"for id, layer in enumerate(L8_band_files, start=1):\n",
70+
" with rasterio.open(layer) as src1:\n",
71+
" L8_bands.append(src1.read(1))\n",
72+
"L8_bands = np.stack(L8_bands, axis=2)\n",
73+
"# <<\n",
74+
"\n",
75+
"img = Image(data=L8_bands, crs=meta[\"crs\"], transform=meta[\"transform\"], dimorder=\"last\")\n",
76+
"img.dn2toa(platform=Platform.Landsat8, mtl_file=mtl_file, wavelengths=band_order)\n",
77+
"img.warp(resampling_method=0, resolution=30, dst_crs=img.dataset.crs)"
78+
]
79+
},
80+
{
81+
"cell_type": "code",
82+
"execution_count": null,
83+
"id": "7b568942-84e6-4baf-b490-a213b3787f80",
84+
"metadata": {},
85+
"outputs": [],
86+
"source": [
87+
"# compute cloud and cloud shadow mask\n",
88+
"csmask = CSmask(\n",
89+
" img=img.arr,\n",
90+
" band_order=band_order,\n",
91+
" product_level=product_level,\n",
92+
" nodata_value=0,\n",
93+
" invalid_buffer=4,\n",
94+
" intra_op_num_threads=0,\n",
95+
" inter_op_num_threads=0,\n",
96+
" providers=providers,\n",
97+
" batch_size=1,\n",
98+
")\n",
99+
"\n",
100+
"# access cloud and cloud shadow mask as numpy array\n",
101+
"csm = csmask.csm\n",
102+
"\n",
103+
"# access valid mask as numpy array\n",
104+
"valid = csmask.valid"
105+
]
106+
},
107+
{
108+
"cell_type": "code",
109+
"execution_count": null,
110+
"id": "68eb9c30-06f7-409e-914d-21e00f45de99",
111+
"metadata": {},
112+
"outputs": [],
113+
"source": [
114+
"# convert results to ukis-pysat Image\n",
115+
"# this assigns back the georeference\n",
116+
"csm = Image(csm, transform=img.dataset.transform, crs=img.dataset.crs, dimorder=\"last\")\n",
117+
"valid = Image(valid, transform=img.dataset.transform, crs=img.dataset.crs, dimorder=\"last\")\n",
118+
"\n",
119+
"# write results to file\n",
120+
"csm.write_to_file(\n",
121+
" path_to_file=Path(out_dir) / Path(f\"{L8_file_prefix}_csm.tif\"),\n",
122+
" dtype=csm.dtype,\n",
123+
" driver=\"COG\",\n",
124+
" compress=\"LZW\",\n",
125+
" kwargs={\"BLOCKSIZE\": 512, \"BIGTIFF\": \"IF_SAFER\"},\n",
126+
")\n",
127+
"valid.write_to_file(\n",
128+
" path_to_file=Path(out_dir) / Path(f\"{L8_file_prefix}_valid.tif\"),\n",
129+
" dtype=valid.dtype,\n",
130+
" driver=\"COG\",\n",
131+
" compress=\"LZW\",\n",
132+
" kwargs={\"BLOCKSIZE\": 512, \"BIGTIFF\": \"IF_SAFER\"},\n",
133+
")"
134+
]
135+
}
136+
],
137+
"metadata": {
138+
"kernelspec": {
139+
"display_name": "Python 3 (ipykernel)",
140+
"language": "python",
141+
"name": "python3"
142+
},
143+
"language_info": {
144+
"codemirror_mode": {
145+
"name": "ipython",
146+
"version": 3
147+
},
148+
"file_extension": ".py",
149+
"mimetype": "text/x-python",
150+
"name": "python",
151+
"nbconvert_exporter": "python",
152+
"pygments_lexer": "ipython3",
153+
"version": "3.11.10"
154+
}
155+
},
156+
"nbformat": 4,
157+
"nbformat_minor": 5
158+
}

0 commit comments

Comments
 (0)