Skip to content

Commit

Permalink
gistool v0.1.0-rc1 release (#1)
Browse files Browse the repository at this point in the history
* creating dev branch

* parsing MERIT-Hydro extents

* modifications on MERIT-Hydro script and validating its output

* adding .gitignore and details

* adding stats extraction feature without testing

* adding renv setup and files

* debuggin and adding stats and quantiles capabilities

* finalizing merit-hydro

* finalizing merit-hydro stats and subsetting

* Changing GDAL versin from 3.0.4 to 3.4.1

The change is due to complete compatibility with all
the datasets included in this tool, mainly the MODIS
MCDQ12A (land cover) dataset that works best with
the 3.0.4 version.

* Adding the capability to subset and implement efficient zonal statistics
on SoilGridsV1 GeoTIFFs

The file reads the .tif files, geographically subsets the files based on
the given latitude and longitude extents, and prints given input
`--stats` in a .csv file.

The users are expected to enter complete variable names taken from the
GitHub repository README page for this specific datasets.

Reported by: Kasra Keshavarz
Signed-off-by: Kasra Keshavarz <[email protected]>

* Debugging problems with zonal statistics and adding log date for errors

The zonal statistics were migrated from merit-hydro scripts without
modifications, so necessary editions were implemented to conform to the
SoilGridsV1 dataset and its nomenclature.

The log date has been prepended to the program-generated errors and
warnings which is useful for future debuggings, etc.

Reported by: Kasra Keshavarz
Signed-off-by: Kasra Keshavarz <[email protected]>

* initial README.md file for SoilGrids dataset

* README file created for MODIS dataset

* missing logDate from previous commit added now

* Adding MODIS zonal statistics and GeoTIFF Subsetting feature

MODIS landcover provides valuable information for setting up
hydrological models and therefore it has been added to the repository.

The zonal statistics using `exactextractr` is very efficient and
produces `frac` of each land cover class. Apart from that, all other
statistics that are available to other datasets, could be used with this
dataset as well.

Reported by: Kasra Keshavarz
Signed-off-by: Kasra Keshavarz <[email protected]>

* removing debugging lines

* adding initial information for the main README page

* renaming for better clarity of the tool

* adding few options to take into account shapefiles that does not have a CRS defined

* renaming

* initial example and README files

* initial README file

* merit-hydro example initialized

* adding the job submission option to the example

* initial soil grids example added

* modis example initialized

* typos corrected

* fixing typos

* Added description of README file for future reference

* adding license header

* relevant info added to the README file

* typos and corrections

* adding no verbose option to wget download

* adding missing backslashes for line continuation

* adding quiet option to wget

* typos

* correcting wget options orders

* fixing job submission cache path typos

* adding full path to the shapefile argument

* necessary technical details of the MERIT-Hydro dataset

* typos, correction, and adding file contents of one .tar file as an example

* Added necessary technical information to the READMEs

* necessary extra info added

* correcting soil_grids directory address typo

* fixing typos
  • Loading branch information
kasra-keshavarz authored Jul 12, 2022
1 parent 861f932 commit b6e6d79
Show file tree
Hide file tree
Showing 17 changed files with 2,102 additions and 328 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
*.pyc
.git
.ipynb_checkpoints
.DS_Store
*.swp
75 changes: 43 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,79 @@
# Description
This repository contains scripts to process necessary GeoTIFF datasets. The general usage of the script (i.e., `./extract-geotiff.sh`) is as follows:
This repository contains scripts to process necessary geospatial datasets and implement efficient zonal statistics on given ESRI Shapefiles. The general usage of the script (i.e., `./extract-gis.sh`) is as follows:

```console
Usage:
extract-geotiff [options...]
extract-gis [options...]

Script options:
-d, --dataset GeoTIFF dataset of interest,
currently available options are:
'MODIS';'MERIT-Hydro';'SoilGridsV1';
'SoilGridsV2';
-d, --dataset Geospatial dataset of interest, currently
available options are: 'MODIS';
'MERIT-Hydro';'SoilGridsV1'
-i, --dataset-dir=DIR The source path of the dataset file(s)
-r, --crs=INT The EPSG code of interest; optional
[defaults to 4326]
-v, --variable=var1[,var2[...]] If applicable, variables to process
-o, --output-dir=DIR Writes processed files to DIR
-s, --start-date=DATE If applicable, start date of the GeoTIFF
-s, --start-date=DATE If applicable, start date of the geospatial
data; optional
-e, --end-date=DATE If applicable, end date of the GeoTIFF
-e, --end-date=DATE If applicable, end date of the geospatial
data; optional
-l, --lat-lims=REAL,REAL Latitude's upper and lower bounds; optional
-n, --lon-lims=REAL,REAL Longitude's upper and lower bounds; optional
-p, --shape-file=PATH Path to the ESRI '.shp' file; optional
-f, --shape-file=PATH Path to the ESRI '.shp' file; optional
-j, --submit-job Submit the data extraction process as a job
on the SLURM system; optional
-t, --stats=stat1[,stat2[...]] If applicable, extract the statistics of
-t, --print-geotiff=BOOL Extract the subsetted GeoTIFF file; optional
[defaults to 'true']
-a, --stat=stat1[,stat2[...]] If applicable, extract the statistics of
interest, currently available options are:
'min';'max';'mean';'majority';'minority';
'median';'quantiles';'variety';'variance';
'median';'quantile';'variety';'variance';
'stdev';'coefficient_of_variation';'frac';
optional
-q, --quantile=q1[,q2[...]] Quantiles of interest to be produced if 'quantile'
is included in the '--stat' argument. The values
must be comma delimited float numbers between
0 and 1; optional [defaults to every 5th quantile]
-p, --prefix=STR Prefix prepended to the output files
-c, --cache=DIR Path of the cache directory; optional
-E, --email=STR E-mail when job starts, ends, and
-E, --email=STR E-mail when job starts, ends, and
finishes; optional
-V, --version Show version
-h, --help Show this screen and exit
```


# Available Datasets
|**#**|Dataset |Time Scale |DOI |Description |
|-----|--------------------------------------------|----------------------|-----------------------|---------------------|
|**1**|MODIS |2000 - 2021 | |[link](modis) |
|**2**|MERIT Hydro |Not Applicable (N/A) |10.1029/2019WR024873 |[link](merit_hydro) |
|**3**|Soil Grids (v1) | N/A |*ditto* |[link](soil_grids_v1)|
|**4** |Soil Grids (v2) | N/A |*ditto* |[link](soil_grids_v2)|
|**#**|Dataset |Time Scale |CRS |DOI |Description |
|-----|--------------------------------------------|----------------------|-----|-------------------------------|---------------------|
|**1**|MODIS |2000 - 2021 | |10.5067/MODIS/MCD12Q1.006 |[link](modis) |
|**2**|MERIT Hydro |Not Applicable (N/A) |4326 |10.1029/2019WR024873 |[link](merit_hydro) |
|**3**|Soil Grids (v1) |Not Applicable (N/A) |4326 |10.1371/journal.pone.0169748 |[link](soil_grids)|


# General Example
As an example, follow the code block below. Please remember that you MUST have access to Graham cluster with Compute Canada (CC) and have access to `MERIT-Hydro` dataset. Also, remember to generate a [Personal Access Token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) with GitHub in advance. Enter the following codes in your Graham shell as a test case:

```console
foo@bar:~$ git clone https://github.com/kasra-keshavarz/geotifftool # clone the repository
foo@bar:~$ cd ./geotifftool/ # always move to the repository's directory
foo@bar:~$ ./extract-geotiff.sh -h # view the usage message
foo@bar:~$ ./extract-geotiff.sh --geotiff="MERIT-Hydro" \
--dataset-dir="/project/rpp-kshook/Model_Output/WRF/CONUS/CTRL" \
--output-dir="$HOME/scratch/conus_i_output/" \
--start-date="2001-01-01 00:00:00" \
--end-date="2001-12-31 23:00:00" \
--lat-lims=49,51 \
--lon-lims=-117,-115 \
--variable=T2,PREC_ACC_NC,Q2,ACSWDNB,ACLWDNB,U10,V10,PSFC \
--prefix="conus_i";
foo@bar:~$ git clone https://github.com/kasra-keshavarz/gistool # clone the repository
foo@bar:~$ cd ./gistool/ # always move to the repository's directory
foo@bar:~$ ./extract-gis.sh -h # view the usage message
foo@bar:~$ wget -m -nd -nv -q -A "cat_pfaf_71_MERIT_Hydro_v07_Basins_v01_bugfix1.*" \
"http://hydrology.princeton.edu/data/mpan/MERIT_Basins/MERIT_Hydro_v07_Basins_v01_bugfix1/pfaf_level_02/";
# downloading a sample shapefile
foo@bar:~$ ./extract-gis.sh --dataset="merit-hydro" \
--dataset-dir="/project/rpp-kshook/CompHydCore/merit_hydro/raw_data/" \
--output-dir="$HOME/scratch/merit-hydro-test" \
--shape-file="./cat_pfaf_67_MERIT_Hydro_v07_Basins_v01_bugfix1.shp" \
--print-geotiff=true \
--stat="min,max,mean,median,quantile" \
--quantile="0.1,0.5,0.9" \
--variable="elv,hnd" \
--prefix="merit_test_";

```
See the [example](./example) directory for real-world scripts for each GeoTIFF dataset included in this repository.
See the [example](./example) directory for real-world scripts for each geospatial dataset included in this repository.


# New Datasets
Expand All @@ -74,7 +85,7 @@ Please open a new ticket on the **Issues** tab of the current repository in case


# License
GeoTIFF Processing Workflow<br>
Geospatial Dataset Processing Workflow<br>
Copyright (C) 2022, University of Saskatchewan<br>

This program is free software: you can redistribute it and/or modify
Expand Down
4 changes: 4 additions & 0 deletions assets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Description

This directory contains two main files: 1) `renv.lock` containing necessary meta-data for the R `renv` package to set up the environment/libraries necessary for running the [`exactextractr`](https://github.com/isciences/exactextractr) package that implements zonal statistics and 2) `stats.R` that calls [`exactextractr`](https://github.com/isciences/exactextractr) package after the R environment has been fully set up.

Loading

0 comments on commit b6e6d79

Please sign in to comment.