Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Commit 080cf46

Browse files
maxkazmsftyalaudahfazamaniSharat Chikkerurkirasoderstrom
authored
0.2 release (#395)
* cleaning up files which are no longer needed * fixes after removing forking workflow (#322) * PR to resolve merge issues * updated main build as well * added ability to read in git branch name directly * manually updated the other files * fixed number of classes for main build tests (#327) * fixed number of classes for main build tests * corrected DATASET.ROOT in builds * added dev build script * Fixes for development inside the docker container (#335) * Fix the mound command for the HRNet pretrained model in the docker readme * Properly catch InvalidGitRepository exception * make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed * Properly catch InvalidGitRepository exception in train.py * Readme update (#337) * README updates * Removing user specific path from config Authored-by: Fatemeh Zamanian <[email protected]> * Fixing #324 and #325 (#338) * update colormap to a non-discrete one -- fixes #324 * fix mask_to_disk to normalize by n_classes * changes to test.py * Updating data.py * bug fix * increased timeout time for main_build * retrigger build * retrigger the build * increase timeout * fixes 318 (#339) * finished 318 * increased checkerboard test timeout * fix 333 (#340) * added label correction to train gradient * changing the gradient data generator to take inline/crossline argument conssistent with the patchloader * changing variable name to be more descriptive Co-authored-by: maxkazmsft <[email protected]> * bug fix to model predictions (#345) * replace hrnet with seresnet in experiments - provides stable default model (#343) * PR to fix #342 (#347) * intermediate work for normalization * 1) normalize function runs based on global MIN and MAX 2) has a error handling for division by zero, np.finfo 3) decode_segmap normalizes the label/mask based on the n_calsses * global normalization added to test.py * increasing the threshold on timeout * trigger * revert * idk what happened * increase timeout * picking up global min and max * passing config to TrainPatchLoader to facilitate access to global min and max and other attr in low level functions, WIP * removed print statement * changed section loaders * updated test for min and max from config too * adde MIN and MAX to config * notebook modified for loaders * another dataloader in notebook * readme update * changed the default values for min max, updated the docstring for loaders, removed suppressed lines * debug * merging work from CSE team into main staging branch (#357) * Adding content to interpretation README (#171) * added sharat, weehyong to authors * adding a download script for Dutch F3 dataset * Adding script instructions for dutch f3 * Update README.md prepare scripts expect root level directory for dutch f3 dataset. (it is downloaded into $dir/data by the script) * Adding readme text for the notebooks and checking if config is correctly setup * fixing prepare script example * Adding more content to interpretation README * Update README.md * Update HRNet_Penobscot_demo_notebook.ipynb Co-authored-by: maxkazmsft <[email protected]> * Updates to prepare dutchf3 (#185) * updating patch to patch_size when we are using it as an integer * modifying the range function in the prepare_dutchf3 script to get all of our data * updating path to logging.config so the script can locate it * manually reverting back log path to troubleshoot build tests * updating patch to patch_size for testing on preprocessing scripts * updating patch to patch_size where applicable in ablation.sh * reverting back changes on ablation.sh to validate build pass * update patch to patch_size in ablation.sh (#191) Co-authored-by: Sharat Chikkerur <[email protected]> * TestLoader's support for custom paths (#196) * Add testloader support for custom paths. * Add test * added file name workaround for Train*Loader classes * adding comments and clean up * Remove legacy code. * Remove parameters that dont exist in init() from documentation. * Add unit tests for data loaders in dutchf3 * moved unit tests Co-authored-by: maxkazmsft <[email protected]> * select contiguous data splits for val and train (#200) * select contiguous data splits for test and train * changed data-dir to data_dir as arg to prepare_dutchf3.py * update script with new required parameter label_file * ignoring split_alaudah_et_al_19 as it is not updated * changed TEST to VALIDATION for clarity in the code * included job to run scripts unit test * Fix val/train split and add tests * adjust to consider the whole horz_lines * update environment - gitpython version * Segy Converter Utility (#199) * Add convert_segy utility script and related notebooks * add segy files to .gitignore * readability update * Create methods for normalizing and clipping separately. * Add comment * update file paths * cleanup tests and terminology for the normalization/clipping code * update notes to provide more context for using the script * Add tests for clipping. * Update comments * added Microsoft copyright * Update root README * Add a flag to turn on clipping in dataprep script. * Remove hard coded values and fix _filder_data method. * Fix some minor issues pointed out on comments. * Remove unused lib. * Rename notebooks to impose order; set env; move all def funtions into utils; improve comments in notebooks; and include code example to run prepare_dutchf3.py * Label missing data with 255. * Remove cell with --help command. * Add notebooks to test pipeline. * grammer edits * update notebook output and utils naming * fix output dir error and cleanup notebook * fix yaml indent error in notebooks_build.yml * fix merge issues and job name errors * debugging the build pipeline * combine notebook tests for segy converter since they are dependent on each other Co-authored-by: Geisa Faustino <[email protected]> * Azureml train pipeline (#195) * initial add of azure ml pipeline * update references and dependencies * fix integration tests * remove incomplete tests * add azureml requirements.txt for dutchf3 local patch and update pipeline config * add empty __init__.py to cv_lib dutchf3 * Get train,py to run in pipeline * allow output dir in train.py * Clean up README and __init__ * only pass output if available and use input dir for output in train.py * update comment in train.py * updating azureml_requirements to only pull from /master * removing windows guidance in azureml_pipelines/README.md * adding .env.example * adding azureml config example * updating documentation in azureml_pipelines README.md * updating main README.md to refer to AML guidance documentation * updating AML README.md to include additional guidance to cancel runs * adding documentation on AzureML pipelines in the AML README.me * adding files needed section for AML training run * including hyperlink in format poiniting to additional detail on Azure Machine Learning pipeslines in AML README.md * removing the mention of VSCode in the AML README.md * fixing typo * modifying config to pipeline configuration in README.md * fixing typo in README.md * adding documentation on how to create a blob container and copy data onto it * adding documentation on blob storage guidance * adding guidance on how to get the subscription id * adding guidance to activate environment and then run the kick off train pipeline from ROOT * adding ability to pass in experiement name and different pipeline configuration to kickoff_train_pipeline.py * adding Microsoft Corporation Copyright to kickoff_train_pipeline.py * fixing format in README.md * adding trouble shooting section in README.md for connection to subscription * updating troubleshooting title * adding guidance on how to download the config.json from the Azure Portal in the README.md * adding additional guidance and information on AzureML compute targets and naming conventions * changing the configuation file example to only include the train step that is currently supported * updating config to pipeline configuration when applicable * adding link to Microsoft docs for additional information on pipeline steps * updated AML test build definitions * updated AML test build definitions * adding job to aml_build.yml * updating example config for testing * modifying the test_train_pipeline.py to have appropriate number of pipeline steps and other required modifications * updating AML_pipeline_tests in aml_build.yml to consume environment variables * updating scriptType, sciptLocation, and inlineScript in aml_build.yml * trivial commit to re-trigger broken build pipelines * fix to aml yml build to use env vars for secrets and everything else * another yml fix * another yml fix * reverting structure format of jobs for aml_build pipeline tests * updating path to test_train_pipeline.py * aml_pipeline_tests timed out, extending timeoutInMinutes from 10 to 40 * adding additional pytest * adding az login * updating variables in aml pipeline tests Co-authored-by: Anna Zietlow <[email protected]> Co-authored-by: maxkazmsft <[email protected]> * moved contrib contributions around from CSE * fixed dataloader tests - updated them to work with new code from staging branch * segyconverter notebooks and tests run and pass; updated documentation * added test job for segy converter notebooks * removed AML training pipeline from this release * fixed training model tolerance precision in the tests - wasn't working * fixed train.py build issues after the merge * addressed PR comments * fixed bug in check_performance Co-authored-by: Sharat Chikkerur <[email protected]> Co-authored-by: kirasoderstrom <[email protected]> Co-authored-by: Sharat Chikkerur <[email protected]> Co-authored-by: Geisa Faustino <[email protected]> Co-authored-by: Ricardo Squassina Lee <[email protected]> Co-authored-by: Michael Zawacki <[email protected]> Co-authored-by: Anna Zietlow <[email protected]> * make tests simpler (#368) * removed Dutch F3 job from main_build * fixed a bug in data subset in debug mode * modified epoch numbers to pass the performance checks, checkedout check_performance from Max's branch * modified get_data_for_builds.sh to set up checkerboard data for smaller size, minor improvements on gen_checkerboard * send all the batches, disabled the performance checks for patch_deconvnet * added comment to enable tests for patch_deconvnet after debugging, renamed gen_checkerboard, added options to new arg per Max's suggestion * Replace HRNet with SEResNet model in the notebook (#362) * replaced HRNet with SEResNet model in the notebook * removed debugging cell info * fixed bug where resnet_unet model wasn't loading the pre-trained version in the notebook * fixed build VM problems * Multi-GPU training support (#359) * Data flow tests (#375) * renamed checkerboard job name * restructured default outputs from test.py to be dumped under output dir and not debug dir * test.py output re-org * removed outdated variable from check_performance.py * intermediate work * intermediate work * bunch of intermediate works * changing args for different trainings * final to run dev_build" * remove print statements * removed print statement * removed suppressed lines * added assertion error msg * added assertion error msg, one intential bug to test * testing a stupid bug * debug * omg * final * trigger build * fixed multi-GPU termination in train.py (#379) * PR to fix #371 and #372 (#380) * added learning rate to logs * changed epoch for patch_deconvnet, and enabled the tests * removed TODOs * changed tensorflow pinned version (#387) * changed tensorflow pinned version * trigger build * closes 385 (#389) * Fixing #259 by adding symmetric padding along depth direction (#386) * BYOD Penobscot (#390) * minor updates to files * added penobscot conversion code * docker build test (#388) * added a new job to test bulding the docker, for now it is daisy-chained to the end * this is just a TEST * test * test * remove old image * debug * debug * test * debug * enabled all the jobs * quick fix * removing non-tagged iamges Co-authored-by: maxkazmsft <[email protected]> * added missing license headers and fixed formatting (#391) * added missing license headers and fixed formatting * some more license headers * updated documentation to close 354 and 381 (#392) * fix test.py and notebook issues (#394) * resolved conflicts for 0.2 release (#396) * V00.01.00003 release (#356) * cleaning up files which are no longer needed * fixes after removing forking workflow (#322) * PR to resolve merge issues * updated main build as well * added ability to read in git branch name directly * manually updated the other files * fixed number of classes for main build tests (#327) * fixed number of classes for main build tests * corrected DATASET.ROOT in builds * added dev build script * Fixes for development inside the docker container (#335) * Fix the mound command for the HRNet pretrained model in the docker readme * Properly catch InvalidGitRepository exception * make repo paths consistent with non-docker runs -- this way configs paths do not need to be changed * Properly catch InvalidGitRepository exception in train.py * Readme update (#337) * README updates * Removing user specific path from config Authored-by: Fatemeh Zamanian <[email protected]> * Fixing #324 and #325 (#338) * update colormap to a non-discrete one -- fixes #324 * fix mask_to_disk to normalize by n_classes * changes to test.py * Updating data.py * bug fix * increased timeout time for main_build * retrigger build * retrigger the build * increase timeout * fixes 318 (#339) * finished 318 * increased checkerboard test timeout * fix 333 (#340) * added label correction to train gradient * changing the gradient data generator to take inline/crossline argument conssistent with the patchloader * changing variable name to be more descriptive Co-authored-by: maxkazmsft <[email protected]> * bug fix to model predictions (#345) * replace hrnet with seresnet in experiments - provides stable default model (#343) Co-authored-by: yalaudah <[email protected]> Co-authored-by: Fatemeh <[email protected]> * typos Co-authored-by: yalaudah <[email protected]> Co-authored-by: Fatemeh <[email protected]> Co-authored-by: yalaudah <[email protected]> Co-authored-by: Fatemeh <[email protected]> Co-authored-by: Sharat Chikkerur <[email protected]> Co-authored-by: kirasoderstrom <[email protected]> Co-authored-by: Sharat Chikkerur <[email protected]> Co-authored-by: Geisa Faustino <[email protected]> Co-authored-by: Ricardo Squassina Lee <[email protected]> Co-authored-by: Michael Zawacki <[email protected]> Co-authored-by: Anna Zietlow <[email protected]>
1 parent 15d45fb commit 080cf46

File tree

91 files changed

+4088
-881
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+4088
-881
lines changed

Diff for: .azureml.example/config.json

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"subscription_id": "input_sub_id",
3+
"resource_group": "input_resource_group",
4+
"workspace_name": "input_workspace_name"
5+
}

Diff for: .env.example

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
BLOB_ACCOUNT_NAME=
2+
BLOB_CONTAINER_NAME=
3+
BLOB_ACCOUNT_KEY=
4+
BLOB_SUB_ID=
5+
AML_COMPUTE_CLUSTER_NAME=
6+
AML_COMPUTE_CLUSTER_MIN_NODES=
7+
AML_COMPUTE_CLUSTER_MAX_NODES=
8+
AML_COMPUTE_CLUSTER_SKU=

Diff for: .gitignore

+5-1
Original file line numberDiff line numberDiff line change
@@ -115,4 +115,8 @@ interpretation/environment/anaconda/local/src/cv-lib
115115
# Rope project settings
116116
.ropeproject
117117

118-
*.pth
118+
*.pth
119+
120+
# Seismic data files
121+
*.sgy
122+
*.segy

Diff for: README.md

+70-52
Large diffs are not rendered by default.

Diff for: conftest.py

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT License.

Diff for: contrib/README.md

+12
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,15 @@ We encourage submissions to the contrib folder, and once they are well-tested, d
66

77
Thank you.
88

9+
#### Azure Machine Learning
10+
If you would like to leverage Azure Machine Learning to create a Training Pipeline with this dataset we have guidance on how do so [here](interpretation/deepseismic_interpretation/azureml_pipelines/README.md)
11+
12+
### HRNet model guidance (experimental for now)
13+
14+
#### HRNet ImageNet weights model
15+
16+
To enable training from scratch on seismic data and to achieve the same results as the benchmarks quoted below you will need to download the HRNet model [pretrained](https://github.com/HRNet/HRNet-Image-Classification) on ImageNet. We are specifically using the [HRNet-W48-C](https://1drv.ms/u/s!Aus8VCZ_C_33dKvqI6pBZlifgJk) pre-trained model; other HRNet variants are also available [here](https://github.com/HRNet/HRNet-Image-Classification) - you can navigate to those from the [main HRNet landing page](https://github.com/HRNet/HRNet-Object-Detection) for object detection.
17+
18+
Unfortunately, the OneDrive location which is used to host the model is using a temporary authentication token, so there is no way for us to script up model download. There are two ways to upload and use the pre-trained HRNet model on DS VM:
19+
- download the model to your local drive using a web browser of your choice and then upload the model to the DS VM using something like `scp`; navigate to Portal and copy DS VM's public IP from the Overview panel of your DS VM (you can search your DS VM by name in the search bar of the Portal) then use `scp local_model_location username@DS_VM_public_IP:./model/save/path` to upload
20+
- alternatively, you can use the same public IP to open remote desktop over SSH to your Linux VM using [X2Go](https://wiki.x2go.org/doku.php/download:start): you can basically open the web browser on your VM this way and download the model to VM's disk

Diff for: contrib/experiments/interpretation/dutchf3_section/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Now you're all set to run training and testing experiments on the F3 Netherlands
1919
### Monitoring progress with TensorBoard
2020
- from the this directory, run `tensorboard --logdir='output'` (all runtime logging information is
2121
written to the `output` folder
22-
- open a web-browser and go to either vmpublicip:6006 if running remotely or localhost:6006 if running locally
22+
- open a web-browser and go to either `<vm_public_ip>:6006` if running remotely or localhost:6006 if running locally
2323
> **NOTE**:If running remotely remember that the port must be open and accessible
2424
2525
More information on Tensorboard can be found [here](https://www.tensorflow.org/get_started/summaries_and_tensorboard#launching_tensorboard).

Diff for: contrib/experiments/interpretation/penobscot/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Also follow instructions for [downloading and preparing](../../../README.md#peno
2020
### Monitoring progress with TensorBoard
2121
- from the this directory, run `tensorboard --logdir='output'` (all runtime logging information is
2222
written to the `output` folder
23-
- open a web-browser and go to either vmpublicip:6006 if running remotely or localhost:6006 if running locally
23+
- open a web-browser and go to either `<vm_public_ip>:6006` if running remotely or `localhost:6006` if running locally
2424
> **NOTE**:If running remotely remember that the port must be open and accessible
2525
2626
More information on Tensorboard can be found [here](https://www.tensorflow.org/get_started/summaries_and_tensorboard#launching_tensorboard).

Diff for: scripts/run_all.sh renamed to contrib/scripts/run_all.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ nohup time python train.py \
3939
# wait for python to pick up the runtime env before switching it
4040
sleep 1
4141

42-
cd ../../dutchf3_patch/local
42+
cd ../../dutchf3_patch
4343

4444
# patch based without skip connections
4545
export CUDA_VISIBLE_DEVICES=2

Diff for: scripts/run_distributed.sh renamed to contrib/scripts/run_distributed.sh

+6-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
#!/bin/bash
22

33
# number of GPUs to train on
4-
NGPU=8
4+
NGPUS=$(nvidia-smi -L | wc -l)
5+
if [ "$NGPUS" -lt "2" ]; then
6+
echo "ERROR: cannot run distributed training without 2 or more GPUs."
7+
exit 1
8+
fi
59
# specify pretrained HRNet backbone
610
PRETRAINED_HRNET='/home/alfred/models/hrnetv2_w48_imagenet_pretrained.pth'
711
# DATA_F3='/home/alfred/data/dutch/data'
@@ -15,9 +19,8 @@ unset CUDA_VISIBLE_DEVICES
1519
# bug to fix conda not launching from a bash shell
1620
source /data/anaconda/etc/profile.d/conda.sh
1721
conda activate seismic-interpretation
18-
export PYTHONPATH=/storage/repos/forks/seismic-deeplearning-1/interpretation:$PYTHONPATH
1922

20-
cd experiments/interpretation/dutchf3_patch/distributed/
23+
cd experiments/interpretation/dutchf3_patch/
2124

2225
# patch based without skip connections
2326
nohup time python -m torch.distributed.launch --nproc_per_node=${NGPU} train.py \

Diff for: scripts/test_all.sh renamed to contrib/scripts/test_all.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ nohup time python test.py \
5959
--cfg "configs/${CONFIG_NAME}.yaml" > ${CONFIG_NAME}_test.log 2>&1 &
6060
sleep 1
6161

62-
cd ../../dutchf3_patch/local
62+
cd ../../dutchf3_patch
6363

6464
# patch based without skip connections
6565
export CUDA_VISIBLE_DEVICES=2
@@ -140,7 +140,7 @@ wait
140140

141141
# scoring scripts are in the local folder
142142
# models are in the distributed folder
143-
cd ../../dutchf3_patch/local
143+
cd ../../dutchf3_patch
144144

145145
# patch based without skip connections
146146
export CUDA_VISIBLE_DEVICES=2

Diff for: contrib/tests/cicd/aml_build.yml

+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT License.
3+
4+
# Pull request against these branches will trigger this build
5+
pr:
6+
- master
7+
- staging
8+
- contrib
9+
10+
# Any commit to this branch will trigger the build.
11+
trigger:
12+
- master
13+
- staging
14+
- contrib
15+
16+
jobs:
17+
18+
# partially disable setup for now - done manually on build VM
19+
- job: setup
20+
timeoutInMinutes: 10
21+
displayName: Setup
22+
pool:
23+
name: deepseismicagentpool
24+
steps:
25+
- bash: |
26+
# terminate as soon as any internal script fails
27+
set -e
28+
29+
echo "Running setup..."
30+
pwd
31+
ls
32+
git branch
33+
uname -ra
34+
35+
# TODO: uncomment in the next release to bring back AML
36+
# # setup run environment
37+
# ./scripts/env_reinstall.sh
38+
#
39+
# # use hardcoded root for now because not sure how env changes under ADO policy
40+
# DATA_ROOT="/home/alfred/data_dynamic"
41+
# ./tests/cicd/src/scripts/get_data_for_builds.sh ${DATA_ROOT}
42+
#
43+
# # upload pre-processed data to AML build WASB storage - overwrites by default and auto-creates container name
44+
# azcopy --quiet --recursive \
45+
# --source ${DATA_ROOT}/dutch_f3/data --destination https://${BLOB_ACCOUNT_NAME}.blob.core.windows.net/${BLOB_CONTAINER_NAME}/data \
46+
# --dest-key ${BLOB_ACCOUNT_KEY}
47+
# env:
48+
# BLOB_ACCOUNT_NAME: $(amlbuildstore)
49+
# BLOB_CONTAINER_NAME: "amlbuild"
50+
# BLOB_ACCOUNT_KEY: $(amlbuildstorekey)
51+
#
52+
#
53+
#- job: AML_pipeline_tests
54+
# dependsOn: setup
55+
# timeoutInMinutes: 20
56+
# displayName: AML pipeline tests
57+
# pool:
58+
# name: deepseismicagentpool
59+
# steps:
60+
# - bash: |
61+
# source activate seismic-interpretation
62+
# # TODO: add code which launches your pytest files ("pytest sometest" OR "python test.py")
63+
# # data is in $(amlbuildstore).blob.core.windows.net/amlbuild/data (container amlbuild, virtual folder data)
64+
# # storage key is $(amlbuildstorekey)
65+
# az --version
66+
# az account show
67+
# az login --service-principal -u $SPIDENTITY -p $SPECRET --tenant $SPTENANT
68+
# az account set --subscription $SUB_ID
69+
# mkdir .azureml
70+
# cat <<EOF > .azureml/config.json
71+
# {
72+
# "subscription_id": "$SUB_ID",
73+
# "resource_group": "$RESOURCE_GROUP",
74+
# "workspace_name": "$WORKSPACE_NAME"
75+
# }
76+
# EOF
77+
# pytest interpretation/tests/test_train_pipeline.py || EXITCODE=123
78+
# exit $EXITCODE
79+
# pytest
80+
# env:
81+
# SUB_ID: $(subscription_id)
82+
# RESOURCE_GROUP: $(resource_group)
83+
# WORKSPACE_NAME: $(workspace_name)
84+
# BLOB_ACCOUNT_NAME: $(amlbuildstore)
85+
# BLOB_CONTAINER_NAME: "amlbuild"
86+
# BLOB_ACCOUNT_KEY: $(amlbuildstorekey)
87+
# BLOB_SUB_ID: $(subscription_id)
88+
# AML_COMPUTE_CLUSTER_NAME: "testcluster"
89+
# AML_COMPUTE_CLUSTER_MIN_NODES: "1"
90+
# AML_COMPUTE_CLUSTER_MAX_NODES: "8"
91+
# AML_COMPUTE_CLUSTER_SKU: "STANDARD_NC6"
92+
# SPIDENTITY: $(spidentity)
93+
# SPECRET: $(spsecret)
94+
# SPTENANT: $(sptenant)
95+
# displayName: 'integration tests'
96+
97+
# - job: AML_short_pipeline_test
98+
# dependsOn: setup
99+
# timeoutInMinutes: 5
100+
# displayName: AML short pipeline test
101+
# pool:
102+
# name: deepseismicagentpool
103+
# steps:
104+
# - bash: |
105+
# source activate seismic-interpretation
106+
# # TODO: OPTIONAL! Add a job which launches entire training pipeline for 1 epoch of training (train model for single epoch)
107+
# # if you don't want this then delete the entire job from this file
108+
# python interpretation/deepseismic_interpretation/azureml_pipelines/dev/kickoff_train_pipeline.py --experiment=DEV-train-pipeline-name --orchestrator_config=orchestrator_config="interpretation/deepseismic_interpretation/azureml_pipelines/pipeline_config.json"
109+
110+

Diff for: cv_lib/cv_lib/__init__.py

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT License.

Diff for: cv_lib/cv_lib/event_handlers/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ def _create_checkpoint_handler(self):
3131
def __call__(self, engine, to_save):
3232
self._checkpoint_handler(engine, to_save)
3333
if self._snapshot_function():
34-
files = glob.glob(os.path.join(self._model_save_location, self._running_model_prefix + "*"))
34+
files = glob.glob(os.path.join(self._model_save_location, self._running_model_prefix + "*"))
3535
name_postfix = os.path.basename(files[0]).lstrip(self._running_model_prefix)
3636
copyfile(
3737
files[0],

Diff for: cv_lib/cv_lib/event_handlers/azureml_handlers.py

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT License.

Diff for: cv_lib/cv_lib/event_handlers/tensorboard_handlers.py

+5-4
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
from cv_lib.segmentation.dutchf3.utils import np_to_tb
1111
from cv_lib.utils import decode_segmap
1212

13+
1314
def create_summary_writer(log_dir):
1415
writer = SummaryWriter(logdir=log_dir)
1516
return writer
@@ -20,9 +21,9 @@ def _transform_image(output_tensor):
2021
return torchvision.utils.make_grid(output_tensor, normalize=True, scale_each=True)
2122

2223

23-
def _transform_pred(output_tensor):
24+
def _transform_pred(output_tensor, n_classes):
2425
output_tensor = output_tensor.squeeze().cpu().numpy()
25-
decoded = decode_segmap(output_tensor)
26+
decoded = decode_segmap(output_tensor, n_classes)
2627
return torchvision.utils.make_grid(np_to_tb(decoded), normalize=False, scale_each=False)
2728

2829

@@ -111,5 +112,5 @@ def log_results(engine, evaluator, summary_writer, n_classes, stage):
111112
y_pred[mask == 255] = 255
112113

113114
summary_writer.add_image(f"{stage}/Image", _transform_image(image), epoch)
114-
summary_writer.add_image(f"{stage}/Mask", _transform_pred(mask), epoch)
115-
summary_writer.add_image(f"{stage}/Pred", _transform_pred(y_pred), epoch)
115+
summary_writer.add_image(f"{stage}/Mask", _transform_pred(mask, n_classes), epoch)
116+
summary_writer.add_image(f"{stage}/Pred", _transform_pred(y_pred, n_classes), epoch)

Diff for: cv_lib/cv_lib/segmentation/dutchf3/__init__.py

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT License.

Diff for: cv_lib/cv_lib/segmentation/dutchf3/utils.py

-1
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,3 @@ def git_branch():
3737
def git_hash():
3838
repo = Repo(search_parent_directories=True)
3939
return repo.active_branch.commit.hexsha
40-

Diff for: cv_lib/cv_lib/segmentation/models/patch_deconvnet_skip.py

+1
Original file line numberDiff line numberDiff line change
@@ -304,4 +304,5 @@ def get_seg_model(cfg, **kwargs):
304304
cfg.MODEL.IN_CHANNELS == 1
305305
), f"Patch deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
306306
model = patch_deconvnet_skip(n_classes=cfg.DATASET.NUM_CLASSES)
307+
307308
return model

Diff for: cv_lib/cv_lib/segmentation/models/resnet_unet.py

+5
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11
# Copyright (c) Microsoft Corporation.
22
# Licensed under the MIT License.
33

4+
import logging
5+
import os
6+
47
import torch
58
import torch.nn as nn
69
import torch.nn.functional as F
710
import torchvision
811

12+
logger = logging.getLogger(__name__)
13+
914

1015
class FPAv2(nn.Module):
1116
def __init__(self, input_dim, output_dim):

Diff for: cv_lib/cv_lib/segmentation/models/section_deconvnet.py

+1
Original file line numberDiff line numberDiff line change
@@ -304,4 +304,5 @@ def get_seg_model(cfg, **kwargs):
304304
cfg.MODEL.IN_CHANNELS == 1
305305
), f"Section deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
306306
model = section_deconvnet(n_classes=cfg.DATASET.NUM_CLASSES)
307+
307308
return model

Diff for: cv_lib/cv_lib/segmentation/models/section_deconvnet_skip.py

+1
Original file line numberDiff line numberDiff line change
@@ -304,4 +304,5 @@ def get_seg_model(cfg, **kwargs):
304304
cfg.MODEL.IN_CHANNELS == 1
305305
), f"Section deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
306306
model = section_deconvnet_skip(n_classes=cfg.DATASET.NUM_CLASSES)
307+
307308
return model

Diff for: cv_lib/cv_lib/segmentation/models/seg_hrnet.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -430,21 +430,20 @@ def init_weights(
430430

431431
if pretrained and not os.path.isfile(pretrained):
432432
raise FileNotFoundError(f"The file {pretrained} was not found. Please supply correct path or leave empty")
433-
433+
434434
if os.path.isfile(pretrained):
435435
pretrained_dict = torch.load(pretrained)
436436
logger.info("=> loading pretrained model {}".format(pretrained))
437437
model_dict = self.state_dict()
438438
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict.keys()}
439439
for k, _ in pretrained_dict.items():
440-
logger.info(
441-
'=> loading {} pretrained model {}'.format(k, pretrained))
440+
logger.info("=> loading {} pretrained model {}".format(k, pretrained))
442441
model_dict.update(pretrained_dict)
443442
self.load_state_dict(model_dict)
444443

445444

446445
def get_seg_model(cfg, **kwargs):
447446
model = HighResolutionNet(cfg, **kwargs)
448-
model.init_weights(cfg.MODEL.PRETRAINED)
449-
447+
if "PRETRAINED" in cfg.MODEL.keys():
448+
model.init_weights(cfg.MODEL.PRETRAINED)
450449
return model

Diff for: cv_lib/cv_lib/segmentation/models/unet.py

+1
Original file line numberDiff line numberDiff line change
@@ -113,4 +113,5 @@ def forward(self, x):
113113

114114
def get_seg_model(cfg, **kwargs):
115115
model = UNet(cfg.MODEL.IN_CHANNELS, cfg.DATASET.NUM_CLASSES)
116+
116117
return model

Diff for: cv_lib/cv_lib/segmentation/utils.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33

44
import numpy as np
55

6+
67
def _chw_to_hwc(image_array_numpy):
78
return np.moveaxis(image_array_numpy, 0, -1)
8-
9-

0 commit comments

Comments
 (0)