This repository is the demonstration of the group equivariant Ambisonic signal processing DNNs [1], implemented by the authors.
This repository (except submodules) is released under the specific license. Read License file in this repository before you download and use this software.
The submodule seld-dcase2019
is under its own license.
The script fCGModule.py
is from zlin7/CGNet, which is originally released under the MIT License.
.
├── LICENSE
├── README.md
├── adversarial_attack.py
├── article_figure
│ └── taslp
├── boot_tensorboard.sh
├── checkpoints
├── dcase19_dataset.py
├── docker
│ ├── Dockerfile
│ └── build.sh
├── evaluation.py
├── fCGModule.py
├── feature_extraction.py
├── login_torch_sh.sh
├── main.py
├── math_util.py
├── models.py
├── modules.py
├── parameter.py
├── render_taslp_fig3.py
├── render_taslp_fig4.py
├── result
├── ret_adv
├── ret_eval
├── run_adversarial_attack.sh
├── run_experiment.sh
└── seld-dcase2019
We assume the environment that docker/Dockerfile
appropriately works.
- Clone this repository.
git clone --recursive https://github.com/nttrd-mdlab/group-equiv-seld
cd group-equiv-seld
- Build the Docker environment.
$ cd docker
$ ./build.sh
> ...
> Successfully built 31cc484c9976
> Successfully tagged cgdcase:0.2
$ cd ../
- Download the dataset files from the link on this website. You need
foa_dev.z**
,metadata_dev.zip
,foa_eval.zip
,metadata_eval.zip
. Then, generate the normalized dataset usingfeature_extraction.py
(do not forget to rewrite the path to the downloaded files infeature_extraction.py
).
./login_torch_sh.sh
python3 feature_extraction.py
exit
- Start model training.
./run_experiment.sh 0 # Specify the GPU number (0-origin) by argument
Trained model is saved to ./checkpoints
, and the log is saved to ./result
.
-
Change experiment conditions by rewriting
parameter.py
and re-run./run_experiment.sh
:- Toggle
model=['Conventional', 'Proposed'][1]
to[0]
to test baseline model. - Toggle
scale_equivariance=True
toFalse
to disable scale equivariance of proposed method. - Switch
train_rotation_bias=['virtual_rot', 'azi_random', None][0]
to[1]
to enable rotational data augmentation. - Rewrite
feature_phase_different_bin=0
toNone
to disable time translation invariance of proposed method.
- Toggle
-
Check and compare performance.
Evaluate the trained model
$ ./login_torch_sh.sh 0
$ python3 evaluation.py --resume ./checkpoints/(name of checkpoint file).checkpoint
$ exit
Compare the progress of (being) trained models
$ ./boot_tensorboard.sh
Then, view http://localhost:6006
with your browser.
- Render the figures on the paper:
$ ./login_torch_sh.sh 0
$ python3 render_taslp_fig3.py
$ python3 render_taslp_fig4.py
$ exit
- Run experiment for adversarial attack.
$ ./run_adversarial_attack.sh 0 ./checkpoints/(name of checkpoint file).checkpoint (output file name)
- [1] R. Sato, K. Niwa, K. Kobayashi, "Ambisonic Signal Processing DNNs Guaranteeing Rotation, Scale and Time Translation Equivariance," IEEE/ACM Trans. ASLP, (to be published), 2021.