This repository contains code for the papers
[1] Kristina Tesch, Nils-Hendrik Mohrmann, and Timo Gerkmann, "On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement", Proceedings of Interspeech, pp. 2908-2912, 2022, [arxiv], [audio examples]
[2] Kristina Tesch and Timo Gerkmann, "Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement", IEEE/ACM Transactions of Audio, Speech and Language Processing, vol 31. pp.563-575, 2023, [audio examples]
[3] Kristina Tesch and Timo Gerkmann, "Spatially Selective Deep Non-linear filters for Speaker Extraction", accepted for ICASSP 2023, [audio examples]
[4] Kristina Tesch and Timo Gerkmann, "Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters", IEEE/ACM Transactions of Audio, Speech and Language Processing, vol. 32, pp. 542-553, 2024 [audio examples]
Take a look at a video of our real-time multi-channel enhancement demo: http://uhh.de/inf-sp-jnf-demo
- Prepare a dataset by running
data_gen_fixed_pos.py
. - Prepare a config file. Examples can be found in the config folder.
- Run the training script in the scripts folder (replace the path to your config file).
- Prepare a dataset by running
data_gen_var_pos.py
. - Prepare a config file. Examples can be found in the config folder.
- Run the training script in the scripts folder (replace the path to your config file).