This is the respository of baseline systems for Multimodal Information Based Speech Processing (MISP) Challenge 2021. In this repository, we provide a baseline system for Task 1 - Audio-Visual Wake Word Spotting, please refer to the task1_wws folder for details and a nn-hmm baseline system for Task 2 - Audio-Visual Speech Recognition with Oracle Speaker Diarization, please refer to the task2_avsr_nn_hmm folder for details.
We also provide a end-to-end baseline system for Task2, please refer to the task2_avsr_e2e respository for details.