You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering what does fbank really give us in the dataloader? I went to torchaudio doc and did not find much info about what it is. Does anyone have a link to its explanation?
Thank you,
The text was updated successfully, but these errors were encountered:
I do have the same question. In the paper, it claims that 'we transform audio recordings into Mel spectrograms and divide them into non-overlapped regular grid patches', but it seems the codebase used fbank instead of spectrogram, any reason?
Hi there,
I am wondering what does fbank really give us in the dataloader? I went to torchaudio doc and did not find much info about what it is. Does anyone have a link to its explanation?
Thank you,
The text was updated successfully, but these errors were encountered: