Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4x8 block computation in vec_avx.h #185

Open
Jo0o0Hyung opened this issue Apr 18, 2022 · 2 comments
Open

4x8 block computation in vec_avx.h #185

Jo0o0Hyung opened this issue Apr 18, 2022 · 2 comments

Comments

@Jo0o0Hyung
Copy link

Jo0o0Hyung commented Apr 18, 2022

@jmvalin , Thank you for sharing your code in Github.
I have a question about 8x4 block computation in vec_avx.h
(My question is based on lpcnet_efficiency branch after reading the paper, NEURAL SPEECH SYNTHESIS ON A SHOESTRING: IMPROVING THE EFFICIENCY OF LPCNET)

https://github.com/xiph/LPCNet/blob/lpcnet_efficiency/src/vec_avx.h#L788
In attached link, vector_ps_to_epi8 function is called and this function converts float type state (_x) into unsigned char type state (x). And for making signed char to unsigned char, the scalar value 127 is added.
I understand that this operation is for applying _mm256_maddubs_epi16 to vxj and vw.

But, In my opinion, after the operation, there should be some kind of 'compensation' part that needs to subtract the added value, 127. (Because state of GRU is not in range of unsigned char in training step.)
So, for example, I think the following line should be added after the aforementioned operation line.

// Before
tmp = _mm256_maddubs_epi16(vxj, vw);
tmp = _mm256_madd_epi16(tmp, ones);

// After
tmp = _mm256_maddubs_epi16(vxj, vw);
tmp = _mm256_sub_epi16(tmp, _mm256_maddubs_epi16(const_127i, vw));
tmp = _mm256_madd_epi16(tmp, ones);

In summary, I reckon that 127 should be subtracted during the sgemv 8x4 operation to compensate the added value of 127 in vector_ps_to_epi8 and I wonder if I have missed anything.

In addition, this is far from with what I said above, I wonder why Gaussian Noise added between GRU_A and GRU_B.

gru_out1, _ = rnn(rnn_in)
gru_out1 = GaussianNoise(.005)(gru_out1)
gru_out2, _ = rnn2(Concatenate()([gru_out1, rep(cfeat)]))

Thanks.

@jmvalin
Copy link
Member

jmvalin commented Apr 18, 2022

Regarding the 127 offset, you're right that there needs to be a computation. That's done offline by changing the bias values we add at the end. See the "bias" vs "subias" values in the model ("su" is for "signed*unsigned multiply). As for the GaussianNoise, it kinda simulates the quantization noise we add to the activation when going from float to int8.

@Jo0o0Hyung
Copy link
Author

Regarding the 127 offset, you're right that there needs to be a computation. That's done offline by changing the bias values we add at the end. See the "bias" vs "subias" values in the model ("su" is for "signed*unsigned multiply). As for the GaussianNoise, it kinda simulates the quantization noise we add to the activation when going from float to int8.

Thank you for the reply. :-)
Thanks to your comment, I understand the role of "subias" in dump_lpcnet.py & nnet.c script.
Also, I checked that the wav is well synthesized without the line,
tmp = _mm256_sub_epi16(tmp, _mm256_maddubs_epi16(const_127i, vw));
by using su_bias.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants