-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unfreeze_layer_from_past parameter #25
Comments
I observed the same thing, I also tried penalizing the fact of generating the '_' token directly in the reward function. Unfortunately, it does not seems to learn how to stop generating the blank token... |
Hi all, the issue probably cause by https://github.com/huggingface/transformers/blob/bffac926ca6bc6c965a92bfbfd00c567a2c0fb90/src/transformers/models/t5/modeling_t5.py#L1147C8-L1147C8 it will add a position_bias after each layer output, so the initialize model will perform badly |
Hey! Do you guys figure out a solution to this problem? Thanks! |
Unfortunately not yet, I spend a lot of time trying to figure out a way to do it with this library but I ended up leaving it (at least currently) |
Nice repo!!!
it seems that the default parameter for the policy will freeze all the layers of the language model we are using and just update the lm_head
I tried the provided example of flan-T5 here: https://colab.research.google.com/drive/1DYHt0mi6cyl8ZTMJEkMNpsSZCCvR4jM1?usp=sharing
when I changed the value unfreeze_layer_from_past to be 1 to update the wights of the final layer of flan-t5 like this:
the behavior change the the actor starts to generate empty text:
Also after training it gave me empty text:
what is the reason of the this behavior?
NOTE: I did not change anything else in the flan-t5 code example.
The text was updated successfully, but these errors were encountered: