Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new vocab doesn't saved the model #773

Open
1 of 2 tasks
andymvp2018 opened this issue Nov 4, 2024 · 4 comments
Open
1 of 2 tasks

Adding new vocab doesn't saved the model #773

andymvp2018 opened this issue Nov 4, 2024 · 4 comments

Comments

@andymvp2018
Copy link

System Info

8 gpu on A100

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

on the finetuning.py script, and did
https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/finetuning.py#L188
tokenizer.add(['wreqw', 'ewqr', 'weqrqewrqw',...])
model.resize_token_embeddings(len(tokenizer))

But then after saving the model when it finish training, I convert it from FSDP into huggingface checkpoint, , and see that the
model.get_input_embeddings().weight.shape[0] is still pre-added tokenizer dimension, which means that the newly added model embeddings isn't being saved.

Error logs

N/A

Expected behavior

The model should have a larger embeddings dimension

@jeffxtang
Copy link
Contributor

@wukaixingxp @mreso can you please take a look?

@wukaixingxp
Copy link
Contributor

@andymvp2018 Can you show me the complete log of how you train the model, how you convert the FSDP to HF model? What command did you use?

@andymvp2018
Copy link
Author

andymvp2018 commented Nov 4, 2024

For the training, I just use that https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/finetuning.py#L188.

For converting FSDP to HF:

python src/llama_recipes/inference/checkpoint_converter_fsdp_hf.py --fsdp_checkpoint_path fsdp_path --consolidated_model_path hf_path

@andymvp2018
Copy link
Author

I think the problem is probability due to https://github.com/meta-llama/llama-recipes/blob/main/src/llama_recipes/tools/convert_hf_weights_to_llama.py#L45,

here, we should also change the dimensionality of the model (i.e, adding new tokens and then resize)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants