-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoRA] CogView4 #10981
base: main
Are you sure you want to change the base?
[LoRA] CogView4 #10981
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
class TokenizerWrapper: | ||
@staticmethod | ||
def from_pretrained(*args, **kwargs): | ||
return AutoTokenizer.from_pretrained("THUDM/glm-4-9b-chat", trust_remote_code=True) | ||
|
||
|
||
class TextEncoderWrapper: | ||
@staticmethod | ||
def from_pretrained(*args, **kwargs): | ||
config = GlmConfig(hidden_size=32, intermediate_size=8, num_hidden_layers=2, num_attention_heads=4, head_dim=8) | ||
return GlmModel(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to do this because I'm not quite sure how to make the trust_remote_code=True
part compatible our test structure. But it seems like a security hole...
@sayakpaul What are your suggestions for handling this? Is this okay or do we create the dummy checkpoints? Creating dummy checkpoint for text encoder is no problem, but I'm not sure how to handle the tokenizer here -- do I just copy the entire repo for tokenization to our hf-internal-testing org?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating. How about:
- Create a dummy checkpoint for the text encoder and host under internal testing.
- Copy over the tokenizer files (they are small enough) to that same repo while crediting the source in the model card.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the test related comments have been addressed.
It's also great that finetrainers
now supports image models like CogView4. I think that demands some communications to the communities. Let's tackle that later.
@@ -627,6 +635,7 @@ def __call__( | |||
original_size=original_size, | |||
target_size=target_size, | |||
crop_coords=crops_coords_top_left, | |||
attention_kwargs=attention_kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add this to docstrings.
class TokenizerWrapper: | ||
@staticmethod | ||
def from_pretrained(*args, **kwargs): | ||
return AutoTokenizer.from_pretrained("THUDM/glm-4-9b-chat", trust_remote_code=True) | ||
|
||
|
||
class TextEncoderWrapper: | ||
@staticmethod | ||
def from_pretrained(*args, **kwargs): | ||
config = GlmConfig(hidden_size=32, intermediate_size=8, num_hidden_layers=2, num_attention_heads=4, head_dim=8) | ||
return GlmModel(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating. How about:
- Create a dummy checkpoint for the text encoder and host under internal testing.
- Copy over the tokenizer files (they are small enough) to that same repo while crediting the source in the model card.
@unittest.skip( | ||
"Needs additional debugging. OSError: Incorrect path_or_model_id: ''. Please provide either the path to a local folder or the repo_id of a model on the Hub." | ||
) | ||
def test_simple_inference_save_pretrained(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an important test -- should we debug further? I can help look into it after the dummy checkpoints for the text encoder and tokenizer are created. I won't mind passing additional pretrained_kwargs
to tokenizer and text encoder if needed.
Wandb logs: https://wandb.ai/aryanvs/finetrainers-cogview4
Checkpoints:
Training: