Qwen 2.5 fails to load: Key weight not found in Linear #214

DePasqualeOrg · 2025-02-14T13:24:53Z

I've verified in mlx-swift-examples that the change from 0.21.2 to 0.21.3 causes Qwen 2.5 0.5B and 1.5B to fail to load with the error "Key weight not found in Linear".

deet · 2025-02-16T01:45:54Z

I have observed this with a a larger Qwen model as well.

rudrankriyam · 2025-02-16T17:10:42Z

Ah, I have been banging my head on this for the whole day. So the problem is not on my end, phew.

Edit: I think it is something related to the added error here in this PR ml-explore/mlx-swift#174?

davidkoski · 2025-02-26T16:53:41Z

The error means that the parameters that are being loaded are missing an expected value, weight for a Linear layer. This is triggered because of:

https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/MLXLMCommon/Load.swift#L90

specifies that it wants .all validation.

In particular this means that you think you have loaded the model, but actually one of the Linear layers still has random values for the weight parameter -- to me that seems like a bug in the saved parameters, assuming the values are indeed missing.

davidkoski · 2025-02-26T22:00:14Z

Qwen2VL.swift

    fileprivate class LanguageModel: Module, KVCacheDimensionProvider {
...
        @ModuleInfo(key: "lm_head") var lmHead: Linear?

        public init(_ args: Qwen2VLConfiguration.TextConfiguration) {
            self.model = Qwen2Model(args)

            if !args.tieWordEmbeddings {
                _lmHead.wrappedValue = Linear(args.hiddenSize, args.vocabularySize, bias: false)
            }

and indeed tieWordEmbeddings is true and there is no lm_head in the weights, so if this were to be loaded:

        public func callAsFunction(
            _ inputs: MLXArray?, cache: [KVCache]? = nil, inputEmbedding: MLXArray? = nil
        ) -> LMOutput {
            var out = model(inputs, cache: cache, inputEmbedding: inputEmbedding)
            if let lmHead {
                out = lmHead(out)
            } else {
                out = model.embedTokens.asLinear(out)
            }
            return LMOutput(logits: out)
        }

Would be applying random weights.

The check now works, but I think this means the model weights are incorrect.

davidkoski · 2025-02-27T07:15:45Z

Ah, just a sec, I am mixing my models. This one fails:

https://huggingface.co/mlx-community/Qwen1.5-0.5B-Chat-4bit

and that is the LLM version, not the VLM. Same issue I described -- there is no lm_head in the safe tensors.

If we go back to the python version:

https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/models/qwen2.py

and the swift VLM code doesn't match -- it doesn't look at if not args.tie_word_embeddings. Instead it was handled like this:

        if configuration.tieWordEmbeddings {
            out = model.embedTokens.asLinear(out)
        } else {
            out = lmHead(out)
        }

That will work (evaluate correctly), but isn't done correctly in terms of parameter validation. It does half-way match the python code though :-)

davidkoski · 2025-02-27T07:16:56Z

Moving to mlx-swift-examples -- the bug is actually there, just detected in mlx-swift :-)

davidkoski · 2025-02-27T07:18:14Z

@adrgrondin provided some ids in #210

mlx-community/Qwen1.5-0.5B-Chat-4bit
mlx-community/Qwen2.5-7B-Instruct-4bit
mlx-community/Qwen2.5-1.5B-Instruct-4bit

- Qwen2 (LLM) had slightly incorrect logic in the initialization regarding lm_head - it was initialized even if not used, but this causes parameter loading to fail with current 0.21.3 mlx-swift

#215) - Qwen2 (LLM) had slightly incorrect logic in the initialization regarding lm_head - it was initialized even if not used, but this causes parameter loading to fail with current 0.21.3 mlx-swift

rudrankriyam mentioned this issue Feb 22, 2025

Error: "Failed: keyNotFound(base: "Linear", key: "weight")" for Qwen LLMs with mlx-swift 0.21.3 #210

Closed

atdrendel mentioned this issue Feb 23, 2025

Expose all models shareup/shllm#2

Merged

davidkoski transferred this issue from ml-explore/mlx-swift Feb 27, 2025

atdrendel mentioned this issue Feb 28, 2025

Models fail to load with keyNotFound error #218

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen 2.5 fails to load: Key weight not found in Linear #214

Qwen 2.5 fails to load: Key weight not found in Linear #214

DePasqualeOrg commented Feb 14, 2025 •

edited

Loading

deet commented Feb 16, 2025

rudrankriyam commented Feb 16, 2025 •

edited

Loading

davidkoski commented Feb 26, 2025

davidkoski commented Feb 26, 2025

davidkoski commented Feb 27, 2025

davidkoski commented Feb 27, 2025

davidkoski commented Feb 27, 2025

Qwen 2.5 fails to load: Key weight not found in Linear #214

Qwen 2.5 fails to load: Key weight not found in Linear #214

Comments

DePasqualeOrg commented Feb 14, 2025 • edited Loading

deet commented Feb 16, 2025

rudrankriyam commented Feb 16, 2025 • edited Loading

davidkoski commented Feb 26, 2025

davidkoski commented Feb 26, 2025

davidkoski commented Feb 27, 2025

davidkoski commented Feb 27, 2025

davidkoski commented Feb 27, 2025

DePasqualeOrg commented Feb 14, 2025 •

edited

Loading

rudrankriyam commented Feb 16, 2025 •

edited

Loading