Encountering assert len(indices) == len(inputs) error when using Qwen2vl for MMMU evaluation #2720

Ben81828 · 2025-02-21T06:29:37Z

I was trying to use the following command to evaluate the fine-tuned Qwen2vl-7b model on MMMU:

export HF_HUB_OFFLINE=1

lm_eval --model vllm-vlm \
    --model_args pretrained=/my_model_path,tensor_parallel_size=2,dtype=float16,gpu_memory_utilization=0.8 \ 
    --tasks mmmu_val \
    --batch_size auto \
    --output_path /home/ben/repos/stenosis_detect/saves/generations/lm_eval_harness \
    --trust_remote_code

However, I encountered the following error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/ben/venv/.lm_eval_harness/bin/lm_eval", line 8, in <module>
[rank0]:     sys.exit(cli_evaluate())
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/__main__.py", line 387, in cli_evaluate
[rank0]:     results = evaluator.simple_evaluate(
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/utils.py", line 402, in _wrapper
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/evaluator.py", line 304, in simple_evaluate
[rank0]:     results = evaluate(
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/utils.py", line 402, in _wrapper
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/evaluator.py", line 524, in evaluate
[rank0]:     resps = getattr(lm, reqtype)(cloned_reqs)
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/models/vllm_vlms.py", line 272, in generate_until
[rank0]:     cont = self._model_generate(
[rank0]:   File "/home/ben/repos/stenosis_detect/lm-evaluation-harness/lm_eval/models/vllm_vlms.py", line 138, in _model_generate
[rank0]:     outputs = self.model.generate(
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/utils.py", line 1063, in inner
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 398, in generate
[rank0]:     self._validate_and_add_requests(
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 875, in _validate_and_add_requests
[rank0]:     self._add_request(
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 893, in _add_request
[rank0]:     self.llm_engine.add_request(
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/utils.py", line 1063, in inner
[rank0]:     return fn(*args, **kwargs)
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 855, in add_request
[rank0]:     processed_inputs = self.input_processor(preprocessed_inputs)
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/inputs/registry.py", line 346, in process_input
[rank0]:     processed_inputs = processor(
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 975, in input_processor_for_qwen2_vl
[rank0]:     prompt_token_ids = _expand_pad_tokens(image_inputs,
[rank0]:   File "/home/ben/venv/.lm_eval_harness/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_vl.py", line 872, in _expand_pad_tokens
[rank0]:     assert len(indices) == len(inputs)
[rank0]: AssertionError

I'm not entirely sure if my understanding is correct, but I suspect that the token wasn't properly inserted into the prompt, causing this issue.
I haven't seen any similar reports in the issues section, so I think I may have missed something in my command.
Could anyone provide me with some hints or guidance?

The text was updated successfully, but these errors were encountered:

Ben81828 · 2025-02-21T09:12:34Z

I later tested with the original Qwen2vl and idefics3.
Qwen2vl still encountered the same error, but idefics3 was able to run the evaluation successfully.

Here are my scripts:

export HF_HUB_OFFLINE=1

source /home/ben/venv/.lm_eval_harness/bin/activate
lm_eval --model vllm-vlm \
    --model_args pretrained=Qwen/Qwen2-VL-7B-Instruct,tensor_parallel_size=2,dtype=float16,gpu_memory_utilization=0.8,max_images=2,max_model_len=8192 \
    --tasks mmmu_val \
    --output_path /output_path  \
    --trust_remote_code

export HF_HUB_OFFLINE=1

source /home/ben/venv/.lm_eval_harness/bin/activate
lm_eval --model vllm-vlm \
    --model_args pretrained=HuggingFaceM4/Idefics3-8B-Llama3,tensor_parallel_size=2,dtype=float16,gpu_memory_utilization=0.8,max_images=2,max_model_len=8192 \
    --tasks mmmu_val \
    --output_path /output_path \
    --trust_remote_code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encountering assert len(indices) == len(inputs) error when using Qwen2vl for MMMU evaluation #2720

Encountering assert len(indices) == len(inputs) error when using Qwen2vl for MMMU evaluation #2720

Ben81828 commented Feb 21, 2025

Ben81828 commented Feb 21, 2025

Encountering assert len(indices) == len(inputs) error when using Qwen2vl for MMMU evaluation #2720

Encountering assert len(indices) == len(inputs) error when using Qwen2vl for MMMU evaluation #2720

Comments

Ben81828 commented Feb 21, 2025

Ben81828 commented Feb 21, 2025