Aligning modling code for GPT2 to work with vLLM (fallback) #36934

ariG23498 · 2025-03-24T15:55:46Z

This PR changes the modeling code for GPT2 to support working on vLLM using the transformers fallback backend.

The changes are as follows:

Introduction of kwargs which is used to propagate information about attention_indices in vLLM
Adding a base_model_tp_plan, which is currently empty. This is a required attribute for vLLM
Changing the reshape structure for the attn_outputs (took help from the llama code)

This PR is dependent on vllm-project/vllm#15290 on the vLLM side to work.

One can use the following snippet to check the implementation:

from vllm import LLM, SamplingParams
llm = LLM("openai-community/gpt2", model_impl="transformers", tensor_parallel_size=1)
prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

inputs = tok("Hello there ", return_tensors="pt")
outputs = model.generate(**inputs)
print(tok.batch_decode(outputs))

github-actions · 2025-03-24T15:56:00Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

ariG23498 · 2025-03-24T15:56:22Z

src/transformers/models/gpt2/modeling_gpt2.py

+    base_model_prefix = "model" # vllm
+    # base_model_prefix = "transformer" # transformers


This is done for weight loading.

We need the prefix set for different platforms.

I'm planning to investigate this on the vLLM side to see if we can remove this requirement

HuggingFaceDocBuilderDev · 2025-03-24T16:21:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

nice thanks!

src/transformers/models/gpt2/modeling_gpt2.py

Co-authored-by: Arthur <[email protected]>

ArthurZucker · 2025-03-24T17:06:40Z

Just make sure CI is green!

ariG23498 · 2025-03-24T17:10:59Z

@ArthurZucker I don't think the CI errors stem from the modeling changes. Do you want me to investigate further?

OK

hmellor · 2025-03-24T18:49:55Z

Adding a base_model_tp_plan, which is currently empty. This is a required attribute for vLLM

I could make this optional so that a model which does not have it simply doesn't support TP, rather than not working at all?

OK

src/transformers/models/gpt2/modeling_gpt2.py

src/transformers/models/gpt2/configuration_gpt2.py

Co-authored-by: Harry Mellor <[email protected]>

OK

ariG23498 · 2025-03-26T15:41:09Z

@ArthurZucker would it be okay to merge? The CI is broken for issues not related to the PR it seems 😅

OK

ariG23498 added 5 commits March 24, 2025 13:26

aligning for vllm

52bf36f

using input shape rather than attn outputs

f666ea5

remove demo

da1ceae

revert Conv1D

a644e25

style

1c55b83

github-actions bot marked this pull request as draft March 24, 2025 15:55

ariG23498 commented Mar 24, 2025

View reviewed changes

ariG23498 marked this pull request as ready for review March 24, 2025 15:56

style

28960a9

ArthurZucker approved these changes Mar 24, 2025

View reviewed changes

src/transformers/models/gpt2/modeling_gpt2.py Outdated Show resolved Hide resolved

Update src/transformers/models/gpt2/modeling_gpt2.py

064f621

Co-authored-by: Arthur <[email protected]>

ariG23498 added 2 commits March 24, 2025 18:12

Merge branch 'main' into aritra/gpt2-vllm

bf0fbb1

OK

fix copies

19fb80e

Merge branch 'main' into aritra/gpt2-vllm

20a9b65

OK

hmellor reviewed Mar 25, 2025

View reviewed changes

src/transformers/models/gpt2/modeling_gpt2.py Outdated Show resolved Hide resolved

hmellor reviewed Mar 25, 2025

View reviewed changes

src/transformers/models/gpt2/configuration_gpt2.py Outdated Show resolved Hide resolved

ariG23498 and others added 8 commits March 25, 2025 15:59

Apply suggestions from code review

3f5572c

Co-authored-by: Harry Mellor <[email protected]>

Merge branch 'main' into aritra/gpt2-vllm

21d877f

OK

Merge branch 'main' into aritra/gpt2-vllm

cb7811f

OK

Merge branch 'main' into aritra/gpt2-vllm

64a2d00

OK

Merge branch 'main' into aritra/gpt2-vllm

8653370

OK

Merge branch 'main' into aritra/gpt2-vllm

31c4859

OK

Merge branch 'main' into aritra/gpt2-vllm

e48cb6b

OK

Merge branch 'main' into aritra/gpt2-vllm

90a354f

OK

Merge branch 'main' into aritra/gpt2-vllm

d043bbb

OK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aligning modling code for GPT2 to work with vLLM (fallback) #36934

Aligning modling code for GPT2 to work with vLLM (fallback) #36934

ariG23498 commented Mar 24, 2025

github-actions bot commented Mar 24, 2025

ariG23498 Mar 24, 2025

hmellor Mar 24, 2025

HuggingFaceDocBuilderDev commented Mar 24, 2025

ArthurZucker left a comment

ArthurZucker commented Mar 24, 2025

ariG23498 commented Mar 24, 2025

hmellor commented Mar 24, 2025

ariG23498 commented Mar 26, 2025

		base_model_prefix = "model" # vllm
		# base_model_prefix = "transformer" # transformers

Aligning modling code for GPT2 to work with vLLM (fallback) #36934

Are you sure you want to change the base?

Aligning modling code for GPT2 to work with vLLM (fallback) #36934

Conversation

ariG23498 commented Mar 24, 2025

github-actions bot commented Mar 24, 2025

ariG23498 Mar 24, 2025

Choose a reason for hiding this comment

hmellor Mar 24, 2025

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 24, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Mar 24, 2025

ariG23498 commented Mar 24, 2025

hmellor commented Mar 24, 2025

ariG23498 commented Mar 26, 2025