Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ScaleToHwAligned for loading fp8 vllm model #941

Open
wants to merge 3 commits into
base: habana_main
Choose a base branch
from

Conversation

changwangss
Copy link

@changwangss changwangss commented Mar 21, 2025

https://jira.habana-labs.com/browse/SW-207506

CI failed due to HabanaAI/vllm-hpu-extension#118 merge, because it need to update the requirements-hpu.txt vllm-hpu-extension to install.
the example script:

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
import time
import os

os.environ["VLLM_SKIP_WARMUP"] = "true"

model = "neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8"
# tokenizer = AutoTokenizer.from_pretrained(model)
llm = LLM(
    model=model, 
    # tokenizer=model,
    # trust_remote_code=True,
    # dtype="bfloat16",
    # max_model_len=16384,
    # gpu_memory_utilization=0.8,
)

prompts = [
    "Hello, my name is",
    "0.999 compares to 0.9 is ",
    "The capital of France is",
    "The future of AI is",
]

sampling_params = SamplingParams(temperature=0, max_tokens=128, ignore_eos=True)
start = time.perf_counter()
outputs = llm.generate(prompts, sampling_params)
end = time.perf_counter()
# Print the outputs.
gt = None
print(f"e2e took {end - start} seconds")
for output_i in range(len(outputs)):
    output = outputs[output_i]
    gt_i = None if gt is None else gt[output_i]
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print("====================================")
    print(f"Prompt: {prompt!r}")
    print(f"Generated text: {generated_text!r}")
    print(f"Ground truth: {gt_i!r}")
    print("====================================")

del llm

the results without PR:

====================================
Prompt: 'Hello, my name is'
Generated text: ' Emily and I am a 25 year old artist living in the beautiful city of Portland, Oregon. I am a painter, a printmaker, and a lover of all things creative. I am also a bit of a hopeless romantic, always chasing my dreams and living life to the fullest.\nI have been an artist for as long as I can remember, and I have been fortunate enough to turn my passion into a career. I have had the opportunity to show my work in galleries and exhibitions all over the country, and I have even had the chance to teach art classes to children and adults alike.\nBut even with all of the success I'
Ground truth: None
====================================
====================================
Prompt: '0.999 compares to 0.9 is '
Generated text: '1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999 compares to 0.9 is 1:1.1\n0.999'
Ground truth: None
====================================
====================================
Prompt: 'The capital of France is'
Generated text: ' a city of romance, art, fashion, and cuisine. Paris is a must-visit destination for anyone who loves history, architecture, and culture. From the iconic Eiffel Tower to the world-class museums like the Louvre and Orsay, Paris has something to offer for every interest and age.\nThe city is divided into 20 arrondissements, each with its own unique character and charm. The Latin Quarter, Montmartre, and Le Marais are some of the most popular neighborhoods to explore, with their narrow streets, charming cafes, and historic landmarks.\nParis is also famous for its fashion, with the Champs'
Ground truth: None
====================================
====================================
Prompt: 'The future of AI is'
Generated text: " bright, but it also raises concerns about bias, accountability, and the potential for AI to be used for malicious purposes. As AI becomes increasingly integrated into our daily lives, it's essential to consider the ethical implications of its development and use.\nThe Future of AI: Opportunities and Challenges\nThe future of AI is filled with opportunities for innovation and growth, but it also raises concerns about bias, accountability, and the potential for AI to be used for malicious purposes. As AI becomes increasingly integrated into our daily lives, it's essential to consider the ethical implications of its development and use.\nOne of the most significant opportunities presented by AI is the potential"
Ground truth: None
====================================

the results based PR:

====================================
Prompt: 'Hello, my name is'
Generated text: " Emily and I am a 3rd year student at the University of Edinburgh. I am studying a BSc in Psychology with a focus on Clinical Psychology. I am excited to be a part of the Edinburgh Student Psychology Society (ESPS) and contribute to the community of students interested in psychology.\nI am particularly interested in the areas of mental health, cognitive psychology, and research methods. I am passionate about understanding the complexities of the human mind and how we can use psychology to improve people's lives.\nOutside of university, I enjoy hiking, reading, and trying out new recipes in the kitchen. I am also a bit of a music lover"
Ground truth: None
====================================
====================================
Prompt: '0.999 compares to 0.9 is '
Generated text: '1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares to 0.9 is 1:1\n0.999 compares'
Ground truth: None
====================================
====================================
Prompt: 'The capital of France is'
Generated text: " Paris, which is located in the northern part of the country. Paris is known for its beautiful architecture, art museums, fashion, and romantic atmosphere. The city is home to many famous landmarks, such as the Eiffel Tower, Notre Dame Cathedral, and the Louvre Museum.\nThe Eiffel Tower is a iconic symbol of Paris and one of the most recognizable landmarks in the world. It was built for the 1889 World's Fair and stands at 324 meters (1,063 feet) tall. Visitors can take the elevator to the top for stunning views of the city.\nNotre Dame Cathedral is a beautiful Gothic church"
Ground truth: None
====================================
====================================
Prompt: 'The future of AI is'
Generated text: ' bright, but it also raises important questions about the impact of technology on society. As AI becomes increasingly integrated into our daily lives, we need to consider the potential consequences of its development and deployment. Here are some of the key issues that need to be addressed:\n1. Job displacement: AI has the potential to automate many jobs, which could lead to significant job displacement and unemployment. This could exacerbate existing social and economic inequalities.\n2. Bias and discrimination: AI systems can perpetuate and amplify existing biases and discrimination if they are trained on biased data or designed with a particular worldview. This could lead to unfair outcomes and perpetuate social injust'
Ground truth: None
====================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant