Skip to content

Commit 2fbbb51

Browse files
authored
transformers==4.37, yi & yuan2 & vicuna (#11805)
* transformers==4.37 * added yi model * added yi model * xxxx * delete prompt template * / and delete
1 parent f43da2d commit 2fbbb51

File tree

4 files changed

+27
-25
lines changed

4 files changed

+27
-25
lines changed

Diff for: python/llm/example/GPU/HuggingFace/LLM/vicuna/README.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Vicuna
2-
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on Vicuna models. For illustration purposes, we utilize the [lmsys/vicuna-13b-v1.3](https://huggingface.co/lmsys/vicuna-13b-v1.3) and [eachadea/vicuna-7b-1.1](https://huggingface.co/eachadea/vicuna-7b-1.1) as reference Vicuna models.
2+
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on Vicuna models. For illustration purposes, we utilize the [lmsys/vicuna-13b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5) and [lmsys/vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) as reference Vicuna models.
33

44
## 0. Requirements
55
To run these examples with IPEX-LLM, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information.
@@ -109,7 +109,7 @@ python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROM
109109
```
110110

111111
Arguments info:
112-
- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Vicuna model (e.g. `lmsys/vicuna-13b-v1.3` and `eachadea/vicuna-7b-1.1`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'lmsys/vicuna-13b-v1.3'`.
112+
- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Vicuna model (e.g. `lmsys/vicuna-13b-v1.5` and `eachadea/vicuna-7b-v1.5`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'lmsys/vicuna-13b-v1.5'`.
113113
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`.
114114
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
115115

@@ -118,7 +118,7 @@ Arguments info:
118118
> Please select the appropriate size of the Vicuna model based on the capabilities of your machine.
119119
120120
#### Sample Output
121-
#### [lmsys/vicuna-13b-v1.3](https://huggingface.co/lmsys/vicuna-13b-v1.3)
121+
#### [lmsys/vicuna-13b-v1.5](https://huggingface.co/lmsys/vicuna-13b-v1.5)
122122
```log
123123
Inference time: xxxx s
124124
-------------------- Prompt --------------------
@@ -130,10 +130,10 @@ What is AI?
130130
### Human:
131131
What is AI?
132132
### Assistant:
133-
AI, or Artificial Intelligence, refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception,
133+
AI stands for Artificial Intelligence. It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual perception
134134
```
135135

136-
#### [eachadea/vicuna-7b-1.1](https://huggingface.co/eachadea/vicuna-7b-1.1)
136+
#### [eachadea/vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5)
137137
```log
138138
Inference time: xxxx s
139139
-------------------- Prompt --------------------
@@ -145,5 +145,5 @@ What is AI?
145145
### Human:
146146
What is AI?
147147
### Assistant:
148-
AI, or artificial intelligence, refers to the ability of a machine or computer program to mimic human intelligence and perform tasks that would normally require human intelligence to
148+
AI stands for "Artificial Intelligence." It refers to the development of computer systems that can perform tasks that typically require human intelligence, such as visual per
149149
```

Diff for: python/llm/example/GPU/HuggingFace/LLM/vicuna/generate.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@
2727

2828
if __name__ == '__main__':
2929
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Vicuna model')
30-
parser.add_argument('--repo-id-or-model-path', type=str, default="lmsys/vicuna-13b-v1.3",
31-
help='The huggingface repo id for the Vicuna (e.g. `lmsys/vicuna-13b-v1.3` and `eachadea/vicuna-7b-1.1`) to be downloaded'
30+
parser.add_argument('--repo-id-or-model-path', type=str, default="lmsys/vicuna-13b-v1.5",
31+
help='The huggingface repo id for the Vicuna (e.g. `lmsys/vicuna-13b-v1.5` and `lmsys/vicuna-7b-v1.5`) to be downloaded'
3232
', or the path to the huggingface checkpoint folder')
3333
parser.add_argument('--prompt', type=str, default="What is AI?",
3434
help='Prompt to infer')
@@ -57,7 +57,7 @@
5757
# enabling `use_cache=True` allows the model to utilize the previous
5858
# key/values attentions to speed up decoding;
5959
# to obtain optimal performance with IPEX-LLM INT4 optimizations,
60-
# it is important to set use_cache=True for vicuna-v1.3 models
60+
# it is important to set use_cache=True for vicuna-v1.5 models
6161
output = model.generate(input_ids,
6262
use_cache=True,
6363
max_new_tokens=args.n_predict)

Diff for: python/llm/example/GPU/HuggingFace/LLM/yi/README.md

+15-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Yi
2-
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on Yi models on [Intel GPUs](../../../README.md). For illustration purposes, we utilize the [01-ai/Yi-6B](https://huggingface.co/01-ai/Yi-6B) as a reference Yi model.
2+
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on Yi models on [Intel GPUs](../../../README.md). For illustration purposes, we utilize the [01-ai/Yi-6B](https://huggingface.co/01-ai/Yi-6B) and [01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-1.5-6B-Chat) as reference Yi models.
33

44
## 0. Requirements
55
To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information.
@@ -112,7 +112,7 @@ python ./generate.py
112112

113113
In the example, several arguments can be passed to satisfy your requirements:
114114

115-
- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Yi model (e.g. `01-ai/Yi-6B`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'01-ai/Yi-6B'`.
115+
- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Yi model (e.g. `01-ai/Yi-6B` and `01-ai/Yi-6B-Chat`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'01-ai/Yi-6B-Chat'`.
116116
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'AI是什么?'`.
117117
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
118118

@@ -122,8 +122,18 @@ In the example, several arguments can be passed to satisfy your requirements:
122122
```log
123123
Inference time: xxxx s
124124
-------------------- Prompt --------------------
125-
AI是什么?
125+
What is AI?
126126
-------------------- Output --------------------
127-
AI是什么?
128-
人工智能(Artificial Intelligence),英文缩写为AI。它是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及
127+
What is AI?
128+
Artificial Intelligence (AI) is the simulation of human intelligence in machines. AI is the science and engineering of making intelligent machines, especially intelligent computer programs.
129129
```
130+
131+
#### [01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)
132+
```log
133+
Inference time: xxxx s
134+
-------------------- Prompt --------------------
135+
What is AI?
136+
-------------------- Output --------------------
137+
What is AI?
138+
Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, and self-
139+
```

Diff for: python/llm/example/GPU/HuggingFace/LLM/yi/generate.py

+3-11
Original file line numberDiff line numberDiff line change
@@ -21,21 +21,13 @@
2121
from ipex_llm.transformers import AutoModelForCausalLM
2222
from transformers import AutoTokenizer
2323

24-
# Refer to https://huggingface.co/01-ai/Yi-6B-Chat#31-use-the-chat-model
25-
YI_PROMPT_FORMAT = """
26-
<|im_start|>system
27-
You are a helpful assistant. If you don't understand what the user means, ask the user to provide more information.<|im_end|>
28-
<|im_start|>user
29-
{prompt}<|im_end|>
30-
<|im_start|>assistant
31-
"""
3224

3325
if __name__ == '__main__':
3426
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Yi model')
35-
parser.add_argument('--repo-id-or-model-path', type=str, default="01-ai/Yi-6B",
27+
parser.add_argument('--repo-id-or-model-path', type=str, default="01-ai/Yi-6B-Chat",
3628
help='The huggingface repo id for the Yi model to be downloaded'
3729
', or the path to the huggingface checkpoint folder')
38-
parser.add_argument('--prompt', type=str, default="AI是什么?",
30+
parser.add_argument('--prompt', type=str, default="What is AI?",
3931
help='Prompt to infer')
4032
parser.add_argument('--n-predict', type=int, default=32,
4133
help='Max tokens to predict')
@@ -60,7 +52,7 @@
6052

6153
# Generate predicted tokens
6254
with torch.inference_mode():
63-
prompt = YI_PROMPT_FORMAT.format(prompt=args.prompt)
55+
prompt = args.prompt
6456
input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu')
6557
# ipex_llm model needs a warmup, then inference time can be accurate
6658
output = model.generate(input_ids,

0 commit comments

Comments
 (0)