Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <problem about api parameter 'use_stream'> #1479

Open
1 task done
bpodq opened this issue May 20, 2024 · 0 comments
Open
1 task done

[BUG/Help] <problem about api parameter 'use_stream'> #1479

bpodq opened this issue May 20, 2024 · 0 comments

Comments

@bpodq
Copy link

bpodq commented May 20, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

我在两台服务器上测试API,一台是有A800 GPU(基于docker,独占GPU),另一台有3090 GPU(直接使用)

发现use_stream这个参数对结果的影响很大

当use_stream是False的时候,A800比3090快很多
10线程A800跑完约15秒,3090跑完约35秒

但是use_stream为True的时候
10线程A800跑完约45秒(是False的3倍),3090跑完约30秒,反而比A800快

30线程、100线程都类似

请问是怎么回事?

Expected Behavior

No response

Steps To Reproduce

python openai_api.py 启动api

然后发送请求
python openai_api_request2.py

是把python openai_api_request.py改了一下
主要改动如下:

L = []
m = 1
n = 10
for j in range(m):
    for i in range(n):

        t = Thread(target=simple_chat, args=(j*10+i, prompts[i], False))        # 切换True False
        # t = Thread(target=simple_chat, args=(j*10+i, prompts[i], True))
        t.start()
        L.append(t)

for i in range(m*n):
    L[i].join()

Environment

- OS: 20.04
- Python: 3.10
- Transformers: 分别是4.39.2, 4.40.1
- PyTorch:分别是2.1.2, 2.0.1
- CUDA Support: True

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant