FastAPI Integration with OpenAI Client Returns 500 Error on /chat/completions Endpoint #169

Abhiraj-Alois · 2025-02-19T05:29:45Z

When attempting to use LMStudio with FastAPI and the OpenAI async client, the /chat/completions endpoint returns a 500 error, despite working correctly when accessed directly. The server returns a 200 status code but includes a 500 server error indicating an incorrect chat/completions endpoint.

Environment

LMStudio Version: 0.3.10
Python Version: 3.12.3
LMStudio Operating System: Windows
Code Operating System: Ubuntu

Current Behavior

The API endpoint returns a 500 error with the message "Unexpected endpoint or method. (POST /chat/completions)" when called through FastAPI, despite the endpoint being correct and functional when tested directly.

Expected Behavior

The endpoint should process the request and return a valid response, as it does when tested directly through LMStudio's interface.

Example Code:

@router.get("/test_llm")
async def test_llm():    
    llm = LMStudioLLM()
    
    prompt = f"When was Valentine's Day?"
    response = await llm._call(prompt)    

    md(response)
    return {"response": response}

LMSTUDIO_MODEL = os.getenv("LMSTUDIO_MODEL", "llama-3.2-3b-instruct")
LMSTUDIO_BASE_URL = os.getenv("LMSTUDIO_BASE_URL", "http://192.168.X.XXX:1234/v1")
LMSTUDIO_API_KEY = os.getenv("LMSTUDIO_API_KEY", "lm_studio")

async def lmstudio_llm(inputs: str) -> Optional[str]:
    client = AsyncOpenAI(base_url=LMSTUDIO_BASE_URL, api_key=LMSTUDIO_API_KEY)
    md(f"[DEBUG] LLM API Endpoint: {LMSTUDIO_BASE_URL}")
    md(f"[DEBUG] LLM API Model: {LMSTUDIO_MODEL}")
    md(f"[DEBUG] LLM API Key Exists: {bool(LMSTUDIO_API_KEY)}")
    for attempt in range(3):
        try:
            response = await client.chat.completions.create(
                model=LMSTUDIO_MODEL,
                messages=[{"role": "user", "content": inputs}],
                temperature=0.2,
                stream=False
            )
            
            if response and response.choices:
                return response.choices[0].message.content.strip()
            else:
                md(f"[ERROR] Empty response or no choices in response: {response}")

        except APIConnectionError as e:
            md(f"[ERROR] Connection failed (attempt {attempt+1}/3): {e}")

        except OpenAIError as e:
            md(f"[ERROR] OpenAI API error: {e}")
            return None 

        await asyncio.sleep(5) 

    md("[ERROR] Failed to connect after multiple attempts.")
    return None

class LMStudioLLM:
    async def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        response = await lmstudio_llm(prompt)
        if response is None:
            raise RuntimeError("LLM failed to generate a response.")
        return response

Error Log:

[DEBUG] LLM API Endpoint: http://192.168.X.XXX:1234/v1
DEBUG] LLM API Model: llama-3.2-3b-instruct
DEBUG] LLM API Key Exists: True
[ERROR] Empty response or no choices in response: 
ChatCompletion(id=None, choices=None, created=None, 
model=None, object=None, service_tier=None, 
system_fingerprint=None, usage=None, error='Unexpected 
endpoint or method. (POST /chat/completions)')
[ERROR] Empty response or no choices in response: 
ChatCompletion(id=None, choices=None, created=None, 
model=None, object=None, service_tier=None, 
system_fingerprint=None, usage=None, error='Unexpected 
endpoint or method. (POST /chat/completions)')
[ERROR] Empty response or no choices in response: 
ChatCompletion(id=None, choices=None, created=None, 
model=None, object=None, service_tier=None, 
system_fingerprint=None, usage=None, error='Unexpected 
endpoint or method. (POST /chat/completions)')
[ERROR] Failed to connect after multiple attempts.
INFO:     192.168.X.XXX:45184 - "GET /api/test_llm HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "backend/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "backend/.venv/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "backend/.venv/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
    raise exc
  File "backend/.venv/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "backend/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "backend/.venv/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
    await self.middleware_stack(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    await route.handle(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "backend/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "backend/.venv/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "backend/.venv/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "backend/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "backend/.venv/lib/python3.12/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "backend/app/routes/batch_documents.py", line 119, in test_llm
    response1 = await llm._call(prompt2)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "backend/app/services/lmstudio_llm.py", line 63, in _call
    raise RuntimeError("LLM failed to generate a response.")
RuntimeError: LLM failed to generate a response.

Steps to Reproduce

Set up LMStudio server with the specified configuration
Implement the FastAPI endpoint as shown in the code example
Make a GET request to the /test_llm endpoint
Observe the 500 error response

Additional Context

The same code works when tested directly by running that particular file
Multiple retry attempts result in the same error
I've tried multiple approaches including using both LMStudio from llama_index and direct AsyncOpenAI client
All configuration parameters match between working and non-working scenarios
Tested with Async and Sync and still the same result

Questions

Is there a specific way to configure FastAPI to work with LMStudio's OpenAI-compatible API?
Are there known issues with async OpenAI client usage in FastAPI applications?
Is there a recommended approach for integrating LMStudio with FastAPI?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FastAPI Integration with OpenAI Client Returns 500 Error on /chat/completions Endpoint #169

FastAPI Integration with OpenAI Client Returns 500 Error on /chat/completions Endpoint #169

Abhiraj-Alois commented Feb 19, 2025 •

edited

Loading

FastAPI Integration with OpenAI Client Returns 500 Error on /chat/completions Endpoint #169

FastAPI Integration with OpenAI Client Returns 500 Error on /chat/completions Endpoint #169

Comments

Abhiraj-Alois commented Feb 19, 2025 • edited Loading

Environment

Current Behavior

Expected Behavior

Example Code:

Error Log:

Steps to Reproduce

Additional Context

Questions

Abhiraj-Alois commented Feb 19, 2025 •

edited

Loading