You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to try a new interface for servers, which should make it easier to use local servers such as vLLM, SgLang for small workloads:
classSgLang:
"""Simple SgLang server interface. Automatically handles server startup/shutdown and provides a clean interface for text generation. """def__init__(
self,
model_id: str,
host: str="127.0.0.1",
port: int=8000,
**kwargs
):
"""Initialize a SgLang server """self.port=portself.host="localhost"self._server_process=Noneself._client=Noneself._url=f"http://{self.host}:{self.port}/v1"def__enter__(self):
"""Start the server when entering the context manager."""self.start()
returnselfdef__exit__(self, exc_type, exc_val, exc_tb):
"""Stop the server when exiting the context manager."""self.stop()
asyncdef__aenter__(self):
"""Async context manager entry point."""passasyncdef__aexit__(self, exc_type, exc_val, exc_tb):
"""Async context manager exit point."""passdefstart(self):
"""Start the vLLM server in a subprocess and create the HTTP client. Poll the /health endpoint until the server up."""defstop(self):
"""Stop the vLLM server."""passdefgenerate(
self,
prompt: Union[str, List[str]],
**kwargs
) ->Union[str, List[str]]:
"""Generate text from a prompt or batch of prompts (synchronous). """passasyncdefgenerate_async(
self,
prompt: Union[str, List[str]],
**kwargs
) ->Union[str, List[str]]:
"""Generatetextfromapromptorbatchofprompts (asynchronous).
pass
Which can be used in a synchronous and asynchronous way:
import outlines
with outlines.servers.sglang() as model:
result = model("prompt", dict)
async def main():
async with outlines.servers.sglang() as model:
result = await model("prompt", dict)
return result
We should look into integrating SGlang as an inference library in Outlines.
The text was updated successfully, but these errors were encountered: