Add SGLang integration #1460

rlouf · 2025-02-28T09:25:31Z

We should look into integrating SGlang as an inference library in Outlines.

rlouf · 2025-03-05T19:01:17Z

I would like to try a new interface for servers, which should make it easier to use local servers such as vLLM, SgLang for small workloads:

class SgLang:
    """Simple SgLang server interface.
    
    Automatically handles server startup/shutdown and provides a
    clean interface for text generation.
    """
    
    def __init__(
        self,
        model_id: str,
        host: str = "127.0.0.1",
        port: int = 8000,
        **kwargs
    ):
        """Initialize a SgLang server
        """
        self.port = port
        self.host = "localhost"
        self._server_process = None
        self._client = None
        self._url = f"http://{self.host}:{self.port}/v1"
        
    def __enter__(self):
        """Start the server when entering the context manager."""
        self.start()
        return self
        
    def __exit__(self, exc_type, exc_val, exc_tb):
        """Stop the server when exiting the context manager."""
        self.stop()
        
    async def __aenter__(self):
        """Async context manager entry point."""
        pass
        
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        """Async context manager exit point."""
       pass
        
    def start(self):
        """Start the vLLM server in a subprocess and create the HTTP client. Poll the /health endpoint until the server up."""
       
        
    def stop(self):
        """Stop the vLLM server."""
        pass
            
    def generate(
        self,
        prompt: Union[str, List[str]],
        **kwargs
    ) -> Union[str, List[str]]:
        """Generate text from a prompt or batch of prompts (synchronous).
        """
        pass
            
    async def generate_async(
        self,
        prompt: Union[str, List[str]],
        **kwargs
    ) -> Union[str, List[str]]:
        """Generate text from a prompt or batch of prompts (asynchronous).
        
       pass

Which can be used in a synchronous and asynchronous way:

import outlines

with outlines.servers.sglang() as model:
    result = model("prompt", dict)

async def main():
    async with outlines.servers.sglang() as model:
    result = await model("prompt", dict)
    return result

rlouf added this to the 1.0 milestone Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SGLang integration #1460

Add SGLang integration #1460

rlouf commented Feb 28, 2025

rlouf commented Mar 5, 2025 •

edited

Loading

Add SGLang integration #1460

Add SGLang integration #1460

Comments

rlouf commented Feb 28, 2025

rlouf commented Mar 5, 2025 • edited Loading

rlouf commented Mar 5, 2025 •

edited

Loading