Skip to content

Deployment of RAG + LLM model serving on multiple K8s cloud clusters

Notifications You must be signed in to change notification settings

elotl/GenAI-infra-stack

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Question-Answer Chatbot with Self-hosted LLMs & RAG

  • Setup the complete infrastructure stack for a Question-Answer chatbot for your private data in just a few minutes!
  • Your stack will be powered by Self-hosted Open-Source Large Language Models and Retrieval Augmented Generation running on Kubernetes Cloud clusters.

Overview

The Question-Answer Chatbot is powered by these technologies:

  1. Open-Source Large Language Models
  2. Retrieval Augmented Generation (RAG)
  3. Vector Stores
  4. Ray AI/ML compute framework
  5. Elotl Luna

elotl_genai_stack_enduser

Retrieval Augmented Generation

The graphic below shows how RAG is used to determine an answer to the end-user's question about a specific knowledge base.

elotl_genai_stack_enduser

Installation

Jump to complete install doc available here.

About

Deployment of RAG + LLM model serving on multiple K8s cloud clusters

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 91.2%
  • Dockerfile 6.0%
  • Shell 2.8%