- Setup the complete infrastructure stack for a Question-Answer chatbot for your private data in just a few minutes!
- Your stack will be powered by Self-hosted Open-Source Large Language Models and Retrieval Augmented Generation running on Kubernetes Cloud clusters.
The Question-Answer Chatbot is powered by these technologies:
- Open-Source Large Language Models
- Retrieval Augmented Generation (RAG)
- Vector Stores
- Ray AI/ML compute framework
- Elotl Luna
The graphic below shows how RAG is used to determine an answer to the end-user's question about a specific knowledge base.
- Cluster Setup Summary
- Install Infrastructure Tools
- Install Model Serve Stack
- Model Serving
- Retrieval Augmented Generation using FAISS
- Creation of the Vector Store
- Install the RAG & LLM querying service
- Send a question to your LLM with RAG
- Query your LLM with RAG using a Chat UI
- Uninstall
Jump to complete install doc available here.