
Highlights
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
An open-source Photograph travel Blog📸built using Next.js, Drizzle, Neon, Better auth, Shadcn/ui and tRPC.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
Fully open reproduction of DeepSeek-R1
Utility to manage SSH public keys stored in LDAP.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Chat with Lex! A RAG app, using HyDE with milvus DB for vector store, VLLM for LLM inference, and FastEmbed for Embeddings!
Efficient Triton Kernels for LLM Training
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Machine Learning Engineering Open Book
ROS2 Project involving 2d and 3d mapping of environment
neuralmagic / nm-vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
A Gradio web UI for Large Language Models with support for multiple inference backends.
A repository for hacking Generative Fill with Open Source Tools
High-speed Large Language Model Serving for Local Deployment
This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu , Terra Blevins…
A framework for few-shot evaluation of language models.
DiffusionFastForward: a free course and experimental framework for diffusion-based generative models
ComfyUI custom nodes for inpainting/outpainting using the new latent consistency model (LCM)
SD.Next: All-in-one for AI generative image
Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes
ML model optimization product to accelerate inference.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.