🚀 Triton Inference Server

This repository provides a Triton Inference Server setup for face detection and face recognition using ONNX models with dynamic batching.

📌 Models Used

Face Detection: RetinaFace-MobileNetV2 (Dynamic Batch, ONNX)
Face Recognition: MobileNetV2 (Dynamic Batch, ONNX)

🛠️ Build & Run the Triton Server

1️⃣ Build the Docker Image

docker build -t tritonserver .

2️⃣ Run the Container

docker run --gpus all --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 tritonserver:latest

🐳 Using Docker Compose (Recommended)

Start the Server

docker compose up --build -d

Stop the Server

docker compose down

Run API server

to visualize and test the API endpoints.

Run the API Server

uvicorn api.face_api:app --host 0.0.0.0 --port 8000 --reload

Test the API Endpoints

Open your browser and navigate to http://localhost:8000/docs to access the FastAPI Swagger UI. Here, you can test the available endpoints for face detection and recognition.

📌 TODO: TensorRT Optimization

To improve inference performance using TensorRT, enable GPU acceleration by adding the following configuration:

execution_accelerators {
  gpu_execution_accelerator : [
    {
      name : "tensorrt"
      parameters { key: "precision_mode" value: "FP16" }  # Run inference in FP16 for better performance
    }
  ]
}

✅ Upcoming Enhancements:

Integrate TensorRT optimizations
Improve model inference speed

📢 Contributions & Feedback

Feel free to open an issue or submit a PR if you have improvements or suggestions! 🚀

📜 License

This project is open-source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
backend		backend
models		models
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Triton Inference Server

📌 Models Used

🛠️ Build & Run the Triton Server

1️⃣ Build the Docker Image

2️⃣ Run the Container

🐳 Using Docker Compose (Recommended)

Start the Server

Stop the Server

Run API server

Run the API Server

Test the API Endpoints

📌 TODO: TensorRT Optimization

📢 Contributions & Feedback

📜 License

About

Releases

Packages

Languages

yakhyo/triton-inference

Folders and files

Latest commit

History

Repository files navigation

🚀 Triton Inference Server

📌 Models Used

🛠️ Build & Run the Triton Server

1️⃣ Build the Docker Image

2️⃣ Run the Container

🐳 Using Docker Compose (Recommended)

Start the Server

Stop the Server

Run API server

Run the API Server

Test the API Endpoints

📌 TODO: TensorRT Optimization

📢 Contributions & Feedback

📜 License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages