ViTI is a high-performance, production-grade inference server for Vision Transformer (ViT) models and other state-of-the-art architectures from the timm
library. Built with FastAPI, ViTI supports dynamic model loading, concurrent image processing, and an enterprise-level cost calculation system based on model complexity, image count, and processing time.
- Dynamic Model Inference: Load and run any vision model from the
timm
library. - Concurrent Image Processing: Supports multiple image processing simultaneously, taking full advantage of your CPU cores.
- Enterprise-Grade Cost Calculation: Scalable pricing based on model size, number of images, and processing time.
- FastAPI: A modern, fast (high-performance) web framework for building APIs with Python 3.7+.
- Loguru Logging: For robust, customizable, and readable logging throughout the inference process.
- Pydantic: Ensures fast and accurate data validation, guaranteeing reliability in production environments.
Follow the steps below to get started with ViTI.
- Python 3.7+
pip
for managing Python packagesgit
to clone the repository
git clone https://github.com/The-Swarm-Corporation/ViTI.git
cd ViTI
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000
The server will now be running at http://localhost:8000
.
Single image inference using resnet50
:
curl -X POST "http://localhost:8000/v1/vision/inference" \
-H "Content-Type: application/json" \
-d '{
"image_base64": ["iVBORw0KGgoAAAANSUhEUgAAAGQAAABkCAIAAAD/gAIDAAABBklEQVR4nO3cwQnDMBQFQcukuaSX+Jge7GOKSX1uwQsCE5gp4CGW
"],
"model_name": "resnet50"
}'
Multiple image inference:
curl -X POST "http://localhost:8000/v1/vision/inference/multiple" \
-H "Content-Type: application/json" \
-d '{
"image_base64": ["<your_base64_encoded_image_1>", "<your_base64_encoded_image_2>"],
"model_name": "efficientnet_b0"
}'
You can install ViTI via pip using the provided requirements.txt
or manually by installing the following:
- FastAPI: The core web framework.
- Uvicorn: ASGI server for high-performance API deployment.
- Torch: For deep learning model inference.
- Timm: Access to hundreds of pre-trained vision models.
- Loguru: For logging.
pip3 install -r requirements.txt
- Endpoint:
/v1/health
- Method:
GET
- Description: Check the health of the API server.
- Response:
{ "status": "ok", "time": "2024-09-27 10:30:00" }
- Endpoint:
/v1/models
- Method:
GET
- Description: Get a list of available models from the
timm
library. - Response:
[ "resnet50", "efficientnet_b0", "mobilenetv2_100" ]
- Endpoint:
/v1/vision/inference
- Method:
POST
- Description: Run inference on a single image.
- Request Body:
{ "image_base64": ["<base64_encoded_image>"], "model_name": "resnet50" }
- Response:
{ "logits": [[0.45, 0.35, 0.1, 0.05, 0.05]], "top_5_classes": [["class_23", "class_34", "class_5", "class_101", "class_56"]], "cost": 0.02 }
- Endpoint:
/v1/vision/inference/multiple
- Method:
POST
- Description: Run inference on multiple images concurrently.
- Request Body:
{ "image_base64": ["<base64_encoded_image_1>", "<base64_encoded_image_2>"], "model_name": "efficientnet_b0" }
- Response:
{ "logits": [[0.45, 0.35, 0.1], [0.55, 0.15, 0.12]], "top_5_classes": [["class_23", "class_34", "class_5"], ["class_12", "class_56", "class_78"]], "cost": 0.03 }
ViTI implements a dynamic pricing model based on:
- Number of Images: Each additional image increases the cost.
- Model Size: Models with more parameters incur higher costs.
- Processing Time: Longer inference times increase the cost.
Cost Formula:
cost = base_cost + (image_count * image_cost) + (processing_time * time_cost) + (log(model_size) * model_size_factor)
Where:
base_cost
: A fixed base cost ($0.01).image_cost
: Cost per image ($0.01).time_cost
: $0.01 per second of inference time.model_size_factor
: A scaling factor for model size ($0.0001 * log(model size)).
For 2 images using efficientnet_b0
:
- Number of images: 2
- Model size: 5.3M parameters
- Processing time: 2 seconds
cost = 0.01 + (2 * 0.01) + (2 * 0.01) + (log(5288548) * 0.0001)
cost ≈ 0.06
You can run ViTI using Docker to easily deploy the inference server in a containerized environment.
# Base image
FROM python:3.9-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# Install dependencies
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Expose port and run the server
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
docker build -t viti-inference .
docker run -d -p 8000:8000 viti-inference
ViTI is released under the MIT License.
We welcome contributions from the community! Please read our contributing guide for more information.
This README.md provides a comprehensive, professional, and structured overview of ViTI, aimed at both developers and enterprise users. It includes detailed sections on getting started, usage, API endpoints, cost calculations, and Docker deployment, making it ready for production-level deployments.