A machine learning model that generates concise summaries of email threads using the BART architecture. The model is deployed as a FastAPI service on Google Cloud Platform.
- Base Model: facebook/bart-large-cnn
- Fine-tuned On: Email Summary Dataset (xprilion/email-summary-dataset)
- Training Infrastructure:
The model was trained using the following setup:
- Computing Platform: Kaggle Notebooks
- Hardware: 2x NVIDIA Tesla T4 GPUs
- Notebook Link: View Implementation
- Framework: PyTorch + Transformers
ROUGE scores on the test set:
- ROUGE-1: F1 = 0.469 (R: 0.497, P: 0.450)
- ROUGE-2: F1 = 0.227 (R: 0.244, P: 0.217)
- ROUGE-L: F1 = 0.442 (R: 0.468, P: 0.425)
The model was fine-tuned using the following configuration:
- Batch size: 8 with gradient accumulation steps of 4
- Learning rate: 5e-5 with StepLR scheduler (gamma=0.9)
- Number of epochs: 10 with early stopping
- Maximum input length: 1024 tokens
- Maximum summary length: 128 tokens
- Training optimizations:
- Gradient clipping
- Early stopping with patience of 2 epochs
- Learning rate decay
- Multi-GPU training support
The training data underwent several preprocessing steps:
- Cleaning email bodies (removing quoted replies, signatures, disclaimers)
- Formatting metadata (From, To, Subject fields)
- Thread concatenation for multi-email conversations
- Removal of duplicates and invalid entries
The model is deployed as a FastAPI service with the following features:
-
POST /summarize/
- Accepts email text and summarization level
- Returns generated summary
- Supports three summarization levels:
- shortest: 10-20% of input length
- normal: 20-40% of input length
- elaborative: 30-60% of input length
-
GET /
- Health check endpoint
- Returns service status
- Minimum input length: 10 words
- Maximum input length: 1024 tokens
- Required fields validation
- Error handling for invalid inputs
The model is deployed on Google Cloud Platform with the following components:
- Model artifacts stored in Google Cloud Storage
- FastAPI service for inference
- CORS middleware enabled for API access
- Environment-based configuration
fastapi
uvicorn[standard]
torch
transformers
pydantic
google-cloud-storage
enum34
The trained model files are stored in the following structure:
- config.json
- generation_config.json
- merges.txt
- model.safetensors
- special_tokens_map.json
- tokenizer_config.json
- vocab.json
Potential areas for enhancement:
- Add more customization options for summary generation
- Implement caching for frequently requested summaries
MIT License