A content aggregation and social engagement platform focused on science articles.
- Automated web scraping of science news websites
- Article translation and summarization using OpenAI GPT API
- User authentication and commenting system
- Future forum for science discussions
- Python 3.8+
- MongoDB
- Node.js
- Docker (optional but recommended for easy setup)
git clone https://github.com/nbursa/science-data-service.git
cd science-data-service
pip install -r requirements.txt
If you prefer to use Docker, you can start MongoDB and Redis using Docker Compose:
docker-compose up -d
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
celery -A scraping.celery_tasks worker --loglevel=info
celery -A scraping.celery_tasks beat --loglevel=info
python -m scrapy crawl <spider_name>
After starting the Uvicorn server, you can access the application at:
http://localhost:8000
API documentation is available at:
http://localhost:8000/docs
Make sure to set the necessary environment variables for connecting to MongoDB and Redis, as well as for the OpenAI API key for article translation and summarization.
Contributions are welcome! Please open an issue or submit a pull request.