Includes analysis of a large corpus of positive and negative user comments, data cleaning, model selection, and deployment to a Flask REST API
- Nathaniel Haddad [email protected]
- Northeastern University
- Disclosure: this is a academic project
Technologies: TensorFlow, Sklearn, NLTK, Pandas, Flask, Python
packages:
pip3 install pickle-mixin
pip3 install tensorflow
pip3 install -U scikit-learn
pip3 install Flask
pip3 install nltk
pip3 install pandas
run:
- train a model: (from the root folder)
python comment_clf_model.py
- run the server: (from the root folder)
python comment_clf_app.py
- go to the server home:
http://127.0.0.1:5000/v1/api
This project represents a series of machine learning models used to identify attacks on users on Wikipedia using natural language processing. Using Scikit-learn and other packages, I built several classifiers that were able to predict whether a comment was an attack or not with a high rate of accuracy.
- Some of the resources I used: