DistilBERT Sentiment Analysis on IMDb Movie Reviews vs Amazon Fine Foods Reviews

Made by: Horia-Gabriel Radu and Alexandru-Liviu Bratosin

Structure

├── data
│   ├── amazon (on Google Drive)
│   └── imdb   (on Google Drive)
├── demo
│   └── main.py
├── experiments
│   ├── test.yaml
│   └── train.yaml
├── lib
│   ├── data.py
│   ├── test.py
│   └── train.py
├── models
│   └── distilbert.py
└── notebooks
    └── main.ipynb

train.yaml and test.yaml: Config files for training and testing.
data.py: Dataset utilities.
test.py and train.py: Training and testing functions.
distilbert.py: PyTorch-Lightning model wrapper class.
notebooks/main.ipynb: Notebook used for running the training and testing experiments.

Datasets

Saved versions of the dataset splits we have used are also uploaded on Google Drive.

Models

Saved checkpoints and training logs are uploaded on Google Drive.

Evaluation and Experiment Results

The following results are achieved with the parameters found in experiments/train.yaml and experiments/test.yaml config files.

Experiment: Train on IMDb,   test on IMDb,   w/o pretrain
    | Test accuracy: 0.850320 | pretrain=false;train=imdb;test=imdb     |

Experiment: Train on Amazon, test on Amazon, w/o pretrain
    | Test accuracy: 0.861111 | pretrain=false;train=amazon;test=amazon |

Experiment: Train on IMDb,   test on Amazon, w/o pretrain
    | Test accuracy: 0.730711 | pretrain=false;train=imdb;test=amazon   |

Experiment: Train on Amazon, test on IMDb,   w/o pretrain
    | Test accuracy: 0.725720 | pretrain=false;train=amazon;test=imdb   |


Experiment: Train on IMDb,   test on IMDb,   w/ pretrain
    | Test accuracy: 0.905840 | pretrain=true;train=imdb;test=imdb      |

Experiment: Train on Amazon, test on Amazon, w/ pretrain
    | Test accuracy: 0.920089 | pretrain=true;train=amazon;test=amazon  |

Experiment: Train on IMDb,   test on Amazon, w/ pretrain
    | Test accuracy: 0.868133 | pretrain=true;train=imdb;test=amazon    |

Experiment: Train on Amazon, test on IMDb,   w/ pretrain
    | Test accuracy: 0.849640 | pretrain=true;train=amazon;test=imdb    |

Demo

The demo file is demo/main.py.

Usage:

usage: main.py [-h] --checkpoint CHECKPOINT [--max_sequence_length MAX_SEQUENCE_LENGTH] [--padding PADDING] [--truncation TRUNCATION]

NLU Sentiment Classification demo.

optional arguments:
  -h, --help            show this help message and exit
  --checkpoint CHECKPOINT
                        Path to model checkpoint.
  --max_sequence_length MAX_SEQUENCE_LENGTH
                        Maximum sequence length.
  --padding PADDING     Padding type.
  --truncation TRUNCATION
                        Truncation.

References

IMDb Dataset - Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).
Amazon Fine Food Dataset - J. McAuley and J. Leskovec. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. WWW, 2013.
DistilBERT model from Hugging Face - https://huggingface.co/docs/transformers/model_doc/distilbert (Accessed 11 May 2022)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
demo		demo
experiments		experiments
lib		lib
models		models
notebooks		notebooks
NLU_Report.pdf		NLU_Report.pdf
README.MD		README.MD
refresh.sh		refresh.sh
requirements.txt		requirements.txt
submit.sh		submit.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DistilBERT Sentiment Analysis on IMDb Movie Reviews vs Amazon Fine Foods Reviews

Made by: Horia-Gabriel Radu and Alexandru-Liviu Bratosin

Structure

Datasets

Models

Evaluation and Experiment Results

Demo

References

About

Releases

Packages

Languages

horiaradu1/DistilBERT-Sentiment-Analysis-on-IMDb-Movie-Reviews-vs-Amazon-Fine-Foods-Reviews

Folders and files

Latest commit

History

Repository files navigation

DistilBERT Sentiment Analysis on IMDb Movie Reviews vs Amazon Fine Foods Reviews

Made by: Horia-Gabriel Radu and Alexandru-Liviu Bratosin

Structure

Datasets

Models

Evaluation and Experiment Results

Demo

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages