Targeted Sentiment Regression on Financial News Articles using DeBERTa + Entity-Focused Fine-Tuning

SemEval 2017 Task 5, Subtask 2

Fine-Grained Sentiment Analysis on Financial News

SOTA model for modeling fine-grained sentiment expressions in financial news articles. Detached CNN-BiLSTM regression head trained on fine-tuned DeBERTa entity embeddings¹. Refer to the PDF for more detail.

Notebooks

Base model comparison notebook
- BERT, RoBERTa, FinBERT, DeBERTa
Main training/experiments notebook with DeBERTa:
- Experimentation with token pooling strategies ([CLS] token vs. target entity token vs. a combination of both)
- Experimentation with NER-based token entity masking strategy
- Comparison between attached² vs. detached³ regression head architectures

Experiments and Results

The experiments showed that sentiment regression performance was improved by:

Incorporating into the classification model the final hidden states of both the [CLS] token as well as the masked target entity token
Detaching the classification model from the token-level fine-tuning process
- In other words, placing complex architectures inside the fine-tuning process performed worse than placing the same complex architecture after the standard (boilerplate transformers.BertForSequenceClassification) pooling + dense layer
- Intuitively, the error propogation backwards through DeBERTa during training seemed to benefit from a closer/simpler signal, resulting in better inputs for the detached CNN-BiLSTM

The tradeoffs between inference time in production systems and model performance is an interesting area for further research.

Attached Classification/Regression Head example:

Detached Classification/Regression Head (with entity token replacement) example:

Experiments & Results:

For BERT-based models, the final token-level embeddings that are output by the fine-tuned model are referred to as the "final hidden states". ↩
"Attached" classification/regression head -- a single network is used to simultaneously fine-tune DeBERTa and perform classification/regression. The loss from the "classification" phase directly affects "representation" (i.e. the production of fine-tuned final hidden states). ↩
"Detached" classification/regression head -- the production of fine-tuned final hidden states is performed using a simple primary network (pooling + dense), then a (completely separate) secondary network is utilized for classification/regression using the output of the primary network as input. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DeBERTa_Entity-Focused_Fine-Tuning.ipynb		DeBERTa_Entity-Focused_Fine-Tuning.ipynb
Fine_Grained_Financial_Sentiment_Regression_with_BERT.ipynb		Fine_Grained_Financial_Sentiment_Regression_with_BERT.ipynb
README.md		README.md
Separating_Representation_from_Classification.pdf		Separating_Representation_from_Classification.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Targeted Sentiment Regression on Financial News Articles using DeBERTa + Entity-Focused Fine-Tuning

SemEval 2017 Task 5, Subtask 2

Fine-Grained Sentiment Analysis on Financial News

Notebooks

Experiments and Results

Attached Classification/Regression Head example:

Detached Classification/Regression Head (with entity token replacement) example:

Experiments & Results:

About

Releases

Packages

Languages

sfuller14/DeBERTa_Entity-Focused_Fine-Tuning

Folders and files

Latest commit

History

Repository files navigation

Targeted Sentiment Regression on Financial News Articles using DeBERTa + Entity-Focused Fine-Tuning

SemEval 2017 Task 5, Subtask 2

Fine-Grained Sentiment Analysis on Financial News

Notebooks

Experiments and Results

Attached Classification/Regression Head example:

Detached Classification/Regression Head (with entity token replacement) example:

Experiments & Results:

Footnotes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages