Volume 73, Issue 9 pp. 1314-1335
RESEARCH ARTICLE

SEntFiN 1.0: Entity-aware sentiment analysis for financial news

Ankur Sinha

Ankur Sinha

Production and Quantitative Methods, IIM, Ahmedabad, India

Search for more papers by this author
Satishwar Kedas

Corresponding Author

Satishwar Kedas

Production and Quantitative Methods, IIM, Ahmedabad, India

Correspondence

Satishwar Kedas, Production and Quantitative Methods, IIM, Ahmedabad, India.

Email: [email protected]

Search for more papers by this author
Rishu Kumar

Rishu Kumar

Production and Quantitative Methods, IIM, Ahmedabad, India

Search for more papers by this author
Pekka Malo

Pekka Malo

Department of Information and Service Economy, Alto University, Espoo, Finland

Search for more papers by this author
First published: 08 March 2022
Citations: 3

Funding information: India Gold Policy Centre, Grant/Award Number: 9209100:1815012

Abstract

Fine-grained financial sentiment analysis on news headlines is a challenging task requiring human-annotated datasets to achieve high performance. Limited studies have tried to address the sentiment extraction task in a setting where multiple entities are present in a news headline. In an effort to further research in this area, we make publicly available SEntFiN 1.0, a human-annotated dataset of 10,753 news headlines with entity-sentiment annotations, of which 2,847 headlines contain multiple entities, often with conflicting sentiments. We augment our dataset with a database of over 1,000 financial entities and their various representations in news media amounting to over 5,000 phrases. We propose a framework that enables the extraction of entity-relevant sentiments using a feature-based approach rather than an expression-based approach. For sentiment extraction, we utilize 12 different learning schemes utilizing lexicon-based and pretrained sentence representations and five classification approaches. Our experiments indicate that lexicon-based N-gram ensembles are above par with pretrained word embedding schemes such as GloVe. Overall, RoBERTa and finBERT (domain-specific BERT) achieve the highest average accuracy of 94.29% and F1-score of 93.27%. Further, using over 210,000 entity-sentiment predictions, we validate the economic effect of sentiments on aggregate market movements over a long duration.