Self-supervised sentiment analysis in Spanish to understand the university narrative of the Colombian conflict

Julian Enrique Paez Valdez; Paula Andrea  Rendón Cardona; Julián  Gil González; Mauricio  Rivera-Henao

doi:https://doi.org/ 10.3390/app12115472

Self-supervised sentiment analysis in Spanish to understand the university narrative of the Colombian conflict

Julian Enrique Paez Valdez, Paula Andrea Rendón Cardona, Julián Gil González, Mauricio Rivera-Henao

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

3 Citas (Scopus)

Resumen

Sentiment analysis is a relevant area in the natural language processing context–(NLP) that allows extracting opinions about different topics such as customer service and political elections.
Sentiment analysis is usually carried out through supervised learning approaches and using labeled data. However, obtaining such labels is generally expensive or even infeasible. The above problems can be faced by using models based on self-supervised learning, which aims to deal with various machine learning paradigms in the absence of labels. Accordingly, we propose a self-supervised approach for sentiment analysis in Spanish that comprises a lexicon-based method and a supervised classifier. We test our proposal over three corpora; the first two are labeled datasets, namely, CorpusCine and PaperReviews. Further, we use an unlabeled corpus conformed by news related to the Colombian conflict to understand the university journalistic narrative of the war in Colombia.
Obtained results demonstrate that our proposal can deal with sentiment analysis settings in scenarios with unlabeled corpus; in fact, it acquires competitive performance compared with state-of-the-art techniques in partially-labeled datasets.

Idioma original	Español (Colombia)
Número de artículo	5472
Páginas (desde-hasta)	1-16
Número de páginas	17
Publicación	Applied Sciences (Switzerland)
Volumen	12
N.º	5472
DOI	https://doi.org/ 10.3390/app12115472
Estado	Publicada - may. 28 2022

ODS de las Naciones Unidas

Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

Acceder al documento

https://doi.org/ 10.3390/app12115472

https://www.mdpi.com/2076-3417/12/11/5472/htm

Citar esto

@article{4d766c7f95c64431aab58e0e0e3a8ea5,

title = "Self-supervised sentiment analysis in Spanish to understand the university narrative of the Colombian conflict",

abstract = "Sentiment analysis is a relevant area in the natural language processing context–(NLP) that allows extracting opinions about different topics such as customer service and political elections.Sentiment analysis is usually carried out through supervised learning approaches and using labeled data. However, obtaining such labels is generally expensive or even infeasible. The above problems can be faced by using models based on self-supervised learning, which aims to deal with various machine learning paradigms in the absence of labels. Accordingly, we propose a self-supervised approach for sentiment analysis in Spanish that comprises a lexicon-based method and a supervised classifier. We test our proposal over three corpora; the first two are labeled datasets, namely, CorpusCine and PaperReviews. Further, we use an unlabeled corpus conformed by news related to the Colombian conflict to understand the university journalistic narrative of the war in Colombia.Obtained results demonstrate that our proposal can deal with sentiment analysis settings in scenarios with unlabeled corpus; in fact, it acquires competitive performance compared with state-of-the-art techniques in partially-labeled datasets.",

author = "{Paez Valdez}, {Julian Enrique} and {Rend{\'o}n Cardona}, {Paula Andrea} and {Gil Gonz{\'a}lez}, Juli{\'a}n and Mauricio Rivera-Henao",

year = "2022",

month = may,

day = "28",

doi = "https://doi.org/ 10.3390/app12115472",

language = "Espa{\~n}ol (Colombia)",

volume = "12",

pages = "1--16",

journal = "Applied Sciences (Switzerland)",

issn = "2076-3417",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "5472",

}

Self-supervised sentiment analysis in Spanish to understand the university narrative of the Colombian conflict. / Paez Valdez, Julian Enrique; Rendón Cardona, Paula Andrea ; Gil González, Julián et al.
En: Applied Sciences (Switzerland), Vol. 12, N.º 5472, 5472, 28.05.2022, p. 1-16.

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

TY - JOUR

T1 - Self-supervised sentiment analysis in Spanish to understand the university narrative of the Colombian conflict

AU - Paez Valdez, Julian Enrique

AU - Rendón Cardona, Paula Andrea

AU - Gil González, Julián

AU - Rivera-Henao, Mauricio

PY - 2022/5/28

Y1 - 2022/5/28

N2 - Sentiment analysis is a relevant area in the natural language processing context–(NLP) that allows extracting opinions about different topics such as customer service and political elections.Sentiment analysis is usually carried out through supervised learning approaches and using labeled data. However, obtaining such labels is generally expensive or even infeasible. The above problems can be faced by using models based on self-supervised learning, which aims to deal with various machine learning paradigms in the absence of labels. Accordingly, we propose a self-supervised approach for sentiment analysis in Spanish that comprises a lexicon-based method and a supervised classifier. We test our proposal over three corpora; the first two are labeled datasets, namely, CorpusCine and PaperReviews. Further, we use an unlabeled corpus conformed by news related to the Colombian conflict to understand the university journalistic narrative of the war in Colombia.Obtained results demonstrate that our proposal can deal with sentiment analysis settings in scenarios with unlabeled corpus; in fact, it acquires competitive performance compared with state-of-the-art techniques in partially-labeled datasets.

AB - Sentiment analysis is a relevant area in the natural language processing context–(NLP) that allows extracting opinions about different topics such as customer service and political elections.Sentiment analysis is usually carried out through supervised learning approaches and using labeled data. However, obtaining such labels is generally expensive or even infeasible. The above problems can be faced by using models based on self-supervised learning, which aims to deal with various machine learning paradigms in the absence of labels. Accordingly, we propose a self-supervised approach for sentiment analysis in Spanish that comprises a lexicon-based method and a supervised classifier. We test our proposal over three corpora; the first two are labeled datasets, namely, CorpusCine and PaperReviews. Further, we use an unlabeled corpus conformed by news related to the Colombian conflict to understand the university journalistic narrative of the war in Colombia.Obtained results demonstrate that our proposal can deal with sentiment analysis settings in scenarios with unlabeled corpus; in fact, it acquires competitive performance compared with state-of-the-art techniques in partially-labeled datasets.

U2 - https://doi.org/ 10.3390/app12115472

DO - https://doi.org/ 10.3390/app12115472

M3 - Artículo

SN - 2076-3417

VL - 12

SP - 1

EP - 16

JO - Applied Sciences (Switzerland)

JF - Applied Sciences (Switzerland)

IS - 5472

M1 - 5472

ER -