Using back-and-forth translation to create artificial augmented textual data for sentiment analysis models
Article
Article Title | Using back-and-forth translation to create artificial augmented textual data for sentiment analysis models |
---|---|
ERA Journal ID | 17852 |
Article Category | Article |
Authors | Body, Thomas (Author), Tao, Xiaohui (Author), Li, Yuefeng (Author), Li, Lin (Author) and Zhong, Ning (Author) |
Journal Title | Expert Systems with Applications |
Journal Citation | 178, pp. 1-12 |
Article Number | 115033 |
Number of Pages | 12 |
Year | 2021 |
Publisher | Elsevier |
Place of Publication | United Kingdom |
ISSN | 0957-4174 |
1873-6793 | |
Digital Object Identifier (DOI) | https://doi.org/10.1016/j.eswa.2021.115033 |
Web Address (URL) | https://www.sciencedirect.com/science/article/pii/S0957417421004747 |
Abstract | Sentiment analysis classification models trained using neural networks require large amounts of data, but collecting these datasets requires significant time and resources. Although artificial data has been used successfully in computer vision, there are few effective and generalizable methods for creating artificial augmented text data. In this paper, a text based data augmentation method is proposed called back-and-forth translation that can be used to artificially increase the size of any natural language dataset. By creating augmented text data and adding it to the original dataset, it is demonstrated by empirical experiments that back-and-forth translation data augmentation can reduce the error rate in binary sentiment classification models by up to 3.4%. These results are shown to be statistically significant. |
Keywords | Natural language processing; Translation; Sentiment analysis; Data augmentation |
ANZSRC Field of Research 2020 | 460507. Information extraction and fusion |
460208. Natural language processing | |
460502. Data mining and knowledge discovery | |
Public Notes | Files associated with this item cannot be displayed due to copyright restrictions. |
Byline Affiliations | School of Sciences |
Queensland University of Technology | |
Wuhan University of Technology, China | |
Maebashi Institute of Technology, Japan | |
Institution of Origin | University of Southern Queensland |
https://research.usq.edu.au/item/q65x5/using-back-and-forth-translation-to-create-artificial-augmented-textual-data-for-sentiment-analysis-models
174
total views8
total downloads0
views this month0
downloads this month