Financial Sentiment across languages
The SSIX project targets sentiment analysis in the financial domain. One of its main objectives is to overcome language barriers and realize a financial sentiment platform capable of scoring textual data in different languages. The three-year project was launched with English. In Year 2, we developed an English Gold Standard (GS) corpus and trained an English sentiment classifier on it. To extend our research from English to multilingual sentiment analysis, we have investigated several strategies. The most valuable are the Native, the Foreign, and the Direct Translation approaches. As a continuation to our previous work, we are currently investigating the Foreign approach, consisting of the realization of a new gold standard for a target language via translation of a gold standard for a source language. We are doing this by translating the English GS to German. The aim of our current research is to quantify the impact of machine translation (MT) on the quality of the GS and outline strategies to improve the MT quality.
The quality of the MT is considered the optimum when there is minimal discrepancy between the original GS sentiment and the MT sentiment. In order to assess this, we examined several MT methods and assessed their output in terms of quality and sentiment.
We collected a sample of 200 English tweets annotated with sentiment score from the English GS. We translated the sample to German within the Geofluent framework, the real-time translation tool developed by Lionbridge Inc. Geofluent offers the possibility to query several MT engines and implement pre- and post-processing rules (so called text normalization rules). These rules correct consistent spelling errors, resolve abbreviations, and correct words in the target language for better domain adaptation. For example, other than spelling corrections, normalization is concerned with English financial terminology like cup-with-handle which cannot clearly be translated as *Tasse mit Griff.
When it comes to financial tweets, we intended to demonstrate that text normalization plays an essential role. We translated a sample with and without pre- and post-processing, nor Translation Memory entries. We used three different engines to perform the translation: Microsoft Translator (MS), Google Translate (G), and Google Neural Machine Translation (GN). The same English tweets sample was also translated into German by a domain expert.
Then we asked two German native finance experts to assign a “sentiment” to each of the translated tweets (keeping the sentiment of the original English tweets hidden). The sentiment has a range from 1 (“very bearish”) to 10 (“very bullish”). When we find a relevant difference in the sentiment assigned by the two domain experts to the same translation, we ask a third expert to reconcile the disagreement. We also asked the domain experts to assess the quality of the translations on a scale from 1 (“very inaccurate”) to 5 (“very accurate”).
Ultimately, we ended up examining 7 translations of the original English sample, namely a human translation, and the normalized and non-normalized variants of Microsoft Translator (MS), Google Translate (G), and Google Neural Machine Translation (GN).
There are several questions we intend to address with our research. Generally speaking, how is sentiment preserved across translation? Can we assume that sentiment is preserved in human translation (HT) by a domain expert? How does sentiment change with MT and what factors affect the change?
To answer these questions we need to investigate how consistent the MT sentiment scores are across translations; in other words, how much one translation’s sentiment deviates from another. We clearly also want to know which MT engine produces the sentiment which is closest to the HT’s and the deviation of each MT sentiment from the HT sentiment.
We have also seen that Geofluent allows specific pre- and post-processing on text. We want to be able to quantify the impact of normalization on sentiment preservation.
Investigation is currently ongoing to provide a satisfactory answer to the above questions. The first assessments on the normalized data show no significant deviation in the sentiment from English to any human or machine output. While not conclusive, these results show that normalization rules and the environment made available by Geofluent play an important role on the preservation of the sentiment quality. The current investigation also clarifies why Geofluent has also been chosen as the translation interface for the Direct Translation approach, which is the one used for most languages supported in SSIX. The normalization tools and translation memories available in Geofluent are valuable resources that make sentiment analysis more reliable across languages. Providing quantitative answers to the above questions and assessing sentiment quality via MT is our ongoing topic of investigation and will be the subject of a forthcoming publication.