Machine Translation in Financial Sentiment Analysis

As we discussed in previous posts, the SSIX project [1] aims at overcoming language barriers by creating a financial sentiment platform which is capable of scoring textual data in different languages. Because the cost of multilingual sentiment analysis can be high, we developed and tested different strategies to assess the advantages and disadvantages of each of them.

Here we intend to report our results on what we called the Foreign Approach (https://ssix-project.eu/financial-sentiment-across-languages/), in which we build a target sentiment Gold Standard (GS) corpus (in this case German) by translating a source sentiment English GS corpus. This approach combines human resources and Machine Translation (MT) to maximize the advantage of human translation (HT) while keeping the cost low. A crucial assumption of our approach is that the sentiment of the source GS can be transferred via translation to the target GS. To verify this assumption, we studied the impact of MT on the quality of the GS, as well as outlined strategies to improve MT quality.

Background
In the SSIX project, we created an English Sentiment Gold Standard (GS) in the domain of finance. The GS corpus was translated into German by three engines (Microsoft, Google, and Google Neural Network) integrated into Geofluent [2] to allow pre-/post-processing, such as DO-NOT-TRANSLATE rules to tackle with special financial terms. Our research setup has been discussed in Lionbridge’s previous posts.

In conducting our research, we are mainly concerned with these questions:

  1. How is the quality of MT in general, and across different engines?
  2. Can we rely on translations (HT & MT) to preserve sentiment?
  3. Assuming MT can preserve sentiment, which engine produces the most satisfactory translations?

We answer the first question in Study 1, and the rest in Study 2.

Study 1
A sample of 700 tweets with clear sentiment was selected and translated into German by one human expert in the domain of finance and the three MT engines mentioned above. We calculated BLEU scores [3] and the results suggested Google and Google Neural Network perform better than Microsoft on 1-grams, and vice versa on 2-/3-/4-grams.

The SSIX project has sentiment analysis as its target. Therefore, the quality of translation is considered the best when there is minimal discrepancy in sentiment between the original English GS and the translation. To explore how sentiment was preserved after translation, we conducted the second study.

Study 2
We used a subset of the sample (200 tweets) and asked three German financial domain experts to assign sentiment scores to the translations.

We compared the sentiment of the English GS to the sentiment of four translations. What we found was that HT is significantly more reliable in preserving sentiment. All three MTs lost sentiment significantly. Since we proved HT was the most reliable, we then used it as the benchmark to compare to MTs. It turned out that both Google and Google Neural Network had successfully transferred sentiment from the English texts to the German translations. However, Microsoft lost sentiment to the extent that its sentiment was significantly different from HT.

Conclusion
In our studies, we have evaluated three different translation engines’ performances when they are equipped with pre-/post-processing rules in Geofluent. Our findings show that an approach combining HT and MT is capable of extracting sentiment of financial tweets on a multilingual platform. In future work, we plan to introduce more language-specific rules to increase the accuracy of MT, as well as increasing data size to further investigate the role of pre-/post-processing rules. As a native sentiment classifier for German is currently being made available within the consortium, our plan is to benchmark it with MT-based approaches.

 

References:
[1] Social Sentiment Index – https://ssix-project.eu/

[2] Geofluent (Lionbridge Inc.) – http://www.lionbridge.com/geofluent/

[3] Source code for calculating the BLEU score – http://www.nltk.org/_modules/nltk/translate/bleu_score.html

 

This blog post was written by SSIX partner Lionbridge.
For the latest update, like us on
 Facebook, follow us on Twitter and join us on LinkedIn.