Breaking the terminology barrier in Neural Machine TranslationGábor Bessenyei
[lead]One challenge Neural Machine Translation technology faces today stems from the very same thing which makes it so amazing and effective. Let's see how Globalese solves the Terminology Problem with the help of AIDA.[/lead]
Neural Machine Translation was an amazing break-through from many points of view. It has improved the overall quality of machine translations compared to pre-neural times. It has provided, for the first time, truly usable and sound quality output for the language industry. It has also opened up opportunities for languages like Japanese, Chinese or Russian, which otherwise performed poorly on Statistical MT technology.
The downside of the Neural Machine Translation revolution: terminologyAs with every groundbreaking invention, NMT technology also had its limitations. One of the major issues with Neural was handling terminology. This major challenge stems from the very reason of what makes NMT so truly exciting. Unlike with statistical MT technology, where it was possible for users to provide a terminology list, which the MT system could safely rely on during translation, it was not directly possible to provide a master terminology for the translation process in the NMT world. Technically, you can, of course, introduce a glossary to an engine as part of the training corpora, but this will not act the way you would expect. It will not prioritize the translations in the glossary over the content in the rest of the training data. In the NMT technology, there is currently no way to influence the terminology translation directly during the machine translation process.
Are you a content owner or an LSP? Give Globalese a go now and grow your business with the power of Neural MT! Click here and start your free trial now!That doesn’t mean that developers hasn’t made attempts to solve this issue. One of the solutions we have seen from many MT providers is to implement terminology replacement based on a glossary after the machine translation phase. While it certainly sounds promising, unfortunately the results are not always that encouraging. The problem is that you are running a considerable risk of losing grammatical information during the replacement process. Just imagine the problems a changed gender of a word can cause in German. In better cases, you will have to spend many hours of editing to fish out the problematic bits. In some cases, you end up with a limited usability output that leaves you, your clients and your translators disappointed.