Tag - Machine Translation

Machine Translation: the Right Expectations, the Right Priorities

Machine Translation today is a real productivity service. While the conclusion has been obvious looking at the performance data MT services deliver, many organizations with the right characteristics have now decided to adopt the technology to support their workflows, cut costs and save resources.

Will the rollout of a Neural Machine Translation solution lead you into the space age? Oh, well – yes it will. But it is important to manage our expectations.

However, what should you expect from introducing the tech in your workflows? Well, it all depends on how you set out to change your world: start small, expect little – start large, get more results early. Adjust your expectations depending on how much ammunition you have.

Machine Translation and the right rollout conditions

“It is always important not to expect very radical improvements in cost and throughput when starting out on small projects” – warns Globalese CEO Gábor Bessenyei in an interview he gave together with Crosslang’s Luc Meertens  to the TAUS blog recently. Small projects will not produce giant results. Gábor has provided an exemplary rollout by a Turkish company: “This company managed to double productivity to 5,000 or 6,000 words a day – about twice the rate of the human-only process. But we should be very careful not to believe or spread stories about ten-fold productivity figures from MT projects.” However, more often than not, end clients and LPSs alike do tend to expect too much from an MT deployment. Gábor is certain it is not only the issue of not understanding the technology well enough and expecting too much based on very limited corpora. The problem often lies in overestimating the benefits of an MT deployment.

Win some, lose some: take good care of your ecosystem

“End clients think they can save a lot of money, while at the same time they don’t have the right compensation package in place to pay either their LSPs or their translators. It’s very important that there is an ecosystem process in place whereby everyone can see how they benefit from automation. For example, translators ought to be able to see a benefit from the introduction of MT - such as applying a lower word rate but being able to work much faster. Rolling out a new pricing model should be done very carefully.” And pricing schemes and compensation packages are an absolute must – as Andrew Joscelyne, the author of the TAUS article puts it: “The vision of transparency that Gabor pictures is real. We can track translation throughput on a real-time basis and share the reporting with translators and clients. We would whole-heartedly agree that this type of business intelligence is part-and-parcel of the paradigm shift that NMT is taking the industry through these days.” Gábor believes the technological change driven by MT technology is happening fast. He thinks however that, while the technology may render traditional generic translation obsolete in many contexts (e.g. travel guides, menus, other B2C content) the technology will also create ample opportunities, especially on the areas of content management and quality assurance. Read the full article on TAUS here.

Meet us at GALA Munich!

We will be at GALA Munich – meet us between the 25th and the 27th: check in below and we will be waiting for you at our booth at the designated time!

Breaking the terminology barrier in Neural Machine Translation

[lead]One challenge Neural Machine Translation technology faces today stems from the very same thing which makes it so amazing and effective. Let's see how Globalese solves the Terminology Problem with the help of AIDA.[/lead]  
The end of the second act of the opera Aida in the Verona Arena in July 2011. – AIDA, Automated In-Domain Adaptation is probably not as grandiose, but probably similarly spectacular for terminology-savvy users of Neural Machine Translation. Photo by Jakub Hałun, CC BY-SA 4.0
The end of the second act of the opera Aida in the Verona Arena in July 2011. – AIDA, Automated In-Domain Adaptation is probably not as grandiose, but probably similarly spectacular for terminology-savvy users of Neural Machine Translation. Photo by Jakub Hałun, CC BY-SA 4.0

Neural Machine Translation was an amazing break-through from many points of view. It has improved the overall quality of machine translations compared to pre-neural times. It has provided, for the first time, truly usable and sound quality output for the language industry.  It has also opened up opportunities for languages like Japanese, Chinese or Russian, which otherwise performed poorly on Statistical MT technology.

The downside of the Neural Machine Translation revolution: terminology

As with every groundbreaking invention, NMT technology also had its limitations. One of the major issues with Neural was handling terminology. This major challenge stems from the very reason of what makes NMT so truly exciting. Unlike with statistical MT technology, where it was possible for users to provide a terminology list, which the MT system could safely rely on during translation, it was not directly possible to provide a master terminology for the translation process in the NMT world. Technically, you can, of course, introduce a glossary to an engine as part of the training corpora, but this will not act the way you would expect. It will not prioritize the translations in the glossary over the content in the rest of the training data. In the NMT technology, there is currently no way to influence the terminology translation directly during the machine translation process.

Are you a content owner or an LSP? Give Globalese a go now and grow your business with the power of Neural MT! Click here and start your free trial now!

That doesn’t mean that developers hasn’t made attempts to solve this issue. One of the solutions we have seen from many MT providers is to implement terminology replacement based on a glossary after the machine translation phase. While it certainly sounds promising, unfortunately the results are not always that encouraging. The problem is that you are running a considerable risk of losing grammatical information during the replacement process. Just imagine the problems a changed gender of a word can cause in German. In better cases, you will have to spend many hours of editing to fish out the problematic bits. In some cases, you end up with a limited usability output that leaves you, your clients and your translators disappointed.

Introducing automated in-domain adaptation (AIDA)

Globalese is answering to this challenge by introducing its proprietary technology, the automated in-domain adaptation. This technology will provide you with a yet unparalleled improvement. So what is this all about? By using the automated in-domain adaptation technology, as a Globalese user, you will have the chance to mark content from the training data of an engine as the most important in-domain content. For example, if a user has a Translation Memory (TM) of a medical device documentation, it can be marked as the master TM. Globalese will analyze the content of the master TM(s) and extend the engine only with similar and related training data from the auxiliary TMs. Additionally, the engine will be tuned based on the master TM. The result is a highly customized engine focusing on the content of the master TM.

Maxing out terminological accuracy and keeping quality

The result of this process will be an engine where the wording and the style of the master TM will get higher priority over the rest of the training data, even if there are concurring terms. This way, you can reach a maximum level of terminology accuracy without having to face the problem of losing grammatical information or decreasing the overall language quality. Naturally, the cleaner and the more up-to-date your master TM is in the relevant topic or domain, the better the overall quality will be. This innovative Globalese solution concerning the terminology barrier of Neural MT technology paves the way to even better optimized workflows. This means that content owners and Language Service Providers can save considerable time and resources in post-editing output.

Join us for a coffee in Munich!