Since the introduction of core and auxiliary corpora in version 3.1, we have seen successful and less successful engines trained in Globalese. The successful ones usually have ample and well-maintained core corpora (which we have renamed to ‘master’ in version 3.5 to resonate more with CAT tool users), have plenty of auxiliary corpora to use as the foundation, and are […]
In the past, LSPs and content owners with a need for MT would often struggle when building engines, because they wouldn’t have the required volumes of specific corpora to train successful engines. To tackle this, Globalese 3.1 introduces the concept of core and auxiliary corpora.
The small corpus struggle
To train a working MT engine, a training corpus of less than 100,000 […]
Not all language pairs are created equal. Anyone who has experience with Statistical Machine Translation (SMT) knows it is always easier to get good results from an English to Spanish engine than say, French to Japanese.
The concept of composite engines makes its debut in Globalese 2.0. Every Globalese engine now includes a phrase-based and a hierarchical part. These reflect two […]
You justify your investment by the return on costs. However, on the long run, your savings should come from streamlining and not from pressurizing the members of your ecosystem who depend on you.
Straining the members of your organization may work for a while, but in the long run you will lose. You will lose your best translators, editors, proofreaders and […]
Machine Translation (MT) is becoming more and more part of the standard translation workflow. However, to use MT as a productivity tool for increasing the profitability of projects and decreasing delivery time, it is essential to utilize high-quality MT engines in projects. This post summarizes the most important points about the influencers on MT engine quality, focusing on Statistical Machine […]