To create an engine:
- Go to Engines.
- Click Create new.
- Specify the name, languages and the group the new engine will belong to. Unlike corpora, engines must only belong to one group only.
- Based on the above data, Globalese will display a list of corpora to choose from.
- Select the corpora you want to include in the engine.
A good rule of thumb is to collect at least 100,000 segment pairs for the engine. If the volume is below 100,000 segment pairs, try adding some other resources. Even if they are not 100% relevant to your engine, they can still elevate the overall language quality.
- Optionally, mark corpora as core. Core corpora play an elevated role during training.
If you mark a corpus as core, Globalese will use it as a reference when training the engine. The training process will use segment pairs from the auxiliary corpora that are from the same domain as the core corpus, and discard those that are not.
If you mark all corpora as core, Globalese will simply ignore it.
- Click Save.