Description
Contribute to Helsinki-NLP/Tatoeba-Challenge development by creating an account on GitHub.
Summary
- In more detail This package provides data sets for machine translation in many languages with test data taken from Tatoeba.
- Naturally, training data do not include Tatoeba sentences and the popular WMT testsets are not included to allow a fair comparison to other models using those data sets.
- We will also publish (reasonable) models to be re-used and deployed through OPUS-MT and linked from the model subdir in this github.
- However, there can be identical source sentences or identical target sentences in both sets, which are not linked to the same translations.