Pangeanic leads a consortium with leading machine translation and NLP companies KantanMT and Tilde to create the largest-ever neural machine translation engine farm translating between all European languages.
The 3 technical partners and the State Secretariat for Digital Advancement (SEAD) met in Valencia 27th28th August to define the strategies for a successful delivery of the NTEU objectives, namely
- Parallel data mining into 503 language combinations
- Build direct combinations for all language pairs in the EU except English.
- Collect training data:
– 15M segments 1-1 resourced languages
– 10-12M segments under-resourced languages
– 10M ultra-under-resources
- Engage PAs: deployment and awareness campaigns (together with ELRC action).
- Deposit engines and deliverables for use by PA’s under ELRC-Share lock.
You can download the agenda here
Picture of some members that participated in NTEU’s kickoff meeting in Valencia