Automatic Transcription and Translation of MOOC content

by Dr. Jorge Civera, Associate Professor of computer science at UPV, Spain

Jorge CiveraOne of the most appealing services of the EMMA platform is the possibility of having your course materials automatically translated into other European languages opening up your student community. In the case of audiovisual materials, such as videos, subtitles are automatically generated in multiple languages. At the current state of the project, transcription systems that generate subtitles from videos in English, Spanish, Italian, Dutch and Portuguese have been developed. Translation systems have also been deployed for the following translation pairs: from Italian, Spanish, Dutch and Portuguese into English, from Spanish into English and Catalan. The translation systems will be employed to translate both, subtitles and textual content of MOOCs in the original language. The Automatic Speech Recognition (ASR) technology behind the EMMA transcription systems have been developed at the European project transLectures (translectures.eu) coordinated by the Universitat Politècnica de València. This technology is based on state-of-the-art ASR techniques combining Deep Neural Network for acousting modelling and n-gram models for language modelling. The translation systems are grounded on the publicly available MOSES translation toolkit that implements a state-of-the-art phrase-based translation system. In addition, lecturers and students may have the possibility of reviewing automatic transcriptions and translations of their MOOC courses to improve not only current transcriptions and translations, but also future ones taking advantage of transcription and translation system updates from these reviewed materials. As the project progresses, MOOC courses from other European universities will be uploaded involving new languages, such as French and Estonian, being automatically transcribed and translated into English. Moreover, an English into Italian translation system will be put into place to translate Spanish MOOC courses into Italian (and vice versa) passing over English as a pivoting, quality assurance language. To find out more and to see the transLectures tools in action check: www.translectures.eu

Leave a Reply