Knowledge 4 All Foundation Ltd.
Jožef Stefan Institute
Universitat Politècnica de València
Rheinisch-Westfaelische Technische Hochschule Aachen
European Media Laboratory GMBH
Deluxe Media Europe
The innovative solutions being developed at transLectures are already being deployed in several educational repositories, allowing these portals to overcome language barriers and reach wider audiences while supporting linguistic diversity. We are condent that the adoption of transLectures solutions will keep a rapid pace over many other educational repositories in Europe and worldwide.
We have fully implemented our ideas on our two case study sites: JSI-K4A's VideoLectures.NET and the UPVLC's poliMedia. For transcription, we have worked on English and Slovenian via (mainly) VideoLectures.NET and Spanish and Catalan via poliMedia. Meanwhile, for translation we have considered the language pairs Spanish, Sloveniang into English, English into French, German, Slovenian, Spanishg, and the pair fSpanish, Catalang in both directions. It is worth noting that the languages considered are official languages in the EU and participant countries.
Recently, we have added for transcription Dutch, Italian and Portuguese within the framework of the EC-funded project EMMA, with French and Estonian also coming soon. All of these services are being tested by public and private organisations worldwide via the Try our Tools service in the project website.
Some of these toolkits are free software and can be downloaded from the links we provide.
The transLectures Platform (TLP) is an open source (Apache License 2.0) set of software tools which includes everything you need in order to integrate transLectures transcription and translation technologies into a media repository. Its main components are the transLectures Database, Ingest Service, Web Service and Player.
The transLectures-UPV toolkit (TLK) is an open source (Apache License 2.0) set of tools for automatic speech recognition (ASR) developed at Universitat Politècnica de València (UPV). Among other functionalities, it features parameter estimation of hidden Markov models (HMMs) and recognition (speech, text…).
TLK is the software behind the implementation of the transLectures automatic transcription system in UPV’s Polimedia video lecture repository. It is being actively developed as the tL project progresses, and new versions will be released with improved features and usability.
The transLectures Matterhorn Plug-in provides a transLectures Matterhorn Service and a transLectures Matterhorn Custom Workflow in order to integrate the transLectures Platform tools into the Opencast Matterhorn platform. It has been developed and tested for the Opencast Matterhorn 1.4.0, but it can be easily extended to support different versions.
RASR (short for “RWTH ASR”) is a software package containing a speech recognition decoder together with tools for the development of acoustic models, for use in speech recognition systems. It has been developed by the Human Language Technology and Pattern Recognition Group at the RWTH Aachen University since 2001. Speech recognition systems developed using this framework have been applied successfully in several international research projects and corresponding evaluations.
The software rwthlm supports different kinds of neural network layers (feedforward, standard recurrent, and long short-term memory neural networks), and arbitrarily deep networks as well as arbitrary combinations of the aforementioned kinds of layers. It was developed at the Human Language Technology and Pattern Recognition Group at RWTH Aachen University since 2013. The software has been successfully used in international evaluations, giving substantial improvements in speech recognition as well as machine translation applications.
Jane is RWTH‘s open source statistical machine translation toolkit. Jane supports state-of-the-art techniques for phrase-based and hierarchical phrase-based machine translation. Many advanced features are implemented in the toolkit, such as forced alignment phrase training for the phrase-based model and several syntactic extensions for the hierarchical model. RWTH has been developing Jane during the past years and it was used successfully in numerous machine translation evaluations.
EML Transcription Platform
For the massive adaptation of both acoustic models and stochastic language models, European Media Laboratory (EML) use their own set of tools, which are integrated into the EML Transcription Platform, a web-service based framework for the creation and adaptation of language components (acoustic model, language model) and their deployment in 7×24 usage scenarios.
The XEROX TunaTon Toolkit
The XEROX TunaTon Toolkit is a facility to train in parallel a large number of translation model variations, in order to compare their performance for a particular application domain.