Related Projects

SMART is closely related to another research project implemented at the Centre for Translation Studies, University of Surrey, i.e. MATRIC (Machine Translation and Respeaking in Interlingual Communication)

About MATRIC

MATRIC (Machine Translation and Respeaking in Interlingual Communication) is a project implemented at the Centre for Translation Studies, University of Surrey in the years 2020-2022. MATRIC explored an emerging, hybrid and semi-automated workflow that combines intralingual respeaking and machine translation to produce live speech-to-text from one language to another.
The focus of the project is on one key aspect of performance quality, namely accuracy. The content of the subtitles produced via MATRIC’s experimental workflow (intralingual respeaking plus machine translation) was compared with the output produced by highly skilled human interpreters with EU accreditation (benchmark).

Rationale for MATRICBoth automatic speech recognition (ASR) and machine translation (MT) technologies have made huge advances, but there are still challenges for them.The range and variety of possible set-ups and stakeholders in live (intra- and interlingual) communication makes it difficult for ASR solutions to generalize across use cases.ASR-related challenges include speaker variability, imperfect delivery, segmentation and punctuation of output, as well as ambient noise and environment variability.MT-related challenges include poor out-of-domain performance, problems with rare word translation or long sentence translation, terminological consistency, and relative unpredictability of output in advanced neural MT models.These and other challenges mean that full automation in live interlingual communication is still burdened with many risks, and multimodal communication is understudied.We need to validate different interlingual workflows empirically and see where they perform best (e.g., parallel research on interlingual respeaking, see SMART project).

MATRIC’s methodSource input: three authentic EU speeches (from the European Parliament’s events) interpreted from English into four target languages (IT, ES, PL, FR). Target outputs: Benchmark workflow: transcripts of interpreters’ performances output from six accredited EU interpreters.Experimental workflow: output from our semi-automated workflow, also in text format, resulting from the combination of live subtitles in English produced by a professional respeaker and then fed into eTranslation (the EU's official machine translation system).The accuracy of both sets of texts was evaluated with the NTR model (Romero-Fresco and Pöchhacker 2017), originally developed for accuracy assessment in interlingual respeaking (derived from the NER model for intralingual respeaking, Romero-Fresco, 2011). The same evaluation model was used on SMART.

MATRIC’s findingsThe experimental workflow is capable of generating outputs that are (very) close in terms of accuracy and completeness to the outputs produced in the benchmark workflow.Many errors identified in our data are due to the MT component; even small changes in MT may therefore yield better results for the semi-automated workflow.Output from other and pre-trained MT engines needs to be investigated in further research.Results cannot be generalized as of now: we used source texts from a very specific environment (EuroParl), in a small-scale experiment. Replication on a larger body of data is necessary to better understand the variation.

Contact: e.davitti@surrey.ac.uk, t.korybski@surrey.ac.uk

MATRIC PAPER: https://aclanthology.org/2022.lrec-1.468/

MATRIC POSTER (key findings):

Skip to content