• It would be wonderful if real-time interpretation could be done by machines, but it seems challenging to start translating before a sentence is complete.

The article titled “Learning to Translate in Real-time with Neural Machine Translation” discusses the possibility of this task. Is it impossible?

  • It seems limited to only have information about the statement to be translated.
    • When the word order is different, it is impossible to start translating before the sentence is complete.
  • It would be easier if there was a model that could predict what one is likely to say based on the context of the conversation up to that point.
    • If it can predict to some extent what has not been said yet, it would be possible to anticipate and translate ahead.
    • The study of human interpreters’ behavior has been conducted.
    • Machines are also doing it.
    • The model SeamlessStreaming enables streaming speech-to-speech and speech-to-text translations with less than 2 seconds of latency. It intelligently decides when to output the next translated segment based on the context it has learned, adapting to different language structures. This learned read/write policy enhances performance across various language pairs.
  • Exciting news! Moreover, the model has been publicly released.