Neural Machine Translation (NMT): A Linguistic Revolution
In 2017, when neural machine translation (NMT) was at a point where it was ready to be integrated into our production processes, I tasked my R&D team with investigating the possibilities right way. I used Systran’s new neural machine translation (NMT) engine to translate our company presentation, and I was amazed at the results.
The presentation in question was a complete overview of our service offerings—59 pages of French text—that had been edited three separate times to ensure top-notch quality. (Thanks, Faten, Boris, and Laurence!)
Several days later, just as I was putting the finishing touches to the presentation as part of an RFP bid, I found out that the prospective client (a major French manufacturer) wanted the bid in English. I had just one day to deliver 59 pages of pristine English content!
Let me give you some background to explain why the CEO of a translation company would decide to entrust one of their most important business documents for one of their biggest RFP bids to neural machine translation.
Since 2007, Lexcelera has been listening and responding to our customers’ demands for machine translation. They might have any number of reasons for wanting it: to help their clients understand the contents of a patent or technical document; to translate internal reports; to automatically localize website content; or for the first draft of a translation that our team would then post-edit until it was of irreproachable quality.
Since then, we’ve been closely following every technological evolution in the field of machine translation.
Let me begin with a few words about neural machine translation (NMT).
What is Neural Machine Translation (NMT)?
Neural machine translation is only a couple of years old––the first articles that talked about this new phenomenon appeared in 2013.
While its MT predecessors had relied on grammatical rules or statistical probability, NMT used new deep-learning technologies. These days, people talk about recurrent neural networks (RNNs) fueled by artificial intelligence. These new translation engines can take the context (everything before and after the text to be translated) into consideration before they process a given phrase. They are also capable of learning and becoming more intelligent as they process more and better-quality content.
A large number of tech companies including the Big Four have turned their attention to NMT. These industry-leading mega corporations have developed their own neural machine translation solutions.
Research breakthroughs in artificial neural networks are also facilitating impressive advances in the fields of speech synthesis and facial recognition.
The Players and Their Solutions
Microsoft Translator, Systran’s Pure Neural, and Google Translate are among the solutions currently on the market.
Other less well-known editors have also achieved impressive results, even going so far as to steal the thunder from giants such as Google. Such is the case with DeepL, as well as SDL Translate and Iconic, all of whom have claimed a piece of the NMT market pie.
Of particular note is Systran, one of the pioneers in the area of machine translation, who share their NMT-related research with an open-source development community with the goal of advancing the field.
The Benefits of Neural Machine Translation (NMT)
The benefits to our clients are, of course, significant cost savings and a non-negotiable increase in speed. MT also enables the translation of content that, because of a lack of time or money, would never have otherwise been translated. And when the MT engine has been well trained, the productivity gains are considerable.
Thus far, machine translation has been the exclusive purview of simple or technical content, rather than marketing materials or content that requires a quality editorial voice or writing style. Our company presentation definitely fell into the marketing category. But I was desperate. I needed to meet the deadline for our RFP bid—and so I needed a translation ASAP.
My Experience with Lexcelera’s NMT Beta Test
With machine translation, there is usually a lot to correct—unless you have trained the engine very, very well. Training good MT engines is our specialty. But the Systran NMT engine was a completely untrained, generic engine. Even so, its translation of our company presentation was amazing.
We needed to make corrections, of course. But I was surprised at how few errors there were. Mostly, the Systran NMT engine seemed to understand what we wanted to say and translated it in an extremely fluent fashion. The majority of the time, the terminology was spot on. And the sentences? Well, they sounded human.
The translation did go off the rails in a few places. (I wonder how NMT would translate “off the rails”!) Once in a while, it would leave a French word in the middle of an English sentence. Or, weirdly, it would repeat the same word twice. And about every three to four sentences, there would be a glaring error that I had to correct. Even so, I was stunned to find that I could leave whole sentences intact.
Most of the errors I spotted would be easily fixable, like the duplicates, and the terms the system didn’t know. (Ironically, Systran’s NMT system did not recognize the term “post-édition” in French, and it mistranslated “relecture” as re-reading.)
Today, after three years of further development, Systran’s Pure Neural tool can analyze the semantic context of the content it is translating and produce even better results. The progress that has been made in what is really a matter of months is truly flabbergasting.
Towards New Paradigms for Neural Machine Translation
The first NMT engines could translate sequences of several words at a time and compare their internal consistency. These were called convolutional neural networks, or CNNs.
These days, they use RNN (recurrent neural network) models, and new NMT engines are capable of analyzing the context of an entire document or documents to find the best translation of a given sentence or segment of text. Advances such as these have eliminated most of the early errors and created translations that feel both fluent and natural—and are a far cry from the initial babble produced by early machine translation.
Such advances in technology inevitably beg the question: what is the role of the human being in the translation process now?
What Role Do Humans Play in an Era of Neural Machine Translation?
Lexcelera’s computational linguists still have and will continue to have a role to play in the training of NMT engines. There are always product names and internal terminology to respect and protect and colloquialisms to understand—a whole customer “language” that the machine constantly has to learn from the databases our linguists prepare for it.
Unlike neural machine translation engines, human beings also make different choices based on the cultural and emotional values they bring to a text. A machine doesn’t know how to stray from the source text, meaning that disciplines such as transcreation—a more free, creative, and idiomatic form of translation into a target language—remain outside an NMT engine’s bailiwick for now.
Most importantly, Lexcelera’s post-editors will always need to proofread every word to ensure texts are free of errors and omissions. (I still believe this to be true, despite the major advances that have been made.) They will even have to be more perceptive than ever, since output errors are much more subtle and more difficult to detect now. Yet even if NMT remains imperfect, it has gone from being a tool that provided a general understanding of a text to one that, well, creates a proper translation. Not a perfect one, but a damn good one all the same. This means that the post-editing process is more efficient, and that human post-editors can work on more content in less time. Two years after implementing NMT at Lexcelera, we’ve seen a 60% increase in productivity when compared to previously existing technologies, and a 230% increase when compared to human translation alone.
The future starts now. Lexcelera’s challenge, like the challenge of every language-loving, multicultural and multilingual body of professionals (in other words, translators), will be to maintain our relevance in a world where tools are improving almost daily, particularly with the latest advancements in artificial intelligence and deep learning.
Lexcelera is already working with this new future in mind by ensuring we remain at the forefront of our field. We know that MT, especially with the advent of NMT, is here to stay. That’s why we have been spending around 7% of our revenue on R&D: to make sure we stay ahead.
And the investment has paid off. We’ve been working with language technologies long enough to master them. We know how to customize them, how to adapt them, how to improve them. But this is not just computing work. We train MT engines, good ones, by relying on human talent. Professional translators, post-editors, and computational linguists will always have a home at Lexcelera.
Our field is in constant evolution, but this isn’t necessarily bad news. It’s up to us to adapt to these new methods and to carve out a niche for ourselves in the areas NMT cannot go. And let’s take advantage of the enormous benefits these advancements offer in terms of the efficiency with which we can handle ever-increasing amounts of content. Three years after we beta tested Systran’s NMT tool, neural machine translation is yet to totally transform our industry. What it does offer us is a new tool that opens up a world of possibilities for those who dare take the leap.
Stay tuned!
Lori Thicke