ATA TEKTalks: Is ModernMT the Right Tool for You? An ATA Language Technology Division Webinar

By Danielle Sánchez

Members of ATA’s Language Technology Division, in collaboration with the Professional Development Committee, received Kirti Vashee, Language Technology Evangelist at Translated, to talk about Translated’s ModernMT. The online event held in early February, had approximately 400 registrants.

In this first episode of our 4^th year, Kirti presented a detailed overview of ModernMT’s adaptive machine translation system, focusing on technical insights into how ModernMT works, its security measures, and comparisons with other systems like Google Translate and DeepL. Our discussion included practical demonstrations and a deep-dive Q&A with Kirti. Questions collected by LTD members that were not answered before the abrupt end were answered by Kirti through email and are part of this post.

This TEKTalk was hosted by LTD Admin Bridget Hylak, Assistant Admin Blayse Hylak, and LTD Committee Member Danielle Sánchez.

Kirti started by reminding us of Translated’s LARA campaign, a short film about a magical girl who knew all languages. It was reassuring to remember that language connects people, and translation creates bridges between them:

“WHEN LANGUAGE WORKS ACROSS CULTURES, IT ENABLES COMMUNICATION; IT ENABLES CONTACT; IT ENABLES ENGAGEMENT. AND SO, LANGUAGE IS A MAGICAL, WONDERFUL THING. AND CROSSING LANGUAGES IS A MAGICAL, WONDERFUL THING.”

We asked Kirti to explain the differences between LARA and ModernMT, which are both from Translated. He explained the origins of ModernMT as part of an EU-funded project; initially an open-source system designed to be more responsive for professional users, and over time, evolving from statistical machine translation (SMT) to neural machine translation (NMT) while remaining adaptive. His explanation about NMTs was very on point:

“NEURAL MACHINE TRANSLATION IS AN ENCODING AND DECODING TECHNOLOGY. SO, YOU TRAIN IT, AND ONCE YOU BUILD THE MODEL, THEN YOU USE THAT MODEL TO DO TRANSLATIONS. TRAINING IS VERY MUCH PART OF THE USE PROCESS.”

LARA, on the other hand, is Translated’s next-generation solution. It uses open-source large language models and currently supports 10 major languages, with plans to expand language coverage. Fundamentally, the underlying technology is quite different; ModernMT is built for rapid, real-time adaptation, while LARA leverages a larger contextual framework but is a fine-tuned foundation model that is optimized for production use.

We focused our questions on ModernMT, their established NMT solution, and he gifted us with a thorough explanation of its concepts and how it works for us, translators.

“MODERNMT CAME FROM A VERY TRANSLATOR ORIENTED ENVIRONMENT. SO, IT WAS ALWAYS SEEN AS “CAN MACHINE TRANSLATION BE USEFUL TO TRANSLATORS?” AND CAN IT BE USEFUL DAILY, NOT JUST AT THE ENTERPRISE LEVEL, BUT ALSO AT AN INDIVIDUAL TRANSLATOR LEVEL.”

Furthermore, this is something for us translators: ModernMT is a solution focused on translators, not companies or agencies. Kirti talked about how ModernMT continuously improves its output by leveraging user-provided data, including translation memories, glossaries, and corrective feedback.

ModernMT uses your translation memory, glossaries, and corrective feedback to train on the fly. As soon as you upload your TM, the system quickly adapts itself, customizing the translation model output to your specific data. This dynamic retraining process means the system continuously improves with every correction, avoiding the need to correct the same error repeatedly. This approach makes it highly responsive for individual translators as well as for enterprise-level operations. The modus operandi of ModernMT, according to Kirti’s explanation, is “simplicity is the ultimate sophistication”:

“… AND THE BASIC UNDERLYING PRINCIPLE, IS A QUOTE FROM LEONARDO DA VINCI, ‘SIMPLICITY IS THE ULTIMATE SOPHISTICATION.’ … IT MEANS INSTANT ADAPTATION TO ANY SPECIFIC PROJECT YOU’RE WORKING ON. IT MEANS RAPID INCORPORATION OF ANY CORRECTIVE FEEDBACK… IN MODERNMT, ALL YOU REALLY DO IS ‘HERE IS MY TM,’ AND YOU POINT TO IT, AND THEN YOU START TRANSLATING, AND IT’LL REFERENCE YOUR TM AND USE THAT TM TO CUSTOMIZE THE MODEL IF IT NEEDS TO. SO, IT’S ALWAYS CONTEXT AWARE.”

We then asked Kirti what it really means for us, translators, to have an adaptive machine translation system versus a standard, static API like Google Translate. How fast does ModernMT adapt, and what does that mean in practical terms for a translator?

Kirti explained that adaptive MT means that ModernMT uses your own translation memory (TM) and corrective feedback to adjust the model immediately. For example, as soon as you upload your TM via the web portal, ModernMT begins training on the fly—often within milliseconds. This rapid fine-tuning means that if you correct an error, the system learns it quickly and applies that change in future translations. Essentially, it personalizes the output based on your data, which static systems like Google’s API cannot do unless you have massive datasets and follow a bulk training process with AutoML.

One of the standout features of ModernMT is its ability to work effectively even with smaller datasets. While many MT engines need hundreds of thousands of segments to make a noticeable impact, ModernMT can start adapting with as few as a thousand segments. As you translate, ModernMT uses recent data to re-rank and adjust its output, leading to quicker adaptation and lower error rates over time. Also, ModernMT can manage multiple TMs simultaneously. Whether you are working with several clients or different subject areas, ModernMT automatically identifies which TM is relevant to the current translation task. In Kirti’s words:

“ADAPTABILITY GENERALLY MEANS I USE MY DATA TO INFLUENCE HOW THE BASIC SYSTEM WORKS, AND MY DATA NOW DOMINATES THE BEHAVIOR OF THE SYSTEM… SO, AS YOU TRANSLATE AND YOU SEE THERE’S AN ERROR, YOU CORRECT IT. THEN, IF FIVE SENTENCES DOWN, THERE’S ANOTHER SIMILAR SENTENCE, THERE’S A LIKELIHOOD THAT IT WILL BE CORRECTED. SOMETIMES IT MAY TAKE TWO OR THREE CORRECTIONS BEFORE THE CORRECTION TAKES, BUT IT’S DESIGNED TO WORK IN REAL TIME, DYNAMICALLY, CONTINUOUSLY, TO LEARN AND IMPROVE, EVEN ON AN INDIVIDUAL BASIS.”

After Kirti’s presentation, we had a Q&A. Here, we present a merging of what happened during the webinar and the answers provided by Kirti in follow-up emails to questions not answered online due to time constraints. Since part of it is not recorded, his answers are fully presented here with some editing for a deeper understanding of our readers.

MORE ABOUT MODERNMT AND COMPARISON WITH LLMs

One of our participants asked about the difference between LLMs and neural networks like DeepL and the reason why LLMs perform worse than DeepL. Kirti explained that LLMs have many capabilities beyond translation and require sophisticated priming for optimal performance. The industry is still learning how to do this well, and we are just now approaching the development of a translation-optimized LLM. I think for top-tier languages like FIGSP-CJK, LLM MT will soon consistently outperform NMT models like DeepL. In 15 months, this could be a much larger base of languages, but we should expect to see NMT remain in use for domain-focused production systems for at least 1 -2 more years.

Another participant inquired about studies conducted by Translated, specifically those comparing ModernMT with its competitors. Kirti said that Translated had conducted several independent studies over the years, which show that ModernMT outperforms competitors when TM, Glossaries, and Corrective Feedback are available. A detailed overview of a study done last year is presented here: https://blog.modernmt.com/comparing-mt-system-performance/

Kirti also answered that in the article https://blog.modernmt.com/mt-and-the-translator/ there are successful use cases of ModernMT for freelance translators. In this post, the interviewed translators also provided feedback on needed and desirable improvements.

He also explained further about ModernMT’s fine-tuning.

First, he provided the link https://blog.modernmt.com/understanding-adaptive-machine-translation/ for the article where the differences between “static” and “adaptative” MT are explained in more detail.

Second, he talked about strategies to fine-tune ModernMT. He explained that multiple levels of fine-tuning can be provided: 1) TMs, 2) Glossaries and terminology, and 3) Corrective feedback, which reduces future errors. Aggregating your TMs is generally recommended unless there are very different uses of similar words in the new translations. However, collecting your TMs, glossaries, and actively correcting MT output all contribute to improving MT output that needs less correction. If the strategy chosen is to use TM developed for a specific client, all you need to do is reference the TM after uploading it so that ModernMT can reference it. ModernMT will use it immediately.

Third, in answering about the maintenance of glossaries on the fly with ModernMT, Kirti explained that ModernMT supports glossaries at a basic level, as described in the link https://www.modernmt.com/api#glossary-file-structure. However, LARA will have a much richer glossary capability that will handle verb inflections and other variations that have been difficult for traditional MT systems.

Lastly, when talking about which type of data feeds ModernMT, he answered that ModernMT uses general web data as well as a large amount of human-curated private data to ensure the model performs better than generic solutions like Google. It is a continuous evolution model, which means it improves daily if you provide relevant TM, glossaries, or corrective feedback. All these contribute to dynamic improvement.

DATA SECURITY

We asked about data security, privacy, and confidentiality and how ModernMT ensures them. Kirti explained that ModernMT was built with security in mind, it is compliant with ISO 27001 standards and adhering strictly to GDPR guidelines. Users control how long their data stays on our servers, and access is limited strictly to customers or Translated’s engineers—only with explicit permission. In ModernMT’s controlled, secure environment, once translations are done, you also have the option to remove any trace of your data immediately. This setup contrasts with some public LLMs where data might be used in unintended ways. Specifically, regarding the identity of people whose personal documents are being translated, their identities are protected in ModernMT’s systems because no data is kept beyond a short window needed for debugging. Even this can be eliminated if there is a data security concern so data can be immediately deleted after translation.

INTEGRATIONS

We also asked about another common question: integration. Kirti explained that ModernMT’s integration is available now in three platforms (MateCAT, Trados, and MemoQ) because they allow dynamic continuous training ModernMT’s uses. Other systems (e.g., Phrase, XTM, etc.) can submit data to ModernMT, but there is no dynamic improvement since they are architected for batch training.

LARA is currently available for use only on MateCAT, but it may follow the steps of ModernMT in the sense of being available for other CAT tools.

PRICING

Another important question was about pricing, about what the pricing model for ModernMT is and LARA, and how it compares to other engines. Kirti said that Translated’s pricing is very competitive. For ModernMT, the cost is around $15 per million characters – roughly 200,000 to 250,000 words. There’s also a monthly subscription option, which might be about 25 euros for ModernMT and around 10 euros for LARA. You only pay for the words you translate.

FUTURE

We also asked about the future, taking into consideration generative AI and tools like ChatGPT entering the scene. How do you see the MT landscape evolving, particularly for freelancers and large enterprises? Kirti answered that he believes the quality of translations for top-tier languages will continue to improve dramatically as more data becomes available. With tools like LARA, you’ll see more context-rich translations that reduce the need for extensive post-editing. However, NMT systems like ModernMT will remain essential for domain-specific production tasks, especially for languages where data is less abundant. In the near term, we expect both adaptive NMT and LLM-based approaches to coexist. Translators will increasingly take on a role that involves steering and fine-tuning these systems rather than doing all the translation manually.

Kirti also thinks that the likelihood is high that the machine will become a much more able and useful assistant to a human expert and allow work to be done faster and more accurately as translators learn to steer the new technology.

As translators, we are used to systems/CAT tools that segment the content and present it in a 2-column. Kirti said that the Source & Target window perspective will remain a preferred model for most people. However, there is an increasing use of speech-to-text, which is encouraged and improved by the traffic and widespread use of mobile phones. But for now, I think we will continue to see a bi-columnated view where the translation can immediately be modified if needed or deemed to need improvement.

Another interesting question about the future was about the changes in the MT landscape with the adoption of tools based on generative AI (like ChatGPT) and tools like LARA. Kirti thinks it seems likely that LLM MT will gain momentum, but we should expect that NMT will coexist for the next few years as well. LLM costs and latency (how fast it returns with a translation) are still challenges, and we are still in the early stages of defining how best to fine-tune an LLM engine.

One of our participants asked about Translated’s point of view about the synergy between LARA and human translators. Kirti explained that LARA is currently available in 10 major languages where data is plentiful and is performing at what can sometimes be characterized as professional human translators work. This happens mostly when specialized fine-tuning and TM and other context data are available. In addition to TM, it also uses much more contextual data around the material that is being translated, including communications between translators and reviewers so that it also learns about the expert human opinions on translations. Thus, based on accuracy and error data collected over the years from thousands of projects done at Translated, we have established error levels that we see from different translators in routine production work. We see that average translators produce about 5 errors per thousand words (EPT) and that the top-performing translators produce less than 2 errors per thousand words. We regularly see that LARA produces less than 5 EPT and is expected to improve.

What is astonishing is that when LARA is properly tuned and optimized, it performs better, i.e., with a lower EPT than average translators across a specific and very large sample. Typical MT systems produce 10 – 12 EPT. This does NOT mean that we see a future without translators; rather, we see that expert humans will assume a larger steering role on new and varied projects.

Kirti said that Translated will make LARA available to all translators who engage with us in production work and make concerted efforts to compensate translators fairly based on measured productivity impacts. For example, if we see that production translation throughput is 10,000 words/day, we will compensate translators as if the MT contribution was less so that translators earn as much as they did before the better MT became available. It is also expected that human experts will take on a larger oversight and AI steering role that could be compensated in different ways rather than on a per-word basis.

Whisper and ChatGPT are side-contributors to this blog post.

NOTE: LTD TEKTalks are not language tool trainings. We aim to showcase language industry tools and technologies, and especially the people and philosophies behind them.