Skip to main content

How Unbabel Delivers Scalable, High-Quality Translations

The advantage of machine translation software is that it’s fast — way faster than humans — and can translate messages in milliseconds, allowing for scalability. However, the quality of machine translations can vary from human-like to terrible. And while a subpar translation may be fine for tourists trying to read a street sign, it’s unacceptable for business use cases because bad translations can frustrate consumers, and, in worst cases, can be misleading or offensive. That reflects poorly on the business and its brand.

So how do we measure “quality” when it comes to translations? It’s challenging because translating is traditionally more art than science. The definition of a “correct” or “wrong” translation is nuanced and highly dependent on the use case and customer expectation. Understanding this, we developed software and a quality evaluation framework to produce better machine translations.    

Let’s take a look at how Unbabel delivers scalable, high-quality translations to eliminate language barriers and drive better customer experiences. 

A unique approach to language translation 

The Unbabel LangOps platform combines leading MT technology with humans-in-the-loop to produce high-quality multilingual translations. That is to say, we blend the speed and scalability of machine translations with the quality of human translators all within a process for detecting and correcting translation errors before they reach the end user. We then utilize feedback from our community of human translators to retrain the AI so that it evolves, grows smarter and more fluent, and needs less human input over time.

Our technology can be customized to fit the business needs of each enterprise customer so that translated content and marketing messaging are always accurate and on brand. And for businesses, translations must be on brand for them to be “high quality”.    

How does Unbabel estimate translation quality?

To illustrate how Unbabel estimates translation quality, let’s consider an asynchronous communication (e.g., email, message sent through an online form) between a retail customer and a support agent, neither of whom speak the same language. 

Unbabel’s machine translation system sits invisibly between the two parties, translating the communication into the receiver’s native language.   

Within milliseconds, the machine-generated translation is sent to Kiwi, our proprietary AI technology that estimates translation quality. Kiwi identifies likely translation errors in each machine-translated sentence and then assigns each sentence an estimated quality score.  These scores are then aggregated to determine an overall score for the entire message.  Kiwi then determines whether the message translation meets our predefined target quality levels and can go directly to the end user, or whether it needs to be sent to a human editor. 

In this way, Kiwi serves as a filter, catching subpar translated communications and redirecting them for human review, while allowing high-quality translations to go directly to the receiver. Leveraging Kiwi allows for faster and more affordable high-quality translations as only lower-quality translations are diverted for human review.

How does Unbabel evaluate and train its machine translation systems? 

As previously mentioned, Unbabel customizes a machine translation system for each enterprise customer so that it speaks to the brand’s language. In order to ensure the engine delivers high-quality translations, we’ve developed COMET — a neural AI system designed to evaluate MT systems at the time they are trained or retrained. COMET measures quality by comparing translations generated by an MT system for curated test sets to a human-verified version (a ‘gold standard’ translation). This allows us to compare COMET scores for different MT systems on the same test data and determine which system generates more accurate translations.

Furthermore, our machine translation systems are continuously retrained on newly accumulated human-edited and corrected translations. By frequently evaluating the translation system itself using COMET, we can ensure that its machine translation model is accurate and improving over time. COMET is so innovative and effective that Google, Microsoft, Welocalize, and Intento have all publicly acknowledged COMET as a leading technology and use it themselves.

Quality is the core of our product

Generic machine translation software solutions offer a “one-size-fits-all” approach to translation, and, as such, can never truly reach the highest translation quality as it’s not designed to speak a brand’s unique language. At Unbabel, quality is at the core of our product. Our machine translation software is built on rich domain data, customized for each enterprise customer, and is embedded with technologies we pioneered to estimate and evaluate translation quality for unparalleled accuracy.

In fact, both Kiwi and COMET have received ‘Best Paper’ awards at flagship academic conferences. To further innovation and support the translation community at large, we make versions of our translation quality technology available publicly, all in the service of removing language barriers so that people can connect more easily.  

Check out our article Is Google Translate Right for My Business? to delve deeper into the quality of machine translation. 



About the Author

Alon Lavie is the Vice President of Language Technologies at Unbabel. He leads and manages Unbabel’s U.S. artificial intelligence lab based in Pittsburgh, and provides strategic leadership for AI R&D teams company-wide. Previously, Alon was a senior manager at Amazon, where he led the Amazon Machine Translation R&D group.

Profile Photo of Alon Lavie