06 70 33 24 905
Found in Translation: How to Stream Video in Multiple Languages

Reach out to an international audience. Here are video translation steps to follow for strong results, as well as expert tips for high quality.

Many companies consider multilanguage video to be so valuable that they will not talk about it on the record. That’s right, for many companies, multilanguage video amounts to a trade secret.

That fact alone should cause any producer to think for a moment: How much more value would my videos have if they were immediately accessible to people who do not speak my language? The answer applies equally when considering a public audience, as it does when communicating with a diverse pool of employees or customers.

Moving pictures in multiple languages is nothing new. Movies and television have been translated and dubbed for decades. In the U.S., secondary audio tracks have been on TV since the 1980s and are part of the DVD and Blu-ray standards. It’s common to find major releases in two or more languages.

Multiple audio tracks are widely supported by major streaming platforms, CDNs, and players. Yet, the availability of multilanguage video lags. In the U.S., the iTunes Store offers a pretty good selection of new releases, and Netflix has a limited inventory. However, you’ll be hard-pressed to find anything in more than one language on Amazon Instant Video or Google Play.

What we can conclude, then, is that the most critical aspects of creating and distributing multilanguage video aren’t technical. Translation, the key element of multilanguage video, is its own specific skill. Doing it well requires expertise that most producers do not have in-house.

The good news is that a growing range of options is available. I talked with representatives from four vendors that offer translation services to match a variety of applications and budgets.

Sovee and Moravia both offer services to localize a wide variety of content—including documents and websites—as well as video. Ramp is a cloud video platform that is rolling out a translation product in the fourth quarter of 2014. Groovy Gecko specializes in live webcasts, with the option to provide live multiple language tracks.

On-Demand Video

There’s a wider array of multilanguage services available for on-demand content than live content. Working with finished productions allows for more options to balance quality and turnaround time with cost.

Many of these services are offered by vendors that started in localization of documents and websites. They also offer broad expertise that can help clients with some of the cultural and practical implications of delivering videos in languages other than your own. At the same time, the increasing accuracy and accessibility of computer-based tools allows more companies to enter the space.

Still, as it stands, there is no commercially available software that will automatically translate speech. Therefore, translation requires several steps.


The first step is creating an accurate transcript in the original language, as you would for captioning. If you already have an accurate script or transcript, most vendors can work with that.

Otherwise, speech-to-text engines have come far enough that you can choose to have a fully machine-generated transcription, have a machine-generated one that is corrected by a human, or a fully human-generated transcript. Machine transcription speeds up the turnaround time and reduces cost, but is not yet perfect.

When choosing among these options, Moravia CMO Renato Beninatto says the source language is a factor. For videos in which the source is English, the company uses a hybrid approach, with humans editing a machine transcription. He noted, however, that the accuracy of automated transcription is lower with some less-common languages.


The choice of transcription method also depends on how your video will be translated. Here you also can choose from machine-generated, machine-generated with human assistance, and fully human-generated translations.

Ramp will offer machine translation with its service. That’s why president and COO Stuart Patterson says, “If you’re going to translate that video you need to start from a human-transcribed version of the audio track.”

Automated translation can be around 90 percent accurate, as can automated transcription, Patterson says. “If you are 90 percent accurate on the transcription and 90% accurate on the translation, then you would be around 80 percent accurate in the final piece.”

Sovee also uses machine translation, which is then corrected by people. Those corrections are entered back into the company’s Smart Engine system to help improve future accuracy. Even so, Sovee founder Steve Steele says, “We don’t ever encourage people to produce a video through just raw machine translation. I don’t think that most people would be happy with that.”

Translation can be machine-generated, machine-generated with human assist, and fully human-generated. RAMP’s upcoming product will use machine translation. 


The third step is voicing the new language soundtracks. Once again, the choice is between human and machine, using a flesh-and-blood voice actor or a synthesized voice. Using a synthesized voice is both faster and less expensive than a real human, though it might not be appropriate for all projects.

“The synthesizers are powerful,” Patterson says. “It’s amazing how many languages can now be synthesized automatically.”

Sovee President Scott Gaskill says that using synthetic voice technology allows the company to quickly translate and provide the language soundtrack for “perishable videos,” such as when “you’ve got to have the video out in 24 hours and it’s no longer valid in 2 weeks.”

With synthesized voices, Steele says, “a lot of our customers are saying, ‘Hey, that’s more than adequate because I can get something done faster, and a whole lot cheaper than with human [voice] talent involved.’”

Yet using human voice talent offers a wider range of expression to match the content and tone of your video in way that computers just aren’t able to. Synthesized voices might work well for an instructional video with only one or two speakers, but things grow complex when there are more voices. Things like regional accent and dialect should also be kept in mind.

As Beninatto says, in some countries or regions, “You wouldn’t want a particular accent associated with your brand.”

In the Sovee SmartEngine, the English and Spanish (Mexico region) translations are displayed side-by-side in sequence. The boxes below each sequence allow human post-editing if needed, such as telling the SmartEngine to keep a certain word in English.

For instance, he says, “In Brazil there are five accents, and two major ones. You want to pick the appropriate one and have it be natural and consistent” across your productions. Although Portuguese is spoken in both countries, using a voice actor from Portugal might not be appropriate for a video intended for a Brazilian audience.

Another consideration that might not be obvious is that different languages require different lengths of time to express the same idea. That means a translated audio track can get out of sync with the video.

Beninatto says that even though English is commonly thought to be more compact than most other languages, “A more accurate statement is that translation always increases the length. In my experience it doesn’t matter. If I translate Portuguese into English it will grow by 20 percent, even though English is theoretically shorter.”

There are two ways to compensate: Edit the video or edit the voice script. With screencasts or videos where the speaker is not seen, it can be appropriate to add or remove frames to fit the timing of the translation. However, vendors also are able to edit videos featuring live action and on-camera speakers.

Steele gives the example of a promotional video that a client, real estate contractor CGL, wanted Sovee to translate into Spanish and Arabic. The original English video was 2 minutes long, but went to 2:36 in Spanish and 2:45 in Arabic. (Both videos can be viewed at etheatershowcase.sovee.com.)

The different lengths were not an issue for the company’s website. However, CGL wanted to use them as commercials, fitting a slot of exactly 2 minutes. Ultimately, they offered to edit content “without disrupting the message,” or speed up the speaker a bit, and ended up using a combination of the two strategies.

Beninatto says that for “artistic videos” with significant investment in production value—such as commercials—Moravia will edit the script to fit the length and pacing. In some cases, particularly with animated or slide-based instructional and elearning content, Moravia will produce the video itself, along with all language versions.

On-screen text is another thing to consider in editing. If that text is relevant to understanding the video content, then it should be translated, too. Some vendors will help with this, too.

Quality Matters

When having a video translated it’s best to provide the vendor with a master file. Steele says Sovee “will take any file they have. There has not been a format that we cannot use.” Nevertheless, he still recommends providing the highest-quality file, “because you can never have output that’s higher quality than the input.”

Although it seems obvious, to get a good transcript it’s very important that the voice track have clear and comprehensible audio. If the video contains background music or sound effects, Steele also suggests providing them as separate tracks so that they can be mixed back in with the final translated voice track.

Because translating videos requires so many steps, most vendors like to work closely with clients to make sure that the workflow, cost, and the final translated video meet the client’s goals.

Services such as Sovee and Ramp offer their own video management and streaming platforms into which their translation technology is integrated. In most cases that means once a client has developed a workflow with the vendor, each translation job can be ordered and completed quickly.

Moravia takes a very hands-on approach, providing a great deal of customization, as well as consultation on local preferences for the target audiences, and choosing appropriate voice talent. That’s why Beninatto says the company “works with large programs and projects, not one-offs.”

Turnaround times can vary based upon the complexity and length of the video itself, along with the number of languages. For instance, Steele says a 3-minute, 17-second video for the Masters golf tournament, which required a full English transcription, was translated into nine languages in 2 1/2 days. This included editing for time synchronization in all nine finished videos.

Due to differences in length and on-screen text, each translated video is produced and delivered as a separate asset. Generally, you won’t find a video with selectable language soundtracks, since the videos themselves will have been edited to fit each language. However, that means translated videos will work with nearly every video platform or CDN.

What About Live?

Providing multiple languages in a live streaming environment requires a workflow that still relies heavily on human talent. U.K.-based Groovy Gecko specializes in translating live events for internal and external audiences.

According to Jake Ward, Groovy Gecko’s business development director, the company uses a team of translators for an event, each of whom is able to speak with the correct dialect and accent for the target market. Two translators are required for each language.

Because live translation is inherently in sync, viewers can switch language tracks on the fly. “We always give the option to select languages, even when we know the locations where people are coming from,” Ward says.

Ward says Groovy Gecko works with many CDNs. It does, however, have a software layer for Akamai that uses the Flow player. After the event is over, Groovy Gecko is able to provide an on-demand version within 5 minutes.

With the kind of manpower and complexity involved, live translations services are not something to order up just a few days ahead. Ward says clients should contact the vendor at least 4 weeks in advance if the only thing needed is live video translation. However, when other materials, like PowerPoint slides or localized websites, also require translation, this requires more time, a minimum of 6 weeks.

Aside from choosing the languages and voices, Ward says another consideration is being ready to deal with technical support questions in the same languages. Also, since one of the reasons to provide multiple-language streams is to grow audience, a client should also be prepared for the increased demand for their stream.

Ward says the cost for a typical, single-language live webcast starts at about £6,000 (about ,560). “When you you’re talking about multilanguage with five languages, it's more like £20,000 (about ,889).” While adding translation basically triples the cost, Ward notes that the value comes from potentially serving five times the audience.

Final Considerations

Translating videos into multiple languages opens them up to wider audiences. Thanks to advances in computerized transcription and translation, the cost and turnaround time for on-demand videos are both going down. Still, multilanguage is an investment that merits planning and consideration.

When evaluating a vendor be sure to see examples of their work. Ask what edits or changes were made to produce the final, translated versions, and how long the process took. Especially if you’re translating into languages you don’t speak, get the opinion of a native speaker on the quality of the translation itself, along with the accent, style, and dialect of the voice, whether human or synthesized.

The vendor should want to know about your workflow and what your goals are. The company should be able to discuss how the tradeoffs involved to get a lower cost might not be acceptable for your application.

Even though computers have become much better at translation, they still are not very good at helping you figure out the cultural considerations that go along with working in other languages. That’s something to keep in mind, too.

Now, allez, seguir adelante, and go forth!



Hungarian, English, German, Russian, French, Portuguese, Spanish, Swedish, Italian, Czech, Serbian, Danish, Bulgarian, Croatian, Slovakian, Polish, Romanian, Slovenian, Flemish, Belorussian, Catalan, Dutch, Turkish, Albanian, Ukrainian, Greek, Bosnian, Catalan, Estonian, Finnish, Latvian, Lithuanian, Irish, Maltese, Armenian, Arabic, Hebrew, Thai, Japanese, Chinese, Korean, Vietnamese

1x1 Translations Ltd.


Phone: +36 70 33 24 905

Email: info@1daytranslation.com

Skype: onebyonetranslation

All rights reserved | 1x1 Translations ©
Website made by: