06 70 33 24 905
News
Preparing Your Content for Machine Translation
2014.10.07


The history of machine translation reaches far back into the past-long before preteens in the mid-1990s were using Babelfish in the middle school library to translate bad words into other languages. Some trace the history of machine translation all the way back to the French philosopher René Descartes, who, in the 17th century, proposed the creation of a universal language in which common symbols would be used to represent equivalent ideas in different languages.

It was in the 1950s, though, that computer scientists began to teach grammatical rules to computers in an effort to create artificial translation machines. In 1951, a team from IBM and Georgetown University demonstrated its machine-translation achievements in a demonstration wherein an assistant typed pithy Russian phrases into IBM cards and the so-called brain returned accurate responses such as, "We transmit thoughts by means of speech." As it turns out, those early machine-learning pioneers who were attempting to teach grammar rules to computers had their process a little bit backward. But we'll get back to that in a moment.

Any organization with an interest in disseminating its content to a global audience needs to familiarize itself with the basics of machine translation. At a time when the globalization of commerce is yesterday's news, the number of companies with the capacity for international ecommerce has grown considerably. Machine translation is not necessarily reserved for the titans of industry, as many translation service providers are beginning to offer more scalable services to assist even small companies with translation projects.

Once you've decided that your business model needs to incorporate an element of translation, you'll begin to start thinking about what'll you need to do to prepare your content for the process of translation. What follows is an exploration of some tips and techniques for getting your content ready.

BETTER ARCHIVES, BETTER TRANSLATION

 Kevin Nelson is the SVP of strategy and technology at MultiLing, a company that has been providing machine-translation services out of Provo, Utah, since 1988. The first tip Nelson would give to anyone getting ready for a machine-translation project is to collect as many examples of text that you've previously translated as you possibly can. (And here's where that part about the history of machine translation being a little bit backward comes in.)

"If you look at the ways humans learn, we learn from memorization first," Nelson says. The machine-translation industry has been moving away from exclusively teaching rules and structures of grammar to the translation engine and moving toward starting by just feeding as much previously completed translation as possible to the server and allowing it to learn from that.

"In the last number of decades, we have started storing previously done translations and feeding them into the servers as a way of teaching them," Nelson says. "Before, translations weren't being stored in a database for future access. Doing everything in a digital way rather than on paper had to come first. But now that we're virtually unlimited in terms of storage space and archive capabilities, this is the direction that we're headed."

The specificity of the translation server you use-and the specificity of the training you implement with that server-is also immensely important to the quality of the translation output, Nelson says. Training your translation server to recognize the terms and constructions of writing that are unique to your industry is vastly preferable to using a machine-translation server that is more broadly trained for general translation.

"If you keep the realm narrow, and you keep content you're creating within that same realm, you'll have a lot better chance of getting good content," Nelson says. "This is part of the big difference between what you get when you use Google Translate, as opposed to using a specific server that is trained for your material. Google is really working on being the translator for the world, but it's still a ways down the road for it to get there. To hand your content to a translator as vast as that, the odds are that it hasn't seen enough content that's related to your industry yet, and the output won't be quite up to business standard." By feeding as much of your archive of previously translated material into your machine-translation server as possible, the server will learn to tailor its translations to the particularities of your translation needs-not just of the language into which you're translating, but the terminology of your industry, the style of your writers, and the voice your company.

WRITING STYLE

That style and voice, as it turns out, are important pieces of creating good translation-ready content. The primary driver of your company's writing style should be your business objective, Nelson says. Beyond that, there are some easy tips to keep in mind when creating your written content that will make the translation process that much easier.

Making sure content is as clean and free of spelling and grammar errors as possible is, of course, one of the first and one of the most important steps to ensuring a good translation. "Machine translation cannot think like a human," says Lori Thicke, founder of LexWorks and its VP of marketing and innovation, a machine-translation service provider based in Vancouver, British Columbia. "A machine-translation engine cannot fill in the blanks if a word is missing as the human brain can or look past a spelling mistake to understand the meaning of a sentence. For this reason, spelling and grammar need to be both correct and unambiguous."

Another important factor to keep in mind is the consistency of your terminology. If your industry uses a particular term as part of its business process, the usage of that term should be consistent throughout your content. "When writing for humans, we use synonyms to avoid repetition of the same term," Thicke says. "But with machine translation, better results will come from using consistent terminology throughout. For instance, ‘double-click on the top menu bar' should always be written exactly the same way, rather than as ‘click twice' or any other variation."

Along these lines, both Thicke and Nelson recommend keeping a glossary of the terms that are most germane to your industry and making sure that your writers are familiar with it and that your machine-translation server is trained on it. "You should be using and identifying your key terms as you're creating your content when you're feeding it into your server," Nelson says. "Those can be pre-translated and fed into your server. If I'm an auto mechanic and I'm writing about axial bearings, I know what that means-and it should always mean the same thing."

 

In addition to a glossary for terminology, another best practice is keeping a style guide. This will be a tool for your writers to use when creating your content, and its focus should be on maintaining a voice, a style, a cadence, and an overarching sound that is consistent across all of your content. The kind of content that achieves the best output from a machine-translation server is the kind of content that MultiLing's Nelson refers to as "tight."

LexWorks' Thicke suggests a few grammatical techniques that also assist in creating good translation-ready text. They are as follows:

  • Don't use demonstrative pronouns-for example, replace "This allows you to" with "This function allows you to."
  • Avoid passive sentence construction-replace "The job was done" with "She did the job."
  • Avoid the conditional-replace "this would open the window" with "this will open the window."
  • Limit sentences to 25 words or less-instead of using clauses, create a new sentence.

Style guides will vary from company to company and industry to industry, but most of the rules cited above are generally good grammar rules, even apart from the fact that they make for good machine-translation-ready content. Even the previously mentioned rules have their occasional exceptions, of course, but on the whole, the cleaner your grammar the better your translation will end up being.

Nelson makes the point that there are automated tools in the marketplace designed to assist companies in maintaining a consistent style and voice. He also notes that there is something of a debate among content creators about the usage of software tools, such as Acrolinx, that work to standardize a writing style across a company's written content. Acrolinx analyzes text within your company and makes suggestions for style and voice based on your own in-house standards. Nelson and Thicke both say tools such as this create content that will more cleanly flow through a machine-translation engine, but again, the decisions about how much autonomy to take from your content creators is a decision that all companies will have to make based on their individual needs.

PREPARING YOUR FILE

Once your content is written and ready to go, it's important to keep in mind a few final tweaks to ensure that it doesn't contain any formatting that is likely to trip up your translator. As a general rule, the best translation will come from content that is structured in sentence and paragraph form and is as free as possible of additional text formatting. Keep these best practices in mind for preparing your file for translation:

  • Use cascading style sheets (CSS) instead of local formatting.
  • Don't embed text in images.
  • Remove extraneous project files (targets, topics, and tables of contents, etc.).

YOU GET WHAT YOU GIVE

It goes without saying that a machine translation of your content can be no better than the quality of the content that you put into it. But even the best-prepared content being fed through the most sophisticated translation engine will still not be perfect.

An important part of the work cycle of machine translation remains the use of human post-editors-trained in the language from which and into which you're translating, as well as in your industry's terminology-to give your content a final polish. "We make sure that translators are experts in the content type for whatever content we are working with," says MultiLing's Nelson. "We build teams to partner for the long term with our customers. And these teams are expected to be with content over course of its life span. When we take on a new client, we identify those translators and post-editors who are most qualified for their content, get them even more familiar with the industry, and, after a very short period of time, these folks are experts in the material as well as in the customer's style of content. The linguist is an important part of the cycle."

If history is any guide, the development of automated translation technology is likely to continue to grow in its accuracy and complexity. We've come a long way from the IBM/Georgetown demonstration in the early 1950s of a few pithy phrases translated from Russian to English on a paper printout. But language, in all its complexity, may remain outside of the grasp of automated technology for a long time to come-if it's ever truly mastered. And so, it will continue to be important for any company with an interest in having its content translated from one language into another (or into multiple other languages) to be mindful of the best practices it can employ to ensure a better output from its machine-translation provider. From style guides and glossaries to providing as many relevant past translations as possible, there are plenty of easy but important steps to take to ensure that the product you get from your translation is as good as it can be.
 
(EContentmag)

Languages

Hungarian, English, German, Russian, French, Portuguese, Spanish, Swedish, Italian, Czech, Serbian, Danish, Bulgarian, Croatian, Slovakian, Polish, Romanian, Slovenian, Flemish, Belorussian, Catalan, Dutch, Turkish, Albanian, Ukrainian, Greek, Bosnian, Catalan, Estonian, Finnish, Latvian, Lithuanian, Irish, Maltese, Armenian, Arabic, Hebrew, Thai, Japanese, Chinese, Korean, Vietnamese

1x1 Translations Ltd.

1DayTranslation.com

Phone: +36 70 33 24 905

Email: info@1daytranslation.com

Skype: onebyonetranslation

All rights reserved | 1x1 Translations ©
Website made by: