Introduction
The Multilingual XML Filetype is a new filetype available for Trados Studio SR2 and later. The basic usecases for this filetype are:
- create bilingual or multilingual projects from an XML file containing at least two languages
- handle large CDATA dumps in XLIFF files correctly
- handling well formed, but invalid XLIFF files that Trados Studio will not handle out of the box
- handle bilingual or multilingual XML files without the need to copy the source into the target element and translate the target element only.
- handle partially translated bilingual or multilingual XML files
Installation
The filetype can be installed through the AppStore integration in Trados Studio, or by downloading the sdlplugin file from the website and double clicking it to invoke the plugin manager. Once installed you should find the new filetype installed here:
You will also find two new batch tasks:
These will be explained below.
Settings
Multilingual XML
- the only option available here is to specify the file dialog wildcard expression if needed. In this example you can see *.tmx has been added. This would allow you to use this filetype to translate a TMX file. The only requirement is that the type of file you are opening is a well-formed XML file.
Language Mapping
- here you specify the location of the languages root element. This is done by using an XPath expression. In this example the TMX file looks like this:
The XPath expression to the languages root element would be:
/tmx/body/tu
This is because the languages are defined within the tuv element for each tu element. - each language in your file can be added to this section by clicking on the Add button, defining the Language and specifying the XPath query that locates the actual translatable content:
In this example the translatable content is actually in the seg element, but the language defined in each case is controlled by an attribute in the tuv element. So the XPath expression needs to specifically point to where each language will be found:
tuv[@xml:lang='EN']/seg
The screenshot above shows multiple languages because the file is a multilingual TMX and this filetype can support the creation of a multilingual project from this single TMX file.
A simpler example could be something like a basic bilingual XML file:
Here the settings would be like this:
The Language Root XPath would be:
/sitecore/item
The location would be the language element used in the file:
en and de respectively.
It's very important to make sure that the paths are correct. The main reason for the filetype not being associated with the file opened for translation is failing to accurately specify the XPath locations. If you have any questions on how to do this for your filetype the recommendation is to ask for help in the appropriate forum
XML files come in all shapes and sizes so some knowledge of how to use XPath is essential. If you are completely new to XPath you may also find these resources useful:
- XPath tutorial in W3Schools - https://www.w3schools.com/xml/xpath_intro.asp
- XML in a nutshell, chapter 9 XPath - https://docstore.mik.ua/orelly/xml/xmlnut/ch09_01.htm
- More regex? No, it's time for something completely different - https://multifarious.filkin.com/2013/07/30/xpath/
- X Files - ATA 56 - https://multifarious.filkin.com/2015/11/07/x-files-ata56/
The last two at least relate to using Trados Studio.
Embedded Content
This section is the same as for any embedded content processing in Trados Studio. An important point is that you do not have this ability to use another filetype (html for example) as an embedded content processor in the XLIFF filetypes supported by Trados Studio. So this is a big advantage, especially when handling XLIFF files that make heavy use of large CDATA sections in single translation units.
Placeholders
This section supports the ability to "tag" text as a placeholder in Trados Studio irrespective of whether you are also using an embedded content processor or not. This is an extremely useful feature because all other filetypes in Trados Studio that offer embedded content processing are either one or the other, so regex rules for everything OR use another fietype like html for the embedded content processing.
The filetype comes with four predefined rules that you can use if appropriate, or remove and just create your own rules. The process is simple, just click on Add:
Then write your regular expression, define the segmentation hint, and if you wish you can add a description. It's worth adding a description because you can also import/export the rules you have created to make it easier to share with others, or even maintain a library for yourself that you can simply import and then remove the ones you don't need as a starting point when setting up your filetype for a specific project. The different options available to you are defined across the toolbar:
Entities
This section provides entity support that works the same way as it does for any filetype in Trados Studio that supports them. See the producthelp for more details.
QuickInsert
This section provides support for QuickInserts that works the same way as it does for any filetype in Trados Studio that supports them. See the producthelp for more details.