How to translate a multilingual XML file in Trados Studio 2017

 My client sent me an XML file with the following format:

  <Item key="[ezSearch] Search.Compare" parentKey="[ezSearch] Search">
    <en-GB><![CDATA[Add To Compare]]></en-GB>
    <nl-NL><![CDATA[]]></nl-NL>
  </Item>

Trados Studio should read the en-GB content and than save it to the nl-NL target, with the following result:

  <Item key="[ezSearch] Search.Compare" parentKey="[ezSearch] Search">
    <en-GB><![CDATA[Add To Compare]]></en-GB>
    <nl-NL><![CDATA[Toevoegen aan vergelijking]]></nl-NL>
  </Item>

I am unable to do this in Studio right now and was wondering how I can do this?

Parents Reply Children
  • Apart from the "little detail" that Passolo costs fortune :( so it's not a solution for one-off need of standard translator, I'm also wondering if it can really handle this particular stupidly structured XML where language code is used as node name... (especially if the target node would not exist in source XML and would need to be created).
    I used Passolo quite a long time ago, so I'm not sure... but I thing that such situation is a problem...

    For the same reason it gets pretty complicated also using XML DOM, at least using MSXML2 I'm used to (because it's built-in in every Windows... because basic XML operations simply do not support "copying/cloning XML node under different name" or changing node name :-\.
    Though, there are XML DOM implementations which do support these operations...

  • Agree Passolo is for engineers working with software localization projects regularly. So below is just for informational.

    Yes the source format does not look suitable for localization into multiple languages. With some regex search and replace, the source XML will need to be converted to look more like this, then Passolo will be able to parse it as multilingual XML.

    <?xml version="1.0" encoding="UTF-8"?>
    <items>
    <Item key="[ezSearch] Search.Compare" parentKey="[ezSearch] Search">
    <ttxxtt xml:lang="en-GB"><![CDATA[Add To Compare]]></ttxxtt>
    <ttxxtt xml:lang="nl-NL"><![CDATA[]]></ttxxtt>
    </Item>
    <Item key="[ezReplace] Replace.Compare" parentKey="[ezReplace] Replace">
    <ttxxtt xml:lang="en-GB"><![CDATA[Add To Replace]]></ttxxtt>
    <ttxxtt xml:lang="nl-NL"><![CDATA[]]></ttxxtt>
    </Item>
    </items>

    Multilingual XML needs to contain appropriate xml:lang attributes or similar for Passolo to understand where to write translation, but if target entries are not in the source XML, Passolo will add target entries as needed. One note for element names for translatable content, if you have fairly unique names, you won't run into namespace issues.

    Thanks,
    Naoko
  • I also think that after looking at the different solutions that I'd just use regex as first suggested. It looks like a pretty straightforward structure for regex in a decent text editor and then you just process as a monolingual XML in Studio.

    Search for this:
    (<en-GB>)(.+?)(</en-GB>.+?<nl-NL>).+?(</nl-NL>)

    Replace with this:
    $1$2$3$2$4

    And use dot matches line breaks in your text editor. Might be more embarrassing than your XML DOM Evzen, but for this file I'd use it.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • As I said, regex processing works, not that it doesn't... for one-off processing of particular fixed/given format (I mean physical 'visual' format) you can create regexes tailored for that specific situation.

    But you get easily into problems once you start aiming at generic-usage, reusable scripts, able to work with any possible physical representation according to XML specification... e.g. that attribute values can be enclosed in both quotation marks or apostrophes (<foobar name="Johnny's bar" owner='Johnny "Party Animal" Booze'/>), that there might be characters represented by numeric entities (which can be represented by decimal or hexadecimal number... not mentioning possible embedded HTML and its countless named entities), that you might need to support any possible IETF language tag including all the optional sections like script, region, variant, etc.... and so on.
    That's where you really DON'T want to start re-inventing the wheel (especially because it's fairly impossible using plain regexes) and wnat to rather re-use what some other smart guys have already created - the XML DOM ;-).

  • Unknown said:
    Yes the source format does not look suitable for localization into multiple languages.

    Exactly.
    Now, the most "amusing" part is that it looks like a Sitecore's XML intended for localization... how embarrassing (for the creator of this XML design, of course)!

    (If this particular XML does not come from Sitecore, then it doesn't really matter... the point is that Sitecore's XML has EXACTLY the same problem - language code used as node name)