Custom bilingual XML with HTML embedded content

Hi Community,


I received a custom XML file, with HTML embedded content, with both the source and the target content in it.

Screenshot of XML code with HTML tags such as 'span' and 'li' not being processed correctly in Trados Studio.

Screenshot showing XML content with embedded HTML not recognized by Trados Studio, displaying tags like 'span' and 'li' as plain text.

I tried creating a custom file type in Studio, but despite selecting the 'HTML embedded content processor', the HTML code such as <span> and <li> is not picked up and processed by Studio.

Also both the source content and the translation should be imported into the file, so the translation can be reviewed in Studio. I am not quite sure where to start.

Please find below a sample file and the custom file type, as far as I got:

XML file.zip

Can you please help me, pointing me to the right direction?

Thank you!!
Greta



Generated Image Alt-Text
[edited by: Trados AI at 4:46 AM (GMT 0) on 5 Mar 2024]
emoji
  • Studio doesn't support BilingualXML so you can only handle this using a monolingual XML filetype as you have attempted to create.  But this is going to be useless to you unless you can copy the source into the target and translate the target, then add a TM (perhaps) that contains the current translations.  So if you have to use Studio for this your course of action would be this:

    1. create a TM from the existing file
    2. manipulate the XML to get the source into the target element

    To do the first I try this perhaps... extract the source with a custom XML filetype and export to Excel (appstore).  Then extract the target (change filetype settings to pull the target) and export to Excel.  Merge the two Excel files into one and convert the Excel to a TM (Glossary Converter perhaps... many ways to do this).

    Then using a text editor that supports regex you replace the target content with the source.  Then Open the XML again, using the target extraction in your custom filetype (this will now contain the source content), pretranslate from your TM and review.

    Finally save the target file and you have a reviewed target.

    A bit messy... if your client has a developer I'd recommend creating a bilingual XML filetype to handle this, especially if they get a lot of these files.  The SDK contains an example of how to do this and I know several companies who have done it, but none of them have put their solutions on the appstore I'm afraid.

    Maybe someone has a better idea... that would be my approach if I was unable to develop a filetype for this.  Sometimes I think the cost of employing a developer to create filetypes when you need them would pay for itself over and over again.  Studio is great for custom solutions if you have a developer!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Thank you Paul for your help, I understand it now.
    Let's see what option the client chooses.

    One last question please:

    Do you think Passolo could handle mutlingual XMLs? I understand it handles multilingual Excel files.
    Maybe then we could prepare the SDLXLIFF in Passolo? 

    Thank you,
    Greta

  • I heard that Passolo (which I don't use) can handle bilingual XMLs. I developed a method to work with bilingual XMLs in Studio, with segmentation and automatic alignment. Works well if there is sufficient segment parity. I can give you more info if you are interested.

    Daniel

    EDIT:

    My approach would produce this:

    Screenshot of Trados Studio showing a bilingual XML editing method with side-by-side comparison of source and target segments, including segmentation and automatic alignment indicators.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:46 AM (GMT 0) on 5 Mar 2024]
  • Is that a conversion to XLF?  Would indeed work well with this xml file.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Yes, I am converting to XLF.

  • Hi there,

    and what is with the "Bilingual XML File Type 1.0.0.0"? I use it time to time. I know it is not supported and it had disappeared from Internet but it would be great for Studio.

  • This new filetype for 2021 SR2 may be a good solution for you:

    https://community.rws.com/product-groups/translationproductivity/w/customer-experience/6039/multilingual-xml-filetype

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi . I'm facing the same issue and would be interested in your approach for automatic alignment. Would you mind sharing it ?

    Thanks in advance!

    emoji
  • Hi ,

    I don't know whether my approach is universally applicable, but here is how I do it.

    I usually apply this to short texts from a CMS. In this CMS, each field (page title, meta description ...) will be in a separate XML element, so I know that a certain element in the source language corresponds to a certain element in the target language.

    Example

    Field "page title":

    EN: "Our best product"

    DE: "Unser bestes Produkt"

    Using Regex and a program called PowerGREP (you can do this with any scripting language as well) I split the bilingual file into a source and a target language file per field, so I'd come out with a list of files like:

    Folder "EN":

    001-soandsopage-pagetitle.xml
    002-soandsopage-metadescription.xml

    ....

    Folder "DE":

    001-soandsopage-pagetitle.xml
    002-soandsopage-metadescription.xml

    ....

    Using Okapi, I do a sentence alignment for each of these files. The outcome is a list of XLF files, one per file pair.

    Those files I translate in Trados Studio.

    Then I use Okapi to convert the translated XLF files back to XML, merge the XML files into one and upload.

    That's the outline. Hope it helps.

    Daniel

    Do give the Multilingual XML filter a try, it has quite an ingenious way of handling bilingual alignment which is more manual, but also more robust than my method.

    emoji
  • Thank you,  !

    An interesting approach, although I'd say it's even a bit more laborious than the one I have managed to come up with.

    My source file = "lazy xliff" (bilingual xliff from WordPress, already pretranslated, big chunks of code within CDATA elements). Built-in Trados xliff filetype delivered bad segmentation (paragraph-level) and some tag issues. Multilingual xml filetype delivers nice segmentation and tag handling, but alignment issues. So, using multilingual xml file type, here's my workaround:

    1. Process source xliff/xml file [a].
    2. Copy text to target (all segments)
    3. Rename source xliff/xml file [b] and process it again, but in reverse (mapping xml target element as the source lang.)
    4. Export [a] & [b] as Word (I use SDL XLIFF Converter)
    5. In Word, merge the exported [a]+[b] files (by copying either source or target  column of [b] to target column of [a])
    6. Import merged Word file back to Trados, into file [a].

    BEFORE:

    Screenshot of Trados Studio interface showing a segment comparison with a red box highlighting alignment issues in the target text.

    AFTER:

    Screenshot of Trados Studio interface after corrections, with a green box indicating proper alignment and segmentation in the target text.

    Hope it helps and please feel free to suggest any improvement ideas.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:46 AM (GMT 0) on 5 Mar 2024]