Custom bilingual XML with HTML embedded content

Hi Community,


I received a custom XML file, with HTML embedded content, with both the source and the target content in it.

Screenshot of XML code with HTML tags such as 'span' and 'li' not being processed correctly in Trados Studio.

Screenshot showing XML content with embedded HTML not recognized by Trados Studio, displaying tags like 'span' and 'li' as plain text.

I tried creating a custom file type in Studio, but despite selecting the 'HTML embedded content processor', the HTML code such as <span> and <li> is not picked up and processed by Studio.

Also both the source content and the translation should be imported into the file, so the translation can be reviewed in Studio. I am not quite sure where to start.

Please find below a sample file and the custom file type, as far as I got:

XML file.zip

Can you please help me, pointing me to the right direction?

Thank you!!
Greta



Generated Image Alt-Text
[edited by: Trados AI at 4:46 AM (GMT 0) on 5 Mar 2024]
emoji
Parents Reply
  • Hi ,

    I don't know whether my approach is universally applicable, but here is how I do it.

    I usually apply this to short texts from a CMS. In this CMS, each field (page title, meta description ...) will be in a separate XML element, so I know that a certain element in the source language corresponds to a certain element in the target language.

    Example

    Field "page title":

    EN: "Our best product"

    DE: "Unser bestes Produkt"

    Using Regex and a program called PowerGREP (you can do this with any scripting language as well) I split the bilingual file into a source and a target language file per field, so I'd come out with a list of files like:

    Folder "EN":

    001-soandsopage-pagetitle.xml
    002-soandsopage-metadescription.xml

    ....

    Folder "DE":

    001-soandsopage-pagetitle.xml
    002-soandsopage-metadescription.xml

    ....

    Using Okapi, I do a sentence alignment for each of these files. The outcome is a list of XLF files, one per file pair.

    Those files I translate in Trados Studio.

    Then I use Okapi to convert the translated XLF files back to XML, merge the XML files into one and upload.

    That's the outline. Hope it helps.

    Daniel

    Do give the Multilingual XML filter a try, it has quite an ingenious way of handling bilingual alignment which is more manual, but also more robust than my method.

    emoji
Children
  • Thank you,  !

    An interesting approach, although I'd say it's even a bit more laborious than the one I have managed to come up with.

    My source file = "lazy xliff" (bilingual xliff from WordPress, already pretranslated, big chunks of code within CDATA elements). Built-in Trados xliff filetype delivered bad segmentation (paragraph-level) and some tag issues. Multilingual xml filetype delivers nice segmentation and tag handling, but alignment issues. So, using multilingual xml file type, here's my workaround:

    1. Process source xliff/xml file [a].
    2. Copy text to target (all segments)
    3. Rename source xliff/xml file [b] and process it again, but in reverse (mapping xml target element as the source lang.)
    4. Export [a] & [b] as Word (I use SDL XLIFF Converter)
    5. In Word, merge the exported [a]+[b] files (by copying either source or target  column of [b] to target column of [a])
    6. Import merged Word file back to Trados, into file [a].

    BEFORE:

    Screenshot of Trados Studio interface showing a segment comparison with a red box highlighting alignment issues in the target text.

    AFTER:

    Screenshot of Trados Studio interface after corrections, with a green box indicating proper alignment and segmentation in the target text.

    Hope it helps and please feel free to suggest any improvement ideas.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:46 AM (GMT 0) on 5 Mar 2024]