How to avoid translating metadata? (converted an XML file to Word)

Hi everyone,

I'm working with an XML file where I need to export it to bilingual word so it can be translated concurrently between 2 translators. This is required as my client doesn't own Trados and would like to comment in Word. So once everything is finalized, we can simply accept all changes and convert it to XML again for delivery. The client will then upload the file to their Design platform called Xyleme for further processing and output.

Before project start, they requested us to provide a mock XML with pseudo translation for testing. So I used the pseudo translation function in Trados to do auto text filling, and returned the mock XML. However, they told me that the instructional metadata has been tempered (probably caused by pseudo translation) and suggested me to change the parser setting. I have no idea what and where it is. Is this a setting I can apply during the XML to Word conversion? or how can I avoid translating the metadata by accident? I assume Trados is smart enough to leave out the metadata but seems like it's not. Hope someone can shed some light and thanks in advance!

Best Regards,

L

Parents
  • I assume Trados is smart enough to leave out the metadata but seems like it's not.

    Trados is only as smart as its user... so you need to customse the parser rules accordingly.  XML is very flexible and Trados provide you with the tools to do this.  I suggest you start here:

    https://multifarious.filkin.com/2014/06/01/custom-xml/

    And if you still need help you will have to share the xml file, or part of it, for anyone to be able to help you.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Thanks for sharing the link. It's very helpful but i'm not sure if i can learn how to create filters in such a short time frame. In case i couldn't, is there any other work around? For example, will I be able to tell from the converted Word file which are instructional metadata so that i can leave it in it's original form? or Trados may extract certain wordings in a string of metadata as a segment, which makes it impossible?

    I couldn't share the XML on the forum as I've signed NDA.. but if anyone is kind enough to look into this with me, I may can share part of it in a secure way. 

  • I couldn't share the XML on the forum as I've signed NDA.. but if anyone is kind enough to look into this with me, I may can share part of it in a secure way. 

    All you need to do is pseudo-translate the XML ad save the target.  Once you have done this surely you can anonymise anything left in the header that you think is important to hide and share that?

    For example, will I be able to tell from the converted Word file which are instructional metadata so that i can leave it in it's original form?

    Frankly this is a ridiculous way to work. If you need to handle XML for a customer then you should learn how to do it properly. It's probably not difficult and if you were to take the time to prepare a sample of the structure we could help you.  We're not interested in the translatable content at all... only the structure of the file.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Totally agree. I should just get this over with and try to learn the filters. Below please find the XML with pseudo translation applied so you can have a look at the structure.

    XML download link: https://www.dropbox.com/s/4wagoaujdj37ysj/Test%20file.xml?dl=0

    I'm able to located the filter settings and understand that I can change the properties to "not translatable", but i'm not sure which filter will fix the problems and whether I need to change the other properties as well. The client listed the issues below and asked me to try changing the parser setting. Hope you can give me a few pointers:

    - There are some instructional metadata parts that were translated that should not have been. It should be related to the filter metadata

    - Text within graphics was not translated

    - Another area of concern that needs to be changed is in the footnote numbers.  Where several footnotes are identified in the same area, the space between the footnote reference numbers are missing. The space was there in the English source doc so somewhere during translation, it’s being removed.  

    Many thanks!

  • ok - the file is a pretty basic so it should be easy.  My only problem here is that I can't tell what's metadata and what's not because I can't read Japanese... and even if I could now that it's pseudo-translated it makes no sense at all ;-)

    What you need to do is open the XML file in a editor and see where the meta data is held?  For example is it held in any of these elements?  It's not obvious from the description and the default XML filetype only extracts text from the elements.  So which of these contains the metadata you wish to exclude?

    AutoNumberToken
    BoundedText
    Cell
    Chapter
    Classification
    content
    CopyrightBlock
    CopyrightOwner
    CopyrightRestrictions
    CoverPage
    Credits
    CustomNote
    Definition
    DesignData
    Distractor
    Emph
    EndTextWrap
    Entry
    Figure
    FilterMetadata
    Footnote
    FrontMatter
    Glossary
    GlossaryItem
    Href
    IA
    ID
    InLineVariableText
    IntroBlock
    Italic
    Item
    ItemBlock
    ItemPara
    Lesson
    LifeCycle
    List
    ListPreamble
    LOM
    MediaObject
    Module
    Modules
    MultipleChoice
    name
    Notice
    Option
    OverlayObject
    OverlayObjects
    OverlayType
    ParaBlock
    props
    QuestionBlock
    QuestionStem
    Renditions
    RichText
    Rights
    sdoc
    sdoc-backup
    SimpleBlock
    SubList
    SubTitle
    Table
    TableRow
    TargetAudience
    TargetAudiences
    Taxon
    TaxonPath
    TblBody
    TblCol
    TblFooter
    TblGroup
    TblHeader
    TblTitle
    Term
    Title
    TitledBlock
    Topic
    Underline
    Version
    Web

    Once you know this the process is really simple because you'll just make the appropriate elements non-translatable and this will exclude them.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Sorry for the late reply as I've been testing the settings with the client's tech team. We have resolved some of the issues but there's one left, in which the text laid on top of graphics is not translated, (Image Overlays?). He suggested that there are several metadata which should not be touched and they asked us to change 4 filters, which we have tried but the problem still exist. Would you mind to help take a look at the files and see if you have any suggestions?

    These are the 4 files i'm sharing:

    Download link: 

    1) Source XML

    2) Test XML, this was exported after we changed the filter and with pseudo translation applied

    3) Image overlays: the client took a screenshot of the metadata which should not be translated

    4) Settings change: The filters that the client ask us to change

    Many thanks!

    Louis

  • This isn't particularly helpful. If you provided the settings file it would have been better because I have no idea at all what you have done.  The screenshot of what should not be translated would never be translated by default because they are all attributes so you must have actually created a filetype and told Studio to translate them.

    The screenshot of metadata that should not be translated is similarly useless.  BoundedText for example does not contain any text at all, so that rule isn't even doing anything.

    Perhaps we can take a quick call because I feel as though I'm wasting my time here when we could probably resolve this quickly if I can speak to you and see what you are trying to achieve?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • I wish I can do a quick phone call but I'm actually living in Hong Kong. Plus i'm actually asking for my translator as he is the one who's managing the files. (I was pretending to be the translator earlier as his English level is limited. So he has Trados but i don't and i'm helping him to ask. My apologies). Please bear with me so I can explain it again. If you still don't get it, we can set up a web call to discuss. We really appreciate your help.  

    Please find attached the setting file (hope it's right). We have only turned some of the parser rules to "untranslatable" in the attempt to resolve the issues we mentioned. It actually work except for one last problem:

    As mentioned previously, my client will import our finished XML file to their design platform (Xyleme) for output, so we applied psuedo translation and sent them a mock XML for trying. They ran it on Xyleme and exported to Word but that the texts in the figures (or table) are not translated. (see attached screenshot, or P.29 of the Word file). They are supposed to be overwritten by pseudo translation if Trados was able to detect it.

    I have a feeling that it may have nothing to do with the parser settings. And my translator has only created the project with the default settings. Is there something that we forgot to click so that it can also select the text in Graph/tables for translation?

    7318.Files.zipScreenshot of a Word document with a table showing untranslatable text in figures, indicating an issue with Trados Studio not detecting text for pseudo translation.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:48 PM (GMT 0) on 28 Feb 2024]
  • ok - we're getting closer.  That table seems to be in here for example:

    Screenshot of Trados Studio error message indicating 'Not translatable' for the BoundedText element.

    Note that the translatable text is inside the text attribute of the BoundedText element.  Currently you have excluded this element and have no rule for the attribute:

    Screenshot showing the BoundedText element set to 'Not translatable' in Trados Studio.

    So I changed it as follows:

    Screenshot showing the BoundedText element set to 'Always translatable' in Trados Studio.

    In the rule it's this:

    Screenshot of Trados Studio 'Edit Rule' dialog box with BoundedText element and text attribute set to 'Always translatable'.

    This will also pick up this for example... there are more:

    Screenshot of a diagram with text in Japanese related to risk assessment in KYCCDD sanctions due diligence.

    Presumably this should also be translated?

    Hopefully this will help you understand the difference between elements and attributes at least, so you can probably fix anything else like this.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:48 PM (GMT 0) on 28 Feb 2024]
  • We tried your suggestion but was not able to locate the text in figure in the exported Word file. Would you mind to check our setting file to see if we have done anything wrong? I have also attached our new XMLand Word for your reference. Many thanks!

    New.zip

  • You have not applied the rule, you simply made the BoundedText translatable.  I told you to do this:

    Screenshot showing Trados Studio rule configuration with 'BoundedText' set to 'Always translatable' but missing attribute extraction.

    This is what you did:

    Screenshot displaying Trados Studio rule setup incorrectly with 'BoundedText' marked as 'Always translatable' without attribute extraction.

    You have not extracted the attribute at all.  Here's what you need to do:

    Note: I made a small error in describing why you should name the filetype.  You also need to name the File type identifier which you can't change now... you need to do this when you create the filetype in the first place.  If you start to work with XML more you'll be glad you have the ability to do this because it can be very confusing when you can't tell one filetype from another.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:49 PM (GMT 0) on 28 Feb 2024]
Reply
  • You have not applied the rule, you simply made the BoundedText translatable.  I told you to do this:

    Screenshot showing Trados Studio rule configuration with 'BoundedText' set to 'Always translatable' but missing attribute extraction.

    This is what you did:

    Screenshot displaying Trados Studio rule setup incorrectly with 'BoundedText' marked as 'Always translatable' without attribute extraction.

    You have not extracted the attribute at all.  Here's what you need to do:

    Note: I made a small error in describing why you should name the filetype.  You also need to name the File type identifier which you can't change now... you need to do this when you create the filetype in the first place.  If you start to work with XML more you'll be glad you have the ability to do this because it can be very confusing when you can't tell one filetype from another.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:49 PM (GMT 0) on 28 Feb 2024]
Children
  • This is very helpful Paul and we really appreciate it. My translator simply missed the attribute part and changed the filter to translatable only. Now it all make sense and I'll ask him to try it tomorrow morning. 

    At the end of the video, i saw that you have typed "tag" under the advanced filtered to show all the bounded text. Can you explain that part please? (like why is it under "tag") I was wondering if there's a way to export that list of bounded text as it also shows which segment is it under. This is because my translators have already started translating with the exported word file where the bounded texts are not captured. So i want to see if there's any easier way for them to catch up other than just send them the new Word files and ask them to copy and paste their translation and double check line by line. The only thing I can think of is for them to create a TM with what they have done so far, and then use it on the new Word file so that it can reduce their manual work.

    Thanks again.

  • At the end of the video, i saw that you have typed "tag" under the advanced filtered to show all the bounded text. Can you explain that part please? (like why is it under "tag") I was wondering if there's a way to export that list of bounded text as it also shows which segment is it under.

    I typed it into the DSI (Document Structure Information) box. This is because the document structure for all attributes extracted contains the word tag. You can see this on the column on the right in the editor:

    Trados Studio Document Structure Information dialog box showing 'Tag' typed in the Code field with bounded text information in the content area.

    I just thought I'd find it faster that way... nothing you need to do.

    If you want to export the filtered selection just click on "generate":

    Trados Studio Advanced Display Filter with 'Tag' typed in the DSI Information box and the 'Generate' button highlighted.

    That will get you an SDLXLIFF with only these segments.  Now you could, if you wanted, open the SDLXLIFF in Studio and export to Bilingual Word, or to Excel... whatever you needed.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:49 PM (GMT 0) on 28 Feb 2024]
  • Sorry to bother you again, Paul. My translator followed your steps to change the attribute but it didn't work; we couldn't find the bounded text in our exported file or on Trados. I've helped him to check the filter and it's showing as boundedtext/@text so I trusted that he did it right this time.

    Would you mind to check our setting file again to see what went wrong? I have also attached the exported XML and WLatest try.zipord for your reference.

  • My translator also gave me the list of filters that we have turned into untranslatable before we try the boundedtext. Not sure if it matters:

    DesignData

    TargetAudience

    TargetAudiences

    FilterMetadata

    Taxon

    TaxonPath

    CopyrightOwner

    Rights

    LOM

    OverlayType

    OverlayObject

    OverlayObjects

  • My translator followed your steps to change the attribute but it didn't work; we couldn't find the bounded text in our exported file or on Trados. I've helped him to check the filter and it's showing as boundedtext/@text so I trusted that he did it right this time.

    Your settings file looks fine.  Did you recreate the project with this new settings file? Maybe you simply changed the settings file in the existing project, and in this case you're wasting your time as the project has already been created with the old one.

    My translator also gave me the list of filters that we have turned into untranslatable before we try the boundedtext. Not sure if it matters:

    I suggest you speak to someone like who "might" be happy to agree a fair price with you to produce the filetype you are looking for.  I have the impression that you are using a very inexperienced Trados user, and certainly one who doesn't understand enough about how to work with XML to tackle your project professionally.  You would save yourself a lot of problems if you hire the right resource to help you.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub