Auto Segmentation after a Specific String of Characters - Trados Studio 2021

Hi,

I regularly have to translate files which have been imported from another translation software and feature strings of random characters within segments, e.g. </mt:t><mt:p/><mt:b><mt:t>.

I want to make it so that Trados automatically starts a new segment after these characters but cannot figure out how to do it.  I think that they represent the presence of bullet points in the original software but this has not transferred well into Trados.  I cannot delete them because I need to be able to transfer the translation back into the original software at the end.

Is anyone able to tell me how to make Trados automatically start a new segment after these characters?  It would make my life so much easier if someone was able to show me how and I would be eternally grateful!  The files are imported as .xml.sdlxliff files.

Thank you in advance.

Parents Reply Children
  • Hi Daniel,

    I receive the files as .mdb files and use the company's translation tool to convert them to .xml files.  I then import the .xml files into Trados.

    I hope this helps.

    Rachel

  • I then import the .xml files into Trados.

    Perfect.  In this case you should create a custom XML for your XML files and create rules to handle the tags in the way you'd like them to be handled.  If you can provide a sample of the XML file it'll be easier to explain what you need to do?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • I'm not sure I understand what you mean by a sample of the XML file.  I can't send you the file itself as that would breach confidentiality.  Do you mean an excerpt from the text?

  • I'm not sure I understand what you mean by a sample of the XML file.

    This would be a small sample of a file:

    <?xml version="1.0" encoding="UTF-8"?>
    <rootelement>
    <story_one>
      <title>This is a confidential title</title>
      <description status='top secret' >The topic in this element must not be shared with anyone as it’s highly confidential</description>
      </story_one>
    </rootelement>

    Do you mean an excerpt from the text?

    No I mean a sample of the file so I can see the structure.  A small snippet as I showed above would be fine... but just anonymise the text:

    <?xml version="1.0" encoding="UTF-8"?>
    <rootelement>
    <story_one>
      <title>A poem</title>
      <description status='top secret' >Mary had a little lamb its fleece was white as snow</description>
      </story_one>
    </rootelement>

    I'm not interested in the text in the file at all.  Just the structure of the file.  So open it with a text editor, save as a new file so you have a copy, then delete all the lines apart from a representative sample of what the structure looks like.  Finally, with the few lines you are now providing just anonymize the text.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi,

    Thank you for talking me through that!  Is this enough of a sample?

    Screenshot of Trados Studio showing a sample text with no visible errors or warnings.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:23 AM (GMT 0) on 29 Feb 2024]
  • ok... that helps.  Although if you'd actually given me text instead of an image it would have saved me some time retyping it all.  I used this in the end which seems representative of your file:

    <?xml version="1.0" encoding="UTF-8"?>
    <trados_mediando>
      <job_id>323462</job_id>
      <texts>
         <text id="21941" >
            <master_textblock>&lt;m:t&gt;Sample text sample text sample text. Sample text sample text sample text. Sample text sample text sample text. &lt;/mt:t&gt;</master_textblock>
            <textblock>&lt;m:t&gt;Sample text sample text sample text. Sample text sample text sample text. Sample text sample text sample text. &lt;/mt:t&gt;</textblock>
            <original_textblock>&lt;m:t&gt;Sample text sample text sample text. Sample text sample text sample text. Sample text sample text sample text. &lt;/mt:t&gt;</original_textblock>
       </text>
       <text id="155769" >
            <master_textblock>&lt;m:t&gt;more sample text more sample text more sample text. more sample text more sample text more sample text. more sample text. &lt;/mt:t&gt;</master_textblock>
            <textblock>&lt;m:t&gt;more sample text more sample text more sample text. more sample text more sample text more sample text. more sample text. &lt;/mt:t&gt;</textblock>
            <original_textblock>&lt;m:t&gt;more sample text more sample text more sample text. more sample text more sample text more sample text. &lt;/mt:t&gt;</original_textblock>
       </text>
      </texts>
    </trados_mediando>

    I created a quick custom xml filetype with this using these rules:

    Trados Studio options menu showing rules for a custom XML filetype.

    If I preview the file I see this which is what you are seeing I guess?

    Preview of XML file in Trados Studio with sample text blocks displayed side by side.

    To solve it I process the text being parsed with the html filetype as embedded content:

    Trados Studio embedded content processing options with HTML filetype selected.

    Now it looks like this:

    Preview of XML file in Trados Studio with text blocks correctly parsed and displayed.

    And to get rid of the orange tabs if they annoy you...

    Trados Studio options menu with annotations indicating steps to remove orange tabs in the preview.

    The tabs are there because you are now essentially parsing the content of every element twice, once with your XML filetype, and then again with the html filetype.  But now it would look like this:

    Trados Studio preview showing segments with orange warning tabs for potential issues.

    Segments 10, 11 and 12 don't break after the full stop in my example because I search and replaced with a lowercase "more" to get additional text :-)  If I fix it... in case this is a concern for you:

    Trados Studio preview after processing, showing segments without orange warning tabs.

    For more information on creating a custom XML filetype this article may help:

    https://multifarious.filkin.com/2014/06/01/custom-xml/

    I recommend you take some time to learn because if you want to handle files like this it makes sense to understand how to do it.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub



    Generated Image Alt-Text
    [edited by: Trados AI at 10:23 AM (GMT 0) on 29 Feb 2024]
  • Sorry, I didn't realise you meant that I should post the text file.  I'm glad you managed to create a suitable substitute.

    I have followed the instructions and created a custom xml.  However, I don't seem to be able to parse the content as you describe.  This may be because I am using a different version of Trados so I have different options.  My screen looks like this:

    Trados Studio screenshot showing an 'Embedded content' error with details and a 'Processing errors' section indicating 'Document structure errors found'.

    Do you have any further suggestions?

    Thank you in advance.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:24 AM (GMT 0) on 29 Feb 2024]
  • Well... you have asked the tool to parse CDATA with the html processor.  But the filetype you showed us didn't contain any CDATA at all.  That's probably part of the problem.  If you can take one of your real files and provide it to me I'll happily show you how to handle it off-forum.  In here I think we're playing a guessing game because you don't know enough to ensure you give us the right information to be able to help you, and when we do suggest something I don't think you really understand it.

    I don't mean to pick on you, but this is a technical task and not everyone has the knowledge to be able to perform it.  Very happy to help you learn though.

    So if you can find a way to do this please send me the xml file - pfilkin@sdl.com

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    I have sent an email to you.  I hope it gets to you and you are able to help me.

    Rachel

  • Thank you.  I sent you the settings file I explained above, it works with your file perfectly.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub