When pretranslating my Word file, all tab stops and soft carriage returns from the source are gone in the target -> how can I tag them so they will remain as tags.

I added a Word file to Trados Studio. The file contains tab stops and soft carriage returns. When pretranslating the file with Google Api plugin, all tabs and soft carriage returns are gone in the target. I know this has to do with Studio and not with the MT Plugin. 

I read a couple of discussions about tab stops not being shown in the target but all links to solutions did not work anymore. They were removed from the Studio community.

I would like to have the tabstops and soft carriage returns be turned into tags so they will appear in the target when I pretranslate. (as tags remain in the target when connecting to the MT plugin)

emoji
  • Hello,

    Based on your query, it seems you want to ensure that tab stops and soft carriage returns in your Word file remain as tags when pretranslating in Trados Studio using the Google API plugin. Here's how you can achieve this:

    Step 1: Open Trados Studio and go to "File" > "Options" > "File Types".

    Step 2: Find and select "Microsoft Word 2007-2019" from the list.

    Step 3: Navigate to "Embedded content" and activate the option "Enable embedded content processing".

    Step 4: Click on "Add..." next to "Document structure information". Here, select the document structures where the soft return is part of. This could be a heading, a paragraph, or any other structure depending on your document.

    Step 5: Click on "Add..." next to "Tag definition rules". Here, select "Placeholder" under "Tag Type" and add "\n" (backslash n) under "Start Tag". This will ensure that soft returns are treated as tags.

    Please note that if you are using a custom project template, you need to change the settings detailed above in that template. You can edit it under "File" > "Setup" > "Project templates".

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  • I am using Studio 2015 and the Word filter does not have an embedded content option

    emoji
  • Hello,

    I understand that you're using Trados Studio 2015 and you're unable to find the "Embedded content" option in the Word filter. In Trados Studio 2015, the process to ensure that tab stops and soft carriage returns remain as tags is slightly different. Here's how you can do it:

    Step 1: Open Trados Studio and go to "File" > "Options" > "File Types".

    Step 2: Find and select "Microsoft Word 2007-2019" from the list.

    Step 3: Navigate to "Common" > "Paragraph formatting" and check the box for "Preserve soft line breaks".

    Step 4: Under "Tag settings", ensure that "Display tags in text" is checked.

    Please note that these settings will apply to new projects. If you want to apply these settings to an existing project, you need to go to "Projects" > "Project Settings" > "File Types" and adjust the settings there.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  • thanks however, this option does not exist. In "common" there is no "paragraph formatting". These are the possible settings

    Screenshot of Trados Studio project settings showing a list of options under 'Common' with no 'paragraph formatting' option visible.

    emoji
  • Hello,

    I apologize for the confusion. In Trados Studio 2015, the options to preserve soft line breaks or tab stops are not directly available in the Word file type settings. However, you can try to use a workaround by creating a custom segmentation rule in your Translation Memory settings. Here's how you can do it:

    Step 1: Open Trados Studio and go to "File" > "Options" > "Language Pairs" > "All Language Pairs" > "Common" > "Segmentation" > "Rules".

    Step 2: Click on "Add..." to create a new rule.

    Step 3: In the "Before break" field, enter "{Alphanumeric}" (without quotes). This will match any alphanumeric character.

    Step 4: In the "After break" field, enter "{Any}" (without quotes). This will match any character.

    Step 5: In the "Break" field, enter "\n" (without quotes) for soft line breaks or "\t" (without quotes) for tab stops.

    Step 6: Click "OK" to save the rule.

    Please note that this rule will apply to all new translation memories you create. If you want to apply this rule to an existing translation memory, you need to go to "Translation Memories" > "Translation Memory Settings" > "Language Pairs" > "All Language Pairs" > "Common" > "Segmentation" > "Rules" and adjust the settings there.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  • Thanks,

    I changed the settings in the TM, removed the file and added it again. Prepared it and nothing happened, still the tabs and soft carriage returns are not tagged. 

    emoji
  • Hi

    I noticed that when I look into the SDLXLIFF file, the tabs are there in the target. However these are out of the <mrk> elements. When I then have a look in trados and change the filter to "all content" I notice that the leading tabs and spaces that are not shown in Trados, appear in the content that is not in a segment, the content between segments. That makes sense if I check how it is stored in the sdlxliff. Although I still do not understand why it is stored out of the elements and segments and why the inline tabs and spaces are gone.

    In the editor it looks like this, the dots are spaces as I switched on the formatting :

    Screenshot of Trados Studio editor showing mismatched formatting between source and target segments for 'Access-control-system' and 'Number-of-doors'. Formatting marks are visible.

    Does this help anyone to get me a solution to solve this?

    Thanks

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:46 PM (GMT 0) on 5 Mar 2024]
  •  

    I know this has to do with Studio and not with the MT Plugin. 

    How do you know this?

    I think that your problem is that since the white space is not needed it's all moved out of the way so you don't need to handle it.  The tabs look to be safely in the target, but the spaces are definitely removed and most probably an issue with the plugin, not Studio.  I think that because you show an unedited translation in the target that has come from MT and it's removed the spaces.  You could probably verify that if it's important for you by using a tool like Fiddler to capture what gets returned from Google before it goes into Studio.

    Whatever the response I think you probably have a bigger problem in that irrespective of the source of the problem neither the MT plugin, or the 2015 version of Studio are supported any more.  So if it's a bug in that version it will never be fixed now since the issue is most likely resolved in one of the four major versions that have been released since then.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi Paul,

    Thanks a lot for your reply.

    I still have the issue with the tabs and the softbreaks within the text. They did not work properly either. I tried with Memoq in the mean time and as tabs and softbreaks are tagged automatically, these remain in the MT translation. 

    However I understand your reply, these are all bugs and are not going to be fixed in non supported versions.

    About Fiddler, I have not heard about this tool and found this post in the RWS community: 

     05 What is Fiddler and how to use it 

    so I will have a look if this tool can help me fix the issue. 

    Kr

    emoji
  •  

    Out of curiosity I knocked up a file that looked like yours and tested it:

    Screenshot of Trados Studio 2022 showing a translation comparison with missing spaces before line breaks in the target text.

    This is with Trados Studio 2022 and the new Google Plugin.  It's also not perfectly the same as the source as the spaces before the line break are removed in the translation.  So I also tested this with Fiddler:

    In Fiddler I can see our request (what we send) and the response (what Google sends back):

    Screenshot of Fiddler's Raw mode displaying a request sent to Google Translate API with spaces (%20) and newline character (%0A) included.

    1. Using Raw mode I can see in the request that we do send all the spaces (%20 is a space) and also the newline character (%0A) in the right places.
    2. In the response from Google the spaces prior to the newline character are removed so we don't see them at all in Trados Studio

    We might be able to resolve this in the plugin by handling the spaces a different way before they are sent, but this would only be a fix we apply to the current version of the plugin and not to the 2015 version which seems to be handling it even less appropriately than the new one.

    It's always tricky to handle files that are put together by people who don't know how to use Microsoft Word to format their documents properly... but then that's the world we live in and what does properly even mean to someone who doesn't care!!

    I attached my test file for reference.

    tabs and spaces.zip

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:59 PM (GMT 0) on 7 Mar 2024]