Translation memory segments not identified when they have hard line breaks

Hello,

I've been working on a project (with Excel files) where the occurrence of 100% and fuzzy matches is high. However, I realized that when the source strings have hard break lines, Trados Studio 2022 does not identify them automatically and I have to search for the translations manually to paste them into the target text.

An important information about the TM in question is that it was created by importing a .tmx file generated by memoQ.

Is there any way of getting Trados to recognize these strings automatically without me having to look for the translations?

emoji
Parents
  •  

    This won't be possible unless you use a TM that is not using Paragraph segmentation. The whole point of this method is to keep paragraphs together and not separate text that doesn't have a paragraph break in there.  I know you said a "hard line break" but you probably meant a "soft line break" as you cannot put a hard break (paragraph break into an Excel cell).  So these for example:

    Text demonstrating the difference between a soft break and a hard break. The soft break is indicated by a left angle bracket followed by a newline, and the hard break is shown with a newline only.

    If you try to enter these, with these breaks into Excel, the hard break (paragraph break) will get converted into a soft line break.  I think your solution, seeing as this is Excel and by default each cell is usually treated as a paragraph anyway, would be to create a new TM that is sentence based, add it to your project as the first TM in the list, and then when you open your project that is the one that will be used to segment the file.  Then just use the memoQ imported TMX for lookup etc.

    I am surprised to see the memoQ imported TMX uses paragraph based segmentation though.  Did you do that on purpose?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 3:40 PM (GMT 1) on 19 Jun 2024]
  • Hi there, ,

    Thank you for the quick reply and correction.

    Just to make sure I understood it correctly: so is there no way to make this memoQ imported TM's 100%/fuzzy matches be identified and "autocompleted" by Trados due to that line break issue, so all I can do is to use the TM for lookups, fragment matches, etc.?

    Or if I use the solution provided by the AI will those segments be identified?

    Thanks again for your attention!

    PS. Yes, the client usually works with paragrah-based TMs due to the projects particularities.

    emoji
  •  

    If I translate my excel file with paragraph segmentation so it matches everything like this:

    Screenshot of Trados Studio showing text segments with 100% match and CM (Context Match) indicators. No visible errors or warnings.

    And then open it in a new project using a set up like this, but with sentence based segmentation I will get this, which presumably is what you're seeing:

    Screenshot of Trados Studio with text segments showing varying match percentages such as 63%, 53%, and 74%. No visible errors or warnings.

    And this is because you will be having a TM setup something like this where the TM at the top of my list is the sentence based TM (my own TM) and the sentence at the bottom is the memoQ TM I imported (based on paragraph segmentation):

    Screenshot of Trados Studio Translation Memory settings with two TMs listed: 'en-fr (sentence based).sdltm' and 'en-fr (para based).sdltm'. Both are enabled with no penalties.

    So if I reverse the TMs BEFORE I create my project so I have this:

    Screenshot of Trados Studio Translation Memory settings with 'en-fr (para based).sdltm' at the top and 'en-fr (sentence based).sdltm' below it. Both are enabled with no penalties.

    Then now when I open the file I have this:

    Screenshot of Trados Studio showing text segments with 100% match and CM (Context Match) indicators. No visible errors or warnings.

    Now I can benefit from the memoQ TM completely and use my own TM to update the work.  The downside however is that the whole file will be segmented this way which might be a problem for you if you actually want to save the segments at sentence level and not paragraph?

    Ultimately this all comes down to whether or not you want to have to use concordance search every time you don't get a match to see if the sentence is in the memoQ TM or not.

    PS. Yes, the client usually works with paragrah-based TMs due to the projects particularities.

    On this basis you can solve the problem easily by simply doing the same thing yourself or do as I just explained and use the memoQ TM to prepare the segmentation for the project by making sure it's at the top of the list.

    It might also be that your memoQ TM is not actually set up to be based on paragraph segmentation at all.  It's just holding content at sentence level that is not segmented and that could be because of how the content in the TMX is provided.

    Either way the only way to solve it I think it to segment the Excel file so that it's treating each cell without segmenting at all and the only way to do that is to create a new TM that is 100% paragraph based segmenting and make sure that it's at the top of your list BEFORE you create your project:

    Screenshot of Trados Studio's 'New Translation Memory' dialog with 'Paragraph based segmentation' selected in the 'Segmentation Rules' section. No visible errors.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 5:00 PM (GMT 1) on 19 Jun 2024]
Reply
  •  

    If I translate my excel file with paragraph segmentation so it matches everything like this:

    Screenshot of Trados Studio showing text segments with 100% match and CM (Context Match) indicators. No visible errors or warnings.

    And then open it in a new project using a set up like this, but with sentence based segmentation I will get this, which presumably is what you're seeing:

    Screenshot of Trados Studio with text segments showing varying match percentages such as 63%, 53%, and 74%. No visible errors or warnings.

    And this is because you will be having a TM setup something like this where the TM at the top of my list is the sentence based TM (my own TM) and the sentence at the bottom is the memoQ TM I imported (based on paragraph segmentation):

    Screenshot of Trados Studio Translation Memory settings with two TMs listed: 'en-fr (sentence based).sdltm' and 'en-fr (para based).sdltm'. Both are enabled with no penalties.

    So if I reverse the TMs BEFORE I create my project so I have this:

    Screenshot of Trados Studio Translation Memory settings with 'en-fr (para based).sdltm' at the top and 'en-fr (sentence based).sdltm' below it. Both are enabled with no penalties.

    Then now when I open the file I have this:

    Screenshot of Trados Studio showing text segments with 100% match and CM (Context Match) indicators. No visible errors or warnings.

    Now I can benefit from the memoQ TM completely and use my own TM to update the work.  The downside however is that the whole file will be segmented this way which might be a problem for you if you actually want to save the segments at sentence level and not paragraph?

    Ultimately this all comes down to whether or not you want to have to use concordance search every time you don't get a match to see if the sentence is in the memoQ TM or not.

    PS. Yes, the client usually works with paragrah-based TMs due to the projects particularities.

    On this basis you can solve the problem easily by simply doing the same thing yourself or do as I just explained and use the memoQ TM to prepare the segmentation for the project by making sure it's at the top of the list.

    It might also be that your memoQ TM is not actually set up to be based on paragraph segmentation at all.  It's just holding content at sentence level that is not segmented and that could be because of how the content in the TMX is provided.

    Either way the only way to solve it I think it to segment the Excel file so that it's treating each cell without segmenting at all and the only way to do that is to create a new TM that is 100% paragraph based segmenting and make sure that it's at the top of your list BEFORE you create your project:

    Screenshot of Trados Studio's 'New Translation Memory' dialog with 'Paragraph based segmentation' selected in the 'Segmentation Rules' section. No visible errors.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 5:00 PM (GMT 1) on 19 Jun 2024]
Children
No Data