Term recognition in Trados 2024

Dear Community, 

this post is an inquiry aimed at understanding better what new Trados24 version brings.

Here is a curious example of term recognition performed by Multiterm today:

Screenshot of Trados software with term recognition highlighting 'comprehensive' and 'maintenance' in a text segment. Term Recognition pane shows 'comprehensive maintenance' with a 78% match.

I have NEVER seen Trados recognize separate words as part of a term saved in the database (comprehensive and maintenance, in this particular case). 

The search is set to 70%, which is a standard issue:

Screenshot of Trados software showing Project Settings with Search Settings expanded, highlighting the 'Minimum match value %' set to 70.

First, I thought that Multiterm recognized only "comprehensive" and returned a fuzzy match from the database (even this would be a 50-60% match, definetely not 70%, in my opinion).

But then I saw that "maintenance" does not appear in the hitlist as a separate term, meaning that both words were recognized as parts of "compehensive maintenance". How come? There are 3 words inbetween!

This has never happened in the previous versions. Is there any technical explanation? New search processes in Trados 2024?

Would be very grateful for the explanation! Thanks in advance!



Generated Image Alt-Text
[edited by: RWS Community AI at 8:09 AM (GMT 1) on 2 Apr 2025]
emoji
  •   

    I tested this in 2022 and got exactly the same results as I do in 2024, so I don't believe there is anything different in 2024 in this regard.
    As to how this works... I tested it a bit and can explain by deduction rather than giving details it would require a developer to provide.  But first some basics.
    Term recognition match scores are not calculated like translation memory fuzzy matches.  Instead, they consider:
    1. Exactness of phrase match
      • Is the entire term ("comprehensive maintenance") present as-is?
      • If yes => 100% match
      • If only components are found, Studio will attempt a fuzzy term match
    2. Proximity of tokens
      • If the term is made up of multiple words, Trados checks how far apart those words are in the segment.
      • Closer = higher match score
      • Separated by multiple words or punctuation = lower score
    3. Additional modifiers or interference
      • Words like “digitized” and “complex” interrupt the expected phrase, reducing confidence in the match.
    4. Order of tokens
      1. If the words are out of order, it further weakens the score.

    So, if I then look at some examples:

    Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 78% match and other terms at 100%. Text being analyzed is visible at the bottom.Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 87% match and other terms at 100%. Analyzed text is displayed at the bottom.Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 72% match and other terms at 100%. The text for analysis is shown at the bottom.

    I changed the distance between "comprehensive" and "maintenance" in each one resulting in 78%, 87% and 72% matches, all based on the same single termbase entry "comprehensive maintenance".

    First, I thought that Multiterm recognized only "comprehensive" and returned a fuzzy match from the database (even this would be a 50-60% match, definetely not 70%, in my opinion).

    If it was a translation memory I'd agree... but it's not :-)

    Studio attempts to infer the intent and provide useful term suggestions even if the exact match isn’t there.  In your use case (technical documentation/rail), this behaviour might be quite useful.  But equally it could be misleading if translators assume that “comprehensive” or “maintenance” are individually present in the termbase (when they're not).  So one difference I do see between 2024 and 2022 is that the match value seems to be there by default in 2024 (it wasn't in 2022 - although I didn't investigate this to see if my earlier or current default settings are at play).  Either way the match score helps flag that partial matches are being made, based on token recognition and order.

    Hope that helps?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 10:01 AM (GMT 1) on 3 Apr 2025]
  • As always, thank you for answering, Paul!

    As for the match value, it appears in 2024 version. My colleagues with Trados 2021/2022 do not have it and there is no such option in the Hitlist settings.

    But this is not the issue. Technically, I do not know how the match calculation is performed, but let me assure you that Multiterm has never "recognized" terms separated by word/words before.

    A recent example: my colleague asked me about a very common system used on the train (Juridical recording unit / Unidad de Registro Jurídico), she needed the exact, "official" term in Russian. Since I work on the TDB on a daily basis, I knew it HAD to be there, so I wondered why it did not appear in the Multiterm window.

    In effect, it was not there, because the engineer used the wording "unidad jurídica", and I had only the first three options saved: 

    Screenshot of a MultiTerm database entry showing terms in Spanish and Russian related to juridical recording units.

    Therefore, Multiterm had "unidad de registro jurídico", but when it came across "unidad jurídica", it was not recognized as a fuzzy match (null result). As you can see, I added this term "variation" to the database.

    This is why the example I came across yesterday kind of struck me. Because it would save me recording so many "similar" wordings in the database (I really have plenty of them):

    Screenshot of a MultiTerm database entry with an image diagram and terms in Spanish and Russian related to welding.

    And in Russian, for example, the adjective can come before or after the noun (without any change in meaning), so I have to save both. If not, the term won´t be recognized.

    So I think, some kind of update-improvement has been performed on the Multiterm search engine. But, then, I am just the final user :).

    And if it works as you have explained above, then it is really great, since this may save us many manual fuzzy searches.

    Best regards!

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 2:45 PM (GMT 1) on 3 Apr 2025]
  •  

    Technically, I do not know how the match calculation is performed, but let me assure you that Multiterm has never "recognized" terms separated by word/words before.

    Really... 

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Paul, you are always so helpful!!! Then, it´s great news! The more terms are recognized by Multiterm, the better!

    And I have double-checked. The colleague with the JRU example has Trados 2021 (we are working with 3 Trados versions: 21,22 and 24), so maybe it was slightly different in that version.

    Anyway, the more we are shifting towards TA revision instead of translating from zero, the more useful Multiterm becomes. At least, this is how I see it...

    Thank you very much once again!!!!

    emoji
  •  

    The JRU example didn't get found because of the order, proximity and presence of the tokens.  In your termbase you have "Unidad de Registro Jurídico", but if your sentence only contains the wording "unidad jurídica" then only two out of four tokens from the full term are there.  So the phrase is farther in meaning and structure from the full term entry.  This is the same in 2024.

    If I do something like this where all four tokens are found I get a 64% match:

    Screenshot of MultiTerm software showing Term Recognition pane with a 64% match for 'Unidad de Registro Juridico' and a translation below it.

    But still only the one term:

    Close-up of MultiTerm's Term Recognition pane with a term match highlighted, and a pop-up showing the term 'Unidad de Registro Juridico'.

    So possibly useful, but you'll only be accurate by entering the terms you actually want control over.

    As I do this I can see why it was worth adding the fuzzy values because now users will have a better understanding of what this is actually telling them, and it's essentially a tool they can use where appropriate to help with the right term that is actually needed.  But not mistake this for a 100% match that they have to be using!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 2:01 PM (GMT 1) on 4 Apr 2025]
  • Dear Paul!

    What you mean is that if the word "registro" had been present in that sentence and if the match % had been lowered to 60%, then I would have probably gotten a fuzzy match, right? Ok, interesting.

    However, I do not go below 70%, sometimes even set it to 80%. And I if want to search incomplete matches, then I go to the Fuzzy search in the Termbase window. 

    Well, actually there are so many ways of searching The Multiterm TDB, most important is to have a good content Slight smile

    Best wishes and have a good week, Paul (though I cannot promise i won´t reappear soon with some other doubt-question of mine:).

    As always, thank you very very much!

    emoji
  • Me again, Paul:)

    Something does not work. Look at this example:

    Screenshot of a translation software interface showing a segment with the text 'Based on the DB product, fully adapted to DSB needs (customer experience, interior, passenger information and train connectivity...)' highlighted in yellow.

    We have "Passenger information system" in the TDB:

    Screenshot of a terminology database entry with terms in Russian and English for 'passenger information system' and other related terms.

    So, as far as I can understand: two of the three words (tokens) are present in the source segment, the match % is lowered to 60%, however, it is not detected by the search engine. For whatever reason. Nevertheless, "passenger" and "information" are recognized separately. 

    I also selected "passenger information" in the source segment and used CTRL+ALT+F1 to perform the same search via Multiterm Widget: 

    Screenshot of the MultiTerm Widget search results for 'passenger information' showing entries in English and Russian.

    Then I have another question: how do these two options affect the search results: 

    Screenshot of search settings sliders with 'Search depth' set to 200 and 'Term repetition threshold' set to 10.

    I played with them, but saw no difference...

    Thanks in advance!

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 9:26 AM (GMT 1) on 7 Apr 2025]
  •  

    ok - so you're not using 2024, so here's my 2024:

    Screenshot of a translation software interface showing term recognition with a 75% match for 'passenger information system' in English and Russian. No visible errors or warnings.

    You can see I get a 75% match.

    Then checking in 2022 I also get the same recognised terms:

    Screenshot of a translation software interface displaying a 100% match for a term in a translation memory. The term is highlighted in green, indicating a successful match. No visible errors or warnings.

    I can't use 2021 on my laptop now so if that's what you're using you're on your own ;-)

    Then I have another question: how do these two options affect the search results: 

    Screenshot of search settings sliders with 'Search depth' set to 200 and 'Term repetition threshold' set to 10.

    I played with them, but saw no difference...

    Most likely you are affect by whatever is causing you not to get a result in the first place!

    But you can always use the help...

    https://docs.rws.com/en-US/trados-studio-2024-1145319/terminology-recognition-settings-345706

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 10:14 AM (GMT 1) on 7 Apr 2025]
  • Somehow the same version works differently:

    Screenshot of a computer desktop showing two overlapping windows of Trados Studio - 2024.11.11, with a focus on the TermBase Viewer panel.

    Thank you for the link with the settings explanation!!!! Will study it!

    Most likely you are affect by whatever is causing you not to get a result in the first place!

    I am thinking about all the mergings I perform on the entries...maybe they are causing the "recognition" troubles. 

    I will read the article, probably it has the answers:)

    Thank you, Paul!!!

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 10:26 AM (GMT 1) on 7 Apr 2025]
  •  

    Maybe a reorganisation would help?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji