Term recognition in Trados 2024

Dear Community, 

this post is an inquiry aimed at understanding better what new Trados24 version brings.

Here is a curious example of term recognition performed by Multiterm today:

Screenshot of Trados software with term recognition highlighting 'comprehensive' and 'maintenance' in a text segment. Term Recognition pane shows 'comprehensive maintenance' with a 78% match.

I have NEVER seen Trados recognize separate words as part of a term saved in the database (comprehensive and maintenance, in this particular case). 

The search is set to 70%, which is a standard issue:

Screenshot of Trados software showing Project Settings with Search Settings expanded, highlighting the 'Minimum match value %' set to 70.

First, I thought that Multiterm recognized only "comprehensive" and returned a fuzzy match from the database (even this would be a 50-60% match, definetely not 70%, in my opinion).

But then I saw that "maintenance" does not appear in the hitlist as a separate term, meaning that both words were recognized as parts of "compehensive maintenance". How come? There are 3 words inbetween!

This has never happened in the previous versions. Is there any technical explanation? New search processes in Trados 2024?

Would be very grateful for the explanation! Thanks in advance!



Generated Image Alt-Text
[edited by: RWS Community AI at 8:09 AM (GMT 1) on 2 Apr 2025]
emoji
Parents
  •   

    I tested this in 2022 and got exactly the same results as I do in 2024, so I don't believe there is anything different in 2024 in this regard.
    As to how this works... I tested it a bit and can explain by deduction rather than giving details it would require a developer to provide.  But first some basics.
    Term recognition match scores are not calculated like translation memory fuzzy matches.  Instead, they consider:
    1. Exactness of phrase match
      • Is the entire term ("comprehensive maintenance") present as-is?
      • If yes => 100% match
      • If only components are found, Studio will attempt a fuzzy term match
    2. Proximity of tokens
      • If the term is made up of multiple words, Trados checks how far apart those words are in the segment.
      • Closer = higher match score
      • Separated by multiple words or punctuation = lower score
    3. Additional modifiers or interference
      • Words like “digitized” and “complex” interrupt the expected phrase, reducing confidence in the match.
    4. Order of tokens
      1. If the words are out of order, it further weakens the score.

    So, if I then look at some examples:

    Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 78% match and other terms at 100%. Text being analyzed is visible at the bottom.Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 87% match and other terms at 100%. Analyzed text is displayed at the bottom.Screenshot of MultiTerm software showing term recognition results with 'comprehensive maintenance' at 72% match and other terms at 100%. The text for analysis is shown at the bottom.

    I changed the distance between "comprehensive" and "maintenance" in each one resulting in 78%, 87% and 72% matches, all based on the same single termbase entry "comprehensive maintenance".

    First, I thought that Multiterm recognized only "comprehensive" and returned a fuzzy match from the database (even this would be a 50-60% match, definetely not 70%, in my opinion).

    If it was a translation memory I'd agree... but it's not :-)

    Studio attempts to infer the intent and provide useful term suggestions even if the exact match isn’t there.  In your use case (technical documentation/rail), this behaviour might be quite useful.  But equally it could be misleading if translators assume that “comprehensive” or “maintenance” are individually present in the termbase (when they're not).  So one difference I do see between 2024 and 2022 is that the match value seems to be there by default in 2024 (it wasn't in 2022 - although I didn't investigate this to see if my earlier or current default settings are at play).  Either way the match score helps flag that partial matches are being made, based on token recognition and order.

    Hope that helps?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 10:01 AM (GMT 1) on 3 Apr 2025]
  • As always, thank you for answering, Paul!

    As for the match value, it appears in 2024 version. My colleagues with Trados 2021/2022 do not have it and there is no such option in the Hitlist settings.

    But this is not the issue. Technically, I do not know how the match calculation is performed, but let me assure you that Multiterm has never "recognized" terms separated by word/words before.

    A recent example: my colleague asked me about a very common system used on the train (Juridical recording unit / Unidad de Registro Jurídico), she needed the exact, "official" term in Russian. Since I work on the TDB on a daily basis, I knew it HAD to be there, so I wondered why it did not appear in the Multiterm window.

    In effect, it was not there, because the engineer used the wording "unidad jurídica", and I had only the first three options saved: 

    Screenshot of a MultiTerm database entry showing terms in Spanish and Russian related to juridical recording units.

    Therefore, Multiterm had "unidad de registro jurídico", but when it came across "unidad jurídica", it was not recognized as a fuzzy match (null result). As you can see, I added this term "variation" to the database.

    This is why the example I came across yesterday kind of struck me. Because it would save me recording so many "similar" wordings in the database (I really have plenty of them):

    Screenshot of a MultiTerm database entry with an image diagram and terms in Spanish and Russian related to welding.

    And in Russian, for example, the adjective can come before or after the noun (without any change in meaning), so I have to save both. If not, the term won´t be recognized.

    So I think, some kind of update-improvement has been performed on the Multiterm search engine. But, then, I am just the final user :).

    And if it works as you have explained above, then it is really great, since this may save us many manual fuzzy searches.

    Best regards!

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 2:45 PM (GMT 1) on 3 Apr 2025]
  •  

    Technically, I do not know how the match calculation is performed, but let me assure you that Multiterm has never "recognized" terms separated by word/words before.

    Really... 

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Paul, you are always so helpful!!! Then, it´s great news! The more terms are recognized by Multiterm, the better!

    And I have double-checked. The colleague with the JRU example has Trados 2021 (we are working with 3 Trados versions: 21,22 and 24), so maybe it was slightly different in that version.

    Anyway, the more we are shifting towards TA revision instead of translating from zero, the more useful Multiterm becomes. At least, this is how I see it...

    Thank you very much once again!!!!

    emoji
Reply Children