Issues with alphanumeric string pre-translation and recognition

Hi everyone,

we've recently come across a strange issue related to alphanumeric string recognition in Trados Studio.

Project settings/conditions:
Project files are created by the customer in Trados Enterprise and received/processed by the PMs or linguists in Trados Studio 2024 (SR1).
All of the TMs used have alphanumeric recognition enabled in the TM settings.
Auto-substitution is enabled for alphanumeric strings for each language pair.
There is no penalty applied to AT of alphanumeric strings.

Issue 1: Only a portion of the alphanumeric strings is marked as pre-translated.
In the translation editor, those segments are highlighted as CM, even though there is no exact match from the TM.
We assume that the pre-translations are indeed the result of AT, and we would expect Studio to mark them as AT, not CM.
AT, however, does not work the same way for all segments. The 99% segments basically have the same structure as the ones marked as CM,
but they don't have the same status.

So two questions here:
Why does Studio/Enterprise use a CM marker for something that is clearly an AT match with no entry in the TM?
And why are some segments treated as CM and others as 99% matches even though they should be treated equally?

Issue 2: For the 99% matches, Studio suggests various CM or 100% matches that clearly have nothing to do with the actual source text.
For example, the TM unit "PH2 > PH2" is suggested as a CM for "3601JK3200".
Translators obviously won't have a problem with that as they already have a pre-populated target segment with the correct string.
But this completely messes up the analysis results, as the Studio analysis will show more CM/100% (that usually are not paid for) than there actually are present in the project files.

So what would be your take on this? Ask the customer to disable alphanumeric string recognition and AT altogether,
which would result in less AT hits? Or is there a defect with the recognition pattern that can be fixed in a future update?

Thanks in advance for your support!

Best regards,
Julian

Translate

Rate translation

Suggest better translation

Moderator UI

Thread Subject & Description
Issues with alphanumeric string pre-translation and recognition Hi everyone, we've recently come across a strange issue related to alphanumeric string recognition in Trados Studio. Project settings/conditions: Project files are created by the customer in Trados Enterprise and received/processed by the PMs or linguists in Trados Studio 2024 (SR1). All of the TMs used have alphanumeric recognition enabled in the TM settings. Auto-substitution is enabled for alphanumeric strings for each language pair. There is no penalty applied to AT of alphanumeric strings. Issue 1: Only a portion of the alphanumeric strings is marked as pre-translated. In the translation editor, those segments are highlighted as CM, even though there is no exact match from the TM. We assume that the pre-translations are indeed the result of AT, and we would expect Studio to mark them as AT, not CM. AT, however, does not work the same way for all segments. The 99% segments basically have the same structure as the ones marked as CM, but they don't have the same status. So two questions here: Why does Studio/Enterprise use a CM marker for something that is clearly an AT match with no entry in the TM? And why are some segments treated as CM and others as 99% matches even though they should be treated equally? Issue 2: For the 99% matches, Studio suggests various CM or 100% matches that clearly have nothing to do with the actual source text. For example, the TM unit "PH2 > PH2" is suggested as a CM for "3601JK3200". Translators obviously won't have a problem with that as they already have a pre-populated target segment with the correct string. But this completely messes up the analysis results, as the Studio analysis will show more CM/100% (that usually are not paid for) than there actually are present in the project files. https://community.rws.com/resized-image/__size/1800x1200/__key/communityserver-discussions-components-files/90/pastedimage1755593613760v2.png So what would be your take on this? Ask the customer to disable alphanumeric string recognition and AT altogether, which would result in less AT hits? Or is there a defect with the recognition pattern that can be fixed in a future update? Thanks in advance for your support! Best regards, Julian
Get AI Suggestion

AI Reply

Accept answer Reject Answer

Top Replies

Jesús Prieto 6 months ago +1

Julian Hamm I see in segment 1273 that the alphanumeric string has been recognized. I thought that you needed to have the auto-substitution enabled, but you said it was (maybe you can double check it…

Parents

0 Jesús Prieto 6 months ago

Julian Hamm

I see in segment 1273 that the alphanumeric string has been recognized. I thought that you needed to have the auto-substitution enabled, but you said it was (maybe you can double check it):

You can try upgrading the TMs.

Finally I thought that the issue is in your client's side.

If you don’t want to be stuck while your client decides anything and/or RWS says something about this issue, I’d filter by these alphanumeric strings, a regex something like ^\d+[A-Z]+\d+$ will do the trick, copy source to target in all segments, change to Translated status, lock them up and that’s it.

Generated Image Alt-Text
[edited by: RWS Community AI at 12:34 PM (GMT 1) on 19 Aug 2025]
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Julian Hamm 6 months ago in reply to Jesús Prieto

Hi Jesús Prieto

Thanks for your feedback! Yes, auto-substitution is enabled by default, so it does make sense that Studio copied the string to the target segment. What I don't understand is that it assigns different match attributes to these segments. I'd expect a CM to be an actual match from the TM, which clearly it is not. We have pre-translated segments with a CM marker that are the result of auto-substitution/recognition (and not a 1:1 match from the TM) and others that only have a 99% match (where Studio suggests a completely different TU as a CM).
Thanks for the suggestion regarding the regex. It won't help us much specifically, as we usually deal with large multilingual projects. Even with automation through plugins, that's a lot of additional manual overhead. I also tried upgrading and re-indexing the TMs, but that did not make a difference.

There already are a lot of open tickets regarding the alphanumerical string recognition. The one that comes closest to my request is this one from over 6 years ago: Issue with Alphanumeric Strings Recognition

I'd appreciate if someone from the RWS support team could point out to me why some of the recognized/auto-substituted segments are treated as CM or 100% matches and others aren't.
Without a proper 1:1 TM match, I'd basically consider all of these strings as AT segments, also putting them in a completely different match category (in terms of QA and payment).

Best regards,
Julian
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Reply

0 Julian Hamm 6 months ago in reply to Jesús Prieto

Hi Jesús Prieto

Thanks for your feedback! Yes, auto-substitution is enabled by default, so it does make sense that Studio copied the string to the target segment. What I don't understand is that it assigns different match attributes to these segments. I'd expect a CM to be an actual match from the TM, which clearly it is not. We have pre-translated segments with a CM marker that are the result of auto-substitution/recognition (and not a 1:1 match from the TM) and others that only have a 99% match (where Studio suggests a completely different TU as a CM).
Thanks for the suggestion regarding the regex. It won't help us much specifically, as we usually deal with large multilingual projects. Even with automation through plugins, that's a lot of additional manual overhead. I also tried upgrading and re-indexing the TMs, but that did not make a difference.

There already are a lot of open tickets regarding the alphanumerical string recognition. The one that comes closest to my request is this one from over 6 years ago: Issue with Alphanumeric Strings Recognition

I'd appreciate if someone from the RWS support team could point out to me why some of the recognized/auto-substituted segments are treated as CM or 100% matches and others aren't.
Without a proper 1:1 TM match, I'd basically consider all of these strings as AT segments, also putting them in a completely different match category (in terms of QA and payment).

Best regards,
Julian
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Children

No Data

Trados Studio > 1. Trados Studio

Issues with alphanumeric string pre-translation and recognition

Top Replies