With Studio 2021, machine-translated segments in TMX files are no longer marked as AT, but as 100%

Hi all,

I have a client who sends me TMX files with every job containing machine-translations (created by the client's proprietary machine-translation engine) of all segments in the job. These segments have a CreationID of "MT!", which marks them as machine translations. Before starting work, I import these TMX files into the TM for the job. With Studio 2019, these segments would appear as AT (automated translation) in the editor, making it easy to see that they are machine translations that should be reviewed closely. The client's TM would also contain fuzzies and 100% matches which would be presented correctly as such.

After I upgraded to Studio 2021, the machine-translated segments are being presented as 100% matches, making it impossible to identify them and to distinguish them from real 100% matches during translation work.

As far as I remember, I never changed any settings in Studio 2019 to get it to recognize machine-translated segments and mark them as AT, so I don't know what I can do to get the same behavior in Studio 2021, if anything. Can anyone help?

Thanks a lot in advance!

Best,

Dennis

  • I have a feeling this fell into the category of "good bugs/bad bugs" where this was a bug that you actually found useful... so a good bug.  Most users do not want this behaviour because normally anything in your TM has been corrected and approved by you and the fact it originated as MT is now irrelevant.  The entry should be a correct translation unit. I can recall this behaviour, and it was also something I didn't like ether.

    I think that if you receive TMX files that are actually MT then the better approach is to import them into a new TM and then simply penalise the TM.  This way you'll never get a 100% match and will always review closely.  You could also add a field into your TM stating the origin and you would also get a good visual cue in the TM results window as you work.

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Thanks for taking the time to answer. However, I don't think I agree with you. The situation is that all TMX imports behave exactly as expected and are shown as 100% or fuzzies - except if CreationID is "MT!", then they are shown as AT. I don't see how that could happen as a bug. The only way this behaviour could happen is if the software (Studio) actually contains code (programming) that actively recognizes and processes TUs with this particular CreationID in a particular way, that is to mark it AT. It definitely seems to be "by design" - otherwise it would be almost miraculous that only and exactly TUs marked with MT! are presented as AT.

    You say that "normally anything in your TM has been corrected and approved" - well, no! Not with this particular client - that is the whole point. And to me and this particular client (which is a very significant intergovernmental organisation that know what they are doing when they translate texts to lots of languages...), it offers an extra and very useful functionality to be able to add (the client's own) machine-translated TUs to a TM. You also write that you have encountered this behaviour and didn't like it - but it only happens if the imported TUs are marked with the CreationID "MT!", so you must have imported such TUs, and in that case it would be a problem for you if they are presented as 100%/fuzzies rather than AT, wouldn't you agree? Conversely, if they are in fact corrected and approved TUs, they would not be marked with MT!.

    So it seems clear to me that this was a well thought-out functionality in Studio 2019 that has for some reason disappeared in Studio 2021. I could also point out that, as far as I remember, good old WinAlign used a variation of the same functionality by marking aligned TUs with CreationID "ALIGN!", causing these TUs to be presented as aligned TUs that should be checked more than "normal" TUs (the alignment function in Studio may work the same way, I haven't used it). So it is a known behaviour in Trados/Studio - the only question is why it has been removed with regard to machine-translated segments marked with "MT!" in 2021.

    And you are right, there are many things I could do to work around this problem - but since a very elegant solution was already implemented in Studio 2019, I think that it's too bad that it has been removed - especially if it is the removal that is an unintentional error or a bug, not its presence to begin with! :-) I would of course like to hear your comments to this extra info, just like I would love to hear from some of SDL's developers. Either way, since this functionality is being used actively by freelancers and a very large international client, I hope that SDL would at least consider re-introducing the functionality, perhaps as an opt-in function ("Mark this check box to present TUs with CreationID "MT!" as AT rather than as 100%/fuzzies").

  • Have you seen this discussion?
    https://community.sdl.com/product-groups/translationproductivity/f/studio/12369/segments-found-in-tm-appear-with-at-status
    Maybe your issue has the same origin ("because the TU was created after it was translated with Machine Translation.  The origin gets saved into the TM." — but the other way round in your case)?

  • Thanks for the thoughtful response.  I do see where you're coming from so I spent some time searching through the dev database today and found that it was changed on purpose but there is a plan to reverse this behaviour in the next CU.  I think given the strength of opinion here it makes sense to ensure that what we are proposing matches your expectations too, so I'm sharing some of the content from the database for you to review.  Let me know if you are in agreement with this or if you still see a problem:


    1. Imported a TMX into a new en-us_de-de Studio 2017 TM.

    2. Analysed a document against it. It has two exact matches, 1) a segment that matches an alignment TU with creation user ALIGN! and 2) a segment that matches a machine translation TU with creation user MT!

    3. When you translate the file interactively in Studio, you get specific origin and status information on these segments and their TUs in the TM.

    a) alignment - is handled as draft status, origin alignment and 1% alignment penalty applied by default:

    Screenshot of Trados Studio showing Translation Results with a segment created by an alignment tool, marked as 'Alignment Tool result' with a draft status and 1% alignment penalty.

    When such an aligned segment is confirmed, the change user changes to the current user. You can see this in the re-exported TM where the change user has changed to the users id. This is by design (the human has seen the aligned TU and so the penalty should no longer be applied).

    b) machine translation - is handled as draft status, origin "automated translation", no penalty applied by default:

    Screenshot of Trados Studio displaying Translation Results with a segment created by a machine translation program, labeled as 'Automated Translation' with a draft status and no penalty applied.

    When such a segment is confirmed, nothing changes in terms of origin. It will stay in the state origin "automated translation". However it will be picked up as an exact match unless you manually use a penalty for TUs that don't contain the creation user MT!  This was the behaviour in Studio 2017.

    After re-exporting the TM you get a TMX where for this segment it still says Creation and Change User MT!

    On creating reports the MT! segment ends up as a 100% match and the alignment as 99% by default.


    Read it carefully and let me know what you think?  This is the behaviour that is currently planned for the next CU in 2021.

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 12:59 AM (GMT 0) on 29 Feb 2024]
  • Hi Paul,

    Thanks so much for looking into this so thoroughly! I'm happy to hear that this functionality is on its way back. Do you know when to expect the CU that will reintroduce this functionality?

    With regard to the details of how MT TUs are handled, that is not the most important thing to me, but since you ask let me say that offhand I don't see a good argument for treating them differently from aligned TUs. Once a human translator has looked over an MT TU and approved it (with or without changes), I feel that the most logical thing would be to update the Change User to the current user - just as with aligned TUs. Sure, aligned TUs (probably) originate in "human-produced" texts to begin with as opposed to MT TUs, but once an MT TU has been looked over and approved by a human, it should be considered "as good as" any other approved TU - in my opinion. At the very least if the user makes a change in the MT TU, the Change User should be updated to the current user - I really don't see any sound argument why not to do that, especially if you allow an MT TU to be recognized as a 100% match the next time - which I definitely agree that it should. I would be interested in hearing your views on this.

    Do you have any knowledge about why MT TUs seem to "block out" 100%/fuzzies from the translation results window - which I am experiencing, and which I have seen and heard other users report? What seems to happen is that if there is an MT match, then neither fuzzies nor 100% matches are even shown in the translation results window - which seems strange. We can discuss the order in which the various match types are shown (I would prefer 100% matches > MT TUs > fuzzies), but I don't see a good argument for not showing 100% matches/fuzzies at all if a MT TU is found.

  • Do you know when to expect the CU that will reintroduce this functionality?

    No.  There are no dates attached to this update yet.

    On the rest, I'm adding and to the thread as they will be interested in the discussion I think.

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Thanks   - indeed great discussion and happy to see the validation of our thinking to make confirming MT TUs consistent with aligned TUs.

    Daniel Brockmann
    Team Trados @ RWS

  • Hi Paul,

    Can you (or anyone else at SDL) tell me if this functionality has been reintroduced in Studio 2021 yet? I am in the slightly troublesome situation that I am still using Studio 2019, and I cannot simply test Studio 2021 since my trial license has expired, and if I convert my 2019 license to a 2021 license and the functionality has not yet been reintroduced, I will be "stuck" with 2021 without this important functionality. So I need to know when the change is implemented in 2021 before I can convert my license and start using 2021.

    Thanks!

  • Hi Dennis - it's tentatively scheduled for the next CU which would be around the beginning of March 2021 as far as we can say today. This is always tentative but we could reconfirm this as we get closer to the release? Thanks, Daniel

    Daniel Brockmann
    Team Trados @ RWS