Comparing two translation memories and purging translation units from one if duplicated?

Hi everyone!

I have a strange (perhaps) need, let's see if somebody has an idea about it.

I have the need to use two Translation Memories. One is "old" (let's call it Old TM) and one is "new and polished" (let's call it Good New TM). They have a lot of different Translation Units, but they also share plenty where their source is identical (and their target may differ, or not), The Old TM has some outdated translations and some valid ones. All the Good New TM's translations are good. However, these TM's are NOT identical, there are translations units ihat are exclusive to just ONE of the TM's while other Translation Units are common to both Translation Memories.

The thing is that I'd need to purge those segments that are exactly equal from the Old TM, so that when I perform a concordance search or a lookup, Trados will only show me one 100% match at most.


The thing is that, as things are now, if I perform a lookup and there are often two different 100% matches for the same segment, I get a 100% result from Old TM and another 100% result from Good New TM. As, in such case, the valid option is ALWAYS the one from Good New TM, I'd like a way to delete/purge/clean up such segments from Old TM. If the segment does not appear in both Translation Memories, keep everything as before.

Example:

If the source segment says:

Open the window

And the lookup operation results in two 100% matches, 

Abre la ventana (from Old TM)

Abre la ventanilla (from Good New TM).

I'd like a way to batch delete this Translation Unit from Old TM because the good translation is "Abre la ventanilla" and I don't want two 100% matches when using these two TM's.

Is there a way or a plugin to do this? I have done some research but to no avail. Of course, the reason for this is because both Translation Memories are huge and I can't afford to go deleting Translation Units one by one.

And no, I can't just dump the Old TM into the Good New TM with the "don't overwrite" setting. I need both separated as I can trust the Good New TM but the Old TM is not too reliable. So no, I need two independent Translation Memories, I only need to remove segments in the OLD TM if and only if their source text is IDENTICAL to the source text in the Good New TM.

This behaviour would be useful for purging outdated translations from otherwise still valid Translation Memories; I am sure some people could definitely use that.

Any ideas? Is there any plugin out there? Exporting to TMX and performing some kind of operation would work?

Parents
  • That is actually something I am working on, too. Good incentive to get thinking about it...

    First I generate an Old TM:

    Screenshot of Trados Studio showing an 'Old TM' with columns for Name, Source Language, and Target Language.

    Then a Good New TM:

    Screenshot of Trados Studio displaying a 'Good New TM' with columns for Name, Source Language, and Target Language.

    This Good New TM I export to TMX.

    I add a field to the Old TM: "Source".

    I import the content of the Good New TM into the Old TM using the "overwrite" setting and setting the "Source" value to "Good New TM" for each imported segment.

    This is the result:

    Screenshot of Trados Studio with an updated 'Old TM' showing an added 'Source' field and imported content from the 'Good New TM'.

    Content unique to the Good New TM, shared content with identical target and shared content with different target has the source field value "Good New TM". Only source segments that are unique to the Old TM have no "Source" value.

    Now I can create a filter and do a batch delete on the value of "Source". (Do backup your files before you do this...)

    Daniel

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 8:17 PM (GMT 0) on 28 Feb 2024]
Reply
  • That is actually something I am working on, too. Good incentive to get thinking about it...

    First I generate an Old TM:

    Screenshot of Trados Studio showing an 'Old TM' with columns for Name, Source Language, and Target Language.

    Then a Good New TM:

    Screenshot of Trados Studio displaying a 'Good New TM' with columns for Name, Source Language, and Target Language.

    This Good New TM I export to TMX.

    I add a field to the Old TM: "Source".

    I import the content of the Good New TM into the Old TM using the "overwrite" setting and setting the "Source" value to "Good New TM" for each imported segment.

    This is the result:

    Screenshot of Trados Studio with an updated 'Old TM' showing an added 'Source' field and imported content from the 'Good New TM'.

    Content unique to the Good New TM, shared content with identical target and shared content with different target has the source field value "Good New TM". Only source segments that are unique to the Old TM have no "Source" value.

    Now I can create a filter and do a batch delete on the value of "Source". (Do backup your files before you do this...)

    Daniel

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 8:17 PM (GMT 0) on 28 Feb 2024]
Children