Translation memory management: duplicate TUs or overwritten TUs

Hello,

I manage the TMs and translation resources for my company. When I import a TU with a custom set of fields (filename, category, subcategory, translator, date entry, native check) and then import the exact same TU from a different project with a different set of fields, the original TU is overwritten. Instead of having 2 TU's I just have 1. This is problematic. If one TU is used for one manual, then overwritten with info from another manual, the reliability of the TU is put into question even though it could have been used serval times.

Is there any way to stop this from happening? I would like two TU and not like old TUs to be overwritten. 

As of now, the only way I know how to NOT override an old TU is to import the new TU into another TM. I split our master TM into 3 TMs for our main divisions and this sorta helped. Another workaround is to customize the field settings of the TU to allow for multiple values. That way I could keep track of where the TU was used (useful for manuals or say makings sure that speech from the president of company stays associated with the president). But this would lead to really huge "filename" fields for me and would potentially take an incredibly long time to initiate because this setting can only be done from a fresh new TM... right? How would I do this for a TM with a few hundred projects and several 10s of thousands of TUs?

So anyway, if anyone has a good translation memory management method or tips, especially for in-house translators, I'd love to hear yr thoughts.

Best regards,

Keenan

Parents
  • Hello ,

    If you try to import a TM in another TM the TU's will be overwritten.

    The solution for you will be to use the CTRL + SHIFT + U function while translating and confirming segments.

    Also the below article might be useful:
    gateway.sdl.com/communityknowledge

    Wish you a great day.

    Best Regards,
    Ana

    Ana-Maria Matefi | RWS Group

    _____
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • What I don't get is how this process works if translator works completely "independently", i.e. works on translation package (i.e. works either with Project TM, or with local separate copy of Main TM).
    Then the return package contains only the SDLXLIFFs... so is this "Add as New Translation" flag remembered somewhere in the SDLXLIFF, so that when engineer runs the Update Main TM batch task on his side (i.e. completely separately from translators!), the TUs are written to TM exactly as the translator originally intended?!
  • Hi  

    Thanks for your email, and your patience, explaining to me again what your problem is.  I'm going to summarise the situation so it's clear (for me at least):

    • you need to be able to differentiate between duplicate translations based on your field values
    • you have many field values in use, all of which are important, so using "allow multiple values" is not a solution for you because there would be too many in the TM results pane to make it possible to use them during translation
    • you could apply filters to the TMs based on a field value so you only get results from the field you want, but this is not a solution for you either
    • using multiple TMs is not a solution for you because the management of so many TMs is impractical
    • what you would like is the ability to update and create duplicates during import

    Currently, as you know, it is very difficult by design to create a true duplicate in a Studio TM.  Even if you go to TM Maintenance and manually create one by copy pasting over another you'll find Studio removes one automatically.  So making the change to allow this would be an enhancement, and probably not a simple one either.  It is also very likely to cause duplication where you don't want it because it would probably be an all or nothing approach and this would add unwanted maintenance to your TMs. Maybe an option to suggest is to add new translation units if the field values differ and perhaps that would give you the desired effect?

    TM updates is quite a complicated area and all changes need to be very carefully thought through, however, I think the best approach is for you to log this request in the ideas site and the product management team can pick it up and decide whether it's something they wish to introduce based on their views and the amount of support you get.

    I hope this reflects the situation.  Sorry it took so long to get there too!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Unknown said:
    Maybe an option to suggest is to add new translation units if the field values differ

    How do I "add new translation unit" as an engineer when performing TM update after receiving translation packages from translators?
    AFAIK, adding new TU is possible only during translation (i.e. on the translator's side), but that action is completely disconnected from what happens in the main project after importing the return package...
    Or am I missing something? Is the "this segment is supposed to be added as new TU" information recorded somewhere in the SDLXLIFF (so that subsequent TM update can act accordingly)? I believe it's not.

  • Hi  ,

    You're not missing anything... there is no such option. I was suggesting Keenan might want to suggest this when/if he creates his idea.  But re-reading what I have written I can see it wasn't very clear!  Sorry about that.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Keenan,

    I guess you are trying to put the cart before the horse.

    You don't need multiple TU with same translation, which as Paul said is difficult to implement, but what about a workaround which might be way easier to implement?

    Your example:

    TU#1 Source: パワーコンディショナー Target: PV inverter    Filename: P11 manual
    TU#2 Source: パワーコンディショナー Target: PV inverter    Filename: T11 manual
    TU#3 Source: パワーコンディショナー Target: PV inverter    Filename: R11 manual
    TU#4 Source: パワーコンディショナー Target: PV inverter    Filename: S11 manual
    TU#5 Source: パワーコンディショナー Target: PV inverter    Filename: Q11 manual
    TU#6 Source: パワーコンディショナー Target: power conditioner    Filename: F11 manual

    So all six of those TUs are imported, in that order, meaning in that there are only two TUs in that TM:
    TU#1 Source: パワーコンディショナー Target: PV inverter    Filename: P11 manual
    TU#6 Source: パワーコンディショナー Target: power conditioner    Filename: F11 manual

    Now I am translating something and I want it to be similar to the R11 manual, like, an update of the R11 manual.

     

    You suggest you need several TU for the various Filename fields, so that translators know which one to use, but what if importing TU into your existing TM would have the following option (which checkbox to enable it):

    If same source and target then update the following fields. (the fields would need to be selected with checkboxes)

    in the background the process would use regex to check if field current string contains the string of the field from new TU and update it if necessary:

    old value for field Filename:

    P11 manual

    value for field Filename of new TU:

    R11 manual

    new value for that TU in TM:

    P11 manual; R11 manual

     

    During translation it would look like this:

    1 Source: パワーコンディショナー Target: power conditioner    Filename: F11 manual 99% match (-1 penalty)
    2 Source: パワーコンディショナー Target: PV inverter    Filename: P11 manual; R11 manual 99% match (-1 penalty)

     

    So the translators would see which TU to use for the manual they are translating whithout the TM result window being cluttered with too many strings.

     

    Alternatively Trados could have an additional filter for TM result window could be set in project options:

    Filter Fieldname: R11 manual

    TM result window would then show results with Fieldname containing "R11 manual" as first result (most relevant one), which would mean less scrolling for the translator.

     

    Regards,

    Pascal

  • Unknown said:

    During translation it would look like this:

    1 Source: パワーコンディショナー Target: power conditioner    Filename: F11 manual 99% match (-1 penalty)
    2 Source: パワーコンディショナー Target: PV inverter    Filename: P11 manual; R11 manual 99% match (-1 penalty)

     

    So the translators would see which TU to use for the manual they are translating whithout the TM result window being cluttered with too many strings.

    Ugh, can you imagine the mess if there would be like 20+ filenames, 50+ characters each?!

  • Pascal,

    Thank you for your attempt. But unfortunately, your advice is not applicable. 
    Also, in response to "you don't need duplicate translations": I've said, repeatedly, that I, and my team, clearly do. If you're hung up on the word "duplicate" which I know the design team made trados on purpose to avoid duplicates, then think of it as "protection". I want certain data to be protected from being overwritten. 

    Your first suggestion, though I don't quite understand how you implement it, arrives at the impractical "cram multiple values in one field" solution, which I've explained before would result in huge fields. First of all, I'd have to turn "allow multiple field values option on" which would delete the contents of the field. I'm resigned to having to try this though using Daniel's import/export advice he gave me in the ideas group.  

    You wrote:
    Alternatively, Trados could have an additional filter for TM result window could be set in project options:

    Filter Fieldname: R11 manual

    TM result window would then show results with Fieldname containing "R11 manual" as first result (most relevant one), which would mean less scrolling for the translator.

    This wouldn't work because essential R11 TUs would've been overwritten by later data. So I can't use a filter to only see R11 data if some of that data doesn't exist (isn't associated with R11) anymore. In TM memory view, I've made changes to older files, filtered only what I needed, and noticed that once a batch edit was complete, the edited TUs always amount to LESS than what was originally added according to my records. So filtering doesn't work when data has been overwritten. Also, if I wanted to extract/export data from a TM using a field, I wouldn't actually get all the data I wanted because some data had been overwritten is no longer associated its original source. 

  • Yes! This! This is exactly my worry and why I'm not enthusiastic about this solution.
  • Hi Keenan,

    The filter I mentionned was in combination with my idea, with the way Trados works not it certainly is not possible in your case.

    Still, as a programmer and database admin I know there are better solutions to avoid your problem but I'm not sure if SDL could implement that one on TMs easily.

    They would need to set up a second table with foreign keys on the TU table for TUs that exists for different manuals. In that table they link the fields connected to a TU. By doing this you can keep clean fields and determine several reference per TU. With this setup my suggested filter would still work and only show up one TM match with the required fields.

    For those not knowing much about DB structure it would be like (for your example):

    Table 1: (contains only TUs)

    ID Source Target

    TU#1 Source: パワーコンディショナー Target: PV inverter
    TU#2 Source: パワーコンディショナー Target: power conditioner

    Table 2: (contains the fields linked to TUs for various projects)

    ID ID (foreign key from other table) Field
    1 TU#1 Filename: P11 manual
    2 TU#1 Filename: T11 manual
    3 TU#1 Filename: R11 manual
    4 TU#1 Filename: S11 manual
    5 TU#1 Filename: Q11 manual
    6 TU#2 Filename: F11 manual

    with filters Studio would query this like this:

    Filter for Filename: T11 manual would result in showing:

    TU#1 Source: パワーコンディショナー Target: PV inverter Filename: T11 manual

    As a user, programmer and database admin I understand that users always want their ideas implemented straight forward (I see it every day, when my colleagues ask me to implement (for them) some obvious idea quickly into the app and these ideas often make sense for everyday use) but I also know the other side of the communication channel and unfortunately altering existing tables and databases is not as easy as it seems because for this, programmer need to change lots of code of the app/program itself, not even talking of the problem of corrupting working data structures.

    So you need to find a way in between. The problem is that users don't understand the programmers "language" and vice versa because they think on different levels. So it can happen they talk about the same thing but in their language and simply don't realize it and both get frustrated.

    So I'm not saying that your idea is crap or that you don't know what you need, but simply that we need to find a way in between user and programming that programmers could easily apply to the existing structure. As I don't know their DB structure inside TM even my suggestions could be completely useless to them as it might not be able to add tables at all. ;)

    regards,
    Pascal
Reply
  • Hi Keenan,

    The filter I mentionned was in combination with my idea, with the way Trados works not it certainly is not possible in your case.

    Still, as a programmer and database admin I know there are better solutions to avoid your problem but I'm not sure if SDL could implement that one on TMs easily.

    They would need to set up a second table with foreign keys on the TU table for TUs that exists for different manuals. In that table they link the fields connected to a TU. By doing this you can keep clean fields and determine several reference per TU. With this setup my suggested filter would still work and only show up one TM match with the required fields.

    For those not knowing much about DB structure it would be like (for your example):

    Table 1: (contains only TUs)

    ID Source Target

    TU#1 Source: パワーコンディショナー Target: PV inverter
    TU#2 Source: パワーコンディショナー Target: power conditioner

    Table 2: (contains the fields linked to TUs for various projects)

    ID ID (foreign key from other table) Field
    1 TU#1 Filename: P11 manual
    2 TU#1 Filename: T11 manual
    3 TU#1 Filename: R11 manual
    4 TU#1 Filename: S11 manual
    5 TU#1 Filename: Q11 manual
    6 TU#2 Filename: F11 manual

    with filters Studio would query this like this:

    Filter for Filename: T11 manual would result in showing:

    TU#1 Source: パワーコンディショナー Target: PV inverter Filename: T11 manual

    As a user, programmer and database admin I understand that users always want their ideas implemented straight forward (I see it every day, when my colleagues ask me to implement (for them) some obvious idea quickly into the app and these ideas often make sense for everyday use) but I also know the other side of the communication channel and unfortunately altering existing tables and databases is not as easy as it seems because for this, programmer need to change lots of code of the app/program itself, not even talking of the problem of corrupting working data structures.

    So you need to find a way in between. The problem is that users don't understand the programmers "language" and vice versa because they think on different levels. So it can happen they talk about the same thing but in their language and simply don't realize it and both get frustrated.

    So I'm not saying that your idea is crap or that you don't know what you need, but simply that we need to find a way in between user and programming that programmers could easily apply to the existing structure. As I don't know their DB structure inside TM even my suggestions could be completely useless to them as it might not be able to add tables at all. ;)

    regards,
    Pascal
Children
No Data