We have a very large memory which we had to split in four smaller ones. Where do you suggest we should do the Update? Or how to do to have a good memory updated? Second> how do we get rid of duplicated TUs, or those very similar to the first TU?
We have a very large memory which we had to split in four smaller ones. Where do you suggest we should do the Update? Or how to do to have a good memory updated? Second> how do we get rid of duplicated TUs, or those very similar to the first TU?
If you split up the TM according to a certain criterion, like Client or Domain, then you can use the same criterion to update the relevant TM, i.e. when working on a project for Client A, you update the TM for Client A.
If you split it up without any particular criterion, then it’s difficult for me to say which TM you should update.
One way of reducing the size of a TM is to create a filter based on the Usage count and Created on fields. This allows you to filter out all TUs that were created before a certain date, and since then never reused, since probably they correspond to sentences that are ‘less useful’. You can then use the Export option to export those units to tmx first (and create a distinct TM with those units, which you could use as reference TM), and delete them from the main TM afterwards (via the Batch Delete function).
You could also create a filter based on the Last Used On field, and do the same thing as above with all units that weren’t used for a long time:
My preferred tool for finding duplicates is Okapi Olifant (okapi-olifant.software.informer.com/.../). It allows you to flag different types of duplicates, which you can then verify and delete, if required:
When you use the Flag entries menu item to search for duplicates (see above), Okapi will indicate the amount of duplicates found, and you can then activate the filter to only see the flagged entries…
You can put your cursor in front of the first segment…
… click Ctrl + A to select all flagged entries, and click Delete to delete all duplicates.
Note that if you do that, you need to make sure the option ‘Flag also the original of each duplicate’ is NOT checked during the search. Of course you can also delete segments in a more controlled way by going through them manually.