Deletion of multiple entries

In importing my MT 5.1 files into new MT Studio databases somehow quite a number of entries were duplicated. Sometimes they even appear 12 times in my database. Since my MT database is rather extensive, trying to fix this manually would take days.

In MT 5.1 it was very simple: just synchronize on index field, and double entries were combined.
Why doesn't this work in MT 2014?

Parents
  • Hi Ineke,

    It should work in MT 2014 as well.  Maybe try reorganizing your termbase first?  Perhaps something was lost during the upgrade?

    Another way that might be interesting for you is to export to Excel using the Glossary Converter and then use excel to easily remove duplicates and convert back again afterwards.  How easy this will be depends mostly on the complexity of your termbase but the Glossary Converter is an excellent tool that everyone should have in their armoury:

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Paul, thank you for your reply. My TM termbase is very extensive, so extensive that I am using 3 of them now, because someone told me they do not work properly anymore when they are bigger than 200 mb. And all combined, mine are definitely larger.

    I tried reorganising, exporting and importing them in a new termbase, and the only effect it had was multiplying even more entries, so it doesn't work as easily as in MT 5.1.
    The best way would be to have an export file that could be imported in MT 5.1, which worked very fine for me, then create a new export and import it in MT 2014 again.

    Cleaning up the file in Excel is not an option, it is simply too big and too time consuming.
    I tried the glossary converter, but it did not make me happy. I spend a few hours tossing around with it, but it didn't do what I wanted, so I gave up again.
  • Hi Ineke,

    The practical limit of a file-based MultiTerm termbase is around 2Gb and this is because it's basically an Access database. So 200Mb should not be a problem at all.

    If you are able to share one of your termbases I'd be happy to take a look at it?  If you can do this please drop me a link to pfilkin@sdl.com You can use dropbox or wesendit perhaps?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • 2 GB, okay, that is better.
    I have 3 databases now, totalling about 600 MB, so in your view I should be able to combine them in one, and still be able to add to them. That would be great.

    I have created a folder in dropbox and have sent you an invitation. The files are being uploaded right now.

    Thanks!
  • Paul, thank you so much for your help, by creating a new database with all the entries in it, without the duplicates. This is very helpful for me and will safe me time, since I was not adjusting the files one entry at a time, every time I saw one.
    Now I will not have to spend anymore time on this, and now I can use only one database, instead of more than one.

    I am really grateful, thank you so much!
  • Thanks Ineke,

    No problem, your files were pretty large and I did have to keep them split in order to import them as MultiTerm struggled with the reorganisation.  There is a KB somewhere related to that but importing as separate files worked fine.  I also double checked that the export was ok as this would now be one large XML file and using the Glossary Converter this worked easily in around 3 seconds!  So I'd recommend you do this regularly to keep a backup of your XDT and XML.

    The steps I used to resolve the problem you were having were as follows:

    1. Used the Glossary Converter to get your three files into Excel
    2. Tidied up the excel files (simple to do with filtering) and removed empty fields altogether
    3. Used the Glossary Converter to create the XTD and MultiTerm XML files for the three tidied up excel files
    4. Created a new MultiTerm Termbase with one of the XTD files
    5. Created a new import model based on merging on English to make sure I had what you wanted
    6. Imported all three MultiTerm XML files
    7. Import was fast, the reorganisation takes a little while especially for the biggest of the files

    Kind regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Reply
  • Thanks Ineke,

    No problem, your files were pretty large and I did have to keep them split in order to import them as MultiTerm struggled with the reorganisation.  There is a KB somewhere related to that but importing as separate files worked fine.  I also double checked that the export was ok as this would now be one large XML file and using the Glossary Converter this worked easily in around 3 seconds!  So I'd recommend you do this regularly to keep a backup of your XDT and XML.

    The steps I used to resolve the problem you were having were as follows:

    1. Used the Glossary Converter to get your three files into Excel
    2. Tidied up the excel files (simple to do with filtering) and removed empty fields altogether
    3. Used the Glossary Converter to create the XTD and MultiTerm XML files for the three tidied up excel files
    4. Created a new MultiTerm Termbase with one of the XTD files
    5. Created a new import model based on merging on English to make sure I had what you wanted
    6. Imported all three MultiTerm XML files
    7. Import was fast, the reorganisation takes a little while especially for the biggest of the files

    Kind regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Children