How to handle a project with 80.000 words/200 files with many repetitions?

Dear colleagues,

Does someone have a good working procedure for a huge project of 80.000 words in Studio 2014. I don’t know how to tackle the project to work quickly and efficiently.

The project consists of 200 files; only 27.000 words out of the 80.000 words are ‘new/no match’.

Most other things are internal repetitions, but the problem about these repetitions is that individual words are often a titel (capital letter), in other cases an image legend (small letter), etc, so they should be regarded case by case...

Do you think it makes sense to export for instance all the text that are no internal repetitions to a separate file and that I first translate this single Studio file separately before working myself through all the 80.000 words.

Or do you suggest another working method that would go fast? The main aim is to not have to read again the 40.000 repeated words again and again.

Your meaning is highly appreciated.

Have a nice day,

Phil

Parents
  • Hi Phil,

    I'll work on the basis that you still want to see these segments as the context may be useful.  So in this case, if it was me I'd do this:

    • First I'd run the analyze batch task and "Export frequent segments"
        
    • This will create a folder in your Project Folder called "Exports" and in there you'll find an sdlxliff that contains the segments that are repeated throughout your project.
    • Add this file to your Project and translate it first.  Then run a pretranslate across your entire project... maybe lock 100% matches automatically too and then when you come to translate the files these segments will be skipped as you work so you don't have to worry about them.
    I don't know what type of content you have and this may change the approach a little.  If for example it was full of number only segments then I'd probably handle these first using the SDLXLIFF Toolkit.
    But perhaps this will help, and maybe someone else can suggest a better way based on their experience.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Reply
  • Hi Phil,

    I'll work on the basis that you still want to see these segments as the context may be useful.  So in this case, if it was me I'd do this:

    • First I'd run the analyze batch task and "Export frequent segments"
        
    • This will create a folder in your Project Folder called "Exports" and in there you'll find an sdlxliff that contains the segments that are repeated throughout your project.
    • Add this file to your Project and translate it first.  Then run a pretranslate across your entire project... maybe lock 100% matches automatically too and then when you come to translate the files these segments will be skipped as you work so you don't have to worry about them.
    I don't know what type of content you have and this may change the approach a little.  If for example it was full of number only segments then I'd probably handle these first using the SDLXLIFF Toolkit.
    But perhaps this will help, and maybe someone else can suggest a better way based on their experience.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Children