Different Analysis wordcounts between project languages for the same set of source files (Studio 2015 SR3)

Good afternoon,

We're setting up a Studio project for a translation from English into German and Chinese of a set of 38 files which together add up to around 30,000 words, but the Studio analysis gives different total (source) wordcounts between the German and Chinese projects - see screenshots below. I understand that the different TMs involved will of course affect the split of matches/repetitions across the match categories for each language pair, but what I don't understand is why there's a difference of ~4000 source words as well as a difference in the number of placeables between the English>German and English>Chinese parts (given that the files involved are identical). The total numbers of characters and tags are identical so it's clearly the same set of files. Can anybody help me to understand this please?

Thanks very much!

Richard

Parents
  • Hi ,

    I can only see the German analysis. Can you add the one for Chinese too?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Sorry, Paul - I've just realized that the project was actually created in Studio 2011, not Studio 2015 (a couple of older versions are still in use within our team). Perhaps this makes a difference - sorry for giving you the wrong information. My colleague who created the project has since found that changing the project settings so that the German project didn't have any TMs enabled either (which was already the case with the Chinese project) produced the same total wordcount between the EN>ZH and EN>DE versions, so whether or not we have any TMs enabled seems to make a difference. I'm not sure why it should add up to ~4000 words, though.
  • Hi Paul,

    Another quick update on this issue - we're seeing the same thing in Studio 2015 too. I've replicated the project in Studio 2015 using the same source Word files, and the same thing happens: the EN>ZH analysis shows a total wordcount that's about 3500 words lower than the EN>DE analysis but has the same total number of characters, and whether or not there is a TM enabled for the language combination seems to make a big difference to the total count. The only way we can get to the same wordcount for both the German and Chinese in Studio 2015 is by not having any TMs enabled for either language. Would you expect this to happen?

    As additional background, there are lots of hyperlinks throughout the source files and we've now decided to change the Word 2007 v 2.0.0.0 File Type setting to "Extract only hyperlink text" rather than "Always process hyperlinks", but I guess any difference resulting from this change should apply equally to both languages, if you see what I mean.

    Thanks very much for your help!

    Richard
Reply
  • Hi Paul,

    Another quick update on this issue - we're seeing the same thing in Studio 2015 too. I've replicated the project in Studio 2015 using the same source Word files, and the same thing happens: the EN>ZH analysis shows a total wordcount that's about 3500 words lower than the EN>DE analysis but has the same total number of characters, and whether or not there is a TM enabled for the language combination seems to make a big difference to the total count. The only way we can get to the same wordcount for both the German and Chinese in Studio 2015 is by not having any TMs enabled for either language. Would you expect this to happen?

    As additional background, there are lots of hyperlinks throughout the source files and we've now decided to change the Word 2007 v 2.0.0.0 File Type setting to "Extract only hyperlink text" rather than "Always process hyperlinks", but I guess any difference resulting from this change should apply equally to both languages, if you see what I mean.

    Thanks very much for your help!

    Richard
Children
No Data