Word count is different to client's word count

Hi everyone,

I have noticed that my word count differs from that of the client's when they send me a project and can't figure out why. Here is the process I follow when setting up a project:

1. Import client's project package (contains TMs).

2. Add my own TM.

3. Re-analyse.

When I re-analyse, the CMs and 100% matches are lower than they were before I added my own TM. I have tried re-analysing before I add my own TM and the numbers stay the same as the client reports.

Any ideas why this may be happening?

Thanks in advance,

Katie

  • Hi 

    Re the CM's, not only do both source and target segments have to be identical to those in the TU in the TM but the preceding segment/TU match also has to be identical to generate a CM match. So, it's likely this result will be higher in the client's TM than in yours as they may have more matching previous TUs in their TM than you have in yours.

    Re the 100% matches, firstly it is possible that there are 100% matches in the client's TM that other users have generated and that don't therefore appear in your TM. In addition to this, in my experience there are other things that the software 'sees' that the user can't see, background properties that are beyond the wordcount, that the software 'sees' but you don't.

    For example, in my experience a TM can create what we would perceive as 'identical duplicates', with two TUs that are apparently identical in source and target BUT have background properties that are different. Many years ago, this was explained to me by as I was experiencing this problem with the biggest TM of the translation agency I then worked for. As we couldn't find a cause specific to our setup, this was referred to the developers. They established that new TUs were being generated by the addition of different background hexadecimal values by the TM even though the translation in the document was ostensibly identical to that in the TM. No cause for this was established at that time and there is evidence on the community that this is still happening. Because it doesn't happen to everyone in every case there is no way of defining what actually triggers this when it happens.

    If your TM is doing this (the same client TM is still sometimes doing this even though I'm now a freelancer and using it on my own setup) then for every so-called identical duplicate, you'll be getting a 99% analysis rather than 100%. Thus the final figure of 100% duplicates will be lower than your client's if their TM has no 'identical duplicates'.

    When you're dealing with heavy duty software that has the incredibly wide functionality that SDL Trados Studio and MultiTerm has, in environments/PC setups as varied as they are today, it isn't surprising that anomalies occur.

    So, in short different preceding TUs and duplicate TUs, for example, could be causing this difference...

    All the best,

    Alison Slight smile