Please explain the differences in results due to different "Analyze" settings

Hello,

I'm preparing a package to be sent out to translation companies. There are 8 very similar files. I'm first preparing the analysis results to be used as the base for a quotation. I used two different analysis results settings and got two different results that will ultimately affect the value of the quote. I want what's accurate but ultimately what will give my company the lowest quotation. Or, could someone tell me if both of these quotes will result in the same value.

Analysis profile 1 (山洋_IS研修資料_171106_見積用解析データ_internal.xlsx)
Settings:
Report cross-file repetitions: yes
Report internal fuzzy match leverage: yes
Results:
Repetitions: 7.49%
Cross-file Repetitions: 0.42%
New/AT: 74.23%

Analysis profile 2 (山洋_IS研修資料_171106_見積用解析データ.xlsx)
Settings:
Report cross-file repetitions: yes
Report internal fuzzy match leverage: no
Results:
Repetitions: 8.73%
Cross-file Repetitions: 7.19%
New/AT: 76.70%

And since there's new sensitive material, I'll just upload the two excel files. 

Any thoughts that as for which version would be best to use as a quote from the customer's standpoint would be appreciated.山洋_IS研修資料_171106_見積用解析データ_internal.xlsx山洋_IS研修資料_171106_見積用解析データ.xlsx

  • Hi Keenan,

    usually you get a better count from the customer's standpoint if you enable internal fuzzy match leverage (you get fewer no matches and more fuzzy matches). You get an even better count if you analyse a single merged file instead of several individual files (with internal fuzzy match leverage enabled). Then obviously, it depends on how you pay internal fuzzy matches, fuzzy matches, no matches, etc.
    I notice that your report does not include words (just characters), so I guess your source language is Chinese (?), this may change the considerations about the count.

    Stefano

  • "Report Internal Fuzzy Match Leverage" option affects how matches are calculated based on a theorerical live TM, rather than TM you are using for the analysis which is static. It means analysis with this option on would take into accout TM update that takes place during translation, therefore it would generally get better leverage because the theoretical live TM simply has more entries than the static actual TM.

    "Report Internal Fuzzy Match Leverage" option only works within a file and it does not work across files. When you merge all of your files in the project during project creation, you would generally get better leverage because internal fuzzy matches are calculated across all of your files that are merged.

    So to answer your question, "Report Internal Fuzzy Match Leverage" option on and merging all of your files in the project would provide the best leverage possible, i.e.lower quote, but it might take longer to get your translation project turned around. The question of which one is best is really up to a client as long as the quote and corresponding translation packages match.

    What's curious is according to your word/character count reports, the analysis with "Report Internal Fuzzy Match Leverage" ON reports significantly less total character count (37571 characters), less than half of the analysis result with "Report Internal Fuzzy Match Leverage" OFF (73255 characters). Could this be a defect for character count analysis? It's not the case when we analyze word count, i.e. total word count matches between analysis with different "Report Internal Fuzzy Match Leverage" options.

     

    Thank you,

    Naoko