How does SDL Language Cloud MT calculate the number of characters?

One of my colleagues used this MT last month but her account was quickly blocked as she reached the limit. However, she knows she hasn't actually translated that much using the MT.

So I did a simple test: I ran a file, one segment at a time through the MT while checking my account open. To my surprise, my account shows I translated 6416 characters, while a count in Word of the same gives me 1644 characters (no spaces) and 1926 characters (with spaces)...

So... I do wonder. How does the algorithm calculate the number of characters? No wonder why my colleague reached so easily the limit!

Parents
  • Hello  ,

     As there appears to be quite a discrepancy in your example and indeed quite technical (algorithms) maybe  would be able to help answer this one.

     

    In the meantime, could you share the test file you used please (swhale@sdl.com) -we can then compare the results and whether possibly a few setting changes in Studio make a difference.

    Lydia Simplicio | RWS Group

    _______
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Steven,

    Any update on this? I just upgraded to Studio 2019 and did a quick test, finding exactly the same behaviour and issue... Why would we want to pay for a service if we know it does not operate fairly?
  • Hi Daniel,

    I've tested this with a sample file in Trados Studio 2019 and I cannot see such a difference.

    Word is giving me 1 122 or 1 329 and after I pre-translate the file with Language Cloud in Studio 2019 my usage increases by 1 304. 

    Please note that if you have LC MT added as a MT provider Studio might use it in the batch tasks before you run a pre-translate even if the "Apply MT when not match is found" option is NOT ticket. So please make sure you only add LC MT before you pre-translate. Also Studio will send segments to Language Cloud when you go segment by segment to post-edit and confirm. This is why the same segment might be sent to translation twice. So if you would use Language Cloud only segment by segment it should only send a segment once to translation. If the segment is not confirmed Studio will send the same segment to Language Cloud again and again. 

    You can add an idea on the Trados Studio thread for this to be improved and if it gets some more supporters Product Management should prioritize it.   https://community.sdl.com/ideas/translation-productivity-ideas/i/trados-studio-ideas 

    If you want we can take a look at this in detail, just contact SDL Support for Language Cloud MT through https://gateway.sdl.com/webtocaseLCMT 

    Thank you

    Radu

     

     Word Count dialog box showing statistics: 2 pages, 239 words, 1,122 characters without spaces, 1,329 characters with spaces, 42 paragraphs, 57 lines.Trados Studio Language Cloud usage dashboard showing 96,673 used characters and 9,903,327 remaining characters with 0.97% used.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 5:09 AM (GMT 0) on 5 Mar 2024]
  • Hi Radu,

    For the purpose of the test I did, I had LC MT as the only TM in my settings. No other TM or MT was set. I did not use the batch task, but opened the file directly and went through it one segment at a time, waiting for the output from the LC MT... I did not confirm any segment, just went from one segment to the next, waiting for the MT to populate it.

    In the end, I still found a big difference:
    In Word: 2227 characters without spaces/2617 characters with spaces
    In LC MT: 4992 characters...

    I can't figure out any reasonable explanation. To me, there is still a bug that needs to be fixed. Could it be related to some pairs only? I am using it from English to French.

    Regards,

    Daniel
  • Hi Daniel,

    I've done some tests on English to French with a smaller file. The number of characters used by LC MT is lower than the number of characters with spaces from Word. LC TM counts as characters the spaces between words as this is how it identifies the words, it doesn't count the spaces at the end of the row or sentence.

    Can you try with a smaller file, maybe with 3-4 segments, and check after each segment how much the LC MT usage increased?
    Also if you have Lookahead activated Trados Studio (File - Options-Editor-Automation) will send 2 more segments in advance for translation, but then when you get on those segments it should not send them again.

    Actually after doing some more tests I noticed that the issue is the LookAhead option and the speed of going from segment to segment. If you go fast from segment to segment, the segments will be sent by Lookahead in advance, but because Lookahead process is not done yet, it will also send the segment again, even if it was sent by Lookahead.

    So try to:
    1. Disable Lookahead
    2. Go segment by segment slowly, allowing 1-2 seconds.

    Studio is dealing with LC MT as with a regular TM and it will search it every-time you go on a segment that is not confirmed.
    Also for AdaptiveMT to work you need to go segment by segment, post-edit it, and then confirm so Studio sends the post-edited segment back to LC MT AdaptiveMT so it can learn your style of translation.

    We would recommend to use LC MT:
    1. Pre-translate then deactivate LC MT provider from project settings. The LC MT hits will be in the target segments for you to post-edit and confirm.
    2. Use LC MT segment by segment, post-edit and confirm. During the time you post edit if you are using LookAhead it should finish retrieving the next 2 segments back from LC MT so it will not send them again.

    Please try this and let me know if the usage is still doubled.

    Thanks
    Radu
  • Ha! That's exactly what it is! Disabling Lookahead gives me exactly the right count. When enabling it, I find indeed the count for each segment actually takes the next 2 segments... So there we are, you found the explanation to the issue!
Reply Children