PerfectMatch misses off last part of some segments (even when using workaround to this known issue)

Hi everyone,

I've had the issue before, when I prepare a new project for Manual B against a previous Studio Project of very similar (identical at most points) Manual A in Studio using PerfectMatch with formatting tags, the last part of the segment (where it combines 2 sentences into 1 segment during the segmentation process) is lost in the translated target segment.

Due to tag/file type issues with PerfectMatch, I am aware that this is a known bug - i.e. PM should not be used to segment the files, i.e. PerfectMatch should not be run at Project creation stage, but rather after as a separate batch task. All fine, what I've been doing is when creating the project, add TM and then let the TM segment the files. At this stage I also uncheck Lookup (and Concordance) so that it does not input the matches during project creation (otherwise I then have to clear all segments post Project creation before I can then run PerfectMatch in the separate step). Once the project has been created, only then do I run PerfectMatch against the sdlxliffs of the previous project.

Up until now, doing this separately has worked (at least, I'm starting to doubt that - it is always something I look out for)). I just happened to notice the issue again on the latest project I've created in this way. 

The frustrating thing is that we use PM as a cost-saver for the client - they have manuals (50-90 files between 30,000-50,000 total words) for machinery - but some of the source file are customised each time, whilst most files stay the same, and they want the full translated manual back each time. This is why PM seems handy and better than using the TM. But this issue with the PMs not being reliable is really quote frustrating, especially because we do not check the PerfectMatch and also since I thought the workaround solved it. 

I checked the Reports, in case I forgot and used PM during Project creation of this particular project, but I definitely followed the right process. In this particular project, the translator is using memoQ - might this have anything to do with it? I doubt it as the translator couldn't amend the segmentation simply by opening it in memoQ? Either that or is it the fact I disable Lookup/Concordance?

Any help would be really appreciated.

Many thanks,

Gemma 

  • Have you tried checking this by using a file you handled in Trados instead of memoQ?  The reason I ask is because of this:

    I doubt it as the translator couldn't amend the segmentation simply by opening it in memoQ?

    It would need checking, but I do think you can amend the segmentation simply by opening the file in memoQ.  memoQ can split/join segments in order to achieve better matches and I'm not really sure what effect this would have on an sdlxliff when you save the target later.  That part would need investigating... but you can validate by not using memoQ and see if the same problem occurs?

    I'm also curious about this:

    Due to tag/file type issues with PerfectMatch, I am aware that this is a known bug - i.e. PM should not be used to segment the files, i.e. PerfectMatch should not be run at Project creation stage, but rather after as a separate batch task.

    Do you have the bug number for this?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Thanks for your help, Paul. 

    Ah, I hadn't realised the memoQ could do that... So I just checked the original version of one of the files (where I know it has definitely happened) - i.e. I checked the prepared sdlxliff I sent to the translator before memoQ got involved. Unfortunately the error was present in that version of the file too so I guess that means that it can't be down to memoQ here.

    As for known 'bug', I am unsure if it is classed as a bug or an issue (apologies if my wording was misleading) - I don't have a bug number, but I can give you the case reference I got from Support when I raised a ticket - 00477887 - from January last year. I have just re-raised it this week as new case ref: 00613825, which your team are currently looking into.

    I'm really at a bit of a loss as to how to check for and resolve this issue without involving too much manual work, and even then if I re-created the projects, the issue would happen again as I did things the 'right way' the first time (though something must be wrong). The only thing I can think of currently is to go through all locked segments and check them manually myself for any missing parts... But I don't really fancy that as the locked segments amount to 75,000 words!

    So I appreciate any other ideas you may have. Thank you very much!

    Many thanks,

    Gemma

  • Hi ,

    Just returning with an update on this one.

    Support have raised this issue as a defect CRQ-25143 for the development team to check and fix.

    In the meanwhile, the only suggested workaround was to try and fix the original translation (i.e. old project) and then re-do PerfectMatch (for new project). Unfortunately this is not a realistic workaround for us and so instead I ended up manually re-checking all (a mere 75000 words) PerfectMatch segments this time... 

    Thanks again for your help and hopefully a fix will be found soon.

    Best wishes,

    Gemma

  • Hi, Gemma.

    Thanks for sharing this. In the end, this was caused by having processed the files with MemoQ or was it totally unrelated?  What fix was required in the original translation?

    We don't use MemoQ but it would be good to know if we have to check the PM output.

    Daniel

  • Hi ,

    Thanks for your message. My particular case was unrelated as I did a test and found that the SDLXLIFFs have the wrong segmentation before I sent it to our memoQ-using team. 

    As to the suggested fix: There was an extra CF tag in some segments. The Support team fixed the original translation by splitting the segments before the extra CF tag and fixing the target translation.

    It seems to happen when there is a cf tag between 2 sentences that for some reason then get segmented wrongly into 1 combined segment. Typically in my case, the translation of the first sentence pf the segment is pulled into the target by PerfectMatch, but the translation of the second sentence is missed entirely. I am unable to share a screenshot due to client confidentiality, sorry, but I hope my explanation makes sense. 

    Hope this helps.

    Best wishes,

    Gemma