Under Community Review

Two small upgrades to upLIFT fuzzy repair

Last week I had a chance to see upLIFT fuzzy repair in a real-life scenario of a large localization project. Even though it often fails for languages with inflection, there is one tiny upgrade to it, which could save just a little bit of time: checking the case of the added/removed words. upLIFT fuzzy repair tends to merge fuzzy matches, but it ignores the case. Changing the capital letter takes just one second, but with a couple thousands of segments it adds up to significant time we could save. Also, it often happens, that the punctuation fails at the end, and the results is a double dot. This is something, which could be checked automatically, and therefore we could save some more chunks of time.

To summarize, the ideas are:

1) case sensitive fuzzy repairs - check if the part glued on the right doensn't start with the capital letter in the middle of the sentence.

2) punctuation checks for double dots and similar errors - we don't want to check for it later with QA Checker if this can be inserter correctly.