Suggest invariant parts of words in case-ending languages

I translate to and from Russian.

For translating FROM Russian, it should be possible to set up things to recognize the invariant part of a word as the basis for a suggestion. E.g., a term base, where the pairing would be: RUSSIAN: 1st x characters of a word are ____; ENGLISH: the translation of that (say) noun (there's only one, after all).

For translation from case-ending language, this would add an immense amount of power. And speed to the translating process. And it seems easy enough to do.

Is it possible to make this happen now? If not, it should be suggested.

Thanks!

  • Hi John,

    I don't completely see how this would be easy, but it would be good to see some actual examples from you. Maybe you could create the idea and ex,pain a little more here:

    community.sdl.com/.../trados-studio-ideas

    See how many others agree with you that we should spend development time on this.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • CAT users seem to fall into two categories: those who translate highly repetitive material, where TMs can be the chief tool in essentially constructing new texts from old materials, and those who use CAT tools to speed up translation composition through autocorrect. We're talking largely about the second group, which is obviously large, but sometimes harder to help.

    A very simple TB-style situation: "Loving" in Russian will begin with любящ... but there are a pile of endings because of gender and case. If we're translating RU>EN, we might want the TM to recognize a wild-card term, e.g. любящ* and give us "loving" for all of them. If we're translating EN>RU, we might want "loving" to give us любящ without a space after it and let us just fill in the case ending.

    As it is, any words or phrases with case endings are very difficult to include in a TB - you'd have to insert dozens of entries for each single word/phrase. The TB situation looks easy to fix - allow wild-card definitions and a setting for whether the rendinging should be treated as a full word (so you get a space after it after hitting enter) or not.

    *All you'd need is to allow terms that include wildcards!*

    The TM situation would probably take more development work, but identifying that there are a lot of segment fragments (words) that begin with the same seven letters , for instance, and having an option to see them as related, doesn't seem unfeasible.
    One approach: users could have the option of entering invariants, building up a library of them; the program then could see all words with that invariant as being the same, thus increasing the match value.

    And perhaps there are other functions I haven't thought of yet.