Concordance search German

Former Member
Former Member

Hi!
In the Concordance Search - and also when searching in the TM View - several words in German are not spotted in the TM if I do not search the word in the exact same form (see examples below). I tried different values for the “minimum match value” (also the lowest possible value “30”), but the results are always the same (with character-based concordance search disabled).
Examples:
I can’t find segments in the TM that contain “befindet” searching for “befinden” or “befinde”;
the word “erhalten” is not found when searching for “erhalte” or the other way round;
“Informationen” is not found when searching for “Information”;
“elektrischen” is not found when searching for “elektrisch” or “elektrische”;
“geeignete” is not found searching for “geeignet”;
etc.
While segments containing the word “festgestellt” are found when searching for “feststellen” or “feststellte” or "festgestellte"; and "Gefahrenquellen" is found when searching for "Gefahrenquelle".

I'm aware of the possibility to enable character-based concordance search (which however can only be set when creating the TM and is not advised for larger TMs), but I think this is in general a suboptimal solution for TMs with German as source or target language.
Since some words like "feststellen" and its inflected forms are detected in segments when searching for any of the verb's forms (as it seems; and without character-based concordance search), I find this should also be possible for (the) other German words. Is this being considered for future releases?

Thanks,
Klara

Parents
  • Hi Klara,

    At the moment Studio doesn't perform any morphological processing, as you can see. Perhaps the solution for you now is to use character based concordance, and if you find you're getting a performance hit maybe split the TMs up so you use several smaller ones rather than one massive TM?

    I don't know what longer term plans we have for using machine translation to help with this sort of thing, but I'd be surprised if we haven't thought about it given there is already a lot of technical capability accumulated for many languages within our MT teams. I think might be able to give a better answer, but right now I think there are no plans in the short term to tackle this particular enhancement.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Former Member
    0 Former Member in reply to Paul
    Hi Paul,

    thank you for your answer!
    Maybe the problem lies actually in some other settings? Because:
    I did some further tests now, and in the TM view it works actually fine (meanwhile?, today? I don't quite understand why it didn't seem to work when I last tried in October). However, in the Concordance Search the problem persists. I also raised the number of hits for the Concordance Search to the maximum of 99, but I get always the same results. Now I find it curious that it works in the TM view but not in the Concordance Search.
    I then exported all the segments of the TM and imported them in a new TM where I enabled character-based concordance search and tried searching for the words I listed before, I always get the same results. So, sadly, character-based concordance search seems not to solve the problem.

    Is it only me who cannot find "Informationen" when searching for "Information" in the Concordance Search (language German (Germany))? Because, in this case, the problem surely lies in some other settings.

    Thanks,
    Klara
  • Hi Klara

    I'm not sure what to say... other than it works for me in Studio 2015:

    Regards

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Former Member
    0 Former Member in reply to Paul
    Hi Paul,
    thank you for the video! It shows clearly, that it is a problem with my settings.
    I found now, that with the character-based CC it always finds something (the nearest match(es)?), but it seems that if it finds exact matches, it shows only them, e.g. if I search for "befinden" it only gives me results which contain "befinden", not also that with "befindet" or "befinde" etc.; same for "Information"/"Informationen", so also in this case, I have to think about the different forms of the verb/substantive and search for each separately. So I will have to go with this solution until I find where the actual problem lies.
    Thank you,
    klara
  • Former Member
    0 Former Member in reply to Former Member
    Addendum: Creating a new TM (with character-based CC) and newly saved translation units, it works as your video shows :), and it finds all possible stuff (even too much, actually) for "befinden" etc. However, in the new TM (with character-based CC) where I imported TUs from the old TM (without character-based CC), the search results do not include other TUs than those containing the exact word, even with newly added TUs.
    Regards,
    klara
Reply Children