"POSSIBLE HASH COLLISION". Any idea on what this means?

Former Member
Former Member

Hello,

I'm experimenting the following with WorldServer: "com.idiominc.ws.tm.lookup.LookupAsset.GENERAL: Hit result is not exact match as expected. Ignoring tmTrans.(POSSIBLE HASH COLLISION)"

 

Any idea on what's going on?

 

Thanks

François

  • Hi François,

    this message is generated during the lookup of entries in translation memory for an asset, as in the Segment Asset autoaction in a workflow, or a similar operation on assets during scoping or opening in Browser Workbench.

    During the leverage process, there may be more than one appropriate match candidate. For instance, there may be multiple ICE matches or multiple exact matches.

    Hash algorithms are used to generate a shortened version of segments to compare for exact matches, and to compare the context of the preceding and following segments for ICE matches. These hashes are supposed to be unique for unique text, but hashing is not perfect, so sometimes different text will produce the same hash value.

    The message is a warning. It means that the lookup found a TM entry with the same hash value, but the source string did not match. At that point, WorldServer will simply continue looking for more appropriate matches. The larger a TM becomes, the more these messages may occur. If the number of messages is small compared to the number of segments being processed, the speed benefit of hashing is much greater than the speed loss from any hash collisions, which is why the technique is used.

    I have just published an article with this very same explanation to our Knowledge Center, for your reference:

    gateway.sdl.com/.../communityknowledge

    I hope this helps you!

    Regards
    Caterina
  • Former Member
    0 Former Member in reply to Caterina Lazzarini
    Hello Caterina,

    It helps.

    In our case, this is happening with a brand new TM group injecting the content into a brand new TM. Let's see how this goes.