Compare issue: Sentence in para removed and replaced with a new sentence, is broken up and not marked as a whole change

The title is a bit confusing and hopefully I can explain it more clearly here.

In version one there is a para:

<p>All licensed carriers in a state whose participation type is an Association Member under Option 2 may have their participation type automatically changed by the state regulatory authority to a Direct Assignment Carrier under Option 1 in accordance with the Participation status table. Under this provision, all licensed carriers are automatically deemed approved as Direct Assignment Carriers and do not need to seek regulatory approval.</p>

In version two the last sentence is removed and replaced, and looks like the following:

<p>All licensed carriers in a state whose participation type is an Association Member under Option 2 may have their participation type automatically changed by the state regulatory authority to a Direct Assignment Carrier under Option 1 in accordance with the Participation status table. Testing adding a sentence with periods commas, and semicolons and colons to make sure it doesn't add a space.</p>

The compared <p> looks like the following:

<p>All licensed carriers in a state whose participation type is an Association Member under Option 2 may have their participation type automatically changed by the state regulatory authority to a Direct Assignment Carrier under Option 1 in accordance with the Participation status table. <?ish text_remove_begin?>Under this provision, all licensed carriers are automatically deemed approved as Direct Assignment Carriers <?ish text_remove_end?><?ish text_insert_begin?>Testing adding a sentence with periods commas, and semicolons <?ish text_insert_end?>and <?ish text_remove_begin?>do not need <?ish text_remove_end?><?ish text_insert_begin?>colons <?ish text_insert_end?>to <?ish text_remove_begin?>seek regulatory approval. <?ish text_remove_end?><?ish text_insert_begin?>make sure it doesn't add a space. <?ish text_insert_end?></p>

Note: I used the color purple for words that the two sentences were "broken" on

The client does not want the last sentence broken up like the above. What they would like to see is the original last sentence wrapped in <?ish text_remove_begin?><?ish text_remove_end?> and then followed by the new last sentence wrapped in <?ish text_insert_begin?><?ish text_insert_end?>.

Is this possible? Are there any parameters for how the compare operates?

I have a hack to do this but it is tag abuse and not an option.

kr

Mario Madunic

emoji
Parents
  • Hi Mario,

    I think you described it well. I understand what humans (customers :-)) want in this scenario, this is however not what computers understand. The engine behind this comparison in any preview but also scaled when comparing publication versions is historically called ChangeTracker in the product. The scaled out part of comparing publications is important - as some customers easily have 50 000 topics in one publication, times two for a comparison - made the product team avoid true xml tree comparisons because it would be dramatically slow. Instead they chose the longest-stretch-algorithm (see https://en.wikipedia.org/wiki/Longest_common_substring).

    The algorithm decides what is the longest common substring, doing its best for added/moved xml tagging and to still end up with well-formed xml. To my knowledge there is no way to add sentence punctuation to this algorithm, actually some customers find added commas and dots just as important.

    In the publishing pipeline, it is possible to overwrite with a custom compare engine, see interface TD14SP4 - IPublishComparePlugin - IshPublishCompare. This is not low hanging fruit as comparison and in turn layout of the comparison, so humans understand it, is not easy. It has been done multiple times though.

    Hope this clarifies something,
    Dave

    emoji
Reply
  • Hi Mario,

    I think you described it well. I understand what humans (customers :-)) want in this scenario, this is however not what computers understand. The engine behind this comparison in any preview but also scaled when comparing publication versions is historically called ChangeTracker in the product. The scaled out part of comparing publications is important - as some customers easily have 50 000 topics in one publication, times two for a comparison - made the product team avoid true xml tree comparisons because it would be dramatically slow. Instead they chose the longest-stretch-algorithm (see https://en.wikipedia.org/wiki/Longest_common_substring).

    The algorithm decides what is the longest common substring, doing its best for added/moved xml tagging and to still end up with well-formed xml. To my knowledge there is no way to add sentence punctuation to this algorithm, actually some customers find added commas and dots just as important.

    In the publishing pipeline, it is possible to overwrite with a custom compare engine, see interface TD14SP4 - IPublishComparePlugin - IshPublishCompare. This is not low hanging fruit as comparison and in turn layout of the comparison, so humans understand it, is not easy. It has been done multiple times though.

    Hope this clarifies something,
    Dave

    emoji
Children