Is there any way to get Studio's segmentation to include closing punctuation following a break plus a hard space?

I work in a French/English environment where we are translating in both directions and, in order to maximize leverage of our respective memories, wish to respect our respective styles. Our house style for French is to insert a space after opening French quotation marks («) and before the closing ones (»). I cannot figure out how to make the Studio 2015 break after the closing quotation mark where it is separated from a full stop, question mark or exclamation point by a space. I am new to regular expressions, but I tried to force Studio to include a closing quotation mark preceded by a space using (?:\p{Zs}\p{Pe})|[\s] in the After break space. I tried with and without Include closing punctuation, but it did not work. Any suggestions?

Parents
  • Hi Paul; thanks for the quick response.

    Here is some sample text with the three stops that might be followed by a closing French quotation mark:

    « Voici un exemple avec un point final. » « Pourquoi est-ce que Studio ne marche pas comme il faut pour nous? » « Zut, alors! »

    French-Canadian style is to separate the quotation marks from the text they surround with a space. (Our house rules require a hard space, and incoming source documents are normally edited to change soft spaces to hard.) The default segmentation rules for our French-English memories cause Studio to place the closing quotation mark in a segment all on its own, requiring us to manually edit the source segments.

    My latest test regex, @"(?:\u0020\u00BB[\s])|[\s]", with Include closing punctuation unchecked, put all of the above three sentences in one segment. The same string without the @ and quotation marks gave me the same results as the default segmentation. (I guess I need to paste from Regex Buddy as is? I have .NET flavour selected).

    The result is four improperly punctuated segments instead of the three properly punctuated ones:

    « Voici un exemple avec un point final.
    » « Pourquoi est-ce que Studio ne marche pas comme il faut pour nous?
    » « Zut, alors!
    »

  • Hi Janis,

    It sounds like you need an exception to the full stop and other terminating punctuation rules, rather than a new segmentation rule, as explained here:

    To achieve what you want, add an exception like this:

    Make sure to add it both to the full stop rule and to the other terminating punctuation rule. That will do the trick. : )

    Edit: To elaborate on the above, make sure yo have selected both ? and ! when you add the exception to the "Other terminating punctuation rule", so that it will look like this:

    You will also need a segmentation rule to break before «, like this:

  • Thanks, Nora, but I don't see how this is going to work for us as 1) every sentence that ends with a closing quotation mark is not going to be followed by one that starts with an opening quotation mark and 2) French quotation marks can occur within a sentence as well, just like English quotation marks. (J'ai dit « bleu », pas « rouge »!/I said "blue", not "red"!) I think what we need is a rule that recognizes and includes a single space plus closing punctuation as closing punctuation.

  • Oh I see, sorry, I went by your provided example. The problem with the idea of the closing rule with the space plus closing quote is that you would need to get rid of the other rules, i.e., full stop and other terminating punctuation for it to work. In other words, the reason you're getting the segmentation you're seeing is because the full stop, ? and ! rules are being applied, regardless of what comes after them, so I really think exceptions are the best solution. Then you would just need to tweak the "Break before «" rule to make sure the break is only  applied when the « is followed by uppercase, something like this:

    And then a separate rule would need to be added for a break after » + space plus an uppercase letter, like this:

    Using the above rules in your provided sample text would result in this:

  • Wow, Nora, it looks like you have done it! I admit I have worn out my brain trying to wrap it around this business, but I can't think of any reason why this won't do the trick. Thanks, you're a gem!
Reply Children