Help with regex - non greedy option?

Hello,

I would like to lock particular content in Studio with a regex I've created. The regex and the sample text is in the link below.

https://regex101.com/r/GpKaEV/1

Please watch out the 'LED" and "MMT3" which are not in blue, which means that they shouldn't be locked in Studio.

Unfortunately this is no the case, see the screenshots from Studio:

Screenshot showing 'LED' not highlighted in blue, indicating it is not locked in Trados Studio as expected.

Screenshot displaying 'MMT3' not in blue, suggesting it is not locked in Trados Studio contrary to user's regex settings.

Any idea why this is happening? Is there a non-greedy option?

To help you reproduce the case, you can find below the source xml and the file type settings.

/cfs-file/__key/communityserver-discussions-components-files/171/4201.test.zip

Thanks

Pavlos



Generated Image Alt-Text
[edited by: Trados AI at 4:34 AM (GMT 0) on 5 Mar 2024]
emoji
Parents
  • I wasn't able to figure out a perfect solution, but I managed to simplify your regex a bit and to exclude the unwanted parts from tagging by removing the space (\s) at the end of the regex:
    ^[A-Z]+\d*[A-Z]*\d*(?:\*\*)?

    Screenshot of Trados Studio showing text with multiple tags, including unwanted tags at the end of some strings.

    From what I can see in the preview, it isn't about the greediness per se since it creates actually two tags, the first one wanted and the second one unwanted, but what Studio considers as the start (^) of the string. My theory is that once Studio has tagged a bit of text, it continues parsing the remainder of the string and considers the position after the last match as the new start of the string. That would also explain why adding a negative lookbehind at the beginning of the search pattern (?<!\s) doesn't have any effect.

    I hope this helps a bit!

    Greetings from Brussels,
    Raphaël

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:34 AM (GMT 0) on 5 Mar 2024]
Reply
  • I wasn't able to figure out a perfect solution, but I managed to simplify your regex a bit and to exclude the unwanted parts from tagging by removing the space (\s) at the end of the regex:
    ^[A-Z]+\d*[A-Z]*\d*(?:\*\*)?

    Screenshot of Trados Studio showing text with multiple tags, including unwanted tags at the end of some strings.

    From what I can see in the preview, it isn't about the greediness per se since it creates actually two tags, the first one wanted and the second one unwanted, but what Studio considers as the start (^) of the string. My theory is that once Studio has tagged a bit of text, it continues parsing the remainder of the string and considers the position after the last match as the new start of the string. That would also explain why adding a negative lookbehind at the beginning of the search pattern (?<!\s) doesn't have any effect.

    I hope this helps a bit!

    Greetings from Brussels,
    Raphaël

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:34 AM (GMT 0) on 5 Mar 2024]
Children