Why would [\d-]* not exclude numbers with dashes?

In a different thread of this forum (https://community.rws.com/product-groups/trados-portfolio/trados-studio/f/studio/52008/how-to-exclude-include-certain-elements-from-the-source-file-and-therefore-from-the-word-count-in-trados-studio-2022) I excluded numbers with dashes from translation using the embedded content processor for Word files.

To my surprise, [\d-]* did not work, at least the preview returned a blank editor - no segments. [\d+]+ did work. In my eyes, this is a bug.

RegexBuddy warns me: “C# (.NET 2.0–7.0) allows a zero-length match at the position where the previous match ends.”

But a zero-length match should not result in anything being converted into a tag.

Regex101 shows all the zero-length matches:

as opposed to

but IMHO this should not affect the conversion into tags.

(All for .NET 7.0)

emoji
Parents Reply
  •  

    I don't think ignoring the fact there are zero-length matches there is appropriate.  How would Studio know when you intended this and when you didn't?  I think I would expect a more concise regex to avoid all doubt.

    But I can see your point.  Skipping zero-length matches could make regex more accessible for general use but there are advanced use cases where the ability to match or recognize zero-length strings is valuable and necessary for precise text manipulation and analysis.  Adding an option could take care of that... but then anyone who used them in the first place may not know what such an option was for and wouldn't benefit from it in the first place ;-)

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Children
No Data