In 2022 SR1, Regex's Negative Lookbehind does not work

Question

Hi dears, 
 When translating a patent specification, I use the "Negative Lookbehind" of the regular expression in the "Find and Replace" window provided by the edit view of trados studio to perform a find and enclose the found result in parentheses. A simplified example of the regex I'm using is "(?<!step|\s도)\s([0-9]{2.4})" as what to look for and "($1)" as what to replace. 
 However, after updating to 2022 SR1, the above regular expression pattern is found but not replaced. 
 Please check this.

Paul · Accepted Answer

Youngrok Kim Jean-François Simard 
 Just a quick update to let you know that this can now be achieved using the SDLXLIFF Toolkit from the appstore. I know it's not as convenient as usingithe find & replace in the editor, but at least it works and you can do multiple files in one go... even multiple projects in one go... and edit source and/or target as well as a host of other things. So good to have a workaround until the issue is solved in the core product.

Paul · Answer

Youngrok Kim 
 I'm not sure if your regex is correct or not as you didn't provide a sample text. It looks as though the find expression, [0-9]{2.4} is intended to specify an integer range, for instance {2,4}... so a comma between the numbers and not a dot. This would match any number from 0-9 that is 2 to 4 digits long. It's also tricky to imagine what you're trying to find given the mix and match between English and Korean characters. So I created this text to test with and just copied source to target: 
 이 신규 화합물의 제조는 여러 단계 과정을 포함합니다. Step 1 involves the reaction of substance A with substance B at a temperature of 500도 for 12시간. 두 번째 단계는 물질 C를 추가하고 혼합물을 300도로 24시간 동안 유지하는 것입니다. Finally, step 3 involves cooling the mixture down to 100도 over a period of 48 hours. The purity of the resulting compound is typically 99 percent. 
 Using Find and replace I can repro your problem. I also think this may be a known issue with lookarounds. So I tested in the SDLXLFF Toolkit as a workaround... a preferred solution for me in fact!: 
 
 That correctly shows the operation having worked in the the toolkit, but then fails to update the sdlxliff!! 
 So, I'll make sure this is logged under the search & replace in Studio, and I'll also get the AppStore Team to fix this in the toolkit... this may be quicker! 
 In the meantime perhaps you ca clarify what you were actually trying to achieve to make sure we have the right expression and even a proper sample text? 
 Oana Nagy fyi.

Paul · Answer

Jean-François Simard 
 I also added your example to the dev notes so we'll be double certain we fix it for you too. In the meantime I also created a workaround you may find useful if it works for your sample text... it's helpful when you provide one! 
 (\b\d\b) (\w+)|(\d{3}-\d{3}-\d{4})([,.]) 
 $1 $2$3$4 
 Doesn't use a lookaround and might solve the issue for you with this particular problem.

Paul · Answer

Youngrok Kim 
 In the meantime, there might be a workaround for you depending on what your source text looks like. For example: 
 (step\s|\s도\s)|(\s([0-9]{2,4})) 
 $1($3) 
 This will capture all instances of 'step ' and ' 도 ' along with the spaces and replace them with themselves, effectively ignoring them. Meanwhile, instances of a space followed by 2 to 4 digits are wrapped in parentheses. However, this only works with my sample. It won't work if the phrases to avoid (step or 도) can appear elsewhere in the text and not just in front of the number patterns you're interested in. The original regular expression with lookbehinds would be more precise for this case. 
 One workaround for amore extensive use of the phrases we want to avoid might be to perform multiple passes and a more complex process. For example: 
 
 Replace all instances of step or 도 followed by a space and a number with a unique placeholder, such as @@@. 
 Perform your existing operation to replace all instances of a space followed by 2 to 4 digits. 
 Finally, replace all instances of your unique placeholder @@@ back to the original value. 
 
 This would be less efficient and more error-prone than using lookarounds, but it could potentially achieve the desired result. 
 In the meantime we'll plan in some time to look at the SDLXLIFF Toolkit until a more permanent solution is implemented into the core product.

Trados Studio > 1. Trados Studio

In 2022 SR1, Regex's Negative Lookbehind does not work

Top Replies