Issues with regex in Find/Replace operations

A discussion was started today in ProZ (http://www.proz.com/forum/sdl_trados_support/309565-regex_f_r_question_trados_2015.html) that has brought to light several issues with regex in Find/Replace Operations that, if I remember correctly, are not new but still present even in Studio 2017.

I thought I'd start a thread here to get the discussion going with people who may be monitoring the SDL Community but not the ProZ Trados forum.

Some of the issues discussed in the ProZ thread include:

- Some segments are skipped during the Find operation for no apparent reason

- Some segments throw an error saying that the "Segment start/end cannot be deleted" when attempting to execute a replacement

- Tags are ignored in the Find/Replace operation 

- [ cannot be used in the Replace field (although including both the opening and closing brackets works); using \[ will naturally result in \[ being inserted in the replacement, instead of the bracket being escaped

- ^ is ignored as a start of segment/string anchor during the Find operation

- Find/Replace stops responding after several of these operations

 

 

Parents
  • Hi Nora,

    Thanks for sharing this... good way to kill a long train journey!  Although every hotel and every train I have to use reminds me how far off we are to being able to work online completely!

    Unknown said:
    - Some segments are skipped during the Find operation for no apparent reason

    I can reproduce this and I think I can see a pattern for what's skipped.  If the segments are within a paragraph then it looks as though only the last segment in the paragraph unit is dealt with.  Like this for example:

    So the paragraph unit before and after that only contain one segment are ok, and the one that is last in the paragraph unit (in this case only 2 segments, but it would still be the last one) is ok.

    I don't have problems with the [[ and ]] though.

    I also get better results with the SDLXLIFF Toolkit as this doesn't skip any segments.  But it does have problems with tags, and with segments containing code... like <texttag>text in here</texttag> for example.  So we do need to address a few things.  I'm going to start with the toolkit as I can do something about this faster.

    Unknown said:
    - Some segments throw an error saying that the "Segment start/end cannot be deleted" when attempting to execute a replacement

    How do I reproduce this?

    Unknown said:
    - Tags are ignored in the Find/Replace operation 

    Yep... got it.

    Unknown said:
    - [ cannot be used in the Replace field (although including both the opening and closing brackets works); using \[ will naturally result in \[ being inserted in the replacement, instead of the bracket being escaped

    Works for me.  Would be good to have some examples of the failing files and steps to reproduce.

    Unknown said:
    - ^ is ignored as a start of segment/string anchor during the Find operation

    Again... examples please.  Seems to be fine for me.

    Unknown said:
    - Find/Replace stops responding after several of these operations

    I could not reproduce this specifically... but I could crash it altogether if this is what you mean?

    Regards

    Paul

  • Hi Paul,

    I tested on a file that I can't share publicly, but will create a similar one to show what happens here when I have a little time later. A couple of things in the meantime:

    Unknown said:

     

    I don't have problems with the [[ and ]] though.

    Me neither, when both [[ and ]] are used in the replacement field, but when attempting, for example, to Find ^(.) and Replace with [[$1, Studio won't accept [[ in the replacement field.

     

    Unknown said:

    - Some segments throw an error saying that the "Segment start/end cannot be deleted" when attempting to execute a replacement

    I could be wrong, but it seems to me that if a segment has, for example, bold formatting, with no tags showing, this happens. 

    Unknown said:
     
    - [ cannot be used in the Replace field (although including both the opening and closing brackets works); using \[ will naturally result in \[ being inserted in the replacement, instead of the bracket being escaped
     
    Works for me.  Would be good to have some examples of the failing files and steps to reproduce.

    Unknown said:
    Nora Díaz
    - ^ is ignored as a start of segment/string anchor during the Find operation

    For this, try searching for ^(.) and see if Find Next takes you to the next first character of the next segment or simply to the next character in the same segment.

     
    Unknown said:
              Nora Díaz
    - Find/Replace stops responding after several of these operations
Reply
  • Hi Paul,

    I tested on a file that I can't share publicly, but will create a similar one to show what happens here when I have a little time later. A couple of things in the meantime:

    Unknown said:

     

    I don't have problems with the [[ and ]] though.

    Me neither, when both [[ and ]] are used in the replacement field, but when attempting, for example, to Find ^(.) and Replace with [[$1, Studio won't accept [[ in the replacement field.

     

    Unknown said:

    - Some segments throw an error saying that the "Segment start/end cannot be deleted" when attempting to execute a replacement

    I could be wrong, but it seems to me that if a segment has, for example, bold formatting, with no tags showing, this happens. 

    Unknown said:
     
    - [ cannot be used in the Replace field (although including both the opening and closing brackets works); using \[ will naturally result in \[ being inserted in the replacement, instead of the bracket being escaped
     
    Works for me.  Would be good to have some examples of the failing files and steps to reproduce.

    Unknown said:
    Nora Díaz
    - ^ is ignored as a start of segment/string anchor during the Find operation

    For this, try searching for ^(.) and see if Find Next takes you to the next first character of the next segment or simply to the next character in the same segment.

     
    Unknown said:
              Nora Díaz
    - Find/Replace stops responding after several of these operations
Children