Regex to identify invalid URL slugs

I am struggling to find a regex that will identify invalid URL slugs. I often have revisions tasks where the translators add apostrophes or accented letters in the URL slugs and I keep removing them manually. I’d like to improve my workflow and spot the erroneous characters automatically.

A valid URL slug is composed of number, letters and hyphens (or underscore, depending on the chosen variant). Nothing else. Thus the regex for valid URL slugs is fairly simple: ^[a-z0-9]+(?:-[a-z0-9]+)*$

Now, where I am struggling is in negating this expression. I tried to use negative lookahead but it did not work. Any hint?

Philippe



Added the mention of "underscore"
[edited by: Philippe Noth at 6:57 PM (GMT 0) on 9 Feb 2023]
emoji
  •  

    I think a better approach might be to create a QA check to look for segments where the source was correct but the target was not... like this:

    Trados Studio project settings window showing Regular Expressions section with a URL Slug Check configured to report if source matches but not the target.

    This way you can use the expression you have validated is always going to identify a correct URL Slug and test against the source and target.  If the source matches but the target doesn't then it will report an error... like this for example:

    Trados Studio Messages window displaying 3 warnings from QA Checker 3.0 for URL Slug Check with source matches count and target matches count.

    Would that work?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:47 AM (GMT 0) on 5 Mar 2024]
  • Good idea, I forgot about that possibility. Just in time for the new project that also has a few URL slugs with errors…

    So… it turns out that it will not work for me: I am getting GroupShare projects with no possibility to change the project settings except for TMs and TBs. I added the custom QA check in the options but it is ignored.

    Philippe

    emoji
  •  

    My first thought is you should encourage your client to include this setting in the project for their own benefit.

    My second is even simpler and might be better anyway!  Use the Advanced display filter like this:

    Trados Studio interface showing Advanced Display Filter 2.0 with a regular expression entered in the Source field and the 'Reverse' button highlighted indicating the filter is reversed.

    And then just click on "Reverse" next to "Apply Filter" which gets you this:

    Trados Studio interface displaying filtered results with segments containing errors highlighted in red, such as 'regex-to-identify-invalid-url-slugs'.

    Just one the ones you need to correct.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:47 AM (GMT 0) on 5 Mar 2024]
  • Thanks Paul, of course the Reverse option does the trick!

    emoji