ChatGPT to learn and create regex expressions

I’ve just tested ChatGPT to create a regex expression and it worked!

I asked ChatGPT: Whats the regex to match a string without the string hello?

And the answer was:

^(?!.*hello).*$

with a helpful explanation:

Screenshot of a regex expression in a text editor with a breakdown explanation. The expression is designed to match a string that does not contain the word 'hello'.

Always take the answers from ChatGPT with a grain of salt, because they can be plain wrong, but I believe this tool can be a real helper to create regular expressions or AutoHotkey macros.



Generated Image Alt-Text
[edited by: Trados AI at 10:12 AM (GMT 0) on 4 Mar 2024]
emoji
Parents
  •  

    Thanks for this.  I moved your post into this forum I set up because AI is surely going to be a bigger topic as times go by.  I also started by looking at AI for regex before I looked at anything more complex and in then end I found it harder to work with!

    It's great for simple things, and also for many really complex things... but  did find it often gets the syntax wrong and when I used t for more complex stuff it then becomes hard to identify where it slipped up.  For example... I ask it this:

    Provide a regular expression, using c# (.net) syntax, for use in a tool like notepad++. The expression should be able to take a number without thousands separators and correct it using a comma for the separator. If there is a period in the number this is the decimal place and the numbers after it should not be separated.

    The response:

    Here's a regular expression that matches numbers without thousands separators and adds a comma as a separator:

    Screenshot of a regular expression in SCSS format that does not work as intended for adding comma separators in numbers.

    \b(\d{1,3})(?=(\d{3})+(?!\d))\b

    To use it in Notepad++, follow these steps:

    Open the Find and Replace dialog box by pressing Ctrl + H.
    In the "Find what" field, enter the regular expression above.
    In the "Replace with" field, enter "$1," (without the quotes).
    Check the "Regular expression" option.
    Click "Replace All" to replace all occurrences of the matched pattern with the corrected version.
    This regular expression uses a positive lookahead to find groups of three digits that are not followed by another digit. This ensures that only the thousands separators are added, and not the ones that are already part of the decimal portion of the number. The "\b" at the beginning and end of the expression ensure that only whole numbers are matched, and not parts of larger numbers. The "$1" in the replacement string refers to the first capturing group, which contains the three digits that are being separated by the comma.

    Looks great... but it doesn't work.  So I ask it to try a number:

    Screenshot showing a conversation where a user points out the regular expression error and an updated version is provided in Ruby format.

    That also doesn't work...

    Screenshot of an apology for the confusion with an updated regular expression in Ruby format, including instructions for using it in Notepad++.

    Neither does this... I can keep tying to get it to correct, provide hints etc. but I can never get it to correctly solve this task.  It even provides expressions occasionally without escaping special characters and this also adds to the confusion.

    So despite me believing AI definitely has a place as a tool to help with technical tasks, I don't think it's got there with regex yet.  At least not for anything other than fairly straightforward requests.  But it is fun trying!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:12 AM (GMT 0) on 4 Mar 2024]
  • ,

    Your request was a really difficult one :-)

    I don’t know much about ChatGPT nor any other AIs, but I believe they learn from content already available in internet. So for example, forums where questions are raised and answered by users are probably good sources for ChatGPT. So if a question has not been already been asked nor answered, probably ChatGPT is not going to be able to answer it properly.

    Out of curiosity, I didn’t get your question of a .Net regex to be used in Notepad++, because they use different regex flavours, don’t they? Anyway I think this is not the reason of ChatGPT failure to give the right answer.

    Last, I’ve just known that Bing search engine is releasing Bing Chat, a new way of search the web via ChatGPT which uses ChatGPT 4.0 (actually only 3.5 is available). In my to do list to do test it (and also your latest posts about AI). I know Google is not lagging behind, and there is a Bard Chat Search engine. Many things in my todo list. :-)

    emoji
  •  

    Your request was a really difficult one :-)

    I know :-)

    I don’t know much about ChatGPT nor any other AIs, but I believe they learn from content already available in internet.

    I think this is partially true.  ChatGPT would say this:

    As an AI language model, I am not programmed with a fixed set of regular expressions. However, I have been trained on a vast amount of text data, including text that contains regular expressions. Through this training process, I have learned to recognize patterns and identify common regular expression syntax and metacharacters.

    When asked to create a regular expression, I use my knowledge of regular expression syntax and patterns to generate a possible expression based on the given criteria. However, as with any complex task, the accuracy of my response may vary depending on the specific requirements of the regular expression and the complexity of the pattern being matched.

    It's worth noting that while I can generate regular expressions, it's always a good idea to test and refine the expression to ensure that it accurately matches the intended pattern.

    So  think it "knows" the rules, or patterns required, but is then able to analyse the requirements of a question and put them all together to answer the question.  In a way, it's basically what we do.

    I didn’t get your question of a .Net regex to be used in Notepad++, because they use different regex flavours, don’t they?

    I did this because the hardest part of getting good technical answers is the prompt, or context.  ChatGPT isn't familiar enough with Trados Studio to know how it will be used, but if I tell it I want the expression to be used in a search & replace tool that it should know about then the format of the answer makes more sense.  Without this I can even get expressions for use as part of a longer piece of code written in javascript or C# for example.  I added the syntax I want to use to make sure the regular expression returned would be suitable for use in Trados Studio which only uses this flavour.

    Sometimes it take a lot of trial and error to get the answer you need, and this all happens in a "discussion" format with the engine interface.  So I often ask at the end, "How should this question have been written to ensure you could answer it the first time?"  That is something I find helpful too because I then get the sort of prompt or context I need to use the next time.  Eventually the models train us so we can make better use of these tools!

    I know Google is not lagging behind, and there is a Bard Chat Search engine

    There are all sorts already!  The race is on and I'm sure Google will be up there soon.  But I am also sure we are at a sea change in terms of how we interact with our technology.  This sort of technology, whilst carrying significant concerns for many (myself included), definitely has the potential to help with productivity in almost everything we do in a digital environment already.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    Your request was a really difficult one :-)

    And with the upgrade tonight to GPT-4 it has actually solved it!

    https://multifarious.filkin.com/2023/03/15/the-elusive-regex-with-gpt-4/

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • You did it!

    Out of curiosity: Did you do it with the Edge browser or with a ChatGPT Plus account? It seems ChatGPT 4.0 is not available everywhere…

    Screenshot of Release Notes dated March 14 stating excitement about bringing GPT-4, the latest model, to ChatGPT Plus subscribers.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:13 AM (GMT 0) on 4 Mar 2024]
  • ,

    An improvement would be to parse correctly numbers with leading zeroes:

    00000000001.23456

    emoji
  • Out of curiosity: Did you do it with the Edge browser or with a ChatGPT Plus account? It seems ChatGPT 4.0 is not available everywhere…

    Good question.  I actually got so interested in this I subscribed to ChatGPT Plus, so I use the interface for this and login as a paying user.  That gives me access to the latest and greatest as they release updates.  I don't get the full benefits of all the multimodal capabilities as this UI is only text, and at some point I think I'll probably revert to the free access available to everyone.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply Children
No Data