Regex help for custom JSON filetype filter

HI there

I need to write a custom JSON filetype filter (TRADOS 2021) and am struggling with getting any regex to work for the JSON Paths in Trados Studio.

A sample of the JSON paths that would need to be translated are

bronze[0].localisable.title
bronze[0].localisable.subTitle
bronze[0].localisable.listItems
bronze[1].localisable.title
bronze[1].localisable.subTitle
bronze[1].localisable.listItems
bronze[1].localisable.ctas[0].text
bronze[1].localisable.ctas[1].text
silver[0].localisable.title
gold[0].localisable.ctas[0].text

I can't get the correct regex identified for the words bronze/gold/silver or the digits inside the [].

Any pointers would be appreciated

TIA

Peter

emoji
Parents
  •  

    Perhaps it would help if you shared a snippet of the json file?  The expressions look ok at first glance assuming that the structure of your JSON data aligns with these paths. So we're missing info to be able to assist you.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi  

    Here's the sample (randomized) code:

    {
    "bronze": [{
    "localisable": {
    "title": "This is Title 1",
    "subTitle": "This is SubTitle 1",
    "listItems": [
    "List Item 1"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 2"
    }
    ]
    },
    "image": "URL1",
    "ctas": [{
    "link": "URL2"
    },
    {
    "link": "URL3"
    }
    ],
    "dates": {
    "start": "26/06/2023",
    "end": ""
    },
    "priority": "2",
    "publications": [
    "AB",
    "CD",
    "EFTG",
    "csdtggtf"
    ]
    },
    {
    "localisable": {
    "title": "This is Title 12",
    "subTitle": "This is SubTitle 14",
    "listItems": [
    "Item 2",
    "Item 23",
    "Item 33"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "lURL"
    },
    {
    "link": "IRL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }
    ],
    "silver": [{
    "localisable": {
    "title": "Titlke 1",
    "subTitle": "Titkle 2",
    "listItems": [
    "Item 1",
    "Item 33 ",
    "Item 4"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "link": "URL"
    },
    {
    "link": "URL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }],
    "gold": [{
    "localisable": {
    "title": "This is Title 1",
    "subTitle": "Totl 2",
    "listItems": [
    "Item 1",
    "Item 2",
    "Item 3"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "link": "URL"
    },
    {
    "link": "URL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }]
    }

    TIIA

    Peter

    emoji
  • Thanks  

    The file you gave me contained an error in the "ctas" array under the second "bronze" object as one of the objects was missing a key for its value (which is "URL").  I had to fix that first as I couldn't parse your example at all.

    Here's my corrected file:

    {
      "bronze": [
        {
          "localisable": {
            "title": "This is Title 1",
            "subTitle": "This is SubTitle 1",
            "listItems": ["List Item 1"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 2"
              }
            ]
          },
          "image": "URL1",
          "ctas": [
            {
              "link": "URL2"
            },
            {
              "link": "URL3"
            }
          ],
          "dates": {
            "start": "26/06/2023",
            "end": ""
          },
          "priority": "2",
          "publications": ["AB", "CD", "EFTG", "csdtggtf"]
        },
        {
          "localisable": {
            "title": "This is Title 12",
            "subTitle": "This is SubTitle 14",
            "listItems": ["Item 2", "Item 23", "Item 33"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "IRL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ],
      "silver": [
        {
          "localisable": {
            "title": "Titlke 1",
            "subTitle": "Titkle 2",
            "listItems": ["Item 1", "Item 33 ", "Item 4"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "URL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ],
      "gold": [
        {
          "localisable": {
            "title": "This is Title 1",
            "subTitle": "Totl 2",
            "listItems": ["Item 1", "Item 2", "Item 3"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "URL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ]
    }

    Then I added all your rules and they seem to work just fine:

    Screenshot showing all the jsonpath rules added and the successful preview.

    Can you explain what your problem is exactly?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi  

    The issue is that we need to be to replace bronze, gold, silver with regular expressions as we don't know exactly how many objects are going to be in the files and also the same for the digits within [].

    The code above is only an example given to us by the developers and it is currently a work in progress and content will be added to it on an ongoing basis but this is going to be the basic structure of the JSON they will use.

    Thanks

    Peter

    emoji
  •  

    I'm not sure I follow what you want to do.  I don't think JSONPath supports regex for key matching.  If you want to extract content from the arrays containing the words "bronze", "gold", or "silver", you would have to explicitly use those key names in your JSONPath expressions.

    For example:

    • bronze[*].localisable.title would extract the title from all "localisable" objects in the "bronze" array.
    • gold[*].localisable.ctas[*].text would extract the text from all "ctas" objects in all "localisable" objects in the "gold" array.
    • silver[*].dates.start would extract the start date from all "dates" objects in the "silver" array.

    If you want to apply the same JSONPath expression to the "bronze", "gold", and "silver" arrays, you would need to replicate the expression three times, once for each key. JSONPath also doesn't offer a way to apply a single expression to multiple keys... as far as I know.

    Have I completely misunderstood you?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi  

    No, you are getting exactly what I want to achieve but it looks like maybe json isn't the way to go in this case. I'll go back to the developers and see if there's another format they can use.

    Thanks for your help on this.

    Peter

    emoji
  •  

    I'll go back to the developers and see if there's another format they can use.

    Plain old XML may be better, it's far more flexible in terms of what you can do with the content of the file than JSON (in my opinion).

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply Children
No Data