Regex help for custom JSON filetype filter

HI there

I need to write a custom JSON filetype filter (TRADOS 2021) and am struggling with getting any regex to work for the JSON Paths in Trados Studio.

A sample of the JSON paths that would need to be translated are

bronze[0].localisable.title
bronze[0].localisable.subTitle
bronze[0].localisable.listItems
bronze[1].localisable.title
bronze[1].localisable.subTitle
bronze[1].localisable.listItems
bronze[1].localisable.ctas[0].text
bronze[1].localisable.ctas[1].text
silver[0].localisable.title
gold[0].localisable.ctas[0].text

I can't get the correct regex identified for the words bronze/gold/silver or the digits inside the [].

Any pointers would be appreciated

TIA

Peter

emoji
Parents Reply Children
  • Hi  

    Here's the sample (randomized) code:

    {
    "bronze": [{
    "localisable": {
    "title": "This is Title 1",
    "subTitle": "This is SubTitle 1",
    "listItems": [
    "List Item 1"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 2"
    }
    ]
    },
    "image": "URL1",
    "ctas": [{
    "link": "URL2"
    },
    {
    "link": "URL3"
    }
    ],
    "dates": {
    "start": "26/06/2023",
    "end": ""
    },
    "priority": "2",
    "publications": [
    "AB",
    "CD",
    "EFTG",
    "csdtggtf"
    ]
    },
    {
    "localisable": {
    "title": "This is Title 12",
    "subTitle": "This is SubTitle 14",
    "listItems": [
    "Item 2",
    "Item 23",
    "Item 33"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "lURL"
    },
    {
    "link": "IRL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }
    ],
    "silver": [{
    "localisable": {
    "title": "Titlke 1",
    "subTitle": "Titkle 2",
    "listItems": [
    "Item 1",
    "Item 33 ",
    "Item 4"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "link": "URL"
    },
    {
    "link": "URL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }],
    "gold": [{
    "localisable": {
    "title": "This is Title 1",
    "subTitle": "Totl 2",
    "listItems": [
    "Item 1",
    "Item 2",
    "Item 3"
    ],
    "ctas": [{
    "text": "Text 1"
    },
    {
    "text": "Text 12"
    }
    ]
    },
    "image": "URL",
    "ctas": [{
    "link": "URL"
    },
    {
    "link": "URL"
    }
    ],
    "dates": {
    "start": "28/06/2023",
    "end": "13/07/2023"
    },
    "priority": "3",
    "publications": [
    "AB",
    "CsdtggF",
    "sdtgg",
    "dsdsd"
    ]
    }]
    }

    TIIA

    Peter

    emoji
  • Thanks  

    The file you gave me contained an error in the "ctas" array under the second "bronze" object as one of the objects was missing a key for its value (which is "URL").  I had to fix that first as I couldn't parse your example at all.

    Here's my corrected file:

    {
      "bronze": [
        {
          "localisable": {
            "title": "This is Title 1",
            "subTitle": "This is SubTitle 1",
            "listItems": ["List Item 1"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 2"
              }
            ]
          },
          "image": "URL1",
          "ctas": [
            {
              "link": "URL2"
            },
            {
              "link": "URL3"
            }
          ],
          "dates": {
            "start": "26/06/2023",
            "end": ""
          },
          "priority": "2",
          "publications": ["AB", "CD", "EFTG", "csdtggtf"]
        },
        {
          "localisable": {
            "title": "This is Title 12",
            "subTitle": "This is SubTitle 14",
            "listItems": ["Item 2", "Item 23", "Item 33"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "IRL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ],
      "silver": [
        {
          "localisable": {
            "title": "Titlke 1",
            "subTitle": "Titkle 2",
            "listItems": ["Item 1", "Item 33 ", "Item 4"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "URL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ],
      "gold": [
        {
          "localisable": {
            "title": "This is Title 1",
            "subTitle": "Totl 2",
            "listItems": ["Item 1", "Item 2", "Item 3"],
            "ctas": [
              {
                "text": "Text 1"
              },
              {
                "text": "Text 12"
              }
            ]
          },
          "image": "URL",
          "ctas": [
            {
              "link": "URL"
            },
            {
              "link": "URL"
            }
          ],
          "dates": {
            "start": "28/06/2023",
            "end": "13/07/2023"
          },
          "priority": "3",
          "publications": ["AB", "CsdtggF", "sdtgg", "dsdsd"]
        }
      ]
    }

    Then I added all your rules and they seem to work just fine:

    Screenshot showing all the jsonpath rules added and the successful preview.

    Can you explain what your problem is exactly?

    emoji
  • Hi  

    The issue is that we need to be to replace bronze, gold, silver with regular expressions as we don't know exactly how many objects are going to be in the files and also the same for the digits within [].

    The code above is only an example given to us by the developers and it is currently a work in progress and content will be added to it on an ongoing basis but this is going to be the basic structure of the JSON they will use.

    Thanks

    Peter

    emoji
  •  

    I'm not sure I follow what you want to do.  I don't think JSONPath supports regex for key matching.  If you want to extract content from the arrays containing the words "bronze", "gold", or "silver", you would have to explicitly use those key names in your JSONPath expressions.

    For example:

    • bronze[*].localisable.title would extract the title from all "localisable" objects in the "bronze" array.
    • gold[*].localisable.ctas[*].text would extract the text from all "ctas" objects in all "localisable" objects in the "gold" array.
    • silver[*].dates.start would extract the start date from all "dates" objects in the "silver" array.

    If you want to apply the same JSONPath expression to the "bronze", "gold", and "silver" arrays, you would need to replicate the expression three times, once for each key. JSONPath also doesn't offer a way to apply a single expression to multiple keys... as far as I know.

    Have I completely misunderstood you?

    emoji
  • Hi  

    No, you are getting exactly what I want to achieve but it looks like maybe json isn't the way to go in this case. I'll go back to the developers and see if there's another format they can use.

    Thanks for your help on this.

    Peter

    emoji
  •  

    I'll go back to the developers and see if there's another format they can use.

    Plain old XML may be better, it's far more flexible in terms of what you can do with the content of the file than JSON (in my opinion).

    emoji