Segmentation after full stop or line break on an already translated json file

Good morning,

Apart from my translations, I've been asked recently to do reviews of json files that already include their translation so that I can edit and update the content to my TM. It seems that, since the translation is already filled, Trados "blocks" it and doesn't do the normal segmentation that I would need for my translation memories. This is how it looks like:


Screenshot of Trados Studio showing a JSON file with French text. Text is not segmented after full stops or line breaks, and there are no visible errors or warnings.

I would need to have a new segment every single time there is a full stop or a line break. I've tried by adding a TM that has segmentation rules and I've checked on File>Options>Filetypes>JSON but I can't find nothing that matches what I need, so I'm afraid I'll need your help.

Thank you in advance,

Noa



Generated Image Alt-Text
[edited by: Trados AI at 9:21 AM (GMT 0) on 29 Feb 2024]
emoji
Parents
  •   

    I think a bit more information is needed here.  If the translation is already filled I assume this means you are not opening a JON file, but rather you are opening an xliff or sdlxliff?  This being the case the segmentation is already complete and is determined by mark up in the xliff.

    If this is the case then perhaps you can save the target file, and also the source, so you have the actual json files.  Then align them to get a TM based on segmentation the way you'd prefer to have it, and finally create a new project from the source json file and pretranslate from your new TM.

    Essentially I think the problem seems to be coming from how the project was originally created.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi, Paul!

    No, it is actually a .json file that has both the source and the target texts filled (it's a file with product descriptions, I've been asked to review and add these products that the brand has already translated to my TM so that these sentences are recovered for other new products that may have matching info on future translation projects). I can send you the example I used on the post, but this is how the file I used to create the project looks like:

    Screenshot of a .json file in Trados Studio showing source and target texts for product descriptions in French and Spanish with article IDs and operation code redacted.

    I created the project with this json file and what I got in the Trados editor is what you can see on the screenshot at my opening post. Normally, when I create a translation project with a .json file that has all the targets empty, my segmentation is normal when there is a full stop or a line break, but doing the same thing with this file gave a segmentation according only to the "articleID" of the .json screenshot as a result.

    I don't know if this info brings any light to my issue... Thank you for your help!

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:21 AM (GMT 0) on 29 Feb 2024]
  •  

    Please can you explain how you managed to get source and target extracted from this file in Studio out the box if it is a JSON file?  Studio, as far as I am aware, will only extract all the strings into the source.  So how did you do this?  I feel as though I'm either missing something I never thought was possible or you are using a plugin, or you did something else altogether perhaps?

    It would help to understand how you have processed this file to get both languages into the Studio editor?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi!

    Sure! I received the .json file with both source and target already filled (as it is on the screenshot) by the client so that I could review and update my TM with the content. I have only received the .json file, not a Trados project or anything else. The only step that I have done inside Trados is creating a normal Trados project with this file, that's it, without any plugins or manipulations at all, but the segmentation that I get isn't the same that I get with a .json file that has the targets empty and I'd need it to be the same (mainly with the full stops and line breaks).

    I hope this helps, but let me know if I can give you any other info that could clarify my issue.

    Thank you!

    emoji
  • Oh, and also, I'm going to paste in here a screenshot of the parsing settings that I have in my json configuration. I've had it for a while now, I've used it many times with the json files that have empty targets with no segmentation issues. This configuration is not the Trados default one:

    Trados Studio screenshot showing JSON file parsing settings with 'Enable path filter' checked and path filter rules for source and target path expressions.

    This is how Studio extracts source and target from my json files.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:22 AM (GMT 0) on 29 Feb 2024]
  •  

    ok...

    This configuration is not the Trados default one:

    Not just the configuration! This is an app from SuperText, one we don't actually have on the appstore at all at the moment... although I wish we did.  I really like their implementation of the JSON filetype as it's a lot better than the out of the box solution.

    So I think your problem is going to be related to what I said here:

    once the target has been translated it's unlikely you'll get a fully resegmented file anyway because the software is based on source segmentation and not target.  So I would imagine any filetype you have that was capable of managing this resegmentation in the editor would probably dump all the target in one segment anyway...or simply not resegment if the translation is present.

    So this means that to do what you want you need to do this:

    save the target file, and also the source, so you have the actual json files.  Then align them to get a TM based on segmentation the way you'd prefer to have it, and finally create a new project from the source json file and pretranslate from your new TM.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Good morning, Paul!

    Thank you so much for your help! I wasn't aware that I had an app from SuperText, otherwise I would have let you know for sure.

    I've followed your steps. I've tried first with the saved source file in "File > Advanced Save", but I got the same json with the already filled targets and the alignment gave me the same result without the segmentation. Then, I've emptied the json "target" texts manually from a copy of the source json to use as source and I've aligned it with the exported target, but this is how it looks like:

    Source text with emptied targets:

    Screenshot of a JSON file with source text in French and empty target fields for translation.

    Alignment:

    Trados Studio alignment window showing source text in French on the left and an unsegmented target text area on the right.

    It does segment the source but the target remains unsegmented and it's hasn't aligned with the translation but with the same source. This is how the target file that I've used for the alignment actually looks like (the target text in Spanish is present in the json but not in the Trados alignment):

    Screenshot of a JSON file with source text in French and corresponding target text in Spanish filled in.

    For some reason, Trados isn't detecting the actual target and the alignment doesn't work. Could this be related to the SuperText app?

    Thank you so much!

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:22 AM (GMT 0) on 29 Feb 2024]
  •  

    The easiest way to tackle this is probably this... maybe other ways... but this is how I would do it if I had to address it right now:

    1. open the json file in a text editor
    2. save it with the name source.json
      1. search and replace this string:
        ^\s+"target".+?\n
        with nothing
      2. save the file
    3. open the original json and save it with the name target.json
      1. search and replace this:
        ^\s+"source".+?\n
        with nothing
      2. search and replace this object:
        "target"
        with this:
        "source"
      3. save the file

    Now you have two json files.  Both containing a single language and both with only one translatable object called "source".  One has a FR "source" and the other an ES "source".  This means you can now align these two files with a single filetype setting that extracts the "source" object.

    When you finish aligning the FR with the ES you will be able to update into an FR-ES translation memory.

    Finally you take the original json file containing both source and target and search replace this:

    ("target":\s+).+?(,\n)

    With this:

    $1""$2

    Now you have a json file you can open in Trados Studio with your filetype that will segment properly and when you pre-translate from the TM you created above the target will be populated so you can review it and finally save the file.

    Might sound complicated, but in my head this would be the most straight forward way to go.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply
  •  

    The easiest way to tackle this is probably this... maybe other ways... but this is how I would do it if I had to address it right now:

    1. open the json file in a text editor
    2. save it with the name source.json
      1. search and replace this string:
        ^\s+"target".+?\n
        with nothing
      2. save the file
    3. open the original json and save it with the name target.json
      1. search and replace this:
        ^\s+"source".+?\n
        with nothing
      2. search and replace this object:
        "target"
        with this:
        "source"
      3. save the file

    Now you have two json files.  Both containing a single language and both with only one translatable object called "source".  One has a FR "source" and the other an ES "source".  This means you can now align these two files with a single filetype setting that extracts the "source" object.

    When you finish aligning the FR with the ES you will be able to update into an FR-ES translation memory.

    Finally you take the original json file containing both source and target and search replace this:

    ("target":\s+).+?(,\n)

    With this:

    $1""$2

    Now you have a json file you can open in Trados Studio with your filetype that will segment properly and when you pre-translate from the TM you created above the target will be populated so you can review it and finally save the file.

    Might sound complicated, but in my head this would be the most straight forward way to go.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Children
  •  

    I added a quick video to explain in case this looks overly complicated:

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Good morning, Paul Smiley

    Thank you so much for your explanations step by step and the video, I highly appreciate them! I've been trying your tutorial yesterday and the day before in between other urgent translations with a couple of files and the SuperText filetype uncheck was definitely blocking me before your video. I've managed to do the alineation and to update the TM, but since it was my first time using regular expressions, I wouldn't have succeeded without your help.

    I've had a bit of a block with the "source.json" file preparation. After replacing "^\s+"target".+?\n" with nothing, there was a spare comma that didn't let me do the alineation. I've managed to solve it by replacing ",
    }," with "}," and saving afterwards. I'll leave a screenshot here just in case somebody else has a similar issue. I don't know if there is a better way to solve this with the regular expression, but this way worked for me in 2 different files and alineations.

    Screenshot of Trados Studio's 'Replace' function showing the 'Find what' field with ', ' and the 'Replace with' field with ','.


    Also, I've had another block with the third file (the one where we only want to empty the target) since I got an error. I've solved it by taking the comma out of the expression:

    ("target":\s+).+?(,\n) -> ("target":\s+).+?(\n)

    But I don't know if this would work for every case.


    Thank you so much for your assistance and your time, Paul!

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 9:22 AM (GMT 0) on 29 Feb 2024]
  •   

    I'm glad that helped.

    I've had a bit of a block with the "source.json" file preparation. After replacing "^\s+"target".+?\n" with nothing, there was a spare comma that didn't let me do the alineation. I've managed to solve it by replacing ",

    There could be a couple of reasons for this:

    1. I made up the test file, so my file could have been different to yours, especially with the non-printable spaces at the end
    2. Perhaps the syntax for the flavour of regex you are using in the tool you used is different?
    3. Also worth checking curly vs straight commas... a common problem when transferring characters between applications

    Whatever the reason I'm sure your posts will be helpful to others in the future.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji