Trados Studio adds extra line breaks to source MS Word file, breaking formatting

Trados is adding extra line breaks and breaking the formatting of my source file. First screenshot is of the original file oppened in word and the second one is of the same file after passing through Studio. How do I solve this?

First screenshot shows a product information page for 22-03 BRICK, with text in Portuguese, images of guitar amplifier knobs, and sections titled CONTROLES and ESPECIFICACOES.

Second screenshot is similar to the first, displaying the same product information page for 22-03 BRICK with text in Portuguese and images of guitar amplifier knobs.



Generated Image Alt-Text
[edited by: RWS Community AI at 1:05 AM (GMT 0) on 22 Jan 2025]
emoji
Parents
  •  

    From what I see the source was a PDF. So the problem seems to be the PDF conversion and not Studio. To avoid such problems, ask for the proper source file or convert the PDF in a better way. The problem with PDF is always the same, a conversion done automatically will never fulfil the real need. It is always better to get the proper source or to convert PDF manually.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  • Thank you for your answer, but the source file is actually an MS Word document (.docx), not a PDF. 

    emoji
  •  

    Can you tell me how do you see extra paragraph breaks in your document? In your screenshot I do not see any paragraph breaks, as non-printable characters are not shown.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  • For example, in screenshot 2, compared to screenshot 1, there are 2 extra blank lines between "CONTROLES" and "VOICE", 2 extra blank lines between "VOICE" and the paragraph below it ("Aumenta a resposta de harmônicos..."), 2 extra blank lines between "GAIN" and the paragraph below it ("Aumenta a saturação global no sentido horário..."), etc.  

    emoji
  •  

    Please visualize the non-printable characters. Only then you really can count paragraph breaks. There can be different characters there, causing the layout to change. Your target text is not exactly as long as the source and this causes changes.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  •  

    Perhaps it would be helpful to see the sdlxliff you have translated.  Can you share that?  If not and you can only share screenshots please show a screenshot of the Editor View with non-printable chars showing, and also using the All Content filter instead of the All Segments.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • I've taken a closer look on your second screenshot. It seems the spacing of your line formatting has changed. Might be the standard format style has been overridden in the source file and restored in the target file. Anyway, without seeing the non-printable characters no one can tell what happened there. Please also make the column and page breaks and also text frames visible.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  • I've attached the original source word file, the Trados source preview word file and the sdlxliff file.

    trados sdlxliff.zip

    emoji
  •  

    Thank you for sending the files though.  I have to say I think this is a bug.  The original file looks like this:

    Screenshot of a text box with the heading 'CONTROLES' and subheading 'VOICE' followed by Portuguese text, with extra symbols between words.

    In Studio it looks like this (showing All Content):

    Screenshot of Trados Studio interface showing HTML-like tags and formatting instructions around the headings 'CONTROLES' and 'VOICE', and the subsequent Portuguese text.

    Then just copying source to target and saving I get this in the target file:

    Screenshot of a text box with the heading 'CONTROLES' and subheading 'VOICE' followed by Portuguese text, with extra symbols and line breaks added.

    For some reason Trados Studio is adding a soft and a hard return after the text and hard breaks, but only inside some of these text boxes.  I'm not sure why.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 4:20 PM (GMT 0) on 22 Jan 2025]
  •  

    Even stranger is that if I then translate the target file that has these additional breaks the target created from that is perfect... still includes the additional breaks but it doesn't duplicate them.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    This is strange, as the lines are already in the source file, when I open it in TS. I have never seen something similar before. However, the formatting of the source file is bad and makes the impression, it was converted from PDF. Unfortunately, I cannot help. The only solution I see here is to remove the paragraph breaks in the target file. I have no explanation where do they come from.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  • The original source docx file was indeed obtained by converting a PDF with OmniPage Ultimate, however that doesn't explain why it looks different when i open it directly in Word versus when I open it in Trados Studio and then save it to docx, right? Studio shouldn't alter the formatting of the document.   

    emoji
Reply Children
  •  

    I have no explanation for it. As   already stated, this might be a bug. A very strange one, as in all the years of using Studio since very early 2009 I have never seen something similar.

    Does this happen with other files on your side too or is this the only file with such strange behavior?

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  •  

    I think I found the problem... but I don't know why!  If I delete the large image and then process the file the target is perfect:

    Screenshot of a text document with formatting issues in Trados Studio. Text is overlapped and unreadable with various symbols and numbers interspersed, indicating a possible conversion error.

    That is the target after roundtripping in Trados Studio.

    So probably easy to solve by removing the image and translating the file, then put the image back afterwards.  Why it happens... I have no idea.  But certainly PDF conversions can do odd things that we probably never tested for before.

    I'll log it with the support team so they can put it into the system to be resolved.  In the meantime you have a workaround, and it might be a problem you never ever see again!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 5:29 PM (GMT 0) on 22 Jan 2025]
  •    

    In fact the problem is even easier than that!

    The original source docx file was indeed obtained by converting a PDF with OmniPage Ultimate

    I don't know much about that application but I think the DOCX format it creates is not an upto date DOCX... at least not in all parts of the standard (whatever that is).  If I simply open your original source file and then save it into a new folder first I get prompted with this:

    Microsoft Word dialog box stating 'Your document will be upgraded to the newest file format.' with options to proceed or cancel.

    If I say yes and save it I then get a file that is a little larger, but it now processes perfectly well.

    So I don't think I can give this to the support team because the first thing that will happen is that they need to ensure the DOCX is a valid DOCX based on what we would receive when created with Word.

    So the solution is to simply do that... open the file in Word, save the file into a new folder upgrading it on the way and then translate the file  in Trados Studio.  It will no longer have this problem.

    One last thing... it's worth checking what version of Word you are using as it might be your Word that causes the problem instead of OmniPage.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 6:06 PM (GMT 0) on 22 Jan 2025]