Any advice on how to separate, in a Word file, source paragraphs followed sequentially by their translation

Hello,

I wonder if anyone may have a suggestion to solve this problem I’m facing:

I have received a large Word file in which a paragraph in the source language is followed by the corresponding translation, followed by another paragraph in the source language which is followed by its translation, so on and so forth. That’s the only document available.

Out of this file we need to create a translation memory with the translations provided within.

Using Alignment with the file as it is would be beyond messy, I think.

Besides brute force, is there by any chance some way of separating/extracting the source language from the translations?

Thank you in advance for any advice/suggestion you may have.

Gilberto

Parents
  • Hi 

    Create 2 copies of the Word file, named to indicate first English and second Spanish.

    In the English document use Find and Replace as follows:

    Ctrl+H - opens 'Find and Replace' to the Replace tab.

    Find what: > Format > Language > Spanish (selecting the version of Spanish your document has)

    and

    Replace with: > Format > Language > English (selecting the version of English your document has)

    Leave the 'Find what' line blank and in the 'Replace with' line, type ^p

    Then click 'Replace All'

    This will allow you to run a Find and Replace for the Spanish text, replace it with a paragraph mark that should then then leave each English entry beginning on a new line.

    Finally, highlight the whole document and double-click on the language title on the bottom bar, which opens the Language dialog where you can 'Mark selected text' as English. Then click OK.

    Repeat the process in the second file to delete the English text fully and make the whole document Spanish.

    Then you should be able to use Alignment to produce an SDLXLIFF.

    You can then check this in the Studio Editor with a new TM added so you can confirm each segment as you check it. Or simply import the SDLXLIFF to a new TM.

    You may have to use 'trial and error' to make the process work better depending on the textual content.

    See here for a description of Translation Alignment: www.trados.com/solutions/translation-alignment/

    Let us know if this works OK,

    All the best,

    Ali Slight smile

  • If the text is indeed marked as English AND Spanish, you could also try to go for:

    Search for Englisch

    Replace with ^&^t

    This will add a tabulator after the English text.

    Next step would be replacing ^t^p (or as many ^p paragraph marks, as many are between EN and ES text) with just a tab ^t

    Now your text should have EN followed by TAB followed (WITHOUT a paragraph mark) by ES. If so, select all the text and generate a table out of it. From this table create a TMX.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • Thank you, Jerzy. Excellent suggestion.

    Sometime yesterday I tried the method of finding English or Spanish text but unfortunately the document is very randomly formatted and a lot of text was missed in both languages. That's the problem

    As I mentioned to Alison: Maybe I'll try a combination of whatever this method can identify as well as some brute force on the segments missed by the finder.

    Thanks again.

    Gilberto

  • In that case maybe you start a different way. It will require some manual work, but might give you what you need.

    First, make sure you see ALL non-printable characters including hidden text. Then go through the document, select the English paragraphs and press CTRL+SHIFT+H. This will format the text as hidden. You can obviously use also any other text attribute which will not change formatting. An option would be highlighting the text. When done, save the document. Now search for the text attribute you used. Replace with ^&^t and then use the other part I suggested before. When done, remove the text attribute you added. In case you used "hide" rund a search and replace for "hidden" and replace with "not hidden". For this replacement both search and replace fields are simply kept empty.

    I understand this is a lot of work, but in the end of the day this could bring you what you need, as creating a TMX from a table is a piece of cake.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

Reply
  • In that case maybe you start a different way. It will require some manual work, but might give you what you need.

    First, make sure you see ALL non-printable characters including hidden text. Then go through the document, select the English paragraphs and press CTRL+SHIFT+H. This will format the text as hidden. You can obviously use also any other text attribute which will not change formatting. An option would be highlighting the text. When done, save the document. Now search for the text attribute you used. Replace with ^&^t and then use the other part I suggested before. When done, remove the text attribute you added. In case you used "hide" rund a search and replace for "hidden" and replace with "not hidden". For this replacement both search and replace fields are simply kept empty.

    I understand this is a lot of work, but in the end of the day this could bring you what you need, as creating a TMX from a table is a piece of cake.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

Children