Any advice on how to separate, in a Word file, source paragraphs followed sequentially by their translation

Hello,

I wonder if anyone may have a suggestion to solve this problem I’m facing:

I have received a large Word file in which a paragraph in the source language is followed by the corresponding translation, followed by another paragraph in the source language which is followed by its translation, so on and so forth. That’s the only document available.

Out of this file we need to create a translation memory with the translations provided within.

Using Alignment with the file as it is would be beyond messy, I think.

Besides brute force, is there by any chance some way of separating/extracting the source language from the translations?

Thank you in advance for any advice/suggestion you may have.

Gilberto

  • Too easy, eh? Well, let's hope so.
    This is a sample from one of the Word files:

    Important Notes: This document has information about the drugs covered by this plan. For more up-to-date information or if you have any questions, please call <Column X>

     

    Notas Importantes: Este documento tiene información sobre los medicamentos que cubre este plan. Para obtener información más actualizada o si tiene alguna pregunta, llame a Servicio al Cliente <Column AL> al:]

     

    Toll-free <Column J>, TTY <Column M>

     

    Llamada gratuita: <Column J>, TTY <Column M>

     

    If you are a member of a group sponsored plan (your coverage is provided through a former employer, union group or trust), please call the Customer Service number on the back of your member ID card.]

     

    Si usted es miembro de un plan patrocinado por un grupo (recibe su cobertura a través de un fideicomiso, sindicato o empleador anterior), llame al número de Servicio al Cliente que se encuentra en la parte de atrás de su tarjeta de ID de miembro.]

     

    What is a drug list?

    A drug list, or formulary, is a list of prescription drugs covered by your planPage3_WhatIsADrugList_UofCA_ENG33.0 [If (Column F = UofCA), then print, UC Medicare Choice]. Your plan and a team of health care providers work together in selecting drugs that are needed for well-rounded care and treatment.

     

    Your plan will generally cover the drugs listed in our drug list as long as:

    • The drug is used for a medically accepted indication,
    • The prescription is filled at a network pharmacy and
    • Other plan rules are followed.

     

    For more information about your drug coverage, please review your Evidence of Coverage.]]

     

    ¿Qué es una Lista de Medicamentos?

    Una Lista de Medicamentos, o Formulario, es una lista de los medicamentos con receta que cubre su plan. Su plan y un equipo de proveedores de cuidado de la salud colaboran en la selección de los medicamentos que se necesitan para ofrecer cuidado y tratamiento integrales.

     

    Su plan generalmente cubrirá los medicamentos incluidos en la Lista de Medicamentos, siempre y cuando:

    • El medicamento se use para una indicación médicamente aceptada,
    • La receta se surta en una farmacia de la red y
    • Se sigan otras reglas del plan.

     

    Para obtener más información sobre su cobertura de medicamentos, consulte su Evidencia de Cobertura.]

  • Hi 

    Create 2 copies of the Word file, named to indicate first English and second Spanish.

    In the English document use Find and Replace as follows:

    Ctrl+H - opens 'Find and Replace' to the Replace tab.

    Find what: > Format > Language > Spanish (selecting the version of Spanish your document has)

    and

    Replace with: > Format > Language > English (selecting the version of English your document has)

    Leave the 'Find what' line blank and in the 'Replace with' line, type ^p

    Then click 'Replace All'

    This will allow you to run a Find and Replace for the Spanish text, replace it with a paragraph mark that should then then leave each English entry beginning on a new line.

    Finally, highlight the whole document and double-click on the language title on the bottom bar, which opens the Language dialog where you can 'Mark selected text' as English. Then click OK.

    Repeat the process in the second file to delete the English text fully and make the whole document Spanish.

    Then you should be able to use Alignment to produce an SDLXLIFF.

    You can then check this in the Studio Editor with a new TM added so you can confirm each segment as you check it. Or simply import the SDLXLIFF to a new TM.

    You may have to use 'trial and error' to make the process work better depending on the textual content.

    See here for a description of Translation Alignment: www.trados.com/.../https://www.trados.com/solutions/translation-alignment/

    Let us know if this works OK,

    All the best,

    Ali Slight smile

  • Hi 

    Create 2 copies of the Word file, named to indicate first English and second Spanish.

    In the English document use Find and Replace as follows:

    Ctrl+H - opens 'Find and Replace' to the Replace tab.

    Find what: > Format > Language > Spanish (selecting the version of Spanish your document has)

    and

    Replace with: > Format > Language > English (selecting the version of English your document has)

    Leave the 'Find what' line blank and in the 'Replace with' line, type ^p

    Then click 'Replace All'

    This will allow you to run a Find and Replace for the Spanish text, replace it with a paragraph mark that should then then leave each English entry beginning on a new line.

    Finally, highlight the whole document and double-click on the language title on the bottom bar, which opens the Language dialog where you can 'Mark selected text' as English. Then click OK.

    Repeat the process in the second file to delete the English text fully and make the whole document Spanish.

    Then you should be able to use Alignment to produce an SDLXLIFF.

    You can then check this in the Studio Editor with a new TM added so you can confirm each segment as you check it. Or simply import the SDLXLIFF to a new TM.

    You may have to use 'trial and error' to make the process work better depending on the textual content.

    See here for a description of Translation Alignment: www.trados.com/solutions/translation-alignment/

    Let us know if this works OK,

    All the best,

    Ali Slight smile

  • If the text is indeed marked as English AND Spanish, you could also try to go for:

    Search for Englisch

    Replace with ^&^t

    This will add a tabulator after the English text.

    Next step would be replacing ^t^p (or as many ^p paragraph marks, as many are between EN and ES text) with just a tab ^t

    Now your text should have EN followed by TAB followed (WITHOUT a paragraph mark) by ES. If so, select all the text and generate a table out of it. From this table create a TMX.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • Thank you very much, Ali, for your suggestion.


    I tried to isolate the English and Spanish text the way you mentioned but unfortunately the document is very randomly formatted and, when trying to find either Spanish or English texts a lot of text was missed in both languages.

    Maybe I'll try a combination of whatever this method can identify as well as some brute force on the segments missed by the finder.

    Thanks again.

    Gilberto

  • Thank you, Jerzy. Excellent suggestion.

    Sometime yesterday I tried the method of finding English or Spanish text but unfortunately the document is very randomly formatted and a lot of text was missed in both languages. That's the problem

    As I mentioned to Alison: Maybe I'll try a combination of whatever this method can identify as well as some brute force on the segments missed by the finder.

    Thanks again.

    Gilberto

  • Hi 

    If you're going to have to work hard to sort each file out, perhaps you should try the suggestion made by ?

    I think his recommendation would mean you only have to work on only one document... is that correct, ?

    He's much more of an expert than I Wink

    All the best,

    Ali Slight smile