Extracting only translatable text from HTM file

Hi everyone,

A bit of forewarning: I’ve never worked with HTML/HTM/XML and I can’t seem to make sense of parsing and attributes and whatnot, so I apologize if this is a stupid question!

We’ve received a document in an HTM format. It looks like this:

 

1

<table style="border-width: 0px; width: 100%; font-family: Times New Roman, Times, serif; font-size: 11pt; border-collapse: collapse; border-spacing: 0px;">

2

     <tbody>

3

         <tr>

4

             <td style="width: 6.5in;">

5

             <h2 style="font-family: Times New Roman, Times, serif; font-size: 11pt; font-weight: bold;"><span>4.3</span> <span style="padding-left: 0.78in;">Acceptance of Amendment</span></h2>

6

             </td>

7

         </tr>

8

         <tr>

9

             <td>&#160;</td>

10

         </tr>

11

     </tbody>

12

</table>

13

 

14

<table style="border-width: 0px; width: 100%; border-collapse: collapse; border-spacing: 0px;">

15

     <tbody>

16

         <tr style="font-family: Times New Roman, Times, serif; font-size: 11pt;">

17

             <td style="width: 1in; font-weight: bold;">&#160;</td>

18

             <td style="width: 5.5in; text-align: justify;">

19

             <div th:if="${policy.groupInfo.situsState == 'QC'}"><span>REDACTED may from time to time unilaterally amend this Policy. The Policyholder will be provided with a copy of the Amendment and the Effective Date of the Amendment.</span><br />

 

In this example, only the following needs to be translated: "Acceptance of Amendment" and "REDACTED may from time to time unilaterally amend this Policy. The Policyholder will be provided with a copy of the Amendment and the Effective Date of the Amendment." But when I run it through Trados Studio, I see all the text copied above. How do I extract only what needs to be translated?

Thanks!

Parents Reply Children