Html Codes within Excel File

Hello All

I want Trados Studio to split excel cell contents into segments based on embedded HTML codes

e.g:

<B>Product Features</B><BR>Throws are acrylic knitted<BR>Product size : 130x170 cm <BR>Product colour is beige. <BR><BR><B>Washing Recommendations</B><BR>Washable at 30 degrees.<BR>Do not bleach.<BR>Do not iron.

----------------------

Product Features
Throws are acrylic knitted
Product size : 130x170 cm
Product colour is beige. 
Washing Recommendations
Washable at 30 degrees.
Do not bleach.
Do not iron.

--------

this is a sample

sample-br.xlsx

thanks 

Parents
  • Maybe this is an 'overkill' solution, but I can see that your sample also seems to contain non-HTML columns. And while the Embedded Content solution provided by Paul does the trick to identify HTML tags, it does not recognise HTML character codes (if you should have any on your files).

    So, just my two cents:

    For such files, we use the XML options in Excel's Developer tab:

    1. Use Notepad++ to create a simple file with the names of the Excel columns, which look like this:

              <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
              <File xmlns:xsi="">www.w3.org/.../XMLSchema-instance">
              <Element>
              <Column1>a</Column1>
              <Column2>a</Column2>
              </Element>
              <Element>
              <Column1>a</Column1>
              <Column2>a</Column2>
              </Element>
              </File>

    2. Use the Source button in Developer > XML in Excel to add this XML map to the Excel file for translation. Then drag and drop each of the XML elements from the XML map onto the respective Excel column heading (which will create a table).

    3. Click Export in Developer > XML to export your table to an XML file.

    4. In Studio, create specific XML file type settings. As such you can configure which elements / columns need to be translated (maybe not all Excel columns need to be translated) and you can add document structure information to elements. You can then use this document structure information to have HTML content processed using Studio's embedded "Html Embedded Content 5 2.0.0.0" processor, which will recognise both HTML tags and HTML character codes. And non-HTML columns will not be processed as containing HTML.

    5. After translation of the XML file, just open the Excel file and click Developer > XML > Import to import the translated XML file.

    Bit of a long process, I know, but once you know how it works, we have found the results to make it worthwhile the effort.

    Won't work either for Excel files with multiple languages, of course...

    Best,
    Lieven

Reply
  • Maybe this is an 'overkill' solution, but I can see that your sample also seems to contain non-HTML columns. And while the Embedded Content solution provided by Paul does the trick to identify HTML tags, it does not recognise HTML character codes (if you should have any on your files).

    So, just my two cents:

    For such files, we use the XML options in Excel's Developer tab:

    1. Use Notepad++ to create a simple file with the names of the Excel columns, which look like this:

              <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
              <File xmlns:xsi="">www.w3.org/.../XMLSchema-instance">
              <Element>
              <Column1>a</Column1>
              <Column2>a</Column2>
              </Element>
              <Element>
              <Column1>a</Column1>
              <Column2>a</Column2>
              </Element>
              </File>

    2. Use the Source button in Developer > XML in Excel to add this XML map to the Excel file for translation. Then drag and drop each of the XML elements from the XML map onto the respective Excel column heading (which will create a table).

    3. Click Export in Developer > XML to export your table to an XML file.

    4. In Studio, create specific XML file type settings. As such you can configure which elements / columns need to be translated (maybe not all Excel columns need to be translated) and you can add document structure information to elements. You can then use this document structure information to have HTML content processed using Studio's embedded "Html Embedded Content 5 2.0.0.0" processor, which will recognise both HTML tags and HTML character codes. And non-HTML columns will not be processed as containing HTML.

    5. After translation of the XML file, just open the Excel file and click Developer > XML > Import to import the translated XML file.

    Bit of a long process, I know, but once you know how it works, we have found the results to make it worthwhile the effort.

    Won't work either for Excel files with multiple languages, of course...

    Best,
    Lieven

Children