Trados 2022: The document cannot be processed since it contains unexpected contents

Hello,

I am having problems uploading an excel file with html content to trados as I get this error. "The document cannot be processed since it contains unexpected contents"
In addition, I have configured the embedded content in the project but it does not work.
How can I configure this content in trados so that it uploads the file and the html language puts it as tags? What codes should I use?

Thank you.

emoji
Parents
  •  

    Does this only happen when you have the embedded content setup?  If so the problem could be related to the rules you created.  Perhaps you can share a small sample of the excel and also the rules you created?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi Paul,

    This is the type of content that is giving me an error message:

    <tbody>
    <tr>
    <th colspan="2">Características principales</th>
    </tr>
    <tr>
    <th>Talla:</th>
    <td>43,47,51</td>
    </tr>
    <tr>
    <th>Horquilla:</th>
    <td>Bianchi Steel, 1.1/8"</td>
    </tr>
    <tr>
    <th>Cuadro:</th>
    <td>Spillo Alloy 6061, 1.1/8" HT, OLD 135mm</td>
    </tr>
    </tbody>

    Also this:

    Se incluye el accesorio Kevlar Core.&nbsp;<br><br><span style="font-weight: bold;">Especificaciones:</span></p><ul><li>Peso: 170 g</li><li>Materiales:&nbsp;Alu 7075, POM, acero inoxidable</li><li>DIN: 5-10</li><li>Modos de marcha:&nbsp;Plano / +45 mm&nbsp; / +60 mm</li><li>Rango de ajuste de longitud: 30 mm</li><li>Recorrido elástico el talón: No</li><li>Freno de esquí: Único</li><li>Sistema de freno: No</li><li>Ideales para:&nbsp;Anchura esquís: 60 - 95 mm /&nbsp;Peso de los esquís: 800 - 1500 g /&nbsp;Peso del esquiador: 50 - 85 kg</li></ul><p><br><span style="font-weight: bold;">Tecnologías:</span></p><ul><li>CAM Release System</li></ul><p>

    And I have these codes configured. How should they appear?

    </?[\p{Ll}\p{Lu}]\w*[^<>]*>

    <[a-z][a-z0-9][^<>]>

    </[a-z][a-z0-9][^<>]>

    Thanks.

  •   

    So, the first rule is the default.  This one will simply turn all tags into placeholders, like this:

    Screenshot of a side-by-side comparison of text within Trados Studio, displaying tracked changes in a table format with HTML tags visible. The left side of the comparison shows original text and the right side shows revised text, with changes highlighted in purple. Both columns list technical specifications for a product in Spanish with HTML elements such as "tbody", "tr", "th", "td", and various attributes such as "br", "span", and non-breaking spaces ( ). Notable entries include "Talla" (size), "Hormilla" (fork), "Cuadro" (frame), and "Peso" (weight). There are also some symbols like circles and squares inserted into the text, possibly indicating formatting or validation errors

    The other two you have created do nothing for the sample text you provided since they don't recognise anything in there.

    So, you either work with the placeholder and create additional placeholders for anything not picked up, or you get a little smarter.  You could use a tag pair like this and remove the default rule completely (partially the reason for your error I think as you have potentially overlapping rules):

    Start tag:
    <[^/>]+?>

    End tag:
    </[^>]+?>

    Set the Advanced to "Exclude".  Also add a placeholder for the non-breaking space:

    &nbsp;

    And then you get something like this:

    Screenshot of a Microsoft Excel preview window showing two columns of text in Spanish, comparing product specifications. The text is displayed in a green and red color scheme, indicating tracked changes or differences between two versions of a document. Notable elements include specifications such as "Talla" (size), "Hormilla" (fork), made of "Bianchi Steel", "Cuadro" (frame) of "Spillo Alloy 6061," and various other features related to skiing equipment like "Freno de esquí" (ski brake), "Peso" (weight), and "Materiales" (materials), with some HTML character references like "& nbsp;" (non-breaking space) visible in the text. The file path suggests it's being viewed from a user's documents directory.

    Maybe that will help you?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply
  •   

    So, the first rule is the default.  This one will simply turn all tags into placeholders, like this:

    Screenshot of a side-by-side comparison of text within Trados Studio, displaying tracked changes in a table format with HTML tags visible. The left side of the comparison shows original text and the right side shows revised text, with changes highlighted in purple. Both columns list technical specifications for a product in Spanish with HTML elements such as "tbody", "tr", "th", "td", and various attributes such as "br", "span", and non-breaking spaces ( ). Notable entries include "Talla" (size), "Hormilla" (fork), "Cuadro" (frame), and "Peso" (weight). There are also some symbols like circles and squares inserted into the text, possibly indicating formatting or validation errors

    The other two you have created do nothing for the sample text you provided since they don't recognise anything in there.

    So, you either work with the placeholder and create additional placeholders for anything not picked up, or you get a little smarter.  You could use a tag pair like this and remove the default rule completely (partially the reason for your error I think as you have potentially overlapping rules):

    Start tag:
    <[^/>]+?>

    End tag:
    </[^>]+?>

    Set the Advanced to "Exclude".  Also add a placeholder for the non-breaking space:

    &nbsp;

    And then you get something like this:

    Screenshot of a Microsoft Excel preview window showing two columns of text in Spanish, comparing product specifications. The text is displayed in a green and red color scheme, indicating tracked changes or differences between two versions of a document. Notable elements include specifications such as "Talla" (size), "Hormilla" (fork), made of "Bianchi Steel", "Cuadro" (frame) of "Spillo Alloy 6061," and various other features related to skiing equipment like "Freno de esquí" (ski brake), "Peso" (weight), and "Materiales" (materials), with some HTML character references like "& nbsp;" (non-breaking space) visible in the text. The file path suggests it's being viewed from a user's documents directory.

    Maybe that will help you?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Children