CDATA section causes end of segment - how do I have to modify my rules?

Former Member
Former Member

Hello all!

I would like to hear your advice on this XML with unusual CDATA sections.
It's not an urgent problem, because I already found a workaround, but I'd love to know if I could have done it straight away with the SDL Studio 2015 file type rules.

This is the XML I'd like to prepare:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root>
<FORMAT name="p"/>
<FORMAT name="p">Some text.  Mehr auf dieser <LINK linktemplate="externer_link" type="genericLink"><![CDATA[PFRFnothingtotranslatehereSFDGERT=]]>
<CMS_VALUE name="lt_linktext">Website</CMS_VALUE>
<CMS_VALUE name="lt_link">www.umweltbundesamt.de/</CMS_VALUE>
</LINK>.
</FORMAT>
</root>

 As you can see, there is an inline CDATA section that has not to be translated.
I managed to create the rules, but without my workaround I cannot avoid that Studio splits my segment before the CDATA section.

This is my set of rules:



The segmentation hint for //LINK is "include".

The output looks more or less like this:

My workaround is:

<LINKCDATA hide=" replaces <![CDATA[

and

"></LINKCDATA> replaces ]]>

adding a rule for //LINKCDATA - not translatable - inline.

The result is a single segment, not two.

But I'd really like to know if I'm thinking too complicated...

As I was writing this post, I found this article:

community.sdl.com/.../6616

The solution indicated there does not seem to apply in my case. Should it?


Thank you,
Andreas

Parents
  • Hi Andreas,

    This is not going to be possible as CDATA sections are always treated as Structure which means they are a new segment in the editor. I don't know how the example you quoted worked either... the original poster didn't respond... but I don't think it would, at least I spent ten minutes trying to make sure and it won't work for me!

    So your workaround is the only way to do this.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Former Member
    0 Former Member in reply to Paul

    Hello Paul, hello community,

    this code keeps giving me a headache.

    The workaorund is fine, but there is a different problem now:

    I did a few of these files in the past, apparently without problems, but the last would not import into the source application anymore.

    The reason is the way SDL Studio writes the target XML:

    IN:

    <FORMAT name="p">
    <LINK linktemplate="inline_image" type="genericLink"><LINKCDATA hide="xxx"></LINKCDATA>
    <CMS_VALUE name="lt_alt">Wesentliche Themen</CMS_VALUE>
    <CMS_VALUE name="lt_title">Wesentliche Themen</CMS_VALUE>
    </LINK>
    <FORMAT name="h3">Wesentliche Themen</FORMAT>
    </FORMAT>
    
    

     

    OUT:

    <FORMAT name="p">
    <LINK linktemplate="inline_image" type="genericLink"><LINKCDATA hide="xxx"></LINKCDATA>
    </LINK><CMS_VALUE name="lt_alt"><LINK linktemplate="inline_image" type="genericLink">Key issues</LINK></CMS_VALUE><LINK linktemplate="inline_image" type="genericLink">
    </LINK><CMS_VALUE name="lt_title"><LINK linktemplate="inline_image" type="genericLink">Key issues</LINK></CMS_VALUE><LINK linktemplate="inline_image" type="genericLink">
    </LINK><FORMAT name="h3">Key issues</FORMAT>
    </FORMAT>

     

    As you can see, the original <LINK> element has been closed early and reopened several times. This might be valid, but the source application does not accept it.

    How can I avoid this?

    Thank you,

    best,

    Andreas

    EDIT:

    I found the solution...

    The author added new attribute values for @name, that were not specified in the xpath rules. The general rule for
    @name was
    //*[@name], Always translatable, Tag type not specified.

    Up to now the only exceptions were
    //*[@name="lt_linktext"], Always translatable, inline, include
    //*[@name="lt_link"], Not translatable, inline, include


    So the new attribute values fell under the //*[@name] rule.

    Looked the same to the translator, but the output was not correct. I still do not quite understand, why...

    But defining //*[@name="lt_title"] and //*[@name="lt_alt"], Always translatable, inline, include did the job.

Reply
  • Former Member
    0 Former Member in reply to Paul

    Hello Paul, hello community,

    this code keeps giving me a headache.

    The workaorund is fine, but there is a different problem now:

    I did a few of these files in the past, apparently without problems, but the last would not import into the source application anymore.

    The reason is the way SDL Studio writes the target XML:

    IN:

    <FORMAT name="p">
    <LINK linktemplate="inline_image" type="genericLink"><LINKCDATA hide="xxx"></LINKCDATA>
    <CMS_VALUE name="lt_alt">Wesentliche Themen</CMS_VALUE>
    <CMS_VALUE name="lt_title">Wesentliche Themen</CMS_VALUE>
    </LINK>
    <FORMAT name="h3">Wesentliche Themen</FORMAT>
    </FORMAT>
    
    

     

    OUT:

    <FORMAT name="p">
    <LINK linktemplate="inline_image" type="genericLink"><LINKCDATA hide="xxx"></LINKCDATA>
    </LINK><CMS_VALUE name="lt_alt"><LINK linktemplate="inline_image" type="genericLink">Key issues</LINK></CMS_VALUE><LINK linktemplate="inline_image" type="genericLink">
    </LINK><CMS_VALUE name="lt_title"><LINK linktemplate="inline_image" type="genericLink">Key issues</LINK></CMS_VALUE><LINK linktemplate="inline_image" type="genericLink">
    </LINK><FORMAT name="h3">Key issues</FORMAT>
    </FORMAT>

     

    As you can see, the original <LINK> element has been closed early and reopened several times. This might be valid, but the source application does not accept it.

    How can I avoid this?

    Thank you,

    best,

    Andreas

    EDIT:

    I found the solution...

    The author added new attribute values for @name, that were not specified in the xpath rules. The general rule for
    @name was
    //*[@name], Always translatable, Tag type not specified.

    Up to now the only exceptions were
    //*[@name="lt_linktext"], Always translatable, inline, include
    //*[@name="lt_link"], Not translatable, inline, include


    So the new attribute values fell under the //*[@name] rule.

    Looked the same to the translator, but the output was not correct. I still do not quite understand, why...

    But defining //*[@name="lt_title"] and //*[@name="lt_alt"], Always translatable, inline, include did the job.

Children
  • Hi Andreas,
    One question first?
    can you tell us what kind of tags is set for <FORMAT> <LINK> <LINKCDATA> and <CMS_VALUE>?

    Kind regards
    Sébastien
  • Former Member
    0 Former Member in reply to Sébastien Desautel

    Hello Sébastien, thank you for answering!
    Did you see I just edited the post and solved my problem?


    There is a rule for the ROOT element: Not Translatable 

    There are no rules for FORMAT and CMS_VALUE, they should fall under the ROOT rule.

    All other rules are now (solved):

    Before, I had no rules for @name="lt_title" and @name="lt_alt".

  • Hello Andreas,
    only a quick idea, I couldn't test your settings yet. I assume that this is connected to tag type (structure, inline or unspecified). Inline tags should be inside structure tags and never structure tags in inline tags. When unspecified is set, Studio chooses the "right" type regarding the whole structure and your preset.
    In your initial settings, the <link> is structure. <CMS_VALUE> is unspecified, but with an attribute @name like <FORMAT>. But the first should be inline and the second structure and there is only one rule for both of them. It may be the source of the problem.
    In the second structure, you separate the settings of CMS_VALUE and FORMAT through the different XPath for @name and I can imagine that this is the reason why it works.

    Not sure I was clear enough.
    I will come back a little later with some tests when I find some time today or tomorrow.

    Kind regards
    Sébastien
  • Hi again,
    I could almost (! just couldn't get before and after your corrections the same segmentation) reproduce the situation. This is not due to the inline/structure entanglement alone, but together with translatable/not translatable. If everything is set on "translatable", then no problem at all as I could see. When inline tags are set on "not translatable", it becomes harder and you get these padlocks in the editor.

    So as already written, I think it's related to the //*[@name] setting that rule both FORMAT and CMS_VALUE in your first try and which is set on "not specified". In the second try, you separate the rule in three separate rules, two of them set on "inline" and it's just better defined for the segmentation.

    Kind regards
    Sébastien
  • Former Member
    0 Former Member in reply to Sébastien Desautel
    Hello Sébastien, thanks a lot for your investigation!
    Indeed here first I did not pay attention to the new attributes added by the customer and I should have created the rules right away.

    What puzzled me though is the fact, that the segmentation in the editor was the same, but the XML output after the translation was different from the original.
    Now, with the new rules, in the editor it still looks the same as before, but the output is correct.

    I would have expected some kind of warning from SDL Studio that the target xml document structure was going to differ from the source. I guess it's necessary to always run a verification on the target output outside of Studio...

    Regarding the inline tags set to "not translatable", I probably should try to set the "LINK" element to "translatable" to avoid the "padlocks", but I remember there where some cases where long URLs would become translatable. I'll check again.
    Thanks again,
    best,
    Andreas