Inline patterns in XML

Hi,

It's likely that this matter has been queried already, but I couldn't find a good match, if so my apologies.

I have the below XML structure, where I want to treat the variables {} as inline tags in Studio (and ideally also newlines \n). 

  <Term Translate="true">
    <Id>Check36VStatus.FailText</Id>
    <String>{0} - {1} - Have you turned on the Rider with the key?</String>
    <Reference>Text shown when a test step fails for product X</Reference>
    <Added>0001-01-01T00:00:00</Added>
  </Term>
  <Term Translate="true">
    <Id>Check36VStatus.SuccessText</Id>
    <String>{0} - {1}</String>
    <Reference>Text shown when the test step is successfully executed for product X</Reference>
    <Added>0001-01-01T00:00:00</Added>
  </Term>
  <Term Translate="true">
    <Id>Check36VStatusTextDescription</Id>
    <String>Turn on the machine by turning the key.\n\nImportant, if this test fails then all other tests will fail as well.</String>
    <Reference>Description of a UI component for product X</Reference>
    <Added>0001-01-01T00:00:00</Added>
  </Term>

 I've tried to use Embedded content with both of the two below settings, but neither causes the brackets to be rendered as tags.

I don't want to use the TM Variable list as I need a stricter enforcement of the rule. Also, I realise that a workaround would be to tag the source text with something like <DNT>{0}</DNT> but I'd prefer not to have to search and replace in the source/final files.

Is the ECP/Inline tags the correct way to address this, and if so what am I doing wrong?

Thanks!

Simon

Parents
  • Have you defined the TXT embedded content processor in your XML file type? If so, did you enter a proper structure information in all relevant tags?
    If yes, try to use {[^}]*} or at least {.*}, but the second one will mask ANYTHING between [ and }, which will result in wrong parsing. Your rule has an error - you used {*.}... You can also test {.+?} - this means { followed by at least one character and stopping when } is found (lazy rule).

    But why don't you simply use XML parser with legacy embedded content and use the same regex, but there?

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • Thanks for your reply Jerzy,

    I defined the TXT ecp in the XML file type. {.+?}, and defined the below settings as the rule (also tried with "{[^}]*}" ), but the brackets are still shown as plain text..

    I haven't tried the legacy embedded content processer, is it better? Will give it a shot on Monday morning!

    Wish you a good weekend,
    Simon

  • Hi Simon,

    Jerzy referred to the XML filetype. You, for some reason, have chosen to use the text filetype and not the XML filetype. So if you persevere with your method I think your problem is just that your regex is incorrect. You used this


    {*.}


    You should have used this:

    {.*?}


    But looking at your file the XML filetype would be better because you'll then have your elements taken care of automatically and you'll only have to use the embedded content processor for the {0} and \n\n etc.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Yes, it is - definitely. The regex you provide must catch the brackets.

    Please check, if you have defined the ecp in the XML file type properly.

    The document structure information is the crucial point here. The tags in which your brackets do appear have to have a defined document structure attribute, which MUST be included in the list above.

    Same applies for the legacy file type, though...

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • I think the legacy filetype is the way to go here Jerzy... the only embedded content seems to be {} and \n\n and these need regex to be handled. So don't specify the html embedded content processor.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • May I join the discussion? I had a very similar problems and using Embedded plain text plus correctly configuring the ECP in the filetype worked well.

    The only thing i couldn't do is this:

    Between my XML segments I have something like this:

    La corretta installazione della macchina.{CR}La conoscenza del funzionamento

     Where {CR} would be the paragraph break character.

    Is there any way I can tell trados to use it as a segment split character (and of course hide it)?

  • Hi Paul
    Oh, this was just a screenshot from my own file type, as you may note. And in that very case we have HTML inside of XML with huge problem with entities, which do not get parsed properly. But this is another story, already visited by the support.
    I am still learning here, but thanks to your blogs and help the learning curve is not too steep.
    The screenshot should show, how to switch the ecp on and assign the document structure information.

    In this very case here I would indeed use the legacy one. The xml structure seems to be simple, so using the method you've shown with //* as not translatable and then opening the other tags one by one would be my way here.

    Best regards, Jerzy

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

Reply
  • Hi Paul
    Oh, this was just a screenshot from my own file type, as you may note. And in that very case we have HTML inside of XML with huge problem with entities, which do not get parsed properly. But this is another story, already visited by the support.
    I am still learning here, but thanks to your blogs and help the learning curve is not too steep.
    The screenshot should show, how to switch the ecp on and assign the document structure information.

    In this very case here I would indeed use the legacy one. The xml structure seems to be simple, so using the method you've shown with //* as not translatable and then opening the other tags one by one would be my way here.

    Best regards, Jerzy

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

Children
  • May I join the discussion? I had a very similar problems and using Embedded playing text plus correctly configuring the ECP in the filetype worked well.

    The only thing i couldn't do is this:

    Between my XML segments I have something like this:

    La corretta installazione della macchina.{CR}La conoscenza del funzionamento

    Where {CR} would be the paragraph break character.

    Is there any way I can tell trados to use it as a segment split character (and of course hide it)?



    In the embedded content you can declare, that this {CR} (defined literally) shall be excluded in the segmentation (see Simons screenshots), but from my experience this does not work properly. So I'm afraid I do not know a method to exclude.
    However, if you have embedded html, you can indeed declare <br /> as break in the ecp for html. This works.

    BTW, you will not use all this information in Trados - it is SDL Trados Studio, where it is applicable :=

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • Yep, SDL Trados Studio 2015 the latest, I absolutely apologize for the "friendly short nickname" i used :-) that's the only "trados" for me :-)

    I have several tags composed as [b:my bold sentence] (and many variants of it) i can track with plain text embedded content. If i use embedded html I'd loose the (much loved) grep features of the filter....

    Thank you very much.