Processing XML files

We have an XML file we are trying to process for a client that is not coming out correctly in Trados. For some reason we’re not getting segment breaks at the tags, all text is flowing continuously (resulting in segments that are 2000-5000 words long). Also, in the source XML file, we’re seeing tabs to separate certain content types (product description /t product number /t currency) that are not picking up in Trados. Trados is replacing those tabs with a simple space. If the client needs these tabs in the file to separate content type, we’re worried that Trados not picking up the tabs may cause import problems when the client tries to bring the translated content back into their native environment.

  • Hi Alaina,

    I had a look at your SDL Account and noticed that you have a valid PSMA contract with us. I would therefore also suggest to Log a Support case directly with our dedicated Support Team via SDL online Account/Support. Please note that Support case must be logged by registered Support User form your account. Once ticket is submitted they will be in contact with you as soon as possible.

    Kind regards,

    Roman
  • Roman, would you be able to help me to navigate to where I log the query? When I go to SUPPORT it takes me on a loop to the SDL Community. I'm having a hard time finding where to log queries.
  • Hello ,

    I received your email, thank you.  It would have been useful to have a sample file but I recreated the problem I think.  So I have a sample file like this:

    <?xml version="1.0" encoding="UTF-8"?>
    <root>
      <Accessories>
      <Accessory>Nice cup, PG Tips	$193.00	£168	8 lbs	(4 kg)	85</Accessory>
      <Accessory>Not so nice cup, English Breakfast	$200.00	£174	7 lbs	(3 kg)	85</Accessory>
      <Accessory>Large cup, Tiptons	$243.00	£208	12 lbs	(6 kg)	85</Accessory>
      </Accessories>
    </root>

    If I open this file with a default TM and deliberately create parser rules where the tags are inline then I can reproduce what you see:

    So, the tabbed spaces have become ordinary spaces and of course the tags are deliberately inline so they are not on separate segments.  The solution to the problem I have created for myself is simple, and will hopefully help you too (but I am guessing!).

    First of all you need to preserve the whitespace as the default is normalizing it.  So change this setting to preserve:

    Next, make sure the parser rules for your xml filetype are not set to inline (hopefully you did create a custom XML filetype for your file and are not just using the defaults):

    Now if you open your file with this new filetype you should see this:

    This is much better... the tabs are retained and the elements have been handled as separate segments.  This is all you asked for, but you could improve this further again by segmentation on the tabs.  If you add a simple rule to your TM segmentation rules like this:

    Now when you open the file with your new filetype and custom segmentation rule you see this:

    Probably much better to translate this way!

    Hope that helps.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Paul, this is incredibly helpful, and much better to translate this way, indeed! Thank you very much!
  • Hi Paul,

    Thanks for pointing me too the segmentation rules. I have another file with tabs that I need to segment. I created a rule, and I can find the rule under the Translation Memories pane, but when I double click to open the rule, it's not saving the language pair (don't know if that matters). I also don't see how to apply it to my file. When setting up a project, is there somewhere in which I select additional Language Resources to apply? Do I apply it to the TM somehow?

    Any advice you could provide would be most appreciated.

    Thank you!
    Alaina
  • Hi Alaina,

    I'm a little confused by this post and don't really have any idea what you mean by "it's not saving the language pair". The segmentation rules are on the TM. So if you open a file against a TM using the segmentation rules you created then they will apply.

    If this isn't clear perhaps you can elaborate a little... pictures help!

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Seems like I'm good at asking confusing questions. :)

    So I went to File > New > New Language Resource Template. I was under the impression that I had to set the language in the popup that opens when I select that. So I selected my target language. 

    Then I went to Segmentation Rules > Add and then followed the TAB rule you showed me earlier in this thread.

    Then I save the Language Resource Template, and it shows up under Language Resource Template in the Translation Memories pane. But how do I apply it to a TM now?

    Thank you for your help!

    Alaina

  • Hi Alaina,

    ok.  Couple of things here probably:

    1. A Language Resource Template is used to create a TM.  So you don't have repeat the settings over and over for similar things.  The language would be your source language and not the target since we are segmenting the file based on the source content not the target.
    2. To do what you need go the settings on the TM itself and you'll find the similar options.  This is where you change the segmentation rules, or
    3. create a new TM based on your new Language \resource Template and use that.

    Make sense?

    I know it's a little counter intuitive as it would be much much better to have the Language Resource Template available to be used for any TM you like so you only had to change the rules in the template and not on every TM.  But that's just the way it works... unfortunately.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    This worked beautifully. Thank you so much for the advice and help.

    By the way, how do you keep up with all of these questions?

    Thanks again, and have a great rest of the week.

    Best,
    Alaina