How to fix very-large-segments in Studio in xliff projects from WordPress/WPML

Hi Team, 

I was wondering whether someone could help with a recent issue we're having. 

Essentially, we get website translation projects in the form of xliff files, generated from WordPress with WPML. There seems to be a consistent issue with such xliffs in Studio in that the translation projects are not being broken down into segments properly. The whole file becomes just a few segments, with lots of line spaces in between whole blocks of text.

The consequences of this are not nice: you can't use MT, you can't use your TMs, you can't even translate anything manually either, for fear of damaging the file so it won't fit back into the website. 

I don't want to make any unfair comparisons but in the interest of research the following might also be useful to pre-empt: I initially thought the segmentation error could be a WPML issue. But then I've put the same xliff into one of popular alternative CATs and hey - what was 3 segments in Studio 2017 (one big blob plus 2 one-liners at the bottom) turned out to be 94 proper segments in the other CAT! I suppose this makes WordPress / WPML innocent enough.  

I admit I noticed some relevant information on this Forum regarding legacy Studio 2014 but that piece of advice is thankfully very now old - because it seemed to be a multi-stage workaround of incredible technical complexity. To be honest I'd rather need a real-life fix which simply works, just like the other CAT does it. Surely, there must be some setting deep inside the newest Studio that I'm not noticing? 

Please would anyone be able to advise the quickest way to correct that segmentation issue?

Many thanks indeed, 

Adam

Parents
  • Unknown said:
    There seems to be a consistent issue with such xliffs in Studio in that the translation projects are not being broken down into segments properly.

    This is NOT Trados Studio problem! This is caused PURELY by TOTAL IGNORANCE of the WPML developer (or the entire WP squad)!

    What they call an XLIFF is in fact a TOTAL CRAP which has - basically apart from the file extension - nothing in common with XLIFF. They just get the complete HTML content as-is and put it as CDATA section into SINGLE LAAAARGE translation unit.
    The IGNORANTS apparently did not get the fundamentals of XLIFF at all :(
    wpml.org/.../

    There is absolutely NOTHING Studio can do about that!

Reply
  • Unknown said:
    There seems to be a consistent issue with such xliffs in Studio in that the translation projects are not being broken down into segments properly.

    This is NOT Trados Studio problem! This is caused PURELY by TOTAL IGNORANCE of the WPML developer (or the entire WP squad)!

    What they call an XLIFF is in fact a TOTAL CRAP which has - basically apart from the file extension - nothing in common with XLIFF. They just get the complete HTML content as-is and put it as CDATA section into SINGLE LAAAARGE translation unit.
    The IGNORANTS apparently did not get the fundamentals of XLIFF at all :(
    wpml.org/.../

    There is absolutely NOTHING Studio can do about that!

Children
  • Hello Evzen,

    I very much appreciate your contribution. Indeed, I do observe exactly what you're describing: a single large translation unit which you can't do anything with.

    However, if you imagine yourself in my situation, I don't own WPML. I can't change how they produce xliffs.

    I only own an LSP that tries to serve some very decent clients, none of whom wish to entertain the various detailed IT problems. They just want their website professionally localised into their desired languages, preferably very quickly, and you really can't blame them.

    On the other hand, I very much wish to stay entirely loyal to the SDL Studio / GroupShare suite, as a business principle.

    Therefore, I'm a bit stuck here: it is confusing to me how my CAT platform of choice seems to be unable to support a straightforward website translation project (if you quote my clients) which was taken from world's most popular website management system (that's undisputed).

    While a competitive CAT tool seems to be able to simply crack on with it on default settings.

    It is perhaps the tragedy of my situation that I can't afford to appreciate the wonderful technicalities and the different flavours of this or other tag inside some computer file. Put bluntly, the idea behind Studio must be to perform translation projects that real-world clients wish to pay for. And what I'm observing is a situation that is... commercially unhelpful.

    Therefore, let me cry for help again: is everyone here absolutely sure nothing can be done to carry out WordPress / WPML projects with Studio 2017 (in terms of fixing the very-large-segment issue)?

    Kind regards,

    Adam
  • Unknown said:
    I only own an LSP that tries to serve some very decent clients, none of whom wish to entertain the various detailed IT problems. They just want their website professionally localised into their desired languages, preferably very quickly, and you really can't blame them.

    Of course I can... because the ONLY way out of this is that major WP(ML) users PUSH VERY STRONG on the lame WP(ML) developers to fix the BAD functionality, or simply stop using their crappy products. Period.
    Ignorant product users are as bad as ignorant product developers, there is absolutely NO difference.

    Unknown said:
    it is confusing to me how my CAT platform of choice seems to be unable to support a straightforward website translation project (if you quote my clients) which was taken from world's most popular website management system (that's undisputed).

    The ONLY undisputed fact is that this "world's most popular system" is BADLY DESIGNED and produces CRAP, not proper XLIFFs. Period. And ignorant clients' claims WON'T change anything on that.

    Unknown said:
    While a competitive CAT tool seems to be able to simply crack on with it on default settings.

    I simply DON'T believe that. Crippled XLIFF is crippled XLIFF and CANNOT be correctly processed as XLIFF.
    Extracting the complete HTML content embedded inside the wannabe-XLIFF and re-processing, re-parsing and segmenting it from scratch is a completely different story and nothing else than WORKAROUND and does NOT mean anything like "that tool can process WPML XLIFFs while the other tool can't" (no matter if ignorant users see it like this).

    Unknown said:
    It is perhaps the tragedy of my situation that I can't afford to appreciate the wonderful technicalities and the different flavours of this or other tag inside some computer file. Put bluntly, the idea behind Studio must be to perform translation projects that real-world clients wish to pay for. And what I'm observing is a situation that is... commercially unhelpful.

    This is just completely WRONG.
    If someone starts producing traffic lights which work the exactly opposite way than the rest of the world (green for STOP, red for GO), would you ask drivers to accommodate to this, or would you push the producer to fix their product (and not use the crippled traffic lights until they fix the problem)?!?!

    Unknown said:
    Therefore, let me cry for help again: is everyone here absolutely sure nothing can be done to carry out WordPress / WPML projects with Studio 2017 (in terms of fixing the very-large-segment issue)?

    As Jerzy wrote, you're on your own - create customized file type which parses the HTML embedded inside the wannabe-XLIFF... just as I described above.
    If you don't like this, bark the right tree, i.e. blame the WPML producer.