How to quickly add stuff (abbreviations, etc.) to segmentation rules when translating?

OK, in memoQ, when you run into an incorrectly segmented segment caused by e.g. an abbreviation, you can quickly add it to your segmentation rules for the current src language, and even re-segment your entire project (if you want). E.g., I just ran into this problem, as Studio segmented on "FIG.", I don;t want it to do this. How can I quickly tell Studio not to break on "FIG."?

Michael

Parents Reply Children
  • Hi Michael,

    I'm afraid this is something that is not possible in Studio. You have to add FIG. to your abbreviation list in the TM settings and then prepare the file again. Sadly, this is something which is not so easy to do in Studio as we work on the files and not a database. I guess it would be possible to automate the process but even then it would not be as fast as it is in some tools that are based on a database.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Oh dear, I was hoping you wouldn't say that. I am at 45% of a largish project, and have made all kinds of comments, merged and split segments, etc. However, looking down, I see TONS of messed up segments because of this "FIG." I figured out how to add it to the seg rules for my current TM, but re-importing the file to re-segment it isn't looking particularly great, as I would lose all my previous stuff (comments, merges/splits, etc). Darn.

    Since Studio can't do this, I suppose it might be a good idea to devise some tricks to help users who run into this problem, which I imagine many people do. Maybe even a little SDL App? … "(Re)Segment Buddy". E.g., a trick that would basically at least copy all my comments back into place after re-importing an identical file. I just ran a test: after adding FIG. to my seg list, I created an identical copy of the file, gave it a different name, and then added this file to my project. Obviously, it was segmented correctly, but all my comments are now missing. I imagine it shouldn't be too hard to make it so that Studio copies the comments back in, somehow. And perhaps even remembers where I merged and split stuff, as the file contents is exactly the same. I suppose this information would need to be saved in the TM somewhere as metadata.

  • OK, problem solved, kind of. I just went through the whole damn thing and merged all of the incorrectly segmented bits manually. A hellish job, and not great for my RSI, but it's done. Thank the lord the job wasn't any bigger.

    OK, so another question: how can I save this new seg rule so it is ALWAYS used, rather than just linked to this 1 particular TM?

  • Unknown said:
    OK, so another question: how can I save this new seg rule so it is ALWAYS used, rather than just linked to this 1 particular TM?

    Hi Michael,

    I hate to ruin your night :-(  Studio has a concept of language resource templates so you can create one of these and use it to create all your new TMs with the appropriate segmentation rules, abbreviations, ordinal followers, variables etc.  But if you have existing TMs then you'll have to go into them all and make the changes to them all separately.

    You don't have to tell me how much better it would be if we could have a language resource that was used by multiple TMs so that you only had to change the resource once to see it reflected across all the TMs you use.  But this is something that is fixed deeper in the software and working a different way would require a major change to the way they currently work.  You won't be the first to want this, and probably not the last!

    Tricky to do with an app as well... but perhaps we can have another look at this when we get some time (that means it won't be soon unfortunately!).  But perhaps another developer has played with this sort of problem to try and find a better solution?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi,
    I fully agree with Michael's observations and wish!

    In the recent SDL app idea contest, I have submitted the idea for an app which identifies possible abbrevations already when creating the project/importing the source files (just as STAR Transit does). However, my idea has not made it to the top idea list which users could vote on :-(

    Kind regards
    Christine
  • Hi Michael, hi Paul,

    just a quick comment: When you work with server TMs (on SDL GroupShare), then a change to the Language Resources is applied to existing server TMs as well - during the nightly maintenance task.
    This is not possible for file based TMs, unfortunately...

    Kind regards
    Christine
  • Yes, this is very useful. Also present in memoQ 2015 and CafeTran.

    and the second dialogue in memoQ's Find Abbreviation tool:

    As you can see, it would have found the "FIG." that caused me so much trouble in my current project.

    Michael

  • Jesus, I have been running into this problem on a daily basis, and in my view, this is a real problem, making Studio's approach to segmentation pretty much unusable. I have tons of TMs for various clients and various domains, and it is simply impossible to add all of the new abbreviations that pop up on a daily basis in any way that would be of any use.

    I have tried creating a special "Language Resource Template" (LRT), and every time I run into a new one (usually around 5 new ones a day in my pair, Dutch to English!), adding it to this. However, this is useless, as none of these added abbreviations will be added to any of my older TMs, only to new TMs I create, if, that is, I even remember to use the correct LRT.

    The current system just doesn't work.

    I know you said that it would be a lot of work, but I really do think SDL should prioritise this, as it is a terrible system. Well, almost no system at all really.

    If you can't change it because the required changes would need to run too deep into the rest of the software, I think an App would be very much needed. A word counter is nice, but I think this is more important. Some kind of SDL App that would automatically propagate any abbreviations newly added to a LRT across a batch of old TMs.

    Michael

    PS: so far, I ran into all of these just this morning: enz., FIG., fig., excl., T.n.v., Dhr., drs., plv., tel. ...

  • +1! This is really needed! Our organization became so frustrated that we ended up using the Trados SDK to develop a tool that we use to create a language resource file off of SRX rules (which i have no idea why SDL does not support).