How to prepare a txt subtitle file for translation in Studio?

Hello everybody,

I've been sent a couple of txt files that seem to be converted from a subtitle software. The text looks like this:

{\rtf1\ansi\ansicpg1252\cocoartf1561\cocoasubrtf200
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
{\*\expandedcolortbl;;}
\paperw11900\paperh16840\margl1440\margr1440\vieww26700\viewh18280\viewkind0
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0

\f0\fs24 \cf0 1\
00:00:10,000 --> 00:00:20,000\
\pard\pardeftab708\ri-386\partightenfactor0
\cf0 Dear trader\
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \
2\
00:00:20,000 --> 00:00:30,000\
\pard\pardeftab708\ri-386\partightenfactor0
\cf0 Are you fully aware of the information you are required by law \
\pard\tx566\tx1133\tx1700\tx2267\tx2834\tx3401\tx3968\tx4535\tx5102\tx5669\tx6236\tx6803\pardirnatural\partightenfactor0
\cf0 \

Obviously, the only thing to translate here are the sentences "Dear trader" and "Are you fully aware of the information you are required by law". Can someone suggest ideas how to prepare these files in Studio, so as to extract only the text for translation?

I tried adding "\cf0 "as an opening pattern and "\" as a closing one in the document structure for txt files but the back slash doesn't seem to be accepted as an end of a pattern. I also changed all "\" to "$" and reintroduced the patterns with the dollar sign - again nothing.

As a last resort, I might try to hide unnecessary stuff with Transtools, but I have to do it manually and I am afraid of skipping sentences accidentally.

TIA

Parents
  • OMG, which software created this? This is obviously RTF, not plain text!
    Change the file extension to .rtf, then open it in Word and save as real plain text file (the actual subtitles format is SRT, see https://en.wikipedia.org/wiki/SubRip).
    BTW, translating subtitles using CAT tools is basically pointless as a) by the nature of subtitles it's pretty unlikely to find any reusable texts, and b) again due to the subtitles nature are sentences divided to multiple parts (as the parts are displayed one-by-one at different times), i.e. the sentences are nonsensically broken down to multiple separate segments.

Reply
  • OMG, which software created this? This is obviously RTF, not plain text!
    Change the file extension to .rtf, then open it in Word and save as real plain text file (the actual subtitles format is SRT, see https://en.wikipedia.org/wiki/SubRip).
    BTW, translating subtitles using CAT tools is basically pointless as a) by the nature of subtitles it's pretty unlikely to find any reusable texts, and b) again due to the subtitles nature are sentences divided to multiple parts (as the parts are displayed one-by-one at different times), i.e. the sentences are nonsensically broken down to multiple separate segments.

Children
  • Thanks Evzen, I will try this. I need to use the TM as the client first sent me the scripts, and I translated them, so I intend to use the memory for fuzzy matches and concordance. Basically the sentences are split into 2 or 3 parts, so they should be easy to find. Plus I want to get rid of all code-like stuff so I can really see where the text is.
  • Unknown said:
    BTW, translating subtitles using CAT tools is basically pointless as a) by the nature of subtitles it's pretty unlikely to find any reusable texts, and b) again due to the subtitles nature are sentences divided to multiple parts (as the parts are displayed one-by-one at different times), i.e. the sentences are nonsensically broken down to multiple separate segments.

    Hi Evzen,

    I don't completely agree with you here.  We have many clients who use CAT tools for subtitling and even though the things you say are not true for all texts the more important part is that using a CAT supports a more controlled process.  I even have a TQA model for subtitling, although it does need a preview developing (way down my list unfortunately as I'd love to do it) to allow the finer points of working with subtitling to be applied.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Evzen, your solution was pretty straightforward, thank you! Now I only see the time codes and the text.
  • Unknown said:
    We have many clients who use CAT tools for subtitling

    Well, we use CAT tools for translating subtitles too... but that doesn't mean that it's not pretty pointless ;-)

    Unknown said:
    the more important part is that using a CAT supports a more controlled process.

    Controlled process has not much to do with CAT tools as such... maybe with using tools in general.
    What I meant was using TMs for pretranslation or storing the (broken) segments in TM - there is basically ZERO benefit in real life. And that was my point. Unlike many people theoretizing about these things, I've personally done quite a number of transcriptions (incl. subtitles timing), processed translations, done timing corrections, etc.m so I could see the the (almost) zero real benefit.

    The point is that properly done subtitles translation is far from classical translation, it's rather a transcreation - the target sentence often only "presents the message" of the source, but using different words... so it does not make much sense using/creating a TM.

  • Unknown said:
    Unlike many people theoretizing about these things, I've personally done quite a number of transcriptions (incl. subtitles timing), processed translations, done timing corrections, etc.m so I could see the the (almost) zero real benefit.

    Perhaps not enough Evzen.  I haven't done any, but I'm familiar with the process, and we have enough large customers who do nothing but subtitling for me to see they do get a benefit.  But at the end of the day it doesn't matter what you or I think does it!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub