Unable to import xliff 1.2 file (Index was out of range)

Hello all,

We're experiencing difficulties importing xliff 1.2 files we have received (Studio is unable to convert to translable format).

We receive the following error message: 

<SDLErrorDetails time="22/12/2022 18:15:21">
  <ErrorMessage>Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index</ErrorMessage>
  <Exception>
    <Type>System.ArgumentOutOfRangeException, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</Type>
    <ParamName>index</ParamName>
    <HelpLink />
    <Source>mscorlib</Source>
    <HResult>-2146233086</HResult>
    <StackTrace><![CDATA[   at System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource)
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.ParagraphUnitBuilder.HasElementsBetweenSegments(Int32 previousSegmentPosition, Int32 currentSegmentIndex, List`1 paragraph)
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.ParagraphUnitBuilder.MatchParagraphElements()
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.ParagraphUnitBuilder.CheckSegmentsAndMatch()
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.ParagraphUnitBuilder.ProcessCurrentParagraphUnit()
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.ParagraphUnitBuilder.OutputParagraphUnit()
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.Consumers.AttributesConsumers.TranslateAttributeConsumer.Consume(XmlNodeParsed message)
   at lambda_method(Closure , IMessage )
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.InMemoryBus.Publish(IMessage message)
   at Sdl.FileTypeSupport.Filters.Xliff.Extractor.ParserImpl.Publish(XmlNodeParsed message)
   at Sdl.FileTypeSupport.Filters.Xliff.Infrastructure.XmlParser.Parse(XmlTextReader reader)
   at Sdl.FileTypeSupport.Filters.Xliff.Extractor.ParserImpl.Parse(String xliffPath)
   at Sdl.FileTypeSupport.Filters.Xliff.Parser.ParseNext()
   at Sdl.FileTypeSupport.Framework.Integration.FileExtractor.ParseNext()
   at Sdl.FileTypeSupport.Framework.Integration.MultiFileConverter.ParseNext()
   at Sdl.FileTypeSupport.Framework.Integration.MultiFileConverter.Parse()
   at Sdl.TranslationStudio.Editor.TranslationEditor.TranslatableDocument.Load(IJobExecutionContext context)
   at Sdl.Desktop.Platform.Services.JobRequest.Execute(IJobExecutionContext context)
   at Sdl.Desktop.Platform.Implementation.Services.Job.<_worker_DoWork>b__47_0()
   at Sdl.Desktop.Logger.Log.Resources(Object message, Action action)
   at Sdl.Desktop.Platform.Implementation.Services.Job._worker_DoWork(Object sender, DoWorkEventArgs e)
   at System.ComponentModel.BackgroundWorker.OnDoWork(DoWorkEventArgs e)
   at System.ComponentModel.BackgroundWorker.WorkerThreadStart(Object argument)]]></StackTrace>
  </Exception>
  <Environment>
    <ProductName>SDL Trados Studio</ProductName>
    <ProductVersion>15.0.0.0</ProductVersion>
    <EntryAssemblyFileVersion>15.2.7.2849</EntryAssemblyFileVersion>
    <OperatingSystem>Microsoft Windows 10 Entreprise</OperatingSystem>
    <ServicePack>NULL</ServicePack>
    <OperatingSystemLanguage>1036</OperatingSystemLanguage>
    <CodePage>1252</CodePage>
    <LoggedOnUser>PWCGLB\jtoner001</LoggedOnUser>
    <DotNetFrameWork>4.0.30319.42000</DotNetFrameWork>
    <ComputerName>W10-PC1SFAVD</ComputerName>
    <ConnectedToNetwork>True</ConnectedToNetwork>
    <PhysicalMemory>16524420 MB</PhysicalMemory>
  </Environment>
</SDLErrorDetails>

We have checked the xliff file and it is a valid xliff 1.2 file. Also, we can import it into another CAT tool we use (DVX). For this project however, we need to use Studio.

We reviewed the following thread, but it does not seem to be quite the same case: community.rws.com/.../bug-importing-xliff-1-2

Can anyone help provide clues as to why this may not be working? Here is an example the code in question:

<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:okp="okapi-framework:xliff-extensions" xmlns:its="http://www.w3.org/2005/11/its" xmlns:itsxlf="http://www.w3.org/ns/its-xliff/" its:version="2.0"> 
 <file original="/opt/tomcat/eol/tomcat/temp/1062-21-a27e5ec76869a021d9e777897c69cdb585470b8c.html" source-language="fr-FR" target-language="en-GB" datatype="html" okp:inputEncoding="UTF-8" okp:configId="/opt/tomcat/eol/config/tikal/plugins/okf_html@eolng-html.fprm"> 
  <body> <trans-unit id="tu7" restype="x-paragraph"> 
    <source xml:lang="fr-FR">ABC DEF GHI.</source> 
    <seg-source>
     JKL MNF PQR
    </seg-source> 
    <target xml:lang="en-GB"></target> 
   </trans-unit>  
  </body> 
 </file> 
</xliff>

Many thanks for your help!

emoji
  •  

    I believe this is being caused by the lack of any segmentation information in the <seg-source>.  So If I rewrite your file like this for example and include a marker element:

    <?xml version="1.0" encoding="UTF-8"?>
    <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:okp="okapi-framework:xliff-extensions" xmlns:its="http://www.w3.org/2005/11/its" xmlns:itsxlf="http://www.w3.org/ns/its-xliff/" its:version="2.0"> 
     <file original="/opt/tomcat/eol/tomcat/temp/1062-21-a27e5ec76869a021d9e777897c69cdb585470b8c.html" source-language="fr-FR" target-language="en-GB" datatype="html" okp:inputEncoding="UTF-8" okp:configId="/opt/tomcat/eol/config/tikal/plugins/okf_html@eolng-html.fprm"> 
      <body> <trans-unit id="tu7" restype="x-paragraph"> 
        <source xml:lang="fr-FR">ABC DEF GHI.</source> 
        <seg-source>
         <mrk mtype="seg">JKL</mrk>
         <mrk mtype="seg">MNF</mrk>
         <mrk mtype="seg">PQR</mrk>
        </seg-source> 
        <target xml:lang="en-GB"></target> 
       </trans-unit>  
      </body> 
     </file> 
    </xliff>

    Or even this:

    <?xml version="1.0" encoding="UTF-8"?>
    <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:okp="okapi-framework:xliff-extensions" xmlns:its="http://www.w3.org/2005/11/its" xmlns:itsxlf="http://www.w3.org/ns/its-xliff/" its:version="2.0"> 
     <file original="/opt/tomcat/eol/tomcat/temp/1062-21-a27e5ec76869a021d9e777897c69cdb585470b8c.html" source-language="fr-FR" target-language="en-GB" datatype="html" okp:inputEncoding="UTF-8" okp:configId="/opt/tomcat/eol/config/tikal/plugins/okf_html@eolng-html.fprm"> 
      <body> <trans-unit id="tu7" restype="x-paragraph"> 
        <source xml:lang="fr-FR">ABC DEF GHI.</source> 
        <seg-source>
         <mrk mtype="seg">JKL MNF PQR</mrk>
        </seg-source> 
        <target xml:lang="en-GB"></target> 
       </trans-unit>  
      </body> 
     </file> 
    </xliff>

    Then it will process perfectly fine in Studio.

    I'm not sure if this is a mandatory element to be included with <seg-source>, but the specification does say this:

    Each segment inside the <seg-source> and <target> content is represented using the <mrk> element with attribute mtype set to the value "seg".

    And there does seem little point in using seg-source in the first place without it.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji