Is it allowed to use a multilingual Termbase in a translation project? Do we risk the dreaded "not supported culture"?

Former Member
Former Member

I'm using a multilingual Termbase with one its languages German. Where can I fix the "DE-01" as an invalid culture identifier? Thanks.

Parents
  • Hello Ozzie,

    Of course this is allowed. But to be honest I've never seen this before. Where did you get the termbase from? Can I see it?

    Thank you

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Former Member
    0 Former Member in reply to Paul

    Well, actually I did some research myself and came to this conclusion: Glossary Converter (though a great application) apparently is the culprit. Let me explain, in a nutshell:

    1) The Termbase was created from a TBX file containing the substring <langSet xml:lang="de"> then

    2) during the conversion process by Glossaryconverter it became <l lang="DE-01" type="German (Germany)"></l> (hence the error message I was getting).

    3) instead of the more correct <l type="German (Germany)" lang="DE-DE"></l>

    The XML string in the entity called "mtConcepts", column "text" is, for instance,

    <cG><c>1</c><trG><tr type="origination">glossaryconverter</tr><dt>2017-02-17T16:26:14</dt></trG><trG><tr type="modification">glossaryconverter</tr><dt>2017-02-17T16:26:14</dt></trG><dG><d type="Subject">Entwicklung</d></dG><lG><l lang="DE-01" type="German (Germany)"></l><tG><t>Genossenschaft</t><trG>... etc.

    Incidentally, other of my multilingual Termbases that include German, correctly identified as "DE-DE" (Deutsch-Deutschland) work flawlessly. For the faulty Termbases with this allegedly wrong "Language-Locale" code, all I have to do is repeat the conversion process. And, yes, you are right, multilingual Termbases can run along any translation project.

    I have to say, the more I use Termbase, the more I like it.

    Cheers!

    Ozzie

  • Former Member
    0 Former Member in reply to Former Member

    Or, was it the culprit SDL Multiterm 2017 Convert? Or simply a faulty .xcd (a conversion session file)? Apparently, yes. I got these XML strings:

    <property name="label" value="German (Germany)" />

           <property name="renamedLabel" value="German (Germany)" />

           <property name="locale" value="DE-01" />

    Go figure.

  • Hi Ozzie,

    The problem is not the Glossary Converter. It is MultiTerm Convert, but I still can't get to the problem you had. I can only get DE-01 as a language code if I convert a DE and EN TBX to a generic German and English XDT/XML. Then, because there is no specific variant stated MultiTerm Convert uses EN-01 and DE-01.

    Now, if I then create a new termbase using the XDT then MultiTerm ignores the fact the languages are generic and suggests DE-DE and EN-US. So I have to remove these and add back the generic languages if this is what I want.

    But irrespective of whether I change the languages or not I still can't get this error when I add the termbase in Studio. Very strange but good you seem to have resolved the problem.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • I also moved your thread to the MultiTerm forum. MultiTerm Workflow is a different product:

    www.sdl.com/.../multiterm-workflow.html

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Former Member
    0 Former Member in reply to Paul

    "Then, because there is no specific variant stated MultiTerm Convert uses EN-01 and DE-01" I believe this should not happen, simply because the convention is that you define a culture, or language culture name, as "Language-Locale" using the corresponding standard ISO codes. Thus, I believe, in this context, "01" means nothing. In this particular case it should be "DE-DE" (German-Germany), DE-AT (German-Austria), or ES-ES (Spanish-Spain), ES-UY (Spanish-Uruguay), and so on.

    I also noticed two different results when importing a TBX file (with an original attribute of xml:lang="de", ie just language) into a SDL Termbase:

    1. If imported using "Glossary Converter" a language, eg German, has this XML element:

    <l lang="DE" type="German">

    2. If imported using this procedure: a) create a(n empty) Termbase within Multiterm 2015/2017 Desktop, b) using Multiterm Convert convert the TBX file into an XML file, c) using Multiterm 2015/2017 Desktop, import the XML file in (b) into the Termbase in (a).

    then the XML language element will look like this:

    <l type="German (Germany)" lang="DE-DE">

    In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.

Reply
  • Former Member
    0 Former Member in reply to Paul

    "Then, because there is no specific variant stated MultiTerm Convert uses EN-01 and DE-01" I believe this should not happen, simply because the convention is that you define a culture, or language culture name, as "Language-Locale" using the corresponding standard ISO codes. Thus, I believe, in this context, "01" means nothing. In this particular case it should be "DE-DE" (German-Germany), DE-AT (German-Austria), or ES-ES (Spanish-Spain), ES-UY (Spanish-Uruguay), and so on.

    I also noticed two different results when importing a TBX file (with an original attribute of xml:lang="de", ie just language) into a SDL Termbase:

    1. If imported using "Glossary Converter" a language, eg German, has this XML element:

    <l lang="DE" type="German">

    2. If imported using this procedure: a) create a(n empty) Termbase within Multiterm 2015/2017 Desktop, b) using Multiterm Convert convert the TBX file into an XML file, c) using Multiterm 2015/2017 Desktop, import the XML file in (b) into the Termbase in (a).

    then the XML language element will look like this:

    <l type="German (Germany)" lang="DE-DE">

    In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.

Children
  • Unknown said:
    In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.

    Of course you are entitled to your opinion, but tell me how you would handle a capability to not specify a variant?

    MultiTerm supports you not selecting an variant.  This can be very useful because it means you can use one language for all variants of the language without having to save de-AT as de-DE when it's not.  It also allows you to use attributes for the language codes instead and as MultiTerm doesn't do any kind of linguistic check on what you are saving as terms this gives you a lot of flexibility since you can then set certain variants as forbidden if you like as part of a verification check.  For example:

    https://multifarious.filkin.com/2014/04/08/yanks-versus-brits/

    So using these main languages instead of the sublanguages can be very helpful.

    Now, the convention is something else altogether.  The Glossary Converter just uses DE because it's more modern.  But MultiTerm Convert is a pretty old codebase so probably required a fully designated language for every language used.  But if you don't specify the variant because want more flexibility it has to use something... I suppose DE-01  and EN-01 was as good as any in those days.  But this is so old I'm guessing completely!

    Today we use Microsoft LCIDS which consists of three ISO standards:

    • ISO-639 - language code
      • Gives us EN, or DE for example
    • ISO-3166 - country code
      • Gives us en-GB, or de-AT for example
    • ISO-15924 - script tag (occasionally used)
      • Gives us ff-Latn-SN, or sr-Cyrl-CS for example

    Studio always requires a fully qualified variant, so you will never see EN or DE in an SDLXLIFF or an SDLTM.  But MultiTerm can.  Anyway, my original post was just explaining what was happening and not trying to justify it in any way.  In my tests MultiTerm happily ignored DE-01 anyway so I didn't get the error you did.  Probably not worth worrying about it anymore unless you can reliably find a way to reproduce it?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub