I'm using a multilingual Termbase with one its languages German. Where can I fix the "DE-01" as an invalid culture identifier? Thanks.
I'm using a multilingual Termbase with one its languages German. Where can I fix the "DE-01" as an invalid culture identifier? Thanks.
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Well, actually I did some research myself and came to this conclusion: Glossary Converter (though a great application) apparently is the culprit. Let me explain, in a nutshell:
1) The Termbase was created from a TBX file containing the substring <langSet xml:lang="de"> then
2) during the conversion process by Glossaryconverter it became <l lang="DE-01" type="German (Germany)"></l> (hence the error message I was getting).
3) instead of the more correct <l type="German (Germany)" lang="DE-DE"></l>
The XML string in the entity called "mtConcepts", column "text" is, for instance,
<cG><c>1</c><trG><tr type="origination">glossaryconverter</tr><dt>2017-02-17T16:26:14</dt></trG><trG><tr type="modification">glossaryconverter</tr><dt>2017-02-17T16:26:14</dt></trG><dG><d type="Subject">Entwicklung</d></dG><lG><l lang="DE-01" type="German (Germany)"></l><tG><t>Genossenschaft</t><trG>... etc.
Incidentally, other of my multilingual Termbases that include German, correctly identified as "DE-DE" (Deutsch-Deutschland) work flawlessly. For the faulty Termbases with this allegedly wrong "Language-Locale" code, all I have to do is repeat the conversion process. And, yes, you are right, multilingual Termbases can run along any translation project.
I have to say, the more I use Termbase, the more I like it.
Cheers!
Ozzie
Or, was it the culprit SDL Multiterm 2017 Convert? Or simply a faulty .xcd (a conversion session file)? Apparently, yes. I got these XML strings:
<property name="label" value="German (Germany)" />
<property name="renamedLabel" value="German (Germany)" />
<property name="locale" value="DE-01" />
Go figure.
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
"Then, because there is no specific variant stated MultiTerm Convert uses EN-01 and DE-01" I believe this should not happen, simply because the convention is that you define a culture, or language culture name, as "Language-Locale" using the corresponding standard ISO codes. Thus, I believe, in this context, "01" means nothing. In this particular case it should be "DE-DE" (German-Germany), DE-AT (German-Austria), or ES-ES (Spanish-Spain), ES-UY (Spanish-Uruguay), and so on.
I also noticed two different results when importing a TBX file (with an original attribute of xml:lang="de", ie just language) into a SDL Termbase:
1. If imported using "Glossary Converter" a language, eg German, has this XML element:
<l lang="DE" type="German">
2. If imported using this procedure: a) create a(n empty) Termbase within Multiterm 2015/2017 Desktop, b) using Multiterm Convert convert the TBX file into an XML file, c) using Multiterm 2015/2017 Desktop, import the XML file in (b) into the Termbase in (a).
then the XML language element will look like this:
<l type="German (Germany)" lang="DE-DE">
In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.
Unknown said:In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.
Of course you are entitled to your opinion, but tell me how you would handle a capability to not specify a variant?
MultiTerm supports you not selecting an variant. This can be very useful because it means you can use one language for all variants of the language without having to save de-AT as de-DE when it's not. It also allows you to use attributes for the language codes instead and as MultiTerm doesn't do any kind of linguistic check on what you are saving as terms this gives you a lot of flexibility since you can then set certain variants as forbidden if you like as part of a verification check. For example:
https://multifarious.filkin.com/2014/04/08/yanks-versus-brits/
So using these main languages instead of the sublanguages can be very helpful.
Now, the convention is something else altogether. The Glossary Converter just uses DE because it's more modern. But MultiTerm Convert is a pretty old codebase so probably required a fully designated language for every language used. But if you don't specify the variant because want more flexibility it has to use something... I suppose DE-01 and EN-01 was as good as any in those days. But this is so old I'm guessing completely!
Today we use Microsoft LCIDS which consists of three ISO standards:
Studio always requires a fully qualified variant, so you will never see EN or DE in an SDLXLIFF or an SDLTM. But MultiTerm can. Anyway, my original post was just explaining what was happening and not trying to justify it in any way. In my tests MultiTerm happily ignored DE-01 anyway so I didn't get the error you did. Probably not worth worrying about it anymore unless you can reliably find a way to reproduce it?
Regards
Paul
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Unknown said:In other words, coding like "EN-01", "DE-01", etc. should not happen ever. That's my opinion.
Of course you are entitled to your opinion, but tell me how you would handle a capability to not specify a variant?
MultiTerm supports you not selecting an variant. This can be very useful because it means you can use one language for all variants of the language without having to save de-AT as de-DE when it's not. It also allows you to use attributes for the language codes instead and as MultiTerm doesn't do any kind of linguistic check on what you are saving as terms this gives you a lot of flexibility since you can then set certain variants as forbidden if you like as part of a verification check. For example:
https://multifarious.filkin.com/2014/04/08/yanks-versus-brits/
So using these main languages instead of the sublanguages can be very helpful.
Now, the convention is something else altogether. The Glossary Converter just uses DE because it's more modern. But MultiTerm Convert is a pretty old codebase so probably required a fully designated language for every language used. But if you don't specify the variant because want more flexibility it has to use something... I suppose DE-01 and EN-01 was as good as any in those days. But this is so old I'm guessing completely!
Today we use Microsoft LCIDS which consists of three ISO standards:
Studio always requires a fully qualified variant, so you will never see EN or DE in an SDLXLIFF or an SDLTM. But MultiTerm can. Anyway, my original post was just explaining what was happening and not trying to justify it in any way. In my tests MultiTerm happily ignored DE-01 anyway so I didn't get the error you did. Probably not worth worrying about it anymore unless you can reliably find a way to reproduce it?
Regards
Paul
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub