Language codes

From ECRIN-MDR Wiki
Revision as of 18:46, 6 September 2019 by Admin (talk | contribs)
Jump to navigation Jump to search

All data objects in the MDR (and several entities in source data and other ECRIN databases) include an indication of their language, usually by a code. The MDR system uses the 2 character ISO 369-1 codes, as used on the web. An initial set of such codes, a subset of the full 369-1 listing, was added when the schema was first created in 2016. The codes were obtained from a 369-1 listing on the web and this is how their source is designated in the table below.

Some source systems, however, use the larger set of three letter codes represented by ISO 369-2 (adapted as MARC – ‘MAchine-Readable Cataloging’ – codes by the US Library of Congress and used by the National Library of Medicine). The MARC codes are therefore also included in the table, though only for the languages already selected in the ISO 369-1 set, plus a relatively small set of additional 369-2 languages used within specific source data. At the moment, this second group is derived only from a small set of additional PubMed language codes. In addition, the accumulation of organisational data in ECRIN led to the introduction of Latin as an additional language (a few universities use a Latin name).

The table below lists the languages, their codes and their source currently in the system.

369-1 369-2 Language source
af afr Afrikaans web
am amh Amharic PubMed
ar ara Arabic web
az aze Azerbaijani web
be bel Belarusian web
bg bul Bulgarian web
bn ben Bengali web
bo tib Tibetan web
br bre Breton web
bs bos Bosnian web
ca cat Catalan web
ce che Chechen web
co cos Corsican web
cs cze Czech web
cy wel Welsh web
da dan Danish web
de ger German web
el gre Greek web
en eng English web
eo epo Esperanto PubMed
es spa Spanish web
et est Estonian web
eu baq Basque web
fa per Persian web
fi fin Finnish web
fr fre French web
ga gle Irish Gaelic web
gd gla Scottish Gaelic web
gl glg Galician web
gu guj Gujarati web
ha hau Hausa web
he heb Hebrew web
hi hin Hindi web
hr hrv Croatian web
hu hun Hungarian web
hy arm Armenian web
id ind Indonesian web
is ice Icelandic web
it ita Italian web
iu iku Inuktitut web
ja jpn Japanese web
jv jav Javanese web
ka geo Georgian web
kk kaz Kazakh web
kl kal Greenlandic, Kalaallisut web
km khm Central Khmer web
ko kor Korean web
ks kas Kashmiri web
ku kur Kurdish web
la lat Latin ECRIN
lb ltz Luxembourgish web
lo lao Lao web
lt lit Lithuanian web
lv lav Latvian web
mi mao Maori web
mk mac Macedonian web
ml mal Malayalam PubMed
mn mon Mongolian web
ms may Malay web
mt mlt Maltese web
mu* mul Multiple languages PubMed
my bur Burmese web
ne nep Nepali web
nl dut Dutch web
no nor Norwegian web
os oss Ossetian web
pa pan Punjabi web
pl pol Polish web
ps pus Pashto PubMed
pt por Portuguese web
qu que Quechua web
rm roh Romansh web
ro rum Romanian, Moldavian web
ru rus Russian web
rw kin Kinyarwanda PubMed
se sme Northern Sami web
si sin Sinhalese web
sk slo Slovak web
sl slv Slovenian web
sm smo Samoan web
sn sna Shona web
so som Somali web
sq alb Albanian web
sr srp Serbian web
sv swe Swedish web
sw swa Swahili web
ta tam Tamil web
te tel Telugu web
tg tgk Tajik web
th tha Thai web
tk tuk Turkmen web
to ton Tongan web
tr tur Turkish web
tt tat Tatar web
ty tah Tahitian web
uk ukr Ukrainian web
un* und Undetermined PubMed
ur urd Urdu web
uz uzb Uzbek web
vi vie Vietnamese web
xh xho Xhosa web
yo yor Yoruba web
zh chi Chinese web
zu zul Zulu web

*Not an official ISO 369-1 code (an ISO two letter code does not exist)