Languages in the unfoldingWord digital publishing system are identified using Internet Engineering Task Force (IETF) language tags. IETF tags provide an abbreviated language code that uses modern computing standards and is backward compatible with ISO 639 language codes but provides a standardized means of identifying additional information, including language variants and scripts.
In the IETF standard, macro languages are identified using two-letter codes (from ISO 639-1) while all other languages use the three-letter “Ethnologue code” (ISO 639-3) where this code exists. The language tags are comprised of subtags separated by hyphens. The IETF standard also provides a flexible means of adding new language variants, through the use of “-x” to indicate a private use tag (not in the official registry).
These are examples of language tags:
hi
: Hindi languageaaa
: Ghotuo languageen-AU
: English language, as written and spoken in Australiaaz-Latn-IR
: Azeri language, written in the Latin script, as used in Iranttt-x-ismai
: Tat language, Ismaili variant (for private use only)
IETF language tags are used in many protocols, including HTTP (the browser can indicate the user’s language preference to the server, the server can indicate to the browser the language and script in which the content is served) and XML (through the xml:lang attribute).
Resources:
- The IETF Wikipedia article.
- The registry of existing language tags.
- A utility to look up tags.
- Searchable language table in translationDatabase
- Additional information.