Project:Adding Languages

From The Languages of David J. Peterson
Jump to navigation Jump to search

Adding Languages to the DJP Wiki

Add the language code

Each language added to this wiki needs a four-character long code that is unique to it. This is how the language will be referenced in many uses of templates as well in some uses of special pages. These codes are analogous to the two- and three-letter codes used in Wiktionary and other places on the internet. Usually the code is the first four letters of the name of the language, but that is not a strict rule. For example, the code for High Valryian is hval.

Once the code is determined, it needs to be added to the language list at Module:Languages/data/djp. This page is solely for languages documented by The Languages of David J. Peterson. The entry has to be in a specific format, with the code at the top and then a number of values in the entry. The exact details documentation on what can go in a language entry can be found on the page itself. This is a summary.

Here's an example:

m["gvun"] = {
	"G'Vunna",
	"Q999999018",
	"atha",
	"Gvoz,Latn",
	ancestors = {"veda"},
	case_insensitive = true,
	sort_key = {
		from = {"[äàáâå]", "Ǝ", "[ëèéêǝ]", "[ïìíî]", "[öòóô]", "[üùúû]" },
		to   = {"a"	, "E", "e"	 , "i"	 , "o"	 , "u"	 }} ,
	standardChars = s["default-chars"].."ÖöÜüƎǝ" .. c.punc,
}

Don't forget the commas after each line in the entry!

Each entry has four unlabeled values that are in there in this order.

  • Canonical name (required). This is the 'official' name of the language. It should be in quotes.
  • Wikidata item ID (required). The item ID is a left over bit from the Wiktionary code we borrowed, so you don't need to figure out one, but something must be there. You can just put nil.
  • Language family (optional). This is a value from one of the pages in Module:Families/data or Module:Families/data/djp. If this doesn't apply, you can put nil or "x".
  • Scripts (optional). A list of scripts that apply to the language, using the scripts in Module:Scripts/data or Module:Scripts/data/djp. This should be in quotes. If there is more than one script, separate them with commas. If you do include this, you should list at least Latn for the Latin/Roman alphabet, and make sure you put in something for Language Family before it even if it is nil.

After these, there are a number of named values. These are all optional. They start with the name of the value, an equal sign (=), and the desired values. These can be added in any order after the four unlabeled ones.

Common labeled values are:

  • ancestors. Identifies the ancestor languages, by language code. The codes are listed as separate values in a table.
For example: ancestors = {"veda","en","etc"},
  • standardChars. These are the characters that are considered standard for the romanization of the language. Any character in a word that isn't in the languages standard characters gets called out into a category. In many entries, you'll see several things strung together with .., which links multiple things into one. You can use s["default-chars"] to mean the usual English 26 characters (a-z and A-Z). It also includes numerals (0-9). Put other characters in a string inside of quotes. And you'll usually see c.punc and the end, to signify that standard punctuation isn't supposed to be called out.
  • case_insentive. If this is true, the case doesn't matter. This is also the default value if it isn't included.
  • sort_key. This is used to equate certain characters with others when it comes to sorting in lists and categories. For example if you want versions of 'e' with diacritics to be sorted with the e without diacritics. Look at the existing entries in the data page and the detailed instructions for more on that.

Add extended information to the language

Additional information on the language that is not accessed as much is kept on a separate page. For the languages documented on this wiki, that page is Module:Languages/data/djp/extra. This information is optional - if the language doesn't need it, then you don't have to bother adding anything to the page.

Here's an example from the same language:

m["gvun"] = {
	aliases = {"G'Vunnǝ", "Lokheim", "Gvunna", "Gvunnǝ"},
}

Right now, we only use the labeled value aliases for other known names for the language. As seen above, this is a labeled value with a table of strings in quotes.

Add language to name tables

There are two pages that help connect languages to their language codes. The language information will need to be added to both pages.

Module:Languages/canonical names

Insert a line into the page that looks like this example (substitute your language's code and name, of course).

      ["G'Vunna"] = "gvun",

Add the entry in the appropriate place in alphabetic order. And remember the comma at the end of the line.'

Module:Languages/code to canonical name

Insert a line into the page that looks like this example (substitute your language's code and name, of course).

      ["gvun"] = "G'Vunna",

Add the entry in the appropriate place in alphabetic order. And remember the comma at the end of the line.'

Create basic pages and categories for language

There are a few pages you will need to make that all languages need. Note that NAME in the examples should be replaced with the canonical name of te language.

NAME Language

This is the basic page for the language. Links on the main page will link to this page. You'll want to give a brief description of the language and some basic links, such as a link to the vocabulary page.

NAME Vocabulary

This is the basic vocabulary page. In most pages, you will will want this to redirect to the lemmas page for the language. Just put the following for the whole page, replacing NAME with the canonical name of the language:

#REDIRECT [[Category:NAME lemmas]]
Category:NAME Language

This is the category for your language. In here you only have to put the following:

{{auto cat}}
Category:NAME lemmas

This will be the category page for your basic, uninflected words (i.e. lemmas). In here you only have to put the following:

{{auto cat}}

Add Scripts

If there are scripts associated with the language that have not been added already, those should be added now. See Project:Adding Scripts for details.