From The Languages of David J. Peterson
Jump to navigation Jump to search

This is the documentation page for Module:Utilities

This module exports various general utility functions, which can be used by other modules.



Escapes the magic characters used in patterns (Lua's version of regular expressions). For example, "^$()%.[]*+-?" becomes "%^%$%(%)%%%.%[%]%*%+%-%?".


format_categories(categories, lang, sort_key, sort_base, force_output)

Formats a list (table) of category names. The output is a string consisting of all categories with [[Category:...]] applied to each one, and the given sort key added. If the namespace is not the main, Appendix or Reconstruction namespaces, the output will be an empty string unless force_output is given. If no sort key is given:

  1. A default one is generated by using sort_base (if given) or the current subpage name, and by removing hyphens from the beginning (so that suffixes can be sorted without a key).
  2. If a sort key is available for the given language, it is then used to create a sort key that follows the rules for that language.
  3. If the final sort key ends up being identical to the page name (which is the default sort key used by the software), then it is omitted entirely, so that it can be used in combination with DEFAULTSORT.



This function is used by the {{categorize}}, {{catlangname}} and {{catlangcode}} templates.


This function adds a "catfix", which is used on language-specific category pages to add language attributes and often script classes to all entry names. The addition of language attributes and script classes makes the entry names display better (using the language- or script-specific styles specified in MediaWiki:Common.css), which is particularly important for non-English languages that do not have consistent font support in browsers.

Language attributes are added for all languages, but script classes are only added for languages with one script listed in their data file, or for languages that have a default script listed in the catfix_script list in Module:utilities/data. Some languages clearly have a default script, but still have other scripts listed in their data file and therefore need their default script to be specified. Others do not have a default script.

  • Serbo-Croatian is regularly written in both the Latin and Cyrillic scripts. Because it uses two scripts, Serbo-Croation cannot have a script class applied to entries in its category pages, as only one script class can be specified at a time.
  • Russian is usually written in the Cyrillic script (Cyrl), but Braille (Brai) is also listed in its data file. So Russian needs an entry in the catfix_script list, so that the Cyrl (Cyrillic) script class will be applied to entries in its category pages.

To find the scripts listed for a language, go to Module:languages and use the search box to find the data file for the language. To find out what a script code means, search the script code in Module:scripts/data