Class LanguageNames

java.lang.Object
org.apache.tika.language.detect.LanguageNames

public class LanguageNames extends Object
Support for language tags (as defined by https://tools.ietf.org/html/bcp47)

See https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes for a list of three character language codes.

TODO change to LanguageTag, and use these vs. strings everywhere in the language detector API?

  • Constructor Details

    • LanguageNames

      public LanguageNames()
  • Method Details

    • makeName

      public static String makeName(String language, String script, String region)
    • normalizeName

      public static String normalizeName(String languageTag)
    • isMacroLanguage

      public static boolean isMacroLanguage(String languageTag)
    • hasMacroLanguage

      public static boolean hasMacroLanguage(String languageTag)
    • getMacroLanguage

      public static String getMacroLanguage(String languageTag)
      If language is a specific variant of a macro language (e.g. 'nb' for Norwegian Bokmal), return the macro language (e.g. 'no' for Norwegian). If it doesn't have a macro language, return unchanged.
      Parameters:
      languageTag -
      Returns:
    • equals

      public static boolean equals(String languageTagA, String languageTagB)