Package org.apache.tika.language.detect
Class LanguageNames
java.lang.Object
org.apache.tika.language.detect.LanguageNames
Support for language tags (as defined by https://tools.ietf.org/html/bcp47)
See https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes for a list of three character language codes.
TODO change to LanguageTag, and use these vs. strings everywhere in the language detector API?
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
static String
getMacroLanguage
(String languageTag) If language is a specific variant of a macro language (e.g.static boolean
hasMacroLanguage
(String languageTag) static boolean
isMacroLanguage
(String languageTag) static String
static String
normalizeName
(String languageTag)
-
Constructor Details
-
LanguageNames
public LanguageNames()
-
-
Method Details
-
makeName
-
normalizeName
-
isMacroLanguage
-
hasMacroLanguage
-
getMacroLanguage
If language is a specific variant of a macro language (e.g. 'nb' for Norwegian Bokmal), return the macro language (e.g. 'no' for Norwegian). If it doesn't have a macro language, return unchanged.- Parameters:
languageTag
-- Returns:
-
equals
-