Package org.apache.tika.language
Class LanguageIdentifier
- java.lang.Object
-
- org.apache.tika.language.LanguageIdentifier
-
@Deprecated public class LanguageIdentifier extends Object
Deprecated.use a concrete class ofLanguageDetector
Identifier of the language that best matches a given content profile. The content profile is compared to generic language profiles based on material from various sources.- Since:
- Apache Tika 0.5
- See Also:
- Europarl: A Parallel Corpus for Statistical Machine Translation, ISO 639 Language Codes
-
-
Constructor Summary
Constructors Constructor Description LanguageIdentifier(String content)
Deprecated.Constructs a language identifier based on a String of text contentLanguageIdentifier(LanguageProfile profile)
Deprecated.Constructs a language identifier based on a LanguageProfile
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static void
addProfile(String language, LanguageProfile profile)
Deprecated.Adds a single language profilestatic void
clearProfiles()
Deprecated.Clears the current map of language profilesstatic String
getErrors()
Deprecated.Returns a string of error messages related to initializing language profilesString
getLanguage()
Deprecated.Gets the identified languagestatic Set<String>
getSupportedLanguages()
Deprecated.Returns what languages are supported for language identificationstatic boolean
hasErrors()
Deprecated.Tests whether there were errors initializing language configstatic void
initProfiles()
Deprecated.Builds the language profiles.static void
initProfiles(Map<String,LanguageProfile> profilesMap)
Deprecated.Initializes the language profiles from a user supplied initialized Map.boolean
isReasonablyCertain()
Deprecated.Tries to judge whether the identification is certain enough to be trusted.String
toString()
Deprecated.
-
-
-
Constructor Detail
-
LanguageIdentifier
public LanguageIdentifier(LanguageProfile profile)
Deprecated.Constructs a language identifier based on a LanguageProfile- Parameters:
profile
- the language profile
-
LanguageIdentifier
public LanguageIdentifier(String content)
Deprecated.Constructs a language identifier based on a String of text content- Parameters:
content
- the text
-
-
Method Detail
-
addProfile
public static void addProfile(String language, LanguageProfile profile)
Deprecated.Adds a single language profile- Parameters:
language
- an ISO 639 code representing languageprofile
- the language profile
-
getLanguage
public String getLanguage()
Deprecated.Gets the identified language- Returns:
- an ISO 639 code representing the detected language
-
isReasonablyCertain
public boolean isReasonablyCertain()
Deprecated.Tries to judge whether the identification is certain enough to be trusted. WARNING: Will never return true for small amount of input texts.- Returns:
true
if the distance is smaller then 0.022,false
otherwise
-
initProfiles
public static void initProfiles()
Deprecated.Builds the language profiles. The list of languages are fetched from a property file named "tika.language.properties" If a file called "tika.language.override.properties" is found on classpath, this is used instead The property file contains a key "languages" with values being comma-separated language codes
-
initProfiles
public static void initProfiles(Map<String,LanguageProfile> profilesMap)
Deprecated.Initializes the language profiles from a user supplied initialized Map. This overrides the default set of profiles initialized at startup, and provides an alternative to configuring profiles through property file- Parameters:
profilesMap
- map of language profiles
-
clearProfiles
public static void clearProfiles()
Deprecated.Clears the current map of language profiles
-
hasErrors
public static boolean hasErrors()
Deprecated.Tests whether there were errors initializing language config- Returns:
- true if there are errors. Use getErrors() to retrieve.
-
getErrors
public static String getErrors()
Deprecated.Returns a string of error messages related to initializing language profiles- Returns:
- the String containing the error messages
-
getSupportedLanguages
public static Set<String> getSupportedLanguages()
Deprecated.Returns what languages are supported for language identification- Returns:
- A set of Strings being the ISO 639 language codes
-
-