Package org.apache.tika.langdetect.tika
Class LanguageProfile
java.lang.Object
org.apache.tika.langdetect.tika.LanguageProfile
Language profile based on ngram counts.
- Since:
- Apache Tika 0.5
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic boolean -
Constructor Summary
ConstructorsConstructorDescriptionLanguageProfile(int length) LanguageProfile(String content) LanguageProfile(String content, int length) -
Method Summary
Modifier and TypeMethodDescriptionvoidAdds a single occurrence of the given ngram to this profile.voidAdds multiple occurrences of the given ngram to this profile.doubledistance(LanguageProfile that) Calculates the geometric distance between this and the given other language profile.longgetCount()longtoString()
-
Field Details
-
DEFAULT_NGRAM_LENGTH
public static final int DEFAULT_NGRAM_LENGTH- See Also:
-
useInterleaved
public static boolean useInterleaved
-
-
Constructor Details
-
LanguageProfile
public LanguageProfile(int length) -
LanguageProfile
public LanguageProfile() -
LanguageProfile
-
LanguageProfile
-
-
Method Details
-
getCount
public long getCount() -
getCount
-
add
Adds a single occurrence of the given ngram to this profile.- Parameters:
ngram- the ngram
-
add
Adds multiple occurrences of the given ngram to this profile.- Parameters:
ngram- the ngramcount- number of occurrences to add
-
distance
Calculates the geometric distance between this and the given other language profile.- Parameters:
that- the other language profile- Returns:
- distance between the profiles
-
toString
-