org.apache.tika.language
Class LanguageProfile
java.lang.Object
org.apache.tika.language.LanguageProfile
public class LanguageProfile
- extends java.lang.Object
Language profile based on ngram counts.
- Since:
- Apache Tika 0.5
Method Summary |
void |
add(java.lang.String ngram)
Adds a single occurrence of the given ngram to this profile. |
void |
add(java.lang.String ngram,
long count)
Adds multiple occurrences of the given ngram to this profile. |
double |
distance(LanguageProfile that)
Calculates the geometric distance between this and the given
other language profile. |
long |
getCount()
|
long |
getCount(java.lang.String ngram)
|
java.lang.String |
toString()
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
DEFAULT_NGRAM_LENGTH
public static final int DEFAULT_NGRAM_LENGTH
- See Also:
- Constant Field Values
LanguageProfile
public LanguageProfile(int length)
LanguageProfile
public LanguageProfile()
LanguageProfile
public LanguageProfile(java.lang.String content,
int length)
LanguageProfile
public LanguageProfile(java.lang.String content)
getCount
public long getCount()
getCount
public long getCount(java.lang.String ngram)
add
public void add(java.lang.String ngram)
- Adds a single occurrence of the given ngram to this profile.
- Parameters:
ngram
- the ngram
add
public void add(java.lang.String ngram,
long count)
- Adds multiple occurrences of the given ngram to this profile.
- Parameters:
ngram
- the ngramcount
- number of occurrences to add
distance
public double distance(LanguageProfile that)
- Calculates the geometric distance between this and the given
other language profile.
- Parameters:
that
- the other language profile
- Returns:
- distance between the profiles
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object
Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.