org.apache.tika.language
Class LanguageProfile

java.lang.Object
  extended by org.apache.tika.language.LanguageProfile

public class LanguageProfile
extends java.lang.Object

Language profile based on ngram counts.

Since:
Apache Tika 0.5

Field Summary
static int DEFAULT_NGRAM_LENGTH
           
 
Constructor Summary
LanguageProfile()
           
LanguageProfile(int length)
           
LanguageProfile(java.lang.String content)
           
LanguageProfile(java.lang.String content, int length)
           
 
Method Summary
 void add(java.lang.String ngram)
          Adds a single occurrence of the given ngram to this profile.
 void add(java.lang.String ngram, long count)
          Adds multiple occurrences of the given ngram to this profile.
 double distance(LanguageProfile that)
          Calculates the geometric distance between this and the given other language profile.
 long getCount()
           
 long getCount(java.lang.String ngram)
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_NGRAM_LENGTH

public static final int DEFAULT_NGRAM_LENGTH
See Also:
Constant Field Values
Constructor Detail

LanguageProfile

public LanguageProfile(int length)

LanguageProfile

public LanguageProfile()

LanguageProfile

public LanguageProfile(java.lang.String content,
                       int length)

LanguageProfile

public LanguageProfile(java.lang.String content)
Method Detail

getCount

public long getCount()

getCount

public long getCount(java.lang.String ngram)

add

public void add(java.lang.String ngram)
Adds a single occurrence of the given ngram to this profile.

Parameters:
ngram - the ngram

add

public void add(java.lang.String ngram,
                long count)
Adds multiple occurrences of the given ngram to this profile.

Parameters:
ngram - the ngram
count - number of occurrences to add

distance

public double distance(LanguageProfile that)
Calculates the geometric distance between this and the given other language profile.

Parameters:
that - the other language profile
Returns:
distance between the profiles

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object


Copyright © 2010 The Apache Software Foundation. All Rights Reserved.