Class LanguageProfile


  • public class LanguageProfile
    extends Object
    Language profile based on ngram counts.
    Since:
    Apache Tika 0.5
    • Field Detail

      • useInterleaved

        public static boolean useInterleaved
    • Constructor Detail

      • LanguageProfile

        public LanguageProfile​(int length)
      • LanguageProfile

        public LanguageProfile()
      • LanguageProfile

        public LanguageProfile​(String content,
                               int length)
      • LanguageProfile

        public LanguageProfile​(String content)
    • Method Detail

      • getCount

        public long getCount()
      • getCount

        public long getCount​(String ngram)
      • add

        public void add​(String ngram)
        Adds a single occurrence of the given ngram to this profile.
        Parameters:
        ngram - the ngram
      • add

        public void add​(String ngram,
                        long count)
        Adds multiple occurrences of the given ngram to this profile.
        Parameters:
        ngram - the ngram
        count - number of occurrences to add
      • distance

        public double distance​(LanguageProfile that)
        Calculates the geometric distance between this and the given other language profile.
        Parameters:
        that - the other language profile
        Returns:
        distance between the profiles