org.apache.tika.language
Class ProfilingHandler

java.lang.Object
  extended by org.xml.sax.helpers.DefaultHandler
      extended by org.apache.tika.sax.ContentHandlerDecorator
          extended by org.apache.tika.sax.WriteOutContentHandler
              extended by org.apache.tika.language.ProfilingHandler
All Implemented Interfaces:
ContentHandler, DTDHandler, EntityResolver, ErrorHandler

public class ProfilingHandler
extends WriteOutContentHandler

SAX content handler that builds a language profile based on all the received character content.

Since:
Apache Tika 0.5

Constructor Summary
ProfilingHandler()
           
ProfilingHandler(LanguageProfile profile)
           
ProfilingHandler(ProfilingWriter writer)
           
 
Method Summary
 LanguageIdentifier getLanguage()
          Returns the language that best matches the current state of the language profile.
 LanguageProfile getProfile()
          Returns the language profile being built by this content handler.
 
Methods inherited from class org.apache.tika.sax.WriteOutContentHandler
characters, isWriteLimitReached
 
Methods inherited from class org.apache.tika.sax.ContentHandlerDecorator
endDocument, endElement, endPrefixMapping, handleException, ignorableWhitespace, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, toString
 
Methods inherited from class org.xml.sax.helpers.DefaultHandler
error, fatalError, notationDecl, resolveEntity, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ProfilingHandler

public ProfilingHandler(ProfilingWriter writer)

ProfilingHandler

public ProfilingHandler(LanguageProfile profile)

ProfilingHandler

public ProfilingHandler()
Method Detail

getProfile

public LanguageProfile getProfile()
Returns the language profile being built by this content handler. Note that the returned profile gets updated whenever new SAX events are received by this content handler. Use the getLanguage() method to get the language that best matches the current state of the profile.

Returns:
language profile

getLanguage

public LanguageIdentifier getLanguage()
Returns the language that best matches the current state of the language profile.

Returns:
language that best matches the current profile


Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.