Class URLEmailNormalizingFilterFactory


  • public class URLEmailNormalizingFilterFactory
    extends org.apache.lucene.analysis.util.TokenFilterFactory
    Factory for filter that normalizes urls and emails to __url__ and __email__ respectively. WARNING:This will not work correctly unless the UAX29URLEmailTokenizer is used! This must be run _before_ the AlphaIdeographFilterFactory, or else the emails/urls will already be removed!
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static String EMAIL  
      static String URL  
      • Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

        LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.lucene.analysis.TokenStream create​(org.apache.lucene.analysis.TokenStream tokenStream)  
      • Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory

        availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
      • Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

        get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
    • Constructor Detail

      • URLEmailNormalizingFilterFactory

        public URLEmailNormalizingFilterFactory​(Map<String,​String> args)
    • Method Detail

      • create

        public org.apache.lucene.analysis.TokenStream create​(org.apache.lucene.analysis.TokenStream tokenStream)
        Specified by:
        create in class org.apache.lucene.analysis.util.TokenFilterFactory