Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
Creates a very narrowly focused TokenFilter that limits tokens based on length _unless_ they've been identified as <DOUBLE> or <SINGLE> by the CJKBigramFilter.
Computes some corpus contrast statistics.
Copyright © 2007–2018 The Apache Software Foundation. All rights reserved.