public class TextDetector extends Object implements Detector
Note that text documents with a character encoding like UTF-16 are better
detected with MagicDetector and an appropriate magic byte pattern.
| Constructor and Description |
|---|
TextDetector()
Constructs a
TextDetector which will look at the default number
of bytes from the beginning of the document. |
TextDetector(int bytesToTest)
Constructs a
TextDetector which will look at a given number of
bytes from the beginning of the document. |
| Modifier and Type | Method and Description |
|---|---|
MediaType |
detect(InputStream input,
Metadata metadata)
Looks at the beginning of the document input stream to determine
whether the document is text or not.
|
public TextDetector()
TextDetector which will look at the default number
of bytes from the beginning of the document.public TextDetector(int bytesToTest)
TextDetector which will look at a given number of
bytes from the beginning of the document.public MediaType detect(InputStream input, Metadata metadata) throws IOException
detect in interface Detectorinput - document input stream, or nullmetadata - ignoredIOException - if the document input stream could not be readCopyright © 2007–2022 The Apache Software Foundation. All rights reserved.