public class TextDetector extends Object implements Detector
Note that text documents with a character encoding like UTF-16 are better
detected with MagicDetector
and an appropriate magic byte pattern.
Constructor and Description |
---|
TextDetector()
Constructs a
TextDetector which will look at the default number
of bytes from the beginning of the document. |
TextDetector(int bytesToTest)
Constructs a
TextDetector which will look at a given number of
bytes from the beginning of the document. |
Modifier and Type | Method and Description |
---|---|
MediaType |
detect(InputStream input,
Metadata metadata)
Looks at the beginning of the document input stream to determine
whether the document is text or not.
|
public TextDetector()
TextDetector
which will look at the default number
of bytes from the beginning of the document.public TextDetector(int bytesToTest)
TextDetector
which will look at a given number of
bytes from the beginning of the document.public MediaType detect(InputStream input, Metadata metadata) throws IOException
detect
in interface Detector
input
- document input stream, or null
metadata
- ignoredIOException
- if the document input stream could not be readCopyright © 2007–2023 The Apache Software Foundation. All rights reserved.