Class XWPFFeatureExtractor
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFFeatureExtractor
This is designed to extract features that are useful for forensics, e-discovery and digital preservation.
Specifically, the presence of: tracked changes, hidden text, comments and comment authors. Because several of these
features can be placed on run properties, which can be in lots of places, we're scraping
the document xml
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidprocess(org.apache.poi.openxml4j.opc.PackagePart packagePart, Metadata metadata, ParseContext parseContext) voidprocess(org.apache.poi.xwpf.usermodel.XWPFDocument xwpfDocument, Metadata metadata, ParseContext parseContext)
-
Constructor Details
-
XWPFFeatureExtractor
public XWPFFeatureExtractor()
-
-
Method Details
-
process
public void process(org.apache.poi.xwpf.usermodel.XWPFDocument xwpfDocument, Metadata metadata, ParseContext parseContext) -
process
public void process(org.apache.poi.openxml4j.opc.PackagePart packagePart, Metadata metadata, ParseContext parseContext)
-