Class RTFHtmlDecapsulator

java.lang.Object
org.apache.tika.parser.microsoft.rtf.jflex.RTFHtmlDecapsulator

public class RTFHtmlDecapsulator extends Object
Extracts the original HTML from an RTF document that contains encapsulated HTML (as indicated by the \fromhtml1 control word), using a JFlex-based tokenizer and shared RTFState for font/codepage tracking.

Embedded objects and pictures are extracted in the same pass via RTFEmbeddedHandler.