Class FieldCodeParser
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.FieldCodeParser
Parses OOXML field codes (instrText) to extract URLs from HYPERLINK,
INCLUDEPICTURE, INCLUDETEXT, IMPORT, and LINK fields.
This class has no Tika dependencies and could be contributed to POI.
-
Method Summary
Modifier and TypeMethodDescriptionstatic StringparseExternalRefFromInstrText(String instrText, StringBuilder fieldType) Parses URLs from instrText field codes that reference external resources.static StringparseHyperlinkFromInstrText(String instrText) Parses a HYPERLINK URL from instrText field code content.
-
Method Details
-
parseHyperlinkFromInstrText
Parses a HYPERLINK URL from instrText field code content. Field codes like:HYPERLINK "https://example.com"- Parameters:
instrText- the accumulated instrText content- Returns:
- the URL if found, or null
-
parseExternalRefFromInstrText
Parses URLs from instrText field codes that reference external resources. This includes INCLUDEPICTURE, INCLUDETEXT, IMPORT, and LINK fields.- Parameters:
instrText- the accumulated instrText contentfieldType- output parameter - will contain the field type if found- Returns:
- the URL if found, or null
-