Class RTFCharsetMaps
java.lang.Object
org.apache.tika.parser.microsoft.rtf.jflex.RTFCharsetMaps
Shared charset maps for RTF parsing. Maps RTF
\fcharsetN and
\ansicpgN values to Java Charset instances.
Extracted from the original TextExtractor so both the JFlex-based
parser and decapsulator can reuse them.
-
Field Summary
FieldsModifier and TypeFieldDescriptionMaps\ansicpgNvalues to Java charsets.Maps\fcharsetNvalues to Java charsets.static final Charset -
Method Summary
Modifier and TypeMethodDescriptionstatic CharsetresolveCodePage(int cpNumber) Resolve an ANSI code page number to a Java Charset.
-
Field Details
-
WINDOWS_1252
-
FCHARSET_MAP
Maps\fcharsetNvalues to Java charsets. The RTF font table uses these to declare per-font character encodings. -
ANSICPG_MAP
Maps\ansicpgNvalues to Java charsets. This is the global ANSI code page declared in the RTF header.
-
-
Method Details
-
resolveCodePage
Resolve an ANSI code page number to a Java Charset. Tries the ANSICPG_MAP first, then falls back towindows-NandcpN. ReturnsWINDOWS_1252if nothing matches.
-