Class ChmCommons
java.lang.Object
org.apache.tika.parser.microsoft.chm.ChmCommons
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Represents entry types: uncompressed, compressedstatic enum
Represents intel file states during decompressionstatic enum
Represents lzx states: started decoding, not started decoding -
Field Summary
Modifier and TypeFieldDescriptionstatic final int
static final int
static final int
Represents lzx block types in order to decompress differentlystatic final int
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
assertByteArrayNotNull
(byte[] data) static byte[]
copyOfRange
(byte[] original, int from, int to) static byte[]
getChmBlockSegment
(byte[] data, ChmLzxcResetTable resetTable, int blockNumber, int lzxcBlockOffset, int lzxcBlockLength) static String
getLanguage
(long langID) Returns textual representation of LangIDstatic int
getWindowSize
(int window) LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X, i.e 2^Xstatic boolean
hasSkip
(DirectoryListingEntry directoryListingEntry) Checks skippable patternsstatic int
indexOfDataSpaceStorageElement
(byte[] text, byte[] pattern) Searches some pattern in byte[]static int
indexOfDataSpaceStorageElement
(List<DirectoryListingEntry> list, String pattern) Searches for some pattern in the directory listing entry list This requires that the entry name start with "::DataSpaceStorage" See TIKA-4204static final int
indexOfResetTableBlock
(byte[] text, byte[] pattern) Returns an index of the reset tablestatic boolean
static void
reverse
(byte[] array) Reverses the order of given arraystatic void
Writes byte[][] to the file
-
Field Details
-
UNDEFINED
public static final int UNDEFINEDRepresents lzx block types in order to decompress differently- See Also:
-
VERBATIM
public static final int VERBATIM- See Also:
-
ALIGNED_OFFSET
public static final int ALIGNED_OFFSET- See Also:
-
UNCOMPRESSED
public static final int UNCOMPRESSED- See Also:
-
-
Method Details
-
assertByteArrayNotNull
- Throws:
TikaException
-
getWindowSize
public static int getWindowSize(int window) LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X, i.e 2^X- Parameters:
window
- chmLzxControlData.getWindowSize()- Returns:
- window size
-
getChmBlockSegment
public static byte[] getChmBlockSegment(byte[] data, ChmLzxcResetTable resetTable, int blockNumber, int lzxcBlockOffset, int lzxcBlockLength) throws TikaException - Throws:
TikaException
-
getLanguage
Returns textual representation of LangID- Parameters:
langID
-- Returns:
- language name
-
hasSkip
Checks skippable patterns- Parameters:
directoryListingEntry
-- Returns:
- boolean
-
writeFile
Writes byte[][] to the file- Parameters:
buffer
-fileToBeSaved
- file name- Throws:
TikaException
-
reverse
public static void reverse(byte[] array) Reverses the order of given array- Parameters:
array
-
-
indexOfResetTableBlock
public static final int indexOfResetTableBlock(byte[] text, byte[] pattern) throws ChmParsingException Returns an index of the reset table- Parameters:
text
-pattern
-- Returns:
- index of the reset table
- Throws:
ChmParsingException
-
indexOfDataSpaceStorageElement
public static int indexOfDataSpaceStorageElement(byte[] text, byte[] pattern) throws ChmParsingException Searches some pattern in byte[]- Parameters:
text
- byte[]pattern
- byte[]- Returns:
- an index, if nothing found returns -1
- Throws:
ChmParsingException
-
indexOfDataSpaceStorageElement
Searches for some pattern in the directory listing entry list This requires that the entry name start with "::DataSpaceStorage" See TIKA-4204- Parameters:
list
-pattern
-- Returns:
- an index, if nothing found returns -1
-
copyOfRange
- Throws:
TikaException
-
isEmpty
-