Uses of Package
org.apache.tika.parser.chm.accessor
-
Packages that use org.apache.tika.parser.chm.accessor Package Description org.apache.tika.parser.chm.accessor org.apache.tika.parser.chm.assertion org.apache.tika.parser.chm.core org.apache.tika.parser.chm.lzx -
Classes in org.apache.tika.parser.chm.accessor used by org.apache.tika.parser.chm.accessor Class Description ChmAccessor Defines an accessor interfaceChmItsfHeader The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data.ChmItspHeader Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)ChmLzxcControlData ::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression. ChmLzxcResetTable LZXC reset table For ensuring a decompression.ChmPmgiHeader Description Note: not always exists An index chunk has the following format: 0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of directory chunk 0008: Directory index entries (to quickref/free area) The quickref area in an PMGI is the same as in an PMGL The format of a directory index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: directory listing chunk which starts with name Encoded Integers aka ENCINT An ENCINT is a variable-length integer.ChmPmglHeader Description There are two types of directory chunks -- index chunks, and listing chunks.DirectoryListingEntry The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate). -
Classes in org.apache.tika.parser.chm.accessor used by org.apache.tika.parser.chm.assertion Class Description ChmAccessor Defines an accessor interfaceChmLzxcResetTable LZXC reset table For ensuring a decompression. -
Classes in org.apache.tika.parser.chm.accessor used by org.apache.tika.parser.chm.core Class Description ChmDirectoryListingSet Holds chm listing entriesChmItsfHeader The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data.ChmItspHeader Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)ChmLzxcControlData ::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression. ChmLzxcResetTable LZXC reset table For ensuring a decompression.DirectoryListingEntry The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate). -
Classes in org.apache.tika.parser.chm.accessor used by org.apache.tika.parser.chm.lzx Class Description ChmLzxcControlData ::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression. DirectoryListingEntry The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate).