Package org.apache.tika.parser.microsoft
Class ListManager
java.lang.Object
org.apache.tika.parser.microsoft.AbstractListManager
org.apache.tika.parser.microsoft.ListManager
Computes the number text which goes at the beginning of each list paragraph
Note: This class only handles the raw number text and does not apply any further formatting as described in [MS-DOC], v20140721, 2.4.6.3, Part 3 to it.
Note 2: The tplc
, a visual override for the appearance of list levels, as
defined in [MS-DOC], v20140721, 2.9.328 is not taken care of in this class.
Further, this class does not yet handle overrides
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.tika.parser.microsoft.AbstractListManager
AbstractListManager.LevelTuple, AbstractListManager.ParagraphLevelCounter
-
Field Summary
Fields inherited from class org.apache.tika.parser.microsoft.AbstractListManager
listLevelMap, overrideTupleMap
-
Constructor Summary
ConstructorDescriptionListManager
(org.apache.poi.hwpf.HWPFDocument document) Ordinary constructor for a new list reader -
Method Summary
Modifier and TypeMethodDescriptiongetFormattedNumber
(org.apache.poi.hwpf.usermodel.Paragraph paragraph) Get the formatted number for a given paragraph
-
Constructor Details
-
ListManager
public ListManager(org.apache.poi.hwpf.HWPFDocument document) Ordinary constructor for a new list reader- Parameters:
document
- Document to process
-
-
Method Details
-
getFormattedNumber
Get the formatted number for a given paragraphNote: This only works correctly if called subsequently for all paragraphs in a valid selection (main document, text field, ...) which are part of a list .
- Parameters:
paragraph
- list paragraph to process- Returns:
- String which represents the numbering of this list paragraph; never
null
, can be empty string, though, if something goes wrong in getList() - Throws:
IllegalArgumentException
- If the given paragraph isnull
or is not part of a list
-