Class RUnpackExtractorFactory

java.lang.Object
org.apache.tika.extractor.RUnpackExtractorFactory
All Implemented Interfaces:
Serializable, EmbeddedDocumentByteStoreExtractorFactory, EmbeddedDocumentExtractorFactory

public class RUnpackExtractorFactory extends Object implements EmbeddedDocumentByteStoreExtractorFactory
See Also:
  • Field Details

    • DEFAULT_MAX_EMBEDDED_BYTES_FOR_EXTRACTION

      public static long DEFAULT_MAX_EMBEDDED_BYTES_FOR_EXTRACTION
  • Constructor Details

    • RUnpackExtractorFactory

      public RUnpackExtractorFactory()
  • Method Details

    • setWriteFileNameToContent

      @Field public void setWriteFileNameToContent(boolean writeFileNameToContent)
    • setEmbeddedBytesIncludeMimeTypes

      @Field public void setEmbeddedBytesIncludeMimeTypes(List<String> includeMimeTypes)
    • setEmbeddedBytesExcludeMimeTypes

      @Field public void setEmbeddedBytesExcludeMimeTypes(List<String> excludeMimeTypes)
    • setEmbeddedBytesIncludeEmbeddedResourceTypes

      @Field public void setEmbeddedBytesIncludeEmbeddedResourceTypes(List<String> includeAttachmentTypes)
    • setEmbeddedBytesExcludeEmbeddedResourceTypes

      @Field public void setEmbeddedBytesExcludeEmbeddedResourceTypes(List<String> excludeAttachmentTypes)
    • setMaxEmbeddedBytesForExtraction

      @Field public void setMaxEmbeddedBytesForExtraction(long maxEmbeddedBytesForExtraction) throws TikaConfigException
      Total number of bytes to write out. A good zip bomb may contain petabytes compressed into a few kb. Make sure that you can't fill up a disk! This does not include the container file in the count of bytes written out. This only counts the lengths of the embedded files.
      Parameters:
      maxEmbeddedBytesForExtraction -
      Throws:
      TikaConfigException
    • newInstance

      public EmbeddedDocumentExtractor newInstance(Metadata metadata, ParseContext parseContext)
      Specified by:
      newInstance in interface EmbeddedDocumentExtractorFactory