Class RecursiveMetadataResource
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final BasicContentHandlerFactory.HANDLER_TYPEprotected static final String -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionjakarta.ws.rs.core.ResponsegetMetadata(InputStream is, jakarta.ws.rs.core.HttpHeaders httpHeaders, String handlerTypeName) Returns an InputStream that can be deserialized as a list ofMetadataobjects.jakarta.ws.rs.core.ResponsegetMetadataFromMultipart(org.apache.cxf.jaxrs.ext.multipart.Attachment att, String handlerTypeName) Returns an InputStream that can be deserialized as a list ofMetadataobjects.jakarta.ws.rs.core.ResponsegetMetadataWithConfig(List<org.apache.cxf.jaxrs.ext.multipart.Attachment> attachments, jakarta.ws.rs.core.HttpHeaders httpHeaders) Multipart endpoint with per-request ParseContext configuration.parseMetadata(TikaInputStream tis, Metadata metadata, jakarta.ws.rs.core.MultivaluedMap<String, String> httpHeaders, ServerHandlerConfig handlerConfig) Parses content and returns metadata list.
-
Field Details
-
HANDLER_TYPE_PARAM
- See Also:
-
DEFAULT_HANDLER_TYPE
-
-
Constructor Details
-
RecursiveMetadataResource
public RecursiveMetadataResource()
-
-
Method Details
-
parseMetadata
public static List<Metadata> parseMetadata(TikaInputStream tis, Metadata metadata, jakarta.ws.rs.core.MultivaluedMap<String, String> httpHeaders, ServerHandlerConfig handlerConfig) throws ExceptionParses content and returns metadata list. Metadata filtering is done in the child process, so no filtering needed here.- Throws:
Exception
-
getMetadataFromMultipart
@POST @Consumes("multipart/form-data") @Produces("application/json") @Path("form{handler : (\\w+)?}") public jakarta.ws.rs.core.Response getMetadataFromMultipart(org.apache.cxf.jaxrs.ext.multipart.Attachment att, @PathParam("handler") String handlerTypeName) throws Exception Returns an InputStream that can be deserialized as a list ofMetadataobjects. The first in the list represents the main document, and the rest represent metadata for the embedded objects. This works recursively through all descendants of the main document, not just the immediate children.The extracted text content is stored with the key
TikaCoreProperties.TIKA_CONTENT.Specify the handler for the content (xml, html, text, markdown, ignore) in the path:
/rmeta/form (default: xml)
/rmeta/form/xml (store the content as xml)
/rmeta/form/text (store the content as text)
/rmeta/form/markdown (store the content as markdown)
/rmeta/form/ignore (don't record any content) -
getMetadataWithConfig
@POST @Consumes("multipart/form-data") @Produces("application/json") @Path("config") public jakarta.ws.rs.core.Response getMetadataWithConfig(List<org.apache.cxf.jaxrs.ext.multipart.Attachment> attachments, @Context jakarta.ws.rs.core.HttpHeaders httpHeaders) throws Exception Multipart endpoint with per-request ParseContext configuration. Accepts two parts: "file" (the document) and "config" (JSON configuration with parseContext). Uses the default handler type (XML).- Throws:
Exception
-
getMetadata
@PUT @Produces("application/json") @Path("{handler : (\\w+)?}") public jakarta.ws.rs.core.Response getMetadata(InputStream is, @Context jakarta.ws.rs.core.HttpHeaders httpHeaders, @PathParam("handler") String handlerTypeName) throws Exception Returns an InputStream that can be deserialized as a list ofMetadataobjects. The first in the list represents the main document, and the rest represent metadata for the embedded objects. This works recursively through all descendants of the main document, not just the immediate children.The extracted text content is stored with the key
TikaCoreProperties.TIKA_CONTENT.Specify the handler for the content (xml, html, text, markdown, ignore) in the path:
/rmeta (default: xml)
/rmeta/xml (store the content as xml)
/rmeta/text (store the content as text)
/rmeta/markdown (store the content as markdown)
/rmeta/ignore (don't record any content)
-