Class SolrEmitter
java.lang.Object
org.apache.tika.pipes.emitter.AbstractEmitter
org.apache.tika.pipes.emitter.solr.SolrEmitter
- All Implemented Interfaces:
Initializable,Emitter
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumstatic enum -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckInitialization(InitializableProblemHandler problemHandler) voidemit(String emitKey, List<Metadata> metadataList, ParseContext parseContext) voidThe default behavior is to callEmitter.emit(String, List, ParseContext)on each item.intvoidinitialize(Map<String, Param> params) voidsetAttachmentStrategy(String attachmentStrategy) Options: SKIP, CONCATENATE_CONTENT, PARENT_CHILD.voidsetAuthScheme(String authScheme) voidsetCommitWithin(int commitWithin) voidsetConnectionTimeout(int connectionTimeout) voidsetEmbeddedFileFieldName(String embeddedFileFieldName) If using theSolrEmitter.AttachmentStrategy.PARENT_CHILD, this is the field name used to store the child documents.voidsetIdField(String idField) Specify the field in the first Metadata that should be used as the id field for the document.voidsetPassword(String password) voidsetProxyHost(String proxyHost) voidsetProxyPort(int proxyPort) voidsetSocketTimeout(int socketTimeout) voidsetSolrCollection(String solrCollection) voidsetSolrUrls(List<String> solrUrls) voidsetSolrZkChroot(String solrZkChroot) voidsetSolrZkHosts(List<String> solrZkHosts) voidsetUpdateStrategy(String updateStrategy) voidsetUserName(String userName) Methods inherited from class org.apache.tika.pipes.emitter.AbstractEmitter
getName, setName
-
Field Details
-
DEFAULT_EMBEDDED_FILE_FIELD_NAME
-
-
Constructor Details
-
SolrEmitter
- Throws:
TikaConfigException
-
-
Method Details
-
emit
public void emit(String emitKey, List<Metadata> metadataList, ParseContext parseContext) throws IOException, TikaEmitterException - Specified by:
emitin interfaceEmitter- Throws:
IOExceptionTikaEmitterException
-
emit
Description copied from class:AbstractEmitterThe default behavior is to callEmitter.emit(String, List, ParseContext)on each item. Some implementations, e.g. Solr/ES/vespa, can benefit from subclassing this and emitting a bunch of docs at once.- Specified by:
emitin interfaceEmitter- Overrides:
emitin classAbstractEmitter- Throws:
IOExceptionTikaEmitterException
-
setAttachmentStrategy
Options: SKIP, CONCATENATE_CONTENT, PARENT_CHILD. Default is "PARENT_CHILD". If set to "SKIP", this will index only the main file and ignore all info in the attachments. If set to "CONCATENATE_CONTENT", this will concatenate the content extracted from the attachments into the main document and then index the main document with the concatenated content _and_ the main document's metadata (metadata from attachments will be thrown away). If set to "PARENT_CHILD", this will index the attachments as children of the parent document via Solr's parent-child relationship. -
setUpdateStrategy
-
setConnectionTimeout
-
setSocketTimeout
-
getCommitWithin
public int getCommitWithin() -
setCommitWithin
-
setIdField
Specify the field in the first Metadata that should be used as the id field for the document.- Parameters:
idField-
-
setSolrCollection
-
setSolrUrls
-
setSolrZkHosts
-
setSolrZkChroot
-
setUserName
-
setPassword
-
setAuthScheme
-
setProxyHost
-
setProxyPort
-
setEmbeddedFileFieldName
If using theSolrEmitter.AttachmentStrategy.PARENT_CHILD, this is the field name used to store the child documents. Note that we artificially flatten all embedded documents, no matter how nested in the container document, into direct children of the root document.- Parameters:
embeddedFileFieldName-
-
initialize
- Specified by:
initializein interfaceInitializable- Parameters:
params- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitializationin interfaceInitializable- Parameters:
problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-