Package org.apache.tika.parser.jdbc
Class AbstractDBParser
java.lang.Object
org.apache.tika.parser.jdbc.AbstractDBParser
- All Implemented Interfaces:
- Serializable,- Parser
- Direct Known Subclasses:
- SQLite3DBParser
Abstract class that handles iterating through tables within a database.
- See Also:
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionprotected voidclose()Override this for any special handling of closing the connection.protected voidextractMetadata(Connection connection, Metadata metadata) This is called before parsing the tables to extract metadata from the db, if any.protected ConnectiongetConnection(InputStream stream, Metadata metadata, ParseContext context) Override this for special configuration of the connection, such as limiting the number of rows to be held in memory.protected abstract StringgetConnectionString(InputStream stream, Metadata metadata, ParseContext parseContext) Implement for db specific connection information, e.g.protected abstract StringJDBC class name, e.g. org.sqlite.JDBCgetSupportedTypes(ParseContext context) Returns the set of media types supported by this parser when used with the given parse context.getTableNames(Connection connection, Metadata metadata, ParseContext context) Returns the names of the tables to processprotected abstract JDBCTableReadergetTableReader(Connection connection, String tableName, EmbeddedDocumentUtil embeddedDocumentUtil) Given a connection and a table name, return the JDBCTableReader for this db.protected abstract JDBCTableReadergetTableReader(Connection connection, String tableName, ParseContext parseContext) Deprecated.voidparse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) Parses a document stream into a sequence of XHTML SAX events.
- 
Constructor Details- 
AbstractDBParserpublic AbstractDBParser()
 
- 
- 
Method Details- 
getSupportedTypesDescription copied from interface:ParserReturns the set of media types supported by this parser when used with the given parse context.- Specified by:
- getSupportedTypesin interface- Parser
- Parameters:
- context- parse context
- Returns:
- immutable set of media types
 
- 
parsepublic void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException Description copied from interface:ParserParses a document stream into a sequence of XHTML SAX events. Fills in related document metadata in the given metadata object.The given document stream is consumed but not closed by this method. The responsibility to close the stream remains on the caller. Information about the parsing context can be passed in the context parameter. See the parser implementations for the kinds of context information they expect. - Specified by:
- parsein interface- Parser
- Parameters:
- stream- the document stream (input)
- handler- handler for the XHTML SAX events (output)
- metadata- document metadata (input and output)
- context- parse context
- Throws:
- IOException- if the document stream could not be read
- SAXException- if the SAX events could not be processed
- TikaException- if the document could not be parsed
 
- 
extractMetadataThis is called before parsing the tables to extract metadata from the db, if any. Override this for db specific metadata. This implementation is a no-op- Parameters:
- connection-
- metadata-
 
- 
closeOverride this for any special handling of closing the connection.- Throws:
- SQLException
- IOException
 
- 
getConnectionprotected Connection getConnection(InputStream stream, Metadata metadata, ParseContext context) throws IOException, TikaException Override this for special configuration of the connection, such as limiting the number of rows to be held in memory.- Parameters:
- stream- stream to use
- metadata- metadata that could be used in parameterizing the connection
- context- parsecontext that could be used in parameterizing the connection
- Returns:
- connection
- Throws:
- IOException
- TikaException
 
- 
getConnectionStringprotected abstract String getConnectionString(InputStream stream, Metadata metadata, ParseContext parseContext) throws IOException Implement for db specific connection information, e.g. "jdbc:sqlite:/docs/mydb.db" Include any optimization settings, user name, password, etc.- Parameters:
- stream- stream for processing
- metadata- metadata might be useful in determining connection info
- parseContext- context to use to help create connectionString
- Returns:
- connection string to be used by getConnection(java.io.InputStream, org.apache.tika.metadata.Metadata, org.apache.tika.parser.ParseContext).
- Throws:
- IOException
 
- 
getJDBCClassNameJDBC class name, e.g. org.sqlite.JDBC- Returns:
- jdbc class name
 
- 
getTableNamesprotected abstract List<String> getTableNames(Connection connection, Metadata metadata, ParseContext context) throws SQLException Returns the names of the tables to process- Parameters:
- connection- Connection to use to make the sql call(s) to get the names of the tables
- metadata- Metadata to use (potentially) in decision about which tables to extract
- context- ParseContext to use (potentially) in decision about which tables to extract
- Returns:
- Throws:
- SQLException
 
- 
getTableReader@Deprecated protected abstract JDBCTableReader getTableReader(Connection connection, String tableName, ParseContext parseContext) Deprecated.Given a connection and a table name, return the JDBCTableReader for this db.- Parameters:
- connection-
- tableName-
- Returns:
- a reader
 
- 
getTableReaderprotected abstract JDBCTableReader getTableReader(Connection connection, String tableName, EmbeddedDocumentUtil embeddedDocumentUtil) Given a connection and a table name, return the JDBCTableReader for this db.- Parameters:
- connection-
- tableName-
- embeddedDocumentUtil- embedded doc util
- Returns:
 
 
- 
getTableReader(Connection, String, EmbeddedDocumentUtil)