Class GeoPkgParser

  • All Implemented Interfaces:
    Serializable, Initializable, Parser

    public class GeoPkgParser
    extends SQLite3Parser
    Customization of sqlite parser to skip certain common blob columns.

    The motivation is that "geom" and "data" columns are intrinsic to geopkg and are not regular embedded files. Tika treats all blob columns as, potentially, embedded files -- this can add dramatically to the time to parse geopkg files, which might have hundreds of thousands of uninteresting blobs.

    Users may modify which columns are ignored or turn off "ignoring" of all solumns.

    To add a column to the default "ignore blob columns" via tika-config.xml:

       
         
         
           
             geom
             data
             something
           
         
       
       }

    Or use an empty list to parse all columns.

    See Also:
    Serialized Form
    • Constructor Detail

      • GeoPkgParser

        public GeoPkgParser()
        Checks to see if class is available for org.sqlite.JDBC.

        If not, this class will return an EMPTY_SET for getSupportedTypes()