Class BoilerpipeContentHandler

    • Constructor Detail

      • BoilerpipeContentHandler

        public BoilerpipeContentHandler​(ContentHandler delegate)
        Creates a new boilerpipe-based content extractor, using the DefaultExtractor extraction rules and "delegate" as the content handler.
        Parameters:
        delegate - The ContentHandler object
      • BoilerpipeContentHandler

        public BoilerpipeContentHandler​(Writer writer)
        Creates a content handler that writes XHTML body character events to the given writer.
        Parameters:
        writer - writer
      • BoilerpipeContentHandler

        public BoilerpipeContentHandler​(ContentHandler delegate,
                                        de.l3s.boilerpipe.BoilerpipeExtractor extractor)
        Creates a new boilerpipe-based content extractor, using the given extraction rules. The extracted main content will be passed to the content handler.
        Parameters:
        delegate - The ContentHandler object
        extractor - Extraction rules to use, e.g. ArticleExtractor