Package org.apache.nutch.indexer
Class IndexingFilters
- java.lang.Object
-
- org.apache.nutch.indexer.IndexingFilters
-
public class IndexingFilters extends Object
Creates and cachesIndexingFilterimplementing plugins.
-
-
Field Summary
Fields Modifier and Type Field Description static StringINDEXINGFILTER_ORDER
-
Constructor Summary
Constructors Constructor Description IndexingFilters(Configuration conf)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description NutchDocumentfilter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)Run all defined filters.
-
-
-
Field Detail
-
INDEXINGFILTER_ORDER
public static final String INDEXINGFILTER_ORDER
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
IndexingFilters
public IndexingFilters(Configuration conf)
-
-
Method Detail
-
filter
public NutchDocument filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks) throws IndexingException
Run all defined filters. Note, may return null if the the document was filtered- Parameters:
doc- theNutchDocumentto process with filtersparse- correspondingParseobject for the documenturl- correspondingTexturl for the documentdatum- correspondingCrawlDatumfor the documentinlinks- correspondingInlinksfor the document- Returns:
- the
NutchDocument, null it the document was filtered - Throws:
IndexingException- if an error occurs within a filter- See Also:
IndexingFilter.filter(NutchDocument, Parse, Text, CrawlDatum, Inlinks)
-
-