Package org.apache.nutch.util
Class NutchTool
- java.lang.Object
-
- org.apache.hadoop.conf.Configured
-
- org.apache.nutch.util.NutchTool
-
- All Implemented Interfaces:
Configurable
- Direct Known Subclasses:
CommonCrawlDataDumper,CrawlDb,DeduplicationJob,Fetcher,Generator,IndexingJob,Injector,LinkDb,ParseSegment
public abstract class NutchTool extends Configured
-
-
Constructor Summary
Constructors Constructor Description NutchTool()NutchTool(Configuration conf)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description floatgetProgress()Get relative progress of the tool.Map<String,Object>getStatus()Returns current status of the running toolbooleankillJob()Kill the job immediately.abstract Map<String,Object>run(Map<String,Object> args, String crawlId)Runs the tool, using a map of arguments.voidsetConf(Configuration conf)booleanstopJob()Stop the job with the possibility to resume.-
Methods inherited from class org.apache.hadoop.conf.Configured
getConf
-
-
-
-
Constructor Detail
-
NutchTool
public NutchTool(Configuration conf)
-
NutchTool
public NutchTool()
-
-
Method Detail
-
run
public abstract Map<String,Object> run(Map<String,Object> args, String crawlId) throws Exception
Runs the tool, using a map of arguments. May return results, or null.
-
setConf
public void setConf(Configuration conf)
- Specified by:
setConfin interfaceConfigurable- Overrides:
setConfin classConfigured
-
getProgress
public float getProgress()
Get relative progress of the tool. Progress is represented as a float in range [0,1] where 1 is complete.- Returns:
- a float in range [0,1].
-
getStatus
public Map<String,Object> getStatus()
Returns current status of the running tool- Returns:
- a populated
Map, the fields of which can be accessed to obtain status.
-
stopJob
public boolean stopJob() throws ExceptionStop the job with the possibility to resume. Subclasses should override this, since by default it callskillJob().
-
-