Package org.apache.nutch.crawl
Class LinkDbReader
- java.lang.Object
-
- org.apache.hadoop.conf.Configured
-
- org.apache.nutch.util.AbstractChecker
-
- org.apache.nutch.crawl.LinkDbReader
-
- All Implemented Interfaces:
Closeable,AutoCloseable,Configurable,Tool
public class LinkDbReader extends AbstractChecker implements Closeable
Read utility for the LinkDb.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classLinkDbReader.LinkDBDumpMapper
-
Field Summary
-
Fields inherited from class org.apache.nutch.util.AbstractChecker
keepClientCnxOpen, stdin, tcpPort, usage
-
-
Constructor Summary
Constructors Constructor Description LinkDbReader()LinkDbReader(Configuration conf, Path directory)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()String[]getAnchors(Text url)InlinksgetInlinks(Text url)voidinit(Path directory)static voidmain(String[] args)voidopenReaders()protected intprocess(String line, StringBuilder output)voidprocessDumpJob(String linkdb, String output, String regex)intrun(String[] args)-
Methods inherited from class org.apache.nutch.util.AbstractChecker
getProtocolOutput, parseArgs, processSingle, processStdin, processTCP, run
-
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
-
-
-
-
Constructor Detail
-
LinkDbReader
public LinkDbReader()
-
LinkDbReader
public LinkDbReader(Configuration conf, Path directory) throws Exception
- Throws:
Exception
-
-
Method Detail
-
openReaders
public void openReaders() throws IOException- Throws:
IOException
-
getAnchors
public String[] getAnchors(Text url) throws IOException
- Throws:
IOException
-
getInlinks
public Inlinks getInlinks(Text url) throws IOException
- Throws:
IOException
-
close
public void close() throws IOException- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
processDumpJob
public void processDumpJob(String linkdb, String output, String regex) throws IOException, InterruptedException, ClassNotFoundException
-
process
protected int process(String line, StringBuilder output) throws Exception
- Specified by:
processin classAbstractChecker- Throws:
Exception
-
-