Package org.apache.nutch.segment
Class SegmentChecker
- java.lang.Object
-
- org.apache.nutch.segment.SegmentChecker
-
public class SegmentChecker extends Object
Checks whether a segment is valid, or has a certain status (generated, fetched, parsed), or can be used safely for a certain processing step (e.g., indexing).
-
-
Constructor Summary
Constructors Constructor Description SegmentChecker()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static booleancheckSegmentDir(Path segmentPath, FileSystem fs)Check the segment to see if it is valid based on the sub directories.static booleanisIndexable(Path segmentPath, FileSystem fs)Check if the segment is indexable.static booleanisParsed(Path segment, FileSystem fs)Check the segment to see if it is has been parsed before.
-
-
-
Method Detail
-
isIndexable
public static boolean isIndexable(Path segmentPath, FileSystem fs) throws IOException
Check if the segment is indexable. May add new check methods here.- Parameters:
segmentPath- path to an individual segment on diskfs- theFileSystemthat the segment resides on- Returns:
- true if the checks pass and the segment can be indexed, false otherwise
- Throws:
IOException- if there is an I/O error locating or checking either the segment contents or locating it on the filesystem
-
checkSegmentDir
public static boolean checkSegmentDir(Path segmentPath, FileSystem fs) throws IOException
Check the segment to see if it is valid based on the sub directories.- Parameters:
segmentPath- path to an individual segment on diskfs- theFileSystemthat the segment resides on- Returns:
- true if the checks pass false otherwise
- Throws:
IOException- if there is an I/O error locating or checking either the segment contents or locating it on the filesystem
-
isParsed
public static boolean isParsed(Path segment, FileSystem fs) throws IOException
Check the segment to see if it is has been parsed before.- Parameters:
segment- path to an individual segment on diskfs- theFileSystemthat the segment resides on- Returns:
- true if the checks pass and the segment has been parsed, false otherwise
- Throws:
IOException- if there is an I/O error locating or checking either the segment contents or locating it on the filesystem
-
-