Package org.apache.nutch.parse
Interface Parse
-
- All Known Implementing Classes:
ParseImpl
public interface ParseThe result of parsing a page's raw content.- See Also:
Parser.getParse(Content)
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description ParseDatagetData()Other data extracted from the page.StringgetText()The textual content of the page.booleanisCanonical()Indicates if the parse is coming from a url or a sub-url
-
-
-
Method Detail
-
getText
String getText()
The textual content of the page. This is indexed, searched, and used when generating snippets.- Returns:
- the entire text String
-
getData
ParseData getData()
Other data extracted from the page.- Returns:
- a populated
ParseDataobject
-
isCanonical
boolean isCanonical()
Indicates if the parse is coming from a url or a sub-url- Returns:
- true if canonical, false otherwise
-
-