Package org.apache.nutch.tools
Class CommonCrawlFormatJettinson
- java.lang.Object
-
- org.apache.nutch.tools.AbstractCommonCrawlFormat
-
- org.apache.nutch.tools.CommonCrawlFormatJettinson
-
- All Implemented Interfaces:
Closeable,AutoCloseable,CommonCrawlFormat
public class CommonCrawlFormatJettinson extends AbstractCommonCrawlFormat
This class provides methods to map crawled data on JSON using Jettinson APIs.
-
-
Field Summary
-
Fields inherited from class org.apache.nutch.tools.AbstractCommonCrawlFormat
conf, content, inLinks, jsonArray, keyPrefix, LOG, metadata, reverseKey, reverseKeyValue, simpleDateFormat, url
-
-
Constructor Summary
Constructors Constructor Description CommonCrawlFormatJettinson(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcloseArray(String key, boolean nested, boolean newline)protected voidcloseObject(String key)protected StringgenerateJson()protected voidstartArray(String key, boolean nested, boolean newline)protected voidstartObject(String key)protected voidwriteArrayValue(String value)protected voidwriteKeyNull(String key)protected voidwriteKeyValue(String key, String value)-
Methods inherited from class org.apache.nutch.tools.AbstractCommonCrawlFormat
close, getImported, getInLinks, getJsonData, getJsonData, getJsonData, getKey, getMethod, getRequestAccept, getRequestAcceptEncoding, getRequestAcceptLanguage, getRequestContactEmail, getRequestContactName, getRequestHostAddress, getRequestHostName, getRequestRobots, getRequestSoftware, getRequestUserAgent, getResponseAddress, getResponseContent, getResponseContentEncoding, getResponseContentType, getResponseDate, getResponseHostName, getResponseServer, getResponseStatus, getTimestamp, getUrl, setInLinks
-
-
-
-
Constructor Detail
-
CommonCrawlFormatJettinson
public CommonCrawlFormatJettinson(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config) throws IOException
- Throws:
IOException
-
-
Method Detail
-
writeKeyValue
protected void writeKeyValue(String key, String value) throws IOException
- Specified by:
writeKeyValuein classAbstractCommonCrawlFormat- Throws:
IOException
-
writeKeyNull
protected void writeKeyNull(String key) throws IOException
- Specified by:
writeKeyNullin classAbstractCommonCrawlFormat- Throws:
IOException
-
startArray
protected void startArray(String key, boolean nested, boolean newline) throws IOException
- Specified by:
startArrayin classAbstractCommonCrawlFormat- Throws:
IOException
-
closeArray
protected void closeArray(String key, boolean nested, boolean newline) throws IOException
- Specified by:
closeArrayin classAbstractCommonCrawlFormat- Throws:
IOException
-
writeArrayValue
protected void writeArrayValue(String value) throws IOException
- Specified by:
writeArrayValuein classAbstractCommonCrawlFormat- Throws:
IOException
-
startObject
protected void startObject(String key) throws IOException
- Specified by:
startObjectin classAbstractCommonCrawlFormat- Throws:
IOException
-
closeObject
protected void closeObject(String key) throws IOException
- Specified by:
closeObjectin classAbstractCommonCrawlFormat- Throws:
IOException
-
generateJson
protected String generateJson() throws IOException
- Specified by:
generateJsonin classAbstractCommonCrawlFormat- Throws:
IOException
-
-