Uses of Class
org.apache.nutch.metadata.Metadata
-
Packages that use Metadata Package Description org.apache.nutch.indexer Index content, configure and run indexing and cleaning jobs to add, update, and delete documents from an index.org.apache.nutch.metadata A Multi-valued Metadata container, and set of constant fields for Nutch Metadata.org.apache.nutch.net.protocols Helper classes related to theProtocolinterface, see alsoorg.apache.nutch.protocol.org.apache.nutch.parse TheParseinterface and related classes.org.apache.nutch.protocol Classes related to theProtocolinterface, see alsoorg.apache.nutch.net.protocols.org.apache.nutch.protocol.htmlunit Protocol plugin which supports retrieving documents via HTTP/HTTPS using Selenium and the HtmlUnitDriver web driver for the for the HtmlUnit headless browser.org.apache.nutch.protocol.http Protocol plugin which supports retrieving documents via the http protocol.org.apache.nutch.protocol.httpclient Protocol plugin which supports retrieving documents via the HTTP andHTTPS protocols, optionally with Basic, Digest and NTLM authentication schemes for web server as well as proxy server.org.apache.nutch.protocol.interactiveselenium Protocol plugin which supports retrieving documents using and interacting with Selenium.org.apache.nutch.protocol.okhttp Protocol plugin for HTTP/HTTPS based on okhttp, supports HTTP 1.1 and/or http/2.org.apache.nutch.protocol.selenium Protocol plugin which supports retrieving documents via Selenium.org.apache.nutch.scoring.webgraph org.apache.nutch.segment A segment stores all data from on generate/fetch/update cycle: fetch list, protocol status, raw content, parsed content, and extracted outgoing links.org.apache.nutch.tools Miscellaneous tools.org.apache.nutch.tools.warc Tools to import / export between Nutch segments and WARC archives.org.creativecommons.nutch Sample plugins that parse and index Creative Commons metadata. -
-
Uses of Metadata in org.apache.nutch.indexer
Methods in org.apache.nutch.indexer that return Metadata Modifier and Type Method Description MetadataNutchDocument. getDocumentMeta() -
Uses of Metadata in org.apache.nutch.metadata
Subclasses of Metadata in org.apache.nutch.metadata Modifier and Type Class Description classCaseInsensitiveMetadataA decorator to Metadata that adds for case-insensitive lookup of keys.classSpellCheckedMetadataA decorator to Metadata that adds spellchecking capabilities to property names.Methods in org.apache.nutch.metadata that return Metadata Modifier and Type Method Description MetadataMetaWrapper. getMetadata()Get all metadata.Methods in org.apache.nutch.metadata with parameters of type Metadata Modifier and Type Method Description voidMetadata. addAll(Metadata metadata)Add all name/value mappings (merge two metadata mappings).Constructors in org.apache.nutch.metadata with parameters of type Metadata Constructor Description MetaWrapper(Metadata metadata, Writable instance, Configuration conf) -
Uses of Metadata in org.apache.nutch.net.protocols
Methods in org.apache.nutch.net.protocols that return Metadata Modifier and Type Method Description MetadataResponse. getHeaders()Get all the headers. -
Uses of Metadata in org.apache.nutch.parse
Methods in org.apache.nutch.parse that return Metadata Modifier and Type Method Description MetadataParseData. getContentMeta()The originalMetadataretrieved from contentMetadataHTMLMetaTags. getGeneralTags()MetadataParseData. getParseMeta()Other content properties.Methods in org.apache.nutch.parse with parameters of type Metadata Modifier and Type Method Description voidParseData. setParseMeta(Metadata parseMeta)Constructors in org.apache.nutch.parse with parameters of type Metadata Constructor Description ParseData(ParseStatus status, String title, Outlink[] outlinks, Metadata contentMeta)ParseData(ParseStatus status, String title, Outlink[] outlinks, Metadata contentMeta, Metadata parseMeta) -
Uses of Metadata in org.apache.nutch.protocol
Methods in org.apache.nutch.protocol that return Metadata Modifier and Type Method Description MetadataContent. getMetadata()Other protocol-specific data.Methods in org.apache.nutch.protocol with parameters of type Metadata Modifier and Type Method Description voidContent. setMetadata(Metadata metadata)Other protocol-specific data.Constructors in org.apache.nutch.protocol with parameters of type Metadata Constructor Description Content(String url, String base, byte[] content, String contentType, Metadata metadata, Configuration conf)Content(String url, String base, byte[] content, String contentType, Metadata metadata, MimeUtil mimeTypes) -
Uses of Metadata in org.apache.nutch.protocol.htmlunit
Methods in org.apache.nutch.protocol.htmlunit that return Metadata Modifier and Type Method Description MetadataHttpResponse. getHeaders() -
Uses of Metadata in org.apache.nutch.protocol.http
Methods in org.apache.nutch.protocol.http that return Metadata Modifier and Type Method Description MetadataHttpResponse. getHeaders() -
Uses of Metadata in org.apache.nutch.protocol.httpclient
Methods in org.apache.nutch.protocol.httpclient that return Metadata Modifier and Type Method Description MetadataHttpResponse. getHeaders()Methods in org.apache.nutch.protocol.httpclient with parameters of type Metadata Modifier and Type Method Description HttpAuthenticationHttpAuthenticationFactory. findAuthentication(Metadata header) -
Uses of Metadata in org.apache.nutch.protocol.interactiveselenium
Methods in org.apache.nutch.protocol.interactiveselenium that return Metadata Modifier and Type Method Description MetadataHttpResponse. getHeaders() -
Uses of Metadata in org.apache.nutch.protocol.okhttp
Methods in org.apache.nutch.protocol.okhttp that return Metadata Modifier and Type Method Description MetadataOkHttpResponse. getHeaders() -
Uses of Metadata in org.apache.nutch.protocol.selenium
Methods in org.apache.nutch.protocol.selenium that return Metadata Modifier and Type Method Description MetadataHttpResponse. getHeaders() -
Uses of Metadata in org.apache.nutch.scoring.webgraph
Methods in org.apache.nutch.scoring.webgraph that return Metadata Modifier and Type Method Description MetadataNode. getMetadata()Methods in org.apache.nutch.scoring.webgraph with parameters of type Metadata Modifier and Type Method Description voidNode. setMetadata(Metadata metadata) -
Uses of Metadata in org.apache.nutch.segment
Methods in org.apache.nutch.segment with parameters of type Metadata Modifier and Type Method Description static CharsetSegmentReader. getCharset(Metadata parseMeta)Try to get HTML encoding from parse metadata. -
Uses of Metadata in org.apache.nutch.tools
Fields in org.apache.nutch.tools declared as Metadata Modifier and Type Field Description protected MetadataAbstractCommonCrawlFormat. metadataMethods in org.apache.nutch.tools with parameters of type Metadata Modifier and Type Method Description static CommonCrawlFormatCommonCrawlFormatFactory. getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)Deprecated.StringAbstractCommonCrawlFormat. getJsonData(String url, Content content, Metadata metadata)StringAbstractCommonCrawlFormat. getJsonData(String url, Content content, Metadata metadata, ParseData parseData)StringCommonCrawlFormat. getJsonData(String url, Content content, Metadata metadata)Returns a string representation of the JSON structure of the URL content.StringCommonCrawlFormat. getJsonData(String url, Content content, Metadata metadata, ParseData parseData)Returns a string representation of the JSON structure of the URL content.StringCommonCrawlFormatWARC. getJsonData(String url, Content content, Metadata metadata, ParseData parseData)Constructors in org.apache.nutch.tools with parameters of type Metadata Constructor Description AbstractCommonCrawlFormat(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)CommonCrawlFormatJackson(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)CommonCrawlFormatJettinson(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)CommonCrawlFormatSimple(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)CommonCrawlFormatWARC(String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config, ParseData parseData) -
Uses of Metadata in org.apache.nutch.tools.warc
Methods in org.apache.nutch.tools.warc with parameters of type Metadata Modifier and Type Method Description protected com.google.gson.JsonObjectWARCExporter.WARCMapReduce.WARCReducer. metadataToJson(Metadata meta)Adds keys/values of a Nuta metadata container to a JsonObject. -
Uses of Metadata in org.creativecommons.nutch
Methods in org.creativecommons.nutch with parameters of type Metadata Modifier and Type Method Description static voidCCParseFilter.Walker. walk(Node doc, URL base, Metadata metadata, Configuration conf)Scan the document adding attributes to metadata.
-