Class QuerystringURLNormalizer
- java.lang.Object
-
- org.apache.nutch.net.urlnormalizer.querystring.QuerystringURLNormalizer
-
- All Implemented Interfaces:
Configurable,URLNormalizer
public class QuerystringURLNormalizer extends Object implements URLNormalizer
URL normalizer plugin for normalizing query strings but sorting query string parameters. Not sorting query strings can lead to large amounts of duplicate URL's such as ?a=x&b=y vs b=y&a=x.
-
-
Field Summary
-
Fields inherited from interface org.apache.nutch.net.URLNormalizer
X_POINT_ID
-
-
Constructor Summary
Constructors Constructor Description QuerystringURLNormalizer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ConfigurationgetConf()Stringnormalize(String urlString, String scope)voidsetConf(Configuration conf)
-
-
-
Method Detail
-
getConf
public Configuration getConf()
- Specified by:
getConfin interfaceConfigurable
-
setConf
public void setConf(Configuration conf)
- Specified by:
setConfin interfaceConfigurable
-
normalize
public String normalize(String urlString, String scope) throws MalformedURLException
- Specified by:
normalizein interfaceURLNormalizer- Throws:
MalformedURLException
-
-