Class HttpWebClient
- java.lang.Object
-
- org.apache.nutch.protocol.selenium.HttpWebClient
-
public class HttpWebClient extends Object
-
-
Constructor Summary
Constructors Constructor Description HttpWebClient()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidcleanUpDriver(org.openqa.selenium.WebDriver driver)static org.openqa.selenium.remote.RemoteWebDrivercreateChromeRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)static org.openqa.selenium.WebDrivercreateChromeWebDriver(String chromeDriverPath, boolean enableHeadlessMode)static org.openqa.selenium.remote.RemoteWebDrivercreateDefaultRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)static org.openqa.selenium.remote.RemoteWebDrivercreateFirefoxRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)static org.openqa.selenium.WebDrivercreateFirefoxWebDriver(String firefoxDriverPath, boolean enableHeadlessMode)static org.openqa.selenium.remote.RemoteWebDrivercreateRandomRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)static org.openqa.selenium.WebDrivergetDriverForPage(String url, Configuration conf)static StringgetHtmlPage(String url)static StringgetHtmlPage(String url, Configuration conf)Function for obtaining the HTML using the selected selenium webdriver There are a number of configuration properties withinnutch-site.xmlwhich determine whether to take screenshots of the rendered pages and persist them as timestamped .png's into HDFS.
-
-
-
Method Detail
-
getDriverForPage
public static org.openqa.selenium.WebDriver getDriverForPage(String url, Configuration conf)
-
createFirefoxWebDriver
public static org.openqa.selenium.WebDriver createFirefoxWebDriver(String firefoxDriverPath, boolean enableHeadlessMode)
-
createChromeWebDriver
public static org.openqa.selenium.WebDriver createChromeWebDriver(String chromeDriverPath, boolean enableHeadlessMode)
-
createFirefoxRemoteWebDriver
public static org.openqa.selenium.remote.RemoteWebDriver createFirefoxRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)
-
createChromeRemoteWebDriver
public static org.openqa.selenium.remote.RemoteWebDriver createChromeRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)
-
createRandomRemoteWebDriver
public static org.openqa.selenium.remote.RemoteWebDriver createRandomRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)
-
createDefaultRemoteWebDriver
public static org.openqa.selenium.remote.RemoteWebDriver createDefaultRemoteWebDriver(URL seleniumHubUrl, boolean enableHeadlessMode)
-
cleanUpDriver
public static void cleanUpDriver(org.openqa.selenium.WebDriver driver)
-
getHtmlPage
public static String getHtmlPage(String url, Configuration conf)
Function for obtaining the HTML using the selected selenium webdriver There are a number of configuration properties withinnutch-site.xmlwhich determine whether to take screenshots of the rendered pages and persist them as timestamped .png's into HDFS.- Parameters:
url- the URL to fetch and renderconf- theConfiguration- Returns:
- the html page
-
-