Introduction
The dependencies page lists all the jars that you will need to have in your classpath.
The class com.gargoylesoftware.htmlunit.WebClient is the main starting point. This simulates a web browser and will be used to execute all of the tests.
Most unit testing will be done within a framework like JUnit so all the examples here will assume that we are using that.
In the first sample, we create the web client and have it load the homepage from the HtmlUnit website. We then verify that this page has the correct title. Note that getPage() can return different types of pages based on the content type of the returned data. In this case we are expecting a content type of text/html so we cast the result to an com.gargoylesoftware.htmlunit.html.HtmlPage.
@Test public void homePage() throws Exception { final WebClient webClient = new WebClient(); try (final WebClient webClient = new WebClient()) { final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net"); Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText()); final String pageAsXml = page.asXml(); Assert.assertTrue(pageAsXml.contains("<body class=\"composite\">")); final String pageAsText = page.asText(); Assert.assertTrue(pageAsText.contains("Support for the HTTP and HTTPS protocols")); } }
Imitating a specific browser
Often you will want to simulate a specific browser. This is done by passing a com.gargoylesoftware.htmlunit.BrowserVersion into the WebClient constructor. Constants have been provided for some common browsers but you can create your own specific version by instantiating a BrowserVersion.
@Test public void homePage_Firefox() throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_38)) { final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net"); Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText()); } }
Specifying this BrowserVersion will change the user agent header that is sent up to the server and will change the behavior of some of the JavaScript.
Finding a specific element
Once you have a reference to an HtmlPage, you can search for a specific HtmlElement by one of 'get' methods, or by using XPath.
Below is an example of finding a 'div' by an ID, and getting an anchor by name:
@Test public void getElements() throws Exception { try (final WebClient webClient = new WebClient()) { final HtmlPage page = webClient.getPage("http://some_url"); final HtmlDivision div = page.getHtmlElementById("some_div_id"); final HtmlAnchor anchor = page.getAnchorByName("anchor_name"); } }
XPath is the suggested way for more complex searches, a brief tutorial can be found in W3Schools
@Test public void xpath() throws Exception { try (final WebClient webClient = new WebClient()) { final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net"); //get list of all divs final List<?> divs = page.getByXPath("//div"); //get div which has a 'name' attribute of 'John' final HtmlDivision div = (HtmlDivision) page.getByXPath("//div").get(0); } }
Using a proxy server
The last WebClient constructor allows you to specify proxy server information in those cases where you need to connect through one.
@Test public void homePage_proxy() throws Exception { try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_38, "myproxyserver", myProxyPort)) { //set proxy username and password final DefaultCredentialsProvider credentialsProvider = (DefaultCredentialsProvider) webClient.getCredentialsProvider(); credentialsProvider.addCredentials("username", "password"); final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net"); Assert.assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText()); } }
Specifying this BrowserVersion will change the user agent header that is sent up to the server and will change the behavior of some of the JavaScript.
Submitting a form
Frequently we want to change values in a form and submit the form back to the server. The following example shows how you might do this.
@Test public void submittingForm() throws Exception { try (final WebClient webClient = new WebClient()) { // Get the first page final HtmlPage page1 = webClient.getPage("http://some_url"); // Get the form that we are dealing with and within that form, // find the submit button and the field that we want to change. final HtmlForm form = page1.getFormByName("myform"); final HtmlSubmitInput button = form.getInputByName("submitbutton"); final HtmlTextInput textField = form.getInputByName("userid"); // Change the value of the text field textField.setValueAttribute("root"); // Now submit the form by clicking the button and get back the second page. final HtmlPage page2 = button.click(); } }
HtmlUnit provides JavaScript support, simulating the behavior of the configured browser (Firefox or Internet Explorer). It uses the Rhino JavaScript engine for the core language (plus workarounds for some Rhino bugs) and provides the implementation for the objects specific to execution in a browser.
The unit tests of some well-known JavaScript libraries are included in HtmlUnit's own unit tests; based on these unit tests, the following libraries are known to work well with HtmlUnit:
- jQuery 1.2.6: Full support (see unit test here)
- MochiKit 1.4.1: Full support (see unit tests here)
- GWT 2.5.0: Full support (see unit test here)
- Sarissa 0.9.9.3: Full support (see unit test here)
- MooTools 1.2.1: Full support (see unit test here)
- Prototype 1.6.0: Very good support (see unit test here)
- Ext JS 2.2: Very good support (see unit test here)
- Dojo 1.0.2: Good support (see unit test here)
- YUI 2.3.0: Good support (see unit test here)