Karate Driver

UI Test Automation Made Simple.


This is new, and this first version 0.9.X should be considered experimental.

Especially after the Gherkin parser and execution engine were re-written from the ground-up, Karate is arguably a mature framework that elegantly solves quite a few test-automation engineering challenges - with capabilities such as parallel execution, data-driven testing, environment-switching, powerful assertions, and an innovative UI for debugging.

Which led us to think, what if we could add UI automation without disturbing the core HTTP API testing capabilities. So we gave it a go, and we are releasing the results so far as this experimental version.

Please do note: this is work in progress and all actions needed for test-automation may not be in-place. But we hope that releasing this sooner would result in more users trying this in a variety of environments. And that they provide valuable feedback and even contribute code where possible.

We know too well that UI automation is hard to get right and suffers from 2 big challenges, what we like to call the “flaky test” problem and the “wait for UI element” problem.

With the help of the community, we would like to try valiantly - to see if we can get close to as ideal a state a possible. So wish us luck !


Chrome Java API

Karate also has a Java API to automate the Chrome browser directly, designed for common needs such as converting HTML to PDF or taking a screenshot of a page. Here is an example:

import com.intuit.karate.FileUtils;
import com.intuit.karate.driver.chrome.Chrome;
import java.io.File;
import java.util.Collections;

public class Test {

    public static void main(String[] args) {
        Chrome chrome = Chrome.startHeadless();
        byte[] bytes = chrome.pdf(Collections.EMPTY_MAP);
        FileUtils.writeToFile(new File("target/github.pdf"), bytes);
        bytes = chrome.screenshot();
        FileUtils.writeToFile(new File("target/github.png"), bytes);

The parameters that you can optionally customize via the Map argument to the pdf() method are documented here: Page.printToPDF .

If Chrome is not installed in the default location, you can pass a String argument like this: Chrome.startHeadless(executable) or Chrome.start(executable).

Syntax Guide


Web Browser

Driver Configuration

configure driver

This below declares that the native (direct) Chrome integration should be used, on both Mac OS and Windows - from the default installed location.

* configure driver = { type: 'chrome' }

If you want to customize the start-up, you can use a batch-file:

* configure driver = { type: 'chrome', executable: 'chrome' }

Here a batch-file called chrome can be placed in the system PATH (and made executable) with the following contents:

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" $*

For Windows it would be chrome.bat in the system PATH as follows:

"C:\Program Files (x86)\Google\Chrome\Application\chrome" %*

Another example for WebDriver, again assuming that chromedriver is in the PATH:

{ type: 'chromedriver', port: 9515, executable: 'chromedriver' }
key description
type see driver types
executable if present, Karate will attempt to invoke this, if not in the system PATH, you can use a full-path instead of just the name of the executable. batch files should also work
start default true, Karate will attempt to start the executable - and if the executable is not defined, Karate will even try to assume the default for the OS in use
port optional, and Karate would choose the “traditional” port for the given type
headless only applies to type: 'chrome' for now
showDriverLog default false, will include webdriver HTTP traffic in Karate report, useful for troubleshooting or bug reports
showProcessLog default false, will include even executable (webdriver or browser) logs in the Karate report

Driver Types

type | default
port | default
executable | description —- | —————- | ———————- | ———– chrome | 9222 | mac: /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
win: C:/Program Files (x86)/Google/Chrome/Application/chrome.exe | “native” Chrome automation via the DevTools protocol chromedriver | 9515 | chromedriver | W3C Chrome Driver geckodriver | 4444 | geckodriver | W3C Gecko Driver (Firefox) safaridriver | 5555 | safaridriver | W3C Safari Driver mswebdriver | 17556 | MicrosoftWebDriver | W3C Microsoft Edge WebDriver msedge | 9222 | MicrosoftEdge | very experimental - using the DevTools protocol winappdriver | 4727 | C:/Program Files (x86)/Windows Application Driver/WinAppDriver | Windows Desktop automation, similar to Appium


The standard locator syntax is supported. For example for web-automation, a / prefix means XPath and else it would be evaluated as a “CSS selector”.

And driver.input('input[name=someName]', 'test input')
When driver.submit("//input[@name='commit']")
web ? prefix means example
web (none) css selector input[name=someName]
web / xpath //input[@name='commit']
web ^ link text ^Click Me
web * partial link text *Click Me
win (none) name Submit
win @ accessibility id @CalculatorResults
win # id #MyButton


Only one keyword sets up UI automation in Karate, typically by specifying the URL to open in a browser. And then you would use the built-in driver JS object for all other operations, combined with Karate’s match syntax for assertions where needed.


Navigate to a web-address and initializes the driver instance for future step operations as per what is configured. And yes, you can use variable expressions from karate-config.js. For example:

Given driver webUrlBase + '/page-01'

driver JSON

A variation where the argument is JSON instead of a URL / address-string, used only if you are testing a desktop (or mobile) application, and for Windows, you can provide the app, appArguments and other parameters expected by the WinAppDriver. For example:

Given driver { app: 'Microsoft.WindowsCalculator_8wekyb3d8bbwe!App' }


The built-in driver JS object is where you script UI automation.

Behind the scenes this does an eval - and as a convenience, you can omit the eval keyword when executing an action - and when you don’t need to save any result using def.

You can refer to the Java interface definition of the driver object to better understand what the various operations are. Note that Map<String, Object> translates to JSON, and JavaBean getters and setters translate to JS properties - e.g. driver.getTitle() becomes driver.title.


Get the current URL / address for matching. Example:

Then match driver.location == webUrlBase + '/page-02'

This can also be used as a “setter”:

* driver.location = 'http://localhost:8080/test'


Get the current page title for matching. Example:

Then match driver.title == 'Test Page'


 And driver.dimensions = { left: 0, top: 0, width: 300, height: 800 }


2 string arguments: locator and value to enter.

* driver.input('input[name=someName]', 'test input')


Just triggers a click event on the DOM element, does not wait for a page load.

* driver.click('input[name=someName]')

There is a second rarely used variant which will wait for a JavaScript dialog to appear:

* driver.click('input[name=someName]', true)


Triggers a click event on the DOM element, and waits for the next page to load.

* driver.submit('.myClass')


Specially for select boxes. There are four variations and use the locator conventions.

# select by displayed text
Given driver.select('select[name=data1]', '^Option Two')

# select by partial displayed text
And driver.select('select[name=data1]', '*Two')

# select by `value`
Given driver.select('select[name=data1]', 'option2')

# select by index
Given driver.select('select[name=data1]', 2)


* driver.focus('.myClass')


Close the page / tab.


Close the browser.


Get the innerHTML. Example:

And match driver.html('.myClass') == '<span>Class Locator Test</span>'


Get the text-content. Example:

And match driver.text('.myClass') == 'Class Locator Test'


Get the HTML form-element value. Example:

And match driver.value('.myClass') == 'some value'


Get the HTML element attribute value. Example:

And match driver.attribute('#eg01SubmitId', 'type') == 'submit'


Wait for the JS expression to evaluate to true. Will poll using the retry settings configured.

* eval driver.waitUntil("document.readyState == 'complete'")


Will actually attempt to evaluate the given string as JavaScript within the browser.


Normal page reload, does not clear cache.


Hard page reload, after clearing the cache.







Set a cookie:

Given def cookie2 = { name: 'hello', value: 'world' }
When driver.cookie = cookie2
Then match driver.cookies contains '#(^cookie2)'


Get a cookie by name:

* def cookie1 = { name: 'foo', value: 'bar' }
And match driver.cookies contains '#(^cookie1)'
And match driver.cookie('foo') contains cookie1


See above examples.


Delete a cookie by name:

When driver.deleteCookie('foo')
Then match driver.cookies !contains '#(^cookie1)'


Clear all cookies.

When driver.clearCookies()
Then match driver.cookies == '#[0]'


Two forms. The first takes a single boolean argument, whether to “accept” or “cancel”. The second form has an additional string argument which is the text to enter for cases where the dialog is expecting user input.

Also works as a “getter” to retrieve the text of the currently visible dialog:

* match driver.dialog == 'Please enter your name:'


Two forms, if a locator is provided only that HTML element will be captured, else the browser viewport will be captured.


Only supported for driver type chrome. See Chrome Java API.


Useful to visually highlight an element in the browser, especially when working in the Karate UI

* driver.highlight('#eg01DivId')