Introducing Taiko

Introducing Taiko, a free and open source browser automation tool for driving your web browser programmatically.

Presented by Scott Davis. His email address is scott.davis@thoughtworks.com. His Twitter handle is @scottdavis99.

Instructions

Keyboard Shortcut Description
→ or [space] Next slide
Previous slide
f Fullscreen view (toggles on/off)
t Transcript view (opens in separate window)
p Play/pause audio for the current slide.
a Auto-play audio and advance to next slide. (toggles on/off)

Here are some quick keyboard shortcuts to help you get around.

  • Press the right arrow or space bar to go to the next slide.
  • Press the left arrow to go to the previous slide.
  • Press 'f' to toggle full screen mode.
  • Press 't' to open the written transcripts in a separate window.
  • Press 'p' to play or pause audio for the current slide.
  • Press 'a' to autoplay audio and advance to the next slide.

Scott Davis' Biography

Hi! My name is Scott Davis. I'm currently a Principle Engineer with ThoughtWorks. Before that, I ran a software consultancy out of Denver, Colorado called ThirstyHead.

I've been writing about web development for years -- articles for IBM, books for O'Reilly and the Pragmatic Bookshelf, and most recently videos for O'Reilly as well.

https://taiko.gauge.org

Taiko Homepage

Today I'd like to talk to you about Taiko -- a free and open source browser automation tool built by ThoughtWorks. So, what exactly is a browser automation tool?

Driver of a car

"Browser automation tools" are much like the "driver automation tools" you can find in modern automobiles today.

Cruise control is a "driver automation tool" that keeps your car driving at a constant speed. On newer cars, cruise control can slow you down, or even stop so that you don't get in an accident.

More advanced cars can parallel park on your behalf, or even be "summonsed" with a simple click of your key fob. They'll back out of a tight parking spot, or back out of your garage, and autonomously drive to where you are.

W3C WebDriver

This is what's so exciting about the idea of "browser automation tools". WebDriver is a W3C standard that will soon be baked into every modern browser.

WebDriver allows you to remotely control your browser from a script -- visiting new web pages, filling in form fields, clicking on buttons, and so on.

Automated testing is an obvious use of this. Testing the behavior of your web app in an actual browser couldn't be easier. But it's not limited to simple testing -- you can grab screenshots at any step along the way, do sophisticated analytics like calculating download speeds and measuring time-to-interactive, and even verify how accessible your website is to folks with disabilities like low vision or lack of fine motor skills.

And notice the lead editor on this spec: Simon Stewart. As it turns out, Simon is a former ThoughtWorker who has been working on this kind of browser automation for over a decade.

Selenium History

As a matter of fact, ThoughtWorks has been actively working on browser automation tools for over 15 years at this point.

You might be familiar with "Selenium", a free and open source browser automation tool that we released way back in 2004.

Simon Stewart released a free and open source tool called "WebDriver" back in 2007, and the two projects merged in 2009.

https://taiko.gauge.org

Taiko Homepage

So, Taiko is the latest in a long history of browser automation tools release by ThoughtWorks, going back over 15 years.

Where Taiko really shines is stability and ease of use. You're a quick 'npm install' away from a getting a stable, reliable browser automation tool installed on your developer machine or added to your Continuous Delivery pipeline.

Taiko includes Chromium

Chromium

What makes Taiko so stable is that it ships with a known-good, compatible, working release of Chromium when you 'npm install' it.

And what is Chromium? Well, outside of software, it's the element used to make the chrome features on your car. In the web development world, it's the free and open source core that Google uses to make their Chrome browser. (Clever, eh?)

But Chromium is also at the core of the Opera browser. And in a recent, unexpected move, Microsoft abandoned nearly 30 years of building their own web technologies to make Chromium the core of their Microsoft Edge browser.

So while there are still healthy competitors to Chromium in the open source ecosystem -- Firefox uses a core called Gecko, and Apple's Safari uses a core called WebKit -- Chromium is a popular, well-supported foundation to many popular web browsers that you and your users are most likely already using.

Chrome DevTools Protocol

Since the new WebDriver W3C standard isn't fully implemented across all browsers today, Taiko uses the next best thing: the Chrome DevTools Protocol.

As you might've guessed from the name, this is the same stable low-level protocol that the Chrome Developer Tools use to interact with the browser, like the JavaScript console. If you've ever run a Lighthouse Audit on your website, Lighthouse uses the Chrome DevTools Protocol to pull performance metrics and accessibility results to build out its detailed analytics reports.

Taiko has access to all of the same data and browser automation capabilities that these familiar development tools have. And now you do, too, in an easy scriptable environment.

Using Taiko

So, let's get started.

https://thirstyhead.com/grocery works GroceryWorks

So, here's a website for a fictional grocery store called GroceryWorks. As you click around, you'll see that tapping the categories on the left show you different types of food items -- beans, nuts, pasta, produce, and so on. Clicking on a food item adds it to the cart on the right. Clicking on the food item again removes it from the cart. Clicking 'Purchase' allows you to place your order or cancel it.

Taiko REPL


$ taiko


Version: 0.5.0 (Chromium:73.0.3638.0)
Type .api for help and .exit to quit

>
          

If you've already installed Taiko, you can simply type 'taiko' at the command prompt. This will launch you into the interactive REPL where you can explore the features of Taiko at your own pace.

At this point, you are given two suggestions to type. One of these suggestions will make this a very short discussion. Why don't you type '.api' instead and see what other commands are available to you?


> .api

Browser actions
    openBrowser, closeBrowser, client, switchTo, 
    setViewPort, openTab, closeTab

Page actions
    goto, reload, title, click, doubleClick, rightClick

[snip]

Run `.api <name>` for more info on a specific function. 
    For Example: `.api click`.
Complete documentation is available at 
    http://taiko.gauge.org.
          

Typing '.api' brings up a list of all of the Taiko commands.

These should be fairly straight forward to understand. The 'openBrowser' and 'closeBrowser' commands do what you'd expect them to do. 'openTab', 'closeTab', 'click', 'doubleClick', 'rightClick' -- you get the idea.

But if you want to learn more about any of these specific commands, simply type '.api' and command name to pull up more information.


> .api openBrowser

Launches a browser with a tab. The browser will be closed 
    when the parent node.js process is closed.

Example:
	openBrowser()
	openBrowser({ headless: false })
	openBrowser({args:['--window-size=1440,900']})
	openBrowser({args: [
	      '--disable-gpu',
	      '--disable-dev-shm-usage',
	      '--disable-setuid-sandbox',
	      '--no-first-run',
	      '--no-sandbox',
	      '--no-zygote']}) - These are recommended args 
               that have to be passed when running in docker
          

For example, type '.api openBrowser'. Not surprisingly, this is how you open a new Chromium instance. But there are a number of flags that you can pass to control or customize your new Chromium instance.

When you're in the Taiko REPL, typing 'openBrowser' without any options will open a visible instance of Chromium so that you can see what you're doing. But if you're running Taiko as a part of your Continuous Delivery pipeline, you'll almost certainly want to run it in headless mode, which means nothing visible will be displayed. You can also control the size of the window, visible or not, to more closely emulate smartphone, tablet, or laptop users.

All of these are standard, out-of-the-box, Chromium flags.


> openBrowser()

 ✔ Browser opened
          
A blank Chromium window

So, if you type 'openBrowser()' right now in your Taiko REPL, you should see a new instance of Chromium pop up with a blank tab, just waiting for you to do something interesting.


> .api goto

Opens the specified URL in the browser's tab. Adds http 
    protocol to the URL if not present.

Example:
	goto('https://google.com')
	goto('google.com')
          

For example, you could goto a website. If you type '.api goto', you'll see that you can either type in a fully qualified domain name, or any shortened version that you'd normally type into your web browser as an end user.


> goto('https://thirstyhead.com/groceryworks')

 ✔ Navigated to url "https://thirstyhead.com/groceryworks" 
          
Viewing the website for groceryworks

So, if you type in "goto('https://thirstyhead.com/groceryworks"), you'll visit the GroceryWorks website that we talked about just a moment ago.

Of course, this works just as well visiting 'localhost' on your development machine, or visiting the URL of your staging environment, your test server, or even your production website. It's completely up to you.


> .api click

Fetches an element with the given selector, scrolls it 
   into view if needed, and then clicks in the center of 
   the element. If there's no element matching selector, 
   the method throws an error.

Example:
	click('Get Started')
	click(link('Get Started')) 

          

If you want to click on something onscreen, not surprisingly, you'll use the 'click' command. Type '.api click' to learn more about it.

One of the most powerful features of Taiko is its selector logic. Whatever you (or your end users) see on the screen, you can use here in conjunction with the 'click' command.


> click('Pasta')

 ✔ Clicked element matching text "Pasta" 
          
Clicking on the Pasta menu item

So, if you want to see what kind of Pasta you can add to your cart, simply type "click('Pasta')".

You don't have to worry about what the underlying class selector is, or ID selector, or anything like that. Use the text on the screen, which is exactly what your end user would do.


> click('Penne')
 ✔ Clicked element matching text "Penne"
> click('Produce')
 ✔ Clicked element matching text "Produce"
> click('Eggplant')
 ✔ Clicked element matching text "Eggplant"
> click('Purchase')
 ✔ Clicked element matching text "Purchase" 
          
Clicking on the Purchase button

So, if you type "click('Penne')", "click('Produce')", "click('Eggplant')", "click('Purchase')", you, in fact, are clicking on radio buttons, checkboxes, and an HTML button using the same text that your user sees on-screen.

From here, you could type "click('Purchase Order')", or "click('Cancel')" to proceed.


> .api

[snip]

Selectors
    $, image, link, listItem, button, inputField, 
    fileField, textField, comboBox, checkBox, 
    radioButton, text, contains

Proximity selectors
    toLeftOf, toRightOf, above, below, near

[snip]

Run `.api <name>` for more info on a specific function. 
    For Example: `.api click`.
Complete documentation is available at 
    http://taiko.gauge.org.
          

Of course, if you need to be more specific, you can be. You can say, "Click on the image", or "Click on the link", or "Click on the listItem". Click on the screen element "toLeftOf", "toRightOf", "above", or "below". You can even use the familiar "$" to use class selectors or ID selectors if that's the level of specificity you need.

But believe it or not, the less specific you can make your Taiko scripts, the more resilient they'll be in the face of change. When you change that link to a button, or that checkBox to a comboBox, or that class name yet again, you'll be happy that you used the on-screen text instead of the underlying implementation.


> .code

const { openBrowser, goto, 
        click, closeBrowser } = require('taiko');
(async () => {
    try {
        await openBrowser();
        await goto('https://thirstyhead.com/groceryworks');
        await click('Pasta');
        await click('Penne');
        await click('Produce');
        await click('Eggplant');
        await click('Purchase');
    } catch (e) {
        console.error(e);
    } finally {
        await closeBrowser();
    }
})();
          

While all of this typing in the REPL has been fun, what if you want to run these commands as a part of your test suite or your Continuous Delivery pipeline?

If you type '.code', you'll see that the Taiko REPL has bundled everything that you've typed up to this point into modern JavaScript source code that's ready to be run from Node.js.

Save to file


> .code buy-ingredients-for-pasta-dinner.js
> .exit
          

Run the file


$ taiko buy-ingredients-for-pasta-dinner.js

 ✔ Browser opened
 ✔ Navigated to url "https://thirstyhead.com/groceryworks"
 ✔ Clicked element matching text "Pasta"
 ✔ Clicked element matching text "Penne"
 ✔ Clicked element matching text "Produce"
 ✔ Clicked element matching text "Eggplant"
 ✔ Clicked element matching text "Purchase"
 ✔ Browser closed
          

Even better, if you type '.code' and a filename, the Taiko REPL will save that source code out to the filesystem in your current working directory.

Once you've done that, you can type 'taiko' and the name of your script to see it run anywhere that you have Taiko installed -- your test servers, your Continuous Delivery servers, or even your local development machines.

Conclusion

So, what have we learned?

Driver of a car

The types of automation features that we're beginning to see baked into modern cars -- like adaptive cruise control and automatic parallel parking -- are similar to the browser automation tools that we're beginning to see baked into modern web browsers.

W3C WebDriver

We're on the cusp of an exciting new era of web development -- one where we have full programmatic remote control of our web browsers. For testing purposes. For performance and accessibility analytics. For uses that we haven't even thought of yet.

https://taiko.gauge.org

Taiko Homepage

This is why we're so excited about Taiko -- a free and open source browser automation tool. ThoughtWorkers have been building tools like this for over 15 years now. We want to make sure that you have modern, stable, easy-to-use tools for your developers, for your test suites, and for your Continuous Delivery pipelines.

I hope that you've enjoyed learning a little bit about Taiko and how to drive your web browser programmatically. Thanks for your time.