Running a Taurus Http Performance test in TeamCity using Docker

I’ve wanted to get some automated performance tests into my CI pipeline for a while, and after finally taking some time to get to it over the quiet holiday period, I wanted to share my experience of how to do it, as I found a lot of the material online around this was not as clear to follow as I would like to see.

I chose to use Taurus after seeing a presentation on it at Australian Testing Days 2018. It is a wrapper that runs on top of popular performance testing tools, JMeter or Gatling, and makes it really easy to run a performance test, simply by creating a config file and running it. I use TeamCity because that’s what my company uses, but it can be easily swapped out to another CI tool. And I wanted to use Docker to avoid having to add installation steps to all our build agents, and make a cleaner, more reliable test by using a known, reproducible environment. (Docker lets you spin up a machine in a known state with known programs/apps etc.. installed)

1 – Create a git repo for your Taurus test

screen shot 2019-01-07 at 11.24.26 am

Start by setting up a simple git repo to contain your Taurus test file. There are a whole lot of variables you can set in the config file, I chose to keep it fairly basic for now while starting out. Here is my file (with a few details omitted):


Screen Shot 2019-01-07 at 11.34.03 am.png
Let’s break down each section.

The execution section describes what test scenario I am going to run and the settings of my performance test. In this case, I am running the ‘getting-started-load-test’ scenario, with variables to be pre-filled in my TeamCity run later on (so they are easy to change, not hard-coded, and for privacy reasons too). Using example numbers, If I had:

  • Concurrency: 10
  • ramp-up: 5s
  • hold-for: 30s

Then my test will start sending requests as defined in the getting-started scenario. Over the first 5 seconds it will increase from 1 user to 10 parallel users making requests. And the whole test will last for 30 seconds. (Breaking down that timeline: 0-5 seconds, = building up from 1-10 users, then 5-30 seconds is the full 10 users).

The scenario section lists the url to make the request with, along with any headers or request body needed. Again, I have used a placeholder for the authorization header.

The reporting section tells the tool what it should consider as a failure in terms of analysing the test results. Without this, you can have every single request fail, and the test will not fail. There are many options for what you can use as your failure criteria, including average request time, 90th percentile of results, and you even tell it that a condition needs to be met for a certain duration, eg. average response time > 1s for 7 seconds = fail.

In my case, I have gone for the simple option, if any requests are not successful, stop running the test and mark it as failed.

I’ve also added a Blazemeter module, which gives me a url in my test results that I can look at with Blazemeter graphs and stats analysing your test run in more details. You can use the free version of Blazemeter (it will do this by default) and it keeps your test results for 7 days, if you have a paid account you can pass your account token into the script and have your test results saved to your account.


It will be handy to confirm that your script works on it’s own, before adding the extra complexities of Docker or a CI build. So replace all your variables with real values. You can use “” as the test url if you like. Then run the test by typing:

bzt getting-started-test.yml

in a command line tool, assuming you have first installed Python, ‘pip’ and Taurus on your machine, using the instructions here (or just run “pip install bzt”).

If the test is successful, you should get output in your terminal similar to this:

taurus output

2 – Setting up TeamCity to run your Test

Now we have our test running locally, the next challenge is to get it running in TeamCity (or your CI tool of choice) so it can be run regularly or as part of your deployment process to help detect problems over time. I will assume basic knowledge of TeamCity for this explanation, which will also help keep this part transferrable to other tools.

I’ll run through each of the build configurations:

1. Create a new build in TeamCity to run your tests

2. Attach the VCS root for your git repo you set up containing your test file, so the build has access to the .yml test file, and also knows whether you have committed changes to this build.

3. Add a build step which will run docker, choosing an image with Taurus installed, and pass in your .yml file to be run. (reference: Steps for using the Taurus Docker image on their website)

The build step to run a Docker image is remarkably simple. You want to add a command line build step, with script like this:

screen shot 2019-01-07 at 3.58.00 pm


The first line says, we are using a bash script. The second line says we are going to run a docker image, the parameters passed in to this method say that we are going to clean up the image when we are done, we are going to copy the current working directory of the build (which contains the contents of our git repo, including our test .yml file) into the docker image, into a folder called ‘bzt-configs’ (this is a requirement of the Taurus Docker image, it expects your scripts to be in that folder) and then we are going to run the ‘getting-started-test.yml’ file from that folder.

4. Add any triggers you want for when this build should run. I chose to run the build anytime I made a commit to the git project to change the test script (so I would know straight away if a change I made caused the test to fail), I also run the tests everytime we run a commit to master, and everyday at 12.30pm over lunch, so we get continual feedback, as well as feedback specific to a release. This schedule will be updated over time as we see fit.

5. Next important setting is the BuildFeatures. This is where I replace each of the variables in my .yml file with the actual test conditions and insert API keys.

You will want to add a “File Content replacer” build step for each variable you have in your code, this automatically runs before any build steps. You can do a regex search or straight text search. There are other ways to replace code in a file, eg a command line script, feel free to use whatever approach you are most comfortable with, I wanted to try out an inbuilt feature of TeamCity I hadn’t tried before.

screen shot 2019-01-07 at 4.19.38 pm

You will also notice that I have used an environmental variable within TeamCity as the value, so that I can then define all the parameters in one place, in the Parameters build configuration page.

6. Set your environment variables for each of the parameters you have now declared in your test script and in your variable replacement script like this:

screen shot 2019-01-07 at 4.20.59 pm

7. Your last step is to make sure the build agents that run this build have Docker installed. Specify any limiting rules on your build agent requirements so that the build always run on a build agent with Docker. (I didn’t set the agents up, but if you need to set up a build agent with Docker, I don’t imagine there is much more involved then having a machine with an appropriate amount of RAM and CPU power, install Docker and away you go. I’m sure there’s walkthroughs available online).

You are now ready to go! Run your test build to make sure it’s all working as expected and you now have your very own Performance test against a Http endpoint running in your CI pipeline and giving you feedback on your builds over time.

Next Steps / Still missing

Some of the things that are still to be explored include:

  • Find a way to report on performance over time in TeamCity between builds. Probably needs a custom report tool extracting the data out of the build log.
  • Add more failure criteria to the build regarding response time or other interesting factors to observe.
  • Find the optimum balance of concurrent users, duration of test and ramp up time to adequately test the system, without adversely impacting other users of the system.
  • Add more http endpoints to test against, explore whether this can/should be done within the same script, or not. It could certainly be done from the same repo, and just target a different file in the teamCity build.

I’m interested to hear anyone else’s experience with Taurus or other Performance testing tools or ways to improve my approach described here.

Australian Testing Days 2016 Reflection – Day 1

On May 20-21 I went to the inaugural Australian Testing Days Conference in Melbourne. The first day involved a series of talks, mostly on sharing experiences people had in testing and the second day was an all-day workshop on test leadership. This post outlines the key messages from the session I attended and the key things I learnt from each one.

Day 1

Part 1 – What you meant to say (keynote)

First up, Michael Bolton discussed how the language we, as testers, use around testing can be quite unhelpful and cause confusion for those involved. For example, automated testing does not exist. You can certainly automate checking, which is mainly regression tests of existing behaviour. But you cannot automate testing, which is everything someone does to understand more about a feature, giving them knowledge to decide on how risky it is to release it. The way we communicate what we do will impact what others understand it as and then expect of us. Similarly, it is important to understand the language others use when asking us to do something.

Another helpful lesson was that customer desires are more important than customer expectations. If they are happy with your product, it doesn’t matter if it met expectations. If I don’t expect the Apple Watch will be of much use to me, but then I try it and discover that I love it, my expectations weren’t met, but my desires were, and it’s a good result. Similarly, users might expect something totally different to what you produce, but if they discover that what you made is actually better than their expectations, it is likewise a good result.

Lessons learnt:

  • Be clear in communication of testing activities to avoid ambiguity and misalignment.
  • Seek the underlying mentality behind people’s testing questions

Part 2 – Transforming an offshore QA team (elective)

Next up Michele Cross shared on the challenges she is facing in transforming an offshore, traditional and highly structure testing team into a more agile, context driven testing team. The primary way to achieve any big change like this is in creating an environment of trust, which comes in 2 ways. A cognitive trust is based on ability, you trust someone because of their skills and attributes. For example, trusting a doctor who has been studying and practising medicine for 20 years that you have just met to diagnose you. An affective trust is based on relationships, you trust someone because of how well you know them and how you have interacted with them in the past. For example, you trust a friend’s movie recommendation because of shared interests and experiences, not their skills as a movie reviewer.

To help establish this trust and initiate change, three C’s were discussed.

  • Culture – Understanding people and their differences. The context that has brought people to where they are now will greatly shape how they interact with people. Do they desire structure or independence? Are they open to conflict or desire harmony? Knowing this can help inform decisions and approaches.
  • Communication – Relating to other people can be just as hard as it is important. Large, distributed teams bring with them challenges of language, timezones, video conferencing etc… It is crucial to find ways to address these concerns so that everyone is kept in the loop, aligned on direction and is able to build relationships with each other.
  • Coaching – Teaching new skills through example and instruction. Create an environment where it is safe to fail so people feel comfortable to grow. Use practical scenarios to teach skills and get involved yourself.

Lessons learnt:

  • Consider the cultural context of people you are interacting with as it will shape how to be most effective in in those interactions
  • Learning through doing, and doing alongside someone is a great way of learning
  • Trust is built by a combination of personal relationships and technical abilities

Part 3 – It takes a village to raise a tester (elective)

Catherine Karena works at WorkVentures which is all about helping under privileged people develop like skills and technological skills to help them enter the tech workforce. She talked about how to figure out what skills to teach by looking at where the most jobs were in the market and the common skills required. This includes both technical and relational skills as they interact with structures and other staff in companies.

On a more general note, a number of characteristics of what makes a great tester were highlighted to focus on teaching these skills as well. A great tester is: curious, a learner, an advocate, a good communicator, tech savvy, a critical thinker, accountable and a high achiever. When it comes to the learning side, a few more tips were shared around teaching through doing as much as possible, making it safe to fail, using industry experts and building up the learning over time.

Some interesting statistics were raised showing that those trained by WorkVentures over 6 months were equal to or greater in performance and value compared to relevant Uni graduates when rated by employers.

Lessons learnt:

  • Relational skills can be just as, if not more, important than technical skills in hiring new talent.
  • Learning in small steps with practical examples greatly improves the outcome.

Part 4 – Context Driven Testing: Uncut (elective)

Brian Osman talked about his experience growing in knowledge and abilities as a tester and how greatly that experience was shaped by testing communities. He explained how a community of like minded people can help drive learning as they challenge each other and bring different view points across.

A side note he introduced was a term called ‘Possum Testing’ which is how he described “Testing that you don’t value, motivated by a fear of some kind”, for example, avoiding using a form of testing because you don’t understand it or how to use it. This is an idea that many people would understand, but could perhaps find it hard to articulate and discuss. Giving it a name instantly provides a means to bring it up in conversation and have people already have a good idea of the context and any common ground in thinking.

Lessons learnt:

  • Naming ideas or common problems is a helpful way to direct future conversations and bring along the original context
  • When looking to improve in a certain area/skill, find a community of others looking to do the same thing.
  • Use these communities to present ideas, defend them and challenge other’s ideas. Debates are encouraged

Part 5 – Testing web services and microservices (elective)

Katrina Clokie (who also mentors me in conference speaking) spoke about her experience testing web services and microservices and a previous version of the talk is available online if you are interested. Starting with web services, she pointed out that each service will have different test needs based on who uses it and how they use it. Service virtualisation is a common technique used in service testing to isolate the front-end from the inconsistencies and potentially unstable back-ends. Microservice testing puts another layer in this model.

Some key guidelines for creating microservices automation were presented, claiming that it should be fit for purpose, remove duplication, be easy to merge changes, have continuous execution and be visible across teams.

An interesting learning technique was present called Pathways which can be found on her website which list a whole bunch of resources for learning about a new topic. They are a helpful way of directing your learning time with a specific goal in mind.

Lessons learnt:

  • Make use of Katrina’s pathways for learning about a new area (for myself or as recommendations to others)
  • Get involved in code creation as early as possible to help influence a culture of testability
  • Write any automation with re-usability and visibility in mind

Part 6 – Test Management Revisited (keynote)

Anne-Marie Charett finished up the day sharing some reflections and approaches she implemented from her time as Test lead at Tyro payments. She started by asking the question, do we still need test management? Which has been asked a few times in the community already. The response being that we do need a testing voice in the community to go with all the new roles and technology coming through like microservices, and DevOps. This doesn’t mean we need Test Managers who deal with providing stability, rather Test Leaders who can direct change. She talked about using the “Satir change Model” to describe the process of change and it’s effect on performance.

She brought a mentality to transforming Tyro to have the best test practice in Australia, and was not interested in blindly copying others. There is certainly benefits to learn from the approach others take, but should be assessed to meet your company’s environment. She discussed a number of testing related strategies that you might have to deal with: Continuous delivery, testing in production, microservices, risk-based automation, business engagement, embedding testing, performance testing, operational testing, test environments, training and growth.

The next question was how to motivate people to learn? Hand-holding certainly isn’t ideal, but you also probably can’t expect people to spontaneously learn all the skills you’d like them to have. This needs coaching! And the coaching should be focused around a task that you can then offer feedback on afterwards. Then challenge them to try it again on their own.

An important question to ask in identifying what skills to teach is in highlighting what makes a good tester at your company, because your needs will be different to other places. She then finished with a few guidelines around coaching based around giving people responsibilities, improving the environment they work in and continuing to adapt as different needs and challenges arise.

Lessons learnt:

  • Any practice/process being used by others should be analysed and adapted to fit your context, not blindly copied.
  • Be a voice for testing and lead others to make changes in areas they need to improve on
  • Think about what makes a good tester at my company and how I measure up
  • Help prepare the organization/team for change and help them cope as they struggle through it

That’s a wrap for Day 1, find my review of Day 2 here, where I took part in a workshop on Coaching Testers with Anne-Marie Charrett

Getting started testing Microservices


Microservices involve breaking down functional areas of code into separate services that run independently of each other. However, there is still a dependency in the type and format of the data that is getting passed around that we need to take into consideration. If that data changes, and other services were depending on the previous format, then they will break. So, you need to test changes between services!

To do this, you can either have brittle end-to-end integration tests that will regularly need updating and are semi-removed from the process, or you can be smarter, and just test the individual services continue to provide and accept data as expected, to highlight when changes are needed. This approach leads to much quicker identification of problems, and is an adaptive approach that won’t be as brittle as integration tests and should be a lot faster to run as well.

The Solution

What I’m proposing is to integrate contract-based testing. (Note, we are only in the early stages of trying this out at my work)
Here’s how it works:

Service A -- Data X --> Service B

Service A is providing Service B with some sort of data payload in the Json format X. We call Service A the provider and Service B the consumer. We start with the consumer (service B) and determine what expectations it has for the data package X and we call this the contract. We would then have standard unit tests that run with every build on that service stubbing out that data coming in from a pretend ‘Service A’. This means that as long as Service B gets it’s data X in the format it expects, it will do what it should with it.

The next part is to make sure that Service A knows that service B is depending on it to provide data X in a given format or with given data, so that it if a change is needed, Service B (or any other services dependent on X) can be updated in line, or a non-breaking change could be made instead.

This is Consumer-Driven contract testing. This is nice, it means that we can guarantee that Service A is providing the right kind of data that Service B is expecting, without having to test their actual connections. Spread this out to a larger scale, with 5 services dependent on Service A’s data, and Service A giving out 5 types of data to different subsets of services and you can certainly see how this makes things simpler, without compromising effectiveness.

A variation of this is to have Service B continue to stub out actually getting data from Service A for the CI builds. But instead of testing on Service A that it still meets the expected data format of Service B, we can put that Test on Service B as well, so it also checks the Stub being used to simulate Service A against what is actually coming in from Service A on a daily basis. When it finds a change, the stub is updated, and/or a change request is made to service A.
Both types have advantages and disadvantages.

In Practice

Writing these sort of tests can be done manually, but there are tools which help with this as well, making it easier. Two such products are Pacto and Pact. They are both written in Ruby, Pacto is by Thoughtworks and Pact by Out of these 2 I think Pact is a better option as it appears to be more regularly updated and have better documentation. PactNet is a .Net version of Pact written by SEEK Jobs which is the language used at my work, and so is the solution we’re looking into.

These tools provide a few different options along the lines of the concepts described above. One such use case is that you provide the tool with an http endpoint, it hits the endpoint and makes a contract out of the response (saying what a response should look like). Then in subsequent tests the same endpoint is hit and the result compared with the contract saved previously, so it can tell if there have been any breaking changes.
I’m not sure how well these tools go at specifying that there might only be part of the response that you really care about, and the rest can change without breaking anything. This would be a more useful implementation.

Further reading

Note that most of the writing available online about these tools is referring to the Ruby implementation, but it’s transferable to the .Net version.

Influential people

People that are big contributors to this space worth following or listening to:

  • Beth Skurrie – Major contributor and speaker on Pact from REA
  • Martin Fowler – Writes a lot on microservices, how to build them and test them, on a theory level, not about particular tools.
  • Neil Campbell – Works on the PactNet library

Got any experience testing microservices and lessons to share? Other resources worth including? Please comment below and I’ll include them

Running Parallel Automation Tests Using NUnit v3

With the version 3 release of NUnit the ability to run your automated tests in parallel was introduced (A long running feature request). This brings with it the power to greatly speed up your test execution time by 2-3 times, depending on the average length of your tests. Faster feedback is crucial in keeping tests relevant and useful as part of your software development cycle.

As parallel tests is new to v3, the support is still somewhat limited, but I’ve managed to setup our Automated Test solution at my work, Campaign Monitor, to run tests in parallel and wanted to share my findings.


We are using the following technologies, so you may have to change some factors to match your setup, but it should provide a good starting point.

  • Visual Studio
  • C#
  • Selenium Webdriver
  • TeamCity

The Setup

Step 1 – Install the latest NUnit package

Within Visual Studio, install or upgrade the NUnit v3 package for your solution via Nuget Package Manager (or your choice of package management tool) using:

Install-Package NUnit
OR Upgrade-Package NUnit

Step 2 – Choose the tests that will be run in parallel

Your Tests can be configured to run in parallel at the TestFixture level only in the current release. So in order to setup your tests to be run in parallel, you choose the fixtures that you want to run in parallel and put a [Parallelizable] attribute at the start of the TestFixture.

Step 3 – Choose the number of parallel threads to run

To specify how many threads will run in parallel to execute your tests, add a ‘LevelOfParallelism’ attribute to your file within each project with whatever value you desire. How many threads your tests can handle will be linked to the number of cores your machine running the test has. I recommend either 3 or 4.

Step 4 – Install a test runner to run the tests in parallel

Since this new to NUnit version 3, some test runners do not support it yet. There are 2 methods i’ve found which work.

1 – NUnit Console is a Nuget package that can be installed and runs as a command line. Install it using:

Install-Package NUnit.Console

then open a cmd window and navigate to the location of the nunit-console.exe file installed with the package. Run the tests with this command:

nunit-console.exe <path_to_project_dll> --workers=<number_of_threads>

This will then run the tests in the location specified and using the number of worker threads specified. Outputting the results in the command window.

2 – NUnit Test Runner is an extension for Visual Studio. Search for ‘NUnit3 Test Adapter’ (I used Version 3.0.4, created by Charlie Poole). Once installed, build your solution to populate the Tests into the Test Explorer view. You can filter the tests visible using the search bar to find the subset of tests you want (as decided in Step 2) and then click right click and run tests. This will also run your tests in parallel, using the ‘LevelOfParallelism’ attribute defined in Step 2 to determine the number of worker threads. This gives you nicer output to digest than the console runner, but still feels a bit clunky to use.

Your now setup and running your tests in parallel! Pretty easy right! The tricky part I found was then getting these tests to run in parallel when running through TeamCity, our continuous integration software.

(Optional) Step 5 – Configure TeamCity to run your tests in parallel

We use TeamCity to run our automated tests against our continuous integration builds, and so the biggest benefit to this project was to enable the TeamCity builds to run in parallel. Here’s how I did it

Note: First, I tried using MSBuild to run the NUnit tests as detailed here since this was the way we previously ran our tests before the NUnit v3 beta. However this didn’t work as it requires you to supply the NUnit version in the build script and that doesn’t support NUnit 3 or 3.0.0 or 3.0.0-beta-4 or any other variation i tried. So that was a no-go.

Second, I tried using the NUnit test build step and choosing the v3 type (only available in TeamCity v9 onwards). This lead me through a whole string of errors with conflicting references and unavailable methods and despite my best efforts, would not run the tests. So that was a no-go as well.

The method i decided upon was to use a command line step and run the NUnit console exe directly. So i first setup an MSBuild step that would just copy the NUnit console files and the Test Project files to a local directory on the Build Agent running the tests. Then I setup a command line step with these settings:

Run: Executable with parameters
Command Executable: <path_to_nunit-console.exe>
Command parameters: <path_to_test_dll's> --workers=<thread_count>

And with this, I was able to run parallel tests through team city! I’m sure the setup will get easier once support improves, but for now, it’s a good solution that gives our test suite 3-5 times faster results 🙂

Did you find this helpful, or have any tips you want to share? Please comment below!