Handling Increased Test Failures and Detecting Longer Test Durations
We all want to deliver our products quickly in order to stay relevant in a crowded market. But churning out new releases often compromises quality. In order to balance efficiency and efficacy, we use tests to ensure that our applications will meet the high standards of our customers.
When we first release a product, it isn’t burdened by a lot of tests, so the pipeline works and completes the builds in a matter of minutes. But as we add more features and more capabilities are integrated into the product, our tests start to take a lot longer. This is especially true if you’ve integrated end-to-end tests into your CI/CD pipeline.
If we can’t optimize the time it takes to run our tests, we can’t expect to deliver the software on time. And with so many competitors in the market, every delay could mean a loss in customers.
So, how do we detect these long-running tests and handle the increasing number of failed tests in our workflow? In this article, we’ll address these issues and give you solutions based on real-world examples.
Growing Complexity in Tests
There are many reasons why the complexity of your automated tests is increasing in your CI pipeline. To help you identify what’s causing your tests to fail or take a long time, we’ve grouped these causes into a few useful categories and included examples.
Application or Test Code Changes
Let’s say our application has an exploration feature that allows users to get book data from a MySQL database. As our application gains popularity, the increased traffic to the system slows down the query response. So, to help with this, we decided to apply caching with Redis.
The exploration feature in our application has a rule that only a fully authorized user can access certain data, like pricing and summary. So, regular users don’t have permission to view that information, but they may eventually upgrade in order to gain access to those features. However, since we applied caching with Redis, those upgraded users will only see the old data, without the pricing and summary content.
In order to resolve this issue, we can add a mechanism to check the user’s permissions. If they’re upgraded, we’ll skip the mode for using Redis caching and instead query the real result. We can then apply caching for the next time.
If we create a new book listing, we’d implement automated tests to check that the book content has loaded. Later, we may offer a new feature that allows users to delete books, so we’ll also automate tests that delete all the books. However, if we apply parallel testing to make the tests run faster, our test for creating a new book will become flaky.
This is because the first test is asserting whether the new book is created or not while the second test is deleting all the books. So, the first test fails because there is no book, and the second test passes because all books are deleted.
Test Data Changes
Sometimes, our automated tests need access to some specific data. But if folks from the QA team don’t know that, they might update the permission of that user, or test case, and remove their admin status. As a result, all the tasks related to admin will fail.
To prevent this from happening, we can establish practices to communicate to other users that this data is for automated testing only. For example, the username for automated testing should start with "Automation."
Topology or Flow Changes
For our application, we have an end-to-end test case for creating new user accounts. After a while, the business team decides that users have to agree with our terms and conditions in order to create a new account. Since our end-to-end test does not have a step for checking the box that shows a user agrees to the terms and conditions, it failed.
This is a simple error and can be spotted right away. To fix this test, we just need to add a step to confirm that users have checked the box to agree with terms and conditions.
Another issue could come from new features. For example, if a user buys an item, and the test execution time is fast and performs just as we expected. We then decide to apply a module to compare prices between different stores in town. Suddenly, the execution for this test takes twice as long. As usual, we check logs and take the time to debug and find out what went wrong. It turns out, our new module is making the web application much slower.
Since this is related to the application’s behavior, we should check the production code and optimize the performance when comparing prices. This can be done by having more nodes for our databases, or by applying caching data in order for query results to be returned fast.
Finally, if our application has advertising pop ups, it’s hard to predict when a popup will be shown. We might have tests that need to download or view part of the application, but with the advertising popup, we can’t know when it’s time to close it. Though challenging, a quick solution is to apply the browser extension that disables all advertising.
Demonstration: Testing a Simple Service
To illustrate some of the issues we described above, we’ll demonstrate how simple test cases that run just fine at first can develop problems as we integrate more features into our application.
To get started, we’ll create a service that manages movies using a Java Spring Boot framework and a MySQL database. You can refer to this github repository for the full code.
We’ll start the service by running the Maven command shown below:
Now, let’s try to create a new movie so we can test the service using Postman.
Apply Foresight to Monitor the Tests
In order to monitor the tests using Foresight, we’ll need to upload our test reports by updating our YAML file with the following.
We’ll then run the test with Maven using this command:
Alright, now we can go to the Foresight dashboard to see the test result.
The test passed, and if we click on the test’s details, we can see the history of this test in our recent runs.
To manage the movies in our application, we need to add features to retrieve movie content and to delete outdated movies. Now that we’ve created these features, after we create a movie, we can get movie content and delete the movie using the API to keep the environment clean. Check this gist for the creating new movie test.
Let’s run the test again.
This time, the test failed. Let’s see what went wrong.
We’ll click on the detailed result for the failed test to get more information.
The test failed because there are assertions for two fields: movie title and movie content. Let’s take a look at our test code again.
Looking back at our test code, we can see that the test failed because we accidentally added the “delete” step before using “retrieve movie API.” What a mistake!
With Foresight, you can monitor your tests, review logs, and easily learn why your tests failed. Foresight also helps you monitor everything, from integration tests to end-to-end tests.
It’s not so hard to create automated tests and integrate them into the CI/CD pipeline. However, maintaining the increasing number of tests over time is a real challenge. Without the proper tool for monitoring tests, your team can spend days trying to debug and optimize your tests’ duration.
Combining open-source tools and implementing them is always an option, but it’s a lot harder than Foresight’s one-stop-shop. Managing a hodgepodge of tools also takes a lot longer and pulls resources from the whole team. Foresight, on the other hand, can be integrated into the system in a matter of minutes.
With Foresight in hand, you can spend more time writing tests, and less time debugging failed and long-running tests. Take your free Foresight account and start improving your tests' performance today.