How to Fix Your Failing End-to-End Tests?
If you follow software development best practices, you’ve probably written quite a few tests for your code. Unit tests, integration tests, smoke tests, black-box tests, and maybe even end-to-end (E2E) tests.
What Is E2E Testing?
While unit tests try to check the correctness of one encapsulated part of your code, E2E tests try to check that every part of your system works together correctly. In short, E2E tests check your whole apxplication from front to back.
You can perform an E2E test by using your system as if you were a real user. Depending on the kind of software you’ve created, this might mean clicking through a GUI, calling a CLI, or sending requests to API endpoints. You can do this manually, but it’s most effective if done automatically by an E2E testing framework that pretends to be a real person.
The purpose of E2E tests is to test a system as it would be used in production. Some of the tasks it checks are:
- Clicking on a button
- Activating a loading indicator
- Sending a request to an API
- Accessing the database
- Sending the database records via API response back to the GUI
- Deactivating the loading indicator again
As you can imagine, using every feature of an application can take quite some time, even if it’s automated. Don’t be surprised if your E2E test suites take longer than an hour. Long-running tests are expensive, so it’s important to find errors quickly. Especially in modern microservice architectures, you end up with many moving parts, each of which could be carrying a bug.
How can you understand why your tests fail? Let’s look at the most common issues that arise when you’ve built a sufficiently complex E2E testing suite.
Why Do E2E Tests Fail, and How Can You Fix Them?
Contemporary software systems are often built on a microservices architecture pattern. In this pattern, rather than build one huge code-base for a single application, systems are separated into multiple smaller code-bases. These implement services that provide a specific feature, like authentication or monitoring, and those services can have performance problems, network issues, or bugs in their code—any of which can lead to failing tests.
Inter-Service Communication is one of the biggest reasons for E2E test failures. If you have multiple microservices calling each other, many things can go wrong along the way. You might load a shopping cart while trying to start a checkout, each of which is part of a different service. In the end, though, you’re presented with a generic “500 Internal Server Error” that doesn’t tell you anything about what went wrong, let alone where it happened.
Let’s look at an example of a GUI and two services that call each other.
The frontend calls an API endpoint:
The API endpoint, in turn, calls a backend service:
The backend service loads a record from a database:
Any of these parts can fail, leaving you to wonder if the database is down or if the API can’t reach the service because of network issues. If you want to solve such problems, you have to go a step further than just logging your errors—you need test monitoring that provides you with distributed tracing.
Such a tracing system will mark all events in your system with a unique ID related to the action you’ve taken. This way, you can see where the chain of events stopped inside your microservice architecture so that you can locate the culprit of your failing tests deeper in your stack.
If you have multiple services communicating over the network, responses won’t be instantaneous. Thus, you’ll have to implement some waiting mechanisms in your code.
The naive solution to this is waiting for a specific number of seconds. You see that the service usually completes in 500 milliseconds, so you set the waiting time to two seconds—just to be safe—and call it a day.
After a few days or weeks, your tests become flaky, meaning they sometimes fail without you even changing the code related to them. Your network is slower than usual, and two seconds of the wait time isn’t enough anymore. Furthermore, if performance improves, your tests will still wait for two seconds and your test suite won’t gain any benefits from the improvements. Long-running tests would especially profit from these performance gains.
A better solution is to make your waiting time dynamic. Don’t wait for two seconds, wait for a specific event, like a response from an API endpoint. You can see this in the command below.
It’s also a good idea to include historical execution time as success criteria. While you want to make your E2E tests as robust as possible, a test that suddenly takes ten times as long could be a performance regression and should be addressed as a real issue by your development team.
A test that requires the successful execution of another test can be a huge source of frustration. If one test fails due to a bug, ten unrelated tests that depend on it will also fail.
In the following example, the second test depends on the first to succeed because they both use the record variable. However, only the first test sets the variable to an actual record from the database.
It may be that the test works as it should, but it doesn’t leave the environment in an acceptable state. Now you have a bug that isn’t directly related to the test that is marked as failed.
To address this, try to encapsulate your tests as much as possible. Use the setup and teardown functionality of your testing framework so you can create a clean environment for every test that runs. It’s a bit more work to get started, but it can save you days or even weeks when you have to debug your code.
Conflating Application Errors and Infrastructure Errors
There are multiple reasons a test fails. It could be related to the actual application code your tests try to validate, or to the infrastructure they’re running on.
Computers don’t have infinite memory, CPU, or HDD space. So when your test suite grows or other processes are running on your testing hardware, these resources can become exhausted.
Troubleshooting test failures will be much harder if you have to check if your test fails because of memory exhaustion or an actual bug. This can be alleviated by splitting your error reporting into application errors and infrastructure errors.
Did everything go well? Mark the test green.
Got an infrastructure-related issue, like a network timeout, or the HDD is full? Mark the test yellow.
It might be that you have a regression, and your services suddenly write more data than they should. But it could simply be that you forgot to delete old data or someone else is writing on that disk.
Did anything else go wrong? Mark the test red. That means it’s time to investigate your own bugs.
This structured approach will make fixing test errors much easier in the future because you know where to look for a problem at a glance.
With E2E testing, you touch multiple parts of your application at once. In the age of microservice architectures, this can mean you touch multiple services on multiple distributed machines with every test. This can lead to problems that aren’t necessarily related to your application code, but to your system’s immense infrastructure.
That said, E2E tests are the closest you’ll get to running your system with real users in production. Getting your E2E tests under control will give you confidence that your system works as intended.
So, how can you find test failures faster? Foresight offers comprehensive visibility into your tests with its testing insights and analytics module by which you can identify slow and unreliable tests. Sign up to learn what’s really breaking your tests.