The Known and Unknown Costs of CI Testing
The purpose of software testing is to reduce the risk of negative effects resulting from unexpected behaviors of your application at the lowest possible cost. So let’s talk about the various costs associated with testing code in cloud applications.
Why Is Continuous Integration (CI) Testing Important?
Testing during the continuous integration phase of the application development life cycle is a lifesaver: It enables you to fix problems even before they occur, which prevents production issues that would affect your end-users, customers, or your business. Additionally, testing lets you speed up the continuous delivery process and reduce integration failures.
Known Costs of CI Testing
Issues in production can be extremely costly, but no one wants their end users to be affected by issues in their applications. That’s why we use software tests: to detect errors, defects, and bugs as early in the development cycle as possible to minimize the cost and create a better experience for end-users. However, tests do have some known costs, such as the following:
- Building cost: This is the combined cost of time, money, and effort being spent to create tests specific to your applications. It also includes the cost of making your system available for running tests.
- Running cost: The cost of the compute resources during the runtime duration of the tests.
- Sustainability cost: The cost of changes needed to adopt the tests to the software alteration.
Software teams spend many man-hours and cloud resources on running their tests, meaning the cost of software tests should be viewed as just another cost item in the software development process.
It is possible to reduce execution costs in several ways. For example, you can decide to only run tests that affect the code that has changed, uncouple tests that overlap with each other, replace slow input data with mocks or test doubles to speed up testing times, or stop a test suite right after it proves there is problematic feedback in order to apply the fail-fast system.
During different runs, some tests can both fail and pass with exactly the same codebase. These are known as flaky tests, which are more costly than others because they increase the risk of a crash or slowdown of applications while also introducing ambiguity. If flaky tests are not dealt with in the organization, then developers start to feel the ramifications of ignoring them.
Since flaky tests have a high potential to badly affect your end users, you should measure and reduce them to a minimum when possible. Tests can be flaky up to a certain degree in cases such as I/O operations that are required to run. In these cases, rewriting tests to prevent flakiness can be very costly (and also complex) –yet also necessary. To prevent or minimize flakiness, you can add retries or strengthen the I/O operations against transient failures to prevent or minimize the flakiness. Additionally, you can accelerate the speed of your tests by recording and replaying the traffic using mocks for external dependencies.
Unknown Costs of CI Testing
Testing will never be able to tell us that an application is 100% reliable because testing is about managing risk. The coverage of traditional tests should not be a target that an engineering team focuses on, as it cannot indicate the quality of service but instead only calculates how extensive unit tests are.
The chart below reflects the common problems encountered during software testing. Long-running tests, flaky tests, and tests that fail can result from unknown causes and can therefore lead to unknown costs.
How to Optimize Cost
One way to optimize a system is by balancing execution cost with feedback time. For example, if we reduce the execution cost, then it will take us longer to determine whether or not our work has been successful.
If you run tests less frequently for every code commit, then the gaps in the test execution map will increase. To ensure a project's success, you need to strike a balance between the tolerance of risk and the cost of testing.
While some parts of an application may need to be tested thoroughly, others may not need to be tested as extensively. You should always think about the best way to test your scenarios, such as new features, bug fixes, or dependency failure handlers, to keep testing costs as low as possible.
For example: Would it be enough to use a unit test? Is it okay to use test doubles and mocked dependencies in tests, or should tests run against real cloud resources? Should applications be instrumented and undergo canary deployment in production in order to make tests flawless?
A developer or a development team should evaluate these questions to determine if there is a valid enhancement on a case-by-case basis. Moreover, developers should be familiar with a wide range of testing methodologies and be confident enough to choose one or more of them to ensure the cost and performance effectiveness of tests.
How Do We Plan for Tests That May or May Not Pass?
One way that we can test for unknown problems is by introducing random data to an application. Randomized testing, as it is sometimes called, introduces data that the developer has not generated intentionally and utilizes existing tests for expected behaviors. See the chart below for more information:
When testing our application, our goal is to move as much information to the “Known Knowns” quadrant as we possibly can. On the other hand, every change we make in our application invalidates the information in the "Known Knowns" quadrant.
Testing allows us to move information from the "Known Unknowns" to "Known Knowns," because we know which questions we should ask in the "Known Unknowns" quadrant. So, when we run tests with those questions in mind, we have a good chance of getting our answers.
This can be done in a quick, automated way. Some examples of questions you might ask are, "Will this function provide the expected outcome when these specific arguments are provided?", "Does my service return an HTTP 400 status when given an invalid payload?", or "Does the UI show me an error message when I enter a password that doesn’t match what is provided on sign-up?"
"Unknown Unknowns” are the issues that become apparent only when they arise, because we are not aware of these issues and potentially, we may not easily understand them.
We cannot predict which components of our application will fail without knowing what can actually fail. The best solution to deal with this problem is to have adequate instrumentation and good debugging tools in place.
It's always a good idea to go for the least costly tests to avoid regressions and to move information to the "Known Unknowns" quadrant, especially if the root cause is not just a transitory operational issue, but something that we are likely to encounter again in the future.
As organizations move to the cloud and adopt Agile methodologies and DevOps, developers are increasingly responsible for more tasks. No longer dedicated QA, developers write more tests, are responsible for operating services (YBIYRI), and are expected to do more for testing, security, etc.
The most efficient way to build resilient, performing, simple/easy to debug, and easy-to-maintain applications is by testing your applications. Testing enables organizations to ship code faster, but when you don’t do it carefully, it can be costly in many ways. Modern applications are distributed, dynamic, and remote. As a result, new failure modes have entered the scene and they are making work increasingly difficult to anticipate and troubleshoot.
The best way to avoid software faults is to use more than one quality control strategy. In order to accurately identify, diagnose, and correct problems more effectively, it's important to review traditional testing techniques and adopt new ones for every stage of the development life cycle. Testing will not be an effective means of low-defect software production until this is implemented.
At every testing stage of every software development project, it’s highly potential to face “Unknown Unknowns.” You need to have the proper tooling in place to immediately tame those monsters. Foresight enables you to monitor your CI pipelines, and debug and troubleshoot your tests.
Detecting the root causes of erroneous tests and understanding the reasons behind slow builds is easy with Foresight. Sign up for Foresight and try it yourself.