Rebuild Confidence in Your Test Automation

DavidTz40 - Jan 16 '23 - - Dev Community

These days, development teams depend heavily on feedback from automated tests to evaluate the quality of the system they are working on.

Depending on the type of software testing and the development process used, running automated tests every night or after every commit provides teams with information on the impact of changes made to the code on its overall quality. With enormous power, however, comes great responsibility. The more you rely on feedback from your automated tests to decide whether to permit a build to proceed to the next step in your development pipeline, the more you must be able to rely on the software quality and defect-detection effectiveness of these very automated tests.

In other words, confidence in the quality of a system is essential, thus if the system is tested in an automated process at any stage, you should also trust in the integrity of these automated tests. Unfortunately, this is where test automation all too frequently falls short of its potential. Instead of being the solid and dependable defenders of product quality that they should be, automated tests are frequently a source of deception, annoyance, and ambiguity. They end up damaging the very trust they are supposed to provide in the first place.

How can we regain confidence in our automated tests? Let’s look at two ways automated tests might erode confidence rather than build it, and then consider what you can do to rectify the situation — or, better yet, prevent it from happening in the first place.

Here’s an Online Random Decimal Generator to generate randomly-generated decimal numbers in any range (between two positive numbers).

False Positives

Let us begin with a short explanation of false positive:

A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result incorrectly indicates the absence of a condition when it is actually present. These are the two kinds of errors in a binary test, in contrast to the two kinds of correct result (a true positive and a true negative). They are also known in medicine as a false positive (or false negative) diagnosis, and in statistical classification as a false positive (or false negative) error.[1] — wiki

This kind of destructive automated test occurs more frequently with user interface-driven testing, mainly because these tests have the biggest likelihood of failing (synchronization and timeout issues, exception handling, uncommunicated changes in the product, etc.). If your timeout and exception handling aren’t correctly designed, false positives can be quite time-consuming in terms of root cause analysis and preventing them in the future. When these false positives recur on an irregular basis it might be even more difficult to determine what is causing the problem.

When your team or business is using continuous integration or continuous deployment strategy for software development, false positives may be very aggravating. When your build pipeline contains tests that occasionally create false positives, your build may occasionally fail because your tests aren’t capable of handling exceptions and timeouts properly. This may eventually lead to you eliminating the tests from your pipeline entirely. Although this may temporarily alleviate your problem with shattered buildings, it is not a long-term approach. There’s a reason you’re putting in the effort to create these tests (right?), thus they should be included in the automated testing and delivery process. As a result, you should investigate the underlying cause of these false positives as soon as they arise and rectify them as soon as possible.

To avoid false positives in the first place, you should invest effort into developing strong and reliable tests, including adequate exception handling and synchronization procedures. This will surely require time, effort, and expertise up front, but a solid foundation will pay off in the long term by eliminating these incredibly annoying false positives.

Check out this free tool to random decimal fractions generator in the [0,1] interval.

False Negatives

Let us begin with a short explanation of false Negatives:

A false negative error, or false negative, is a test result which wrongly indicates that a condition does not hold. — Wiki

While false positives might be quite annoying, at least they make themselves known through error warnings or unsuccessful CI builds. The significant threat of decaying trust in test automation lies in negative cases: tests that pass when they should not. The more you depend on the outcome of your automated test runs when making procedural decisions, like a deployment into production, the more crucial it is that you can trust that your test results are an accurate reflection of the quality of your application under test, rather than a hollow shell of tests that pass but do not carry out the verifications they are supposed to.

False negatives are particularly difficult to identify and manage for two primary reasons:

  1. As previously stated, they do not actively demonstrate their presence, such as by displaying an error notice, as true positives and false positives do.

  2. While certain tests may be false negatives from the beginning, a large portion of false negatives is generated over time and after continuous revisions to the product under test.

Along with establishing new tests and upgrading outdated ones, a critical component of your test automation approach should be to frequently assess the fault detection performance of your existing test cases. This is especially true for tests that have been operating smoothly and successfully throughout their early beginnings. Are you certain they are still carrying out the correct assertions? Or would they also pass if the application under test functioned incorrectly? The following is a wise plan for constantly evaluating that your tests are still deserving of the confidence you put in them:

  1. When you develop your tests, test them. Depending on the type of test and assertion, this can be as easy as rejecting the assertion stated at the conclusion of the test and checking if the test fails. When you employ test-driven development, you’re automatically doing something similar because your tests won’t pass until you add actual production code.

  2. A passing test that is meaningless might be great for the statistical data, but it adds maintenance work you probably can do without. Regularly review your tests to ensure they still have their original defect detection power and are not redundant due to changes that have been made to your application since the test was developed.

I propose looking at mutation testing as a technique for establishing and maintaining a high-quality, robust test suite for unit tests. Here’s what it means:

Mutation testing (or mutation analysis or program mutation) is used to design new software tests and evaluate the quality of existing software tests. Mutation testing involves modifying a program in small ways.[1] Each mutated version is called a mutant and tests detect and reject mutants by causing the behavior of the original version to differ from the mutant. This is called killing the mutant. Test suites are measured by the percentage of mutants that they kill. New tests can be designed to kill additional mutants. Mutants are based on well-defined mutation operators that either mimic typical programming errors (such as using the wrong operator or variable name) or force the creation of valuable tests (such as dividing each expression by zero). The purpose is to help the tester develop effective tests or locate weaknesses in the test data used for the program or in sections of the code that are seldom or never accessed during execution. Mutation testing is a form of white-box testing -Wiki

Although doing a mutation testing run can be time-consuming, especially for large systems, it may provide you with significant insight regarding the quality and defect-detection capabilities of your unit test suite. Building reliable automated tests necessitate programming skills, but it may be even more crucial to possess the ability to determine whether to automate a test at all. Determining the most effective technique to automate a specific test also requires skill. I suggested and found the simplest answer to a test automation problem — a clear plan. If the road I’m on doesn’t lead to a straightforward answer, this option is probably not the most efficient either.

Also, explore this new tool that allows users to generate random HEX generator numbers.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .