Mutation Testing in Patterns¶
Mutation testing is a technique used to evaluate the quality of existing software
tests. Mutation testing involves modifying a program in small ways, for example
True constants with
False and re-running its test suite.
When the test suite fails the mutant is killed. This tells us how good the
test suite is. The goal of this paper is to describe different software and
testing patterns related using practical examples.
Some of them are language specific so please see the relevant sections for information about installing and running the necessary tools and examples.
Mutation testing tools¶
This is a list of mutation testing tools which are under active use and maintenance from the community:
For LLVM-based languages such as C, C++, Rust and Objective-C checkout [mull](https://github.com/mull-project/mull), which looks like a nice project but may not be ready for production use!
Make sure your tools work¶
Mutation testing relies on dynamically modifying program modules and loading the mutated instance from memory. Depending on the language specifics there may be several ways to refer to the same module. In Python the following are equivalent
import sandwich.ham.ham obj = sandwich.ham.ham.SomeClass() from sandwich.ham import ham obj = ham.SomeClass()
The equivalency here is in terms of having access to the same module API.
When we mutation test the right-most
ham module our tools may not
be able to resolve to the same module if various importing styles are used.
For example see Mutant not killed due to module import issue.
Another possible issue is with programs that load modules dynamically or change the module search path at runtime. Depending on how the mutation testing tool works these operations may interfere with it. For example see Mutant not killed when dynamically importing module.
Make sure your tests work¶
Mutation testing relies on the fact that your test suite will fail when a mutation is introduced. In turn any kind of failure will kill the mutant! The mutation test tool has no way of knowing whether your test suite failed because the mutant tripped one of the assertions or whether it failed due to other reasons.
Make sure your test suite is robust and doesn’t randomly fail due to external factors! For example see Mutant killed due to flaky test.
Divide and conquer¶
The basic mutation test algorithm is this
for operator in mutation-operators: for site in operator.sites(code): operator.mutate(site) run_tests()
- mutation-operators are the things that make small changes to your code
- operator.sites are the places in your code where operators can be applied
As you can see mutation testing is a very expensive operation. For example the pykickstart project started with 5523 possible mutations and 347 tests, which took on average 100 seconds to execute. A full mutation testing execution needs more than 6 days to complete!
In practice however not all tests are related to, or even make use of all program modules. This means that mutated operators are only tested via subset of the entire test suite. This fact can be used to reduce execution time by scheduling mutation tests against each individual file/module using only the tests which are related to it. The best case scenario is when your source file names map directly to test file names.
For example something like this
for f in `find ./src -type f -name "*.py" | sort`; do TEST_NAME="tests/$f" runTests $f $TEST_NAME done
Where runTests executes the mutation testing tool against a single file and executes only the test which is related to this file. For pykickstart this approach reduced the entire execution time to little over 6 hours!
Other tools and languages may have a convention of how tests are organized or which tests are executed by the mutation testing tool. For example in Ruby the convention is to have all tests under spec/*_spec.rb which maps with the idea proposed above. Mutant, the Ruby mutation testing tool, uses this convention to find the tests it needs. For Python, on the other hand, the user needs to manually specify which tests should be executed!
Mutation testing relies on your test suite failing when it detects a faulty mutation. It doesn’t matter which particular test has failed because most of the tools have no way of telling whether or not the failed test is related to the mutated code. That means it also doesn’t matter if there are more than one failing tests so you can use this to your advantage.
Whenever your test tools and framework support the fail fast option make use of it to reduce test execution time even more!
Refactor comparison to empty string¶
Comparison operators may be mutated with each other which gives, depending on the langauge, about 10 possible mutations.
S is not an empty string the following 3 variants
are evaluated to
if S != ""
if S > ""
if S not in ""
The existing test cases pass and these mutations are never killed. In languages like Python, non-empty sequences are evaluated to True in boolean context and you don’t need to use comparisons. This reduces the number of possible mutations.
For Python you may use the emptystring extension of pylint
pylint a.py --load-plugins=pylint.extensions.emptystring
In some cases empty string is an acceptable value and refactoring will change the behavior of the program! Be careful when doing this.
Refactor comparison to zero¶
This is similar to the previous section but for integer values. For Python use the comparetozero extension to detect possible offenses.
pylint a.py --load-plugins=pylint.extensions.comparetozero
See pylint #1243 for more info.
Python: Refactor len(X) comparisons to zero¶
X is not an empty sequence the following variants
are evaluated to
True and result in surviving mutants:
if len(X) != 0
if len(X) > 0
Additionally if we don’t have a test to validate the
for example that it raises an exception, then the following mutation
will also survive:
if len(X) < 0
Refactoring this to
if X: do_something()
is the best way to go about it. This also reduces the total number of possible mutations. A more complicated example, using two lists and boolean operation can be seen below.
- if len(self.disabled) == 0 and len(self.enabled) == 0: + if not (self.disabled or self.enabled):
Consider the following example
# All the port:proto strings go into a comma-separated list. portstr = ",".join(filteredPorts) if len(portstr) > 0: portstr = " --port=" + portstr else: portstr = ""
Similar to previous examples the
len() > 0 expression can be refactored.
Since joining an empty list will produce an empty string the
is not necessary. The example can be re-written as
# All the port:proto strings go into a comma-separated list. portstr = ",".join(filteredPorts) if portstr: portstr = " --port=" + portstr
In pylint 2.0 there is a new checker called len-as-condition which will warn you about code snippets that compare the result of a len() call to zero. For more information see pylint #1154.
For practical example see Refactor if len(list) != 0.
Python: Refactor if len(list) == 1¶
The following code
if len(ns.password) == 1: self.password = ns.password else: self.password = ""
can be refactored into this
if ns.password: self.password = ns.password else: self.password = ""
This refactoring may have side effects when the list length is greater than 1, e.g. 2. Depending on your program this may ot may-not be the case.
Testing for X != 1¶
When testing the not equals condition we need at least 3 test cases:
- Test with value smaller than the condition
- Test with value that equals the condition
- Test with value greater than the condition
Most often we do test with value that equals the condition (the golden scenario) and either one of the other bordering values but not both. This leads to mutations which are not killed.
Example Testing for X != 1.
Python: Refactor if X is None¶
When X has a value of None the following mutations are equivalent are will survive:
if X is None:
if X == None:
in addition static analyzers may report comparison to None
as an offence. To handle this refactor
if X is None: to
if not X: when possible.
For example see Refactor if X is None.
Python: Refactor if X is not None¶
This is the opposite of the previous section.
if X is not None: to
For example see Refactor if X is not None.
Python: Testing __eq__ and __ne__¶
When objects are compared by comparing their attributes then full mutation test coverage can be achieved by comparing the object to itself, comparing to None, comparing two objects with the same attribute values and then test by changing the attributes one by one.
For example see Testing __eq__ & __ne__.
Consider if there is the following mistake in the example:
def __eq__(self, other): if not y: return False return self.device and self.device == y.device
Notice the redundant self.device and in the expression above! When self.device contains a value (string in this case) the expression is equivalent to self.device == other.device. On the other hand when self.device is None or an empty string the expression will always return False!
If we have all of the above tests (which mutation testing has identified) then our test suite will fail and properly detect the defect
$ python -m nose -- tests.py F..... ====================================================================== FAIL: Newly created objects with the same attribute values ---------------------------------------------------------------------- Traceback (most recent call last): File "~/example_07/tests.py", line 15, in test_default_objects_are_always_equal self.assertEqual(self.sandwich_1, self.sandwich_2) AssertionError: <sandwich.Sandwich object at 0x7f4603cece80> != <sandwich.Sandwich object at 0x7f4603ceceb8> ---------------------------------------------------------------------- Ran 6 tests in 0.001s FAILED (failures=1)
Python: Testing sequence of if == int¶
To completely test the following pattern
if X == int_1: pass elif X == int_2: pass elif X == int_3: pass
you need to test with all descrete values plus values outside the allowed set. For example see Testing sequence of if == int statements
Python: Testing sequence of if == string¶
To fully test the following pattern
if X == "string_1": pass elif X == "string_2": pass elif X == "string_3": pass
you need to test with all possible string values as well as with values outside the allowed set. For example see Testing sequence of if == string statements.
Python: Missing or extra parameters¶
Depending on how your method signature is defined it is possible to either accept additional parameters which are not needed or forget to pass along parameters which control internal behavior. Mutation testing helps you identify those cases and adjust your code accordingly.
For example see Missing or extra method parameters.
Python: Testing for 0 <= X < 100¶
When testing numerical ranges we need at least 4 tests:
- Test with both border values
- Test with values outside the range, ideally +1/-1
- Testing with a value in the middle of the range is not required for full mutation coverage!
For example see Testing for 0 <= X <= 100.
Python: On boolean expressions¶
When dealing with non-trivial boolean expressions mutation testing often helps put things into perspective. It causes you to rethink the expression which often leads to refactoring and killing mutants. For example see Testing and refactoring boolean expressions.
Refactor multiple boolean expressions¶
Consider the following code where the expression left of
is always the same
if name == "method": self._clear_seen() if name == "method" and value == "cdrom": setattr(self.handler.cdrom, "seen", True) elif name == "method" and value == "harddrive": setattr(self.handler.harddrive, "seen", True) elif name == "method" and value == "nfs": setattr(self.handler.nfs, "seen", True) elif name == "method" and value == "url": setattr(self.handler.url, "seen", True)
This can easily be refactored by removing the
name == "method" expression and making the subsequent if
statements nested under the first one.
if name == "method": self._clear_seen() if value == "cdrom": setattr(self.handler.cdrom, "seen", True) elif value == "harddrive": setattr(self.handler.harddrive, "seen", True) elif value == "nfs": setattr(self.handler.nfs, "seen", True) elif value == "url": setattr(self.handler.url, "seen", True)
The refactored code is shorter and provides less mutation sites thus reducing overall mutation test execution time. This code can be refactored even more aggressively into
if name == "method": self._clear_seen() if value in ["cdrom", "harddrive", "nfs", "url"]: setattr(getattr(self.handler, value), "seen", True)