ASPECT makes use of a large suite of tests to ensure correct behavior. The test suite is run automatically for each change to the Github repository, and it is good practice to add new tests for any new functionality.
In order to run the tests, it is necessary to have either Diff or Numdiff to compare the results to the known good case. Diff is installed by default on most Linux systems, and Numdiff is usually available as a package so this is not a severe limitation. While it is possible to use Diff, Numdiff is preferred due to being able to more accurately identify whether a variation in numerical output is significant. The test suite is run using the ctest program that comes with cmake, and should therefore be available on all systems that have compiled ASPECT.
After running cmake and then compiling ASPECT, you can run the testsuite by saying ctest. By default, this will only run a small subset of all tests given that both setting up all tests (several hundred) and running them takes a non-trivial amount of time. To set up the full test suite, you can run
in the build directory. To run the entire set of tests, then execute
Unless you have a very fast machine with lots of processors, running the entire testsuite will take hours, though it can be made substantially faster if you use
where <N> is the number of tests you want ctest to run in parallel; you may want to choose <N> equal to or slightly smaller than the number of processors you have. Alternatively, you can run only a subset of all tests by saying
where <regex> is a regular expression and the only tests that will be run are those whose names match the expression.
When ctest runs a test, it will ultimately output results of the form
While the small default subset of tests should work on almost all platforms, you will find that some of the tests fail on your machine when you run the entire testsuite. This is because success of failure of a test is determined by looking at whether its output matches the one saved at the time when the test was written to the last digit, both as far as numerical output in floating point precision is concerned (e.g., for heat fluxes or other things we compute via postprocessors) as well as for integers such as the number of iterations that is printed in the screen output.44 Unfortunately, systems almost always differ by compiler version, processor type and version, system libraries, etc, that can all lead to small changes in output – generally (and hopefully!) not large enough to produce qualitatively different results, but quantitatively large enough to change the number of iterations necessary to reach a specific tolerance, or to change the computed heat flux by one part in a million. This leads to ctest reporting that a test failed, when in reality it produced output that is qualitatively correct.
Given that some tests are expected to fail on any given system raises the question why it makes sense to have tests at all? The answer is that there is one system on which all tests are supposed to succeed: This system is a machine that churns through all tests every time someone proposes a change to the ASPECT code base via the ASPECT GitHub page.45 Upon completion of the test suite, both the general summary (pass/fail) and a full verbose log will available from the GitHub page. Because the official test setup is set up in a Docker container, it is simple to replicate the results on a local machine. To this end, follow the instructions in Section 3.1 to set up Docker, and then run the following command in any terminal (replace ASPECT_SOURCE_DIR with the path to your ASPECT directory):
This command executes the shell script cmake/compile_and_update_tests.sh inside the docker container that contains the official ASPECT test system. Note that by mounting your ASPECT folder into the container you are actually updating the reference test results on the host system (i.e. your computer).
To write a test for a new feature, take one of the existing parameter files in the tests/ directory, modify it to use the new feature, check that the new feature does what it is supposed to do, and then just add the parameter file to the tests directory. You will then need to add another folder to that directory that is named like the parameter file, and add the model output files that prove that the feature is working (rename log.txt to screen-output for historical reasons). The test and output files should be as small and quick to run as possible.
When you make a new test part of a pull request on GitHub, then as explained above that will lead to a run of all tests – including your new one – on a “reference machine”. The reference machine that runs the tests may of course produce slightly different results than the machine on which a pull request was developed and from which the output was taken. If this has been confirmed to be the source of a failed test run, a file that contains the differences between the patch content and the tester output will be available from the GitHub page and can be applied to the patch to make it pass the tester. On the other hand, if a change leads to even a single existing test failing on that system, then we know that some more investigation as to the causes is necessary.
44This is not actually completely true. Rather, if cmake finds a program called numdiff on your system, it uses numdiff to compare the output of a test run against the saved output, and calls two files the same if all numbers differ by no more than some tolerance.
45This is again not completely true: The test machine will only go to work for pull requests by a set of trusted maintainers since the test machine will execute code that is proposed as part of the pull request – posing a security risk if anyone’s patches were allowed to run on that system. For pull requests by others, one of the trusted maintainers has to specifically request a test run, and this will usually happen as soon as the patch developer indicates the patch is ready for review.