Working Saturnin-based meta-command
Alois Mahdal 2defa7f607 Update TFKit to v0.0.16 7 gadus atpakaļ
..
templates Add TFKit v0.0.15 7 gadus atpakaļ
README.md Update TFKit to v0.0.16 7 gadus atpakaļ

README.md

TFKit

Installation

The easiest way is to embed TFKit within your repo, ie. clone TFKit and install it using:

make install DESTDIR=/path/to/your/repo

Now you can run your test suite using runtests binary:

$ cd /path/to/your/repo
$ utils/tfkit/runtest

Note that the above probably won't return any useful results as you still don't have any tests.

Writing tests

Tests can be written in any scripting language, although the built-in framework, written in Bash, provides some useful features for writing certain kind of relatively simple tests.

The harness, though, assumes that:

  • Any direct sub-directory of $TF_SUITE directory ("tests" by default) that contains at least TF_RUN executable becomes a test,

  • basename of this directory becomes the name of the test,

  • and return code from running the executable is reported as result of the test, according to "Exit status" section below.

Naming

Test name should start with name of the module that is tested and underscore. If module name contains dots, they should be replaced with underscores as well.

core_sanity
mod_submod_function
ini_iniread

are valid test names.

Data

Should the test need any data, just leave it around in the test directory along with TF_RUN.

Note that before running, the whole test directory is automatically copied to a temporary location (one per test), and should the test fail, copied back as a debugging relic. For this reason, do not store huge amounts of data here. If you really need huge data, consider obtaining it (and throwing it away) within runtime of TF_RUN.

Exit status

We try hard to follow this semantic:

  • Zero means OK -- test has been run and passed.

  • One means Failure -- test has been run but failed (e.g. found a bug).

  • Two means Bailout -- test has decided not to run at all.

  • Three means Error -- there was error detected during execution, but script was able to clean up properly.

  • Four means Panic -- there was other error but script was not able to clean up properly.

  • Anything else should indicate other uncaught errors, including those outside control of the program such as segfaults in the test code or test being SIGKILLed.

Notice that the higher the value is, the worse situation it indicates. Thus, if a test is composed of several sub-tests, you need to make sure to always exit with the highest value (subtest.sh does take care of this).

See common.sh for functions and variables to help with handling exit statuses with this semantic.

Also see Notes section for more details on exit statuses, including cheat sheet and dscussuion.

Framework

harness.sh

This part is not intended to be used in tests, but rather contains functions that help govern test discovery, preparation and execution as is described in previous sections. Feel free to poke around, of course.

subtest.sh

As name suggests, this file defines few functions to handle subtests in TF_RUN.

In order to make use of the subtests functionality, you will need to define two functions yourself: tf_enum_subtests to enumerate names of tests you want to run, and tf_do_subtest with actual test implementation.

The minimal TF_RUN with two subtests could look like this:

#!/bin/bash

. $TF_DIR/include/subtest.sh

tf_enum_subtests() {
    echo test1
    echo test2
    something && echo test3
}

tf_do_subtest() {
    case $1 in
        test1)  myprog foo ;;
        test2)  myprog bar ;;
        test3)  myprog baz ;;
    esac
}

tf_do_subtests

At the end, tf_do_subtests acts as a launcher of the actual test. In short, it will

  1. run tf_enum_subtests, taking each line as name of a subtest; for each subtest:

    1. source TF_SETUP, if such file is found,
    2. launch the tf_do_subtest() function with subtest name as the only argument,
    3. source TF_CLEANUP, if such file is found,
  2. and finally, report "worst" exit status encountered.

Note that subtest names need to be single words ([a-zA-Z0-9_]).

tools.sh

This file contains various tools and utilities to help with testing.

Curently there is only one function, tf_testflt designed to help write tests for simple unix filters.

tf_testflt

The idea is that tester specifies

  • test name,
  • command to launch the system under test,
  • a data stream to use as STDIN,
  • and expected STDOUT, STDERR, and exit status.

and tf_testflt launches the command, collects tha data and evaluates and reports the result using unified diff.

In its simplest form:

tf_testflt -n foo my_command arg

the function will run my_command arg (not piping anything to it), and will expect it to finish with exit status 0 and empty both STDERR and STDOUT.

Example of full form,

tf_testflt -n foo -i foo.in -O foo.stdout -E foo.stderr -S 2 myprog

will pipe foo.in into myprog, expecting exit status of 2, and STDOUT and STDERR as above. Notice that parameters specifying expected values are uppercase, and those specifying input values are lowercase.

Specifying name is mandatory, because it's used in reporting messages, and as a basis for naming temporary result files: these are saved in results subdirectory and kept for further reference.

common.sh

This includes simple functions and variables shared between both mentioned libraries.

First group is designed to help support the exit status semantic:

  • The functions are tf_exit_pass, tf_exit_fail, tf_exit_bailout, tf_exit_error and tf_exit_panic and each take any number of parameters that are printed on stderr.

  • The variables are TF_ES_OK, TF_ES_FAIL, TF_ES_BAILOUT, TF_ES_ERROR and TF_ES_PANIC and are supposed to be used with return builtin, e.g. to return from tf_exit_error.

Second group is useful to better control output: functions tf_warn, tf_debug and tf_think are used to print stuff on STDERR. Use of tf_warn is apparent, just as tf_debug, the latter being muted if TF_DEBUG is set to false (set it to true to turn on debugging).

tf_think is used for progress info, and is muted unless TF_VERBOSE is set to true, which is by default.

Setup and cleanup

Special files TF_SETUP and TF_CLEANUP (one of them or both) can be added along with TF_RUN. These will be sourced before (TF_SETUP) and after every subtest (TF_CLEANUP).

First, if any of these files are missing, it is considered as if the respective phase succeeded. Second, if setup phase fails, test will be skipped and subtest exit status will be TF_ES_BAILOUT. Last, if cleanup fails (no matter result of setup), subtests aborts with TF_ES_PANIC returned. Be aware that in this case the actual test status, albeit useful, is lost.

When coming from other test frameworks, this may feel harsh, but note that this has been designed with the idea that if a cleanup fails, it may render all further tests are automatically unsafe, because the environment is not as expected.

To cope with this behavior, try to bear in mind following advice:

  1. Make sure you write setup/cleanup procedures with extreme care and test them well.

  2. Do not do complicated and risky things in the setup/cleanup phases.

  3. If you need to do such things, consider doing them in the TF_RUN instead of doing them for all subtests.

  4. You don't need to clean up everything, the contents of the testing dir will be moved out from the test system.

  5. If there are scenarios you can safely fix or ignore, handle them in a robust manner.

Notes

bailout vs. tf_enum_subtests

One more note to claify relation of bailout and tf_enum_subtests. As you may have noticed, there are two ways how to skip a test: return prematurely with TF_ES_BAILOUT, or suppress enumeration in tf_enum_subtests. The problem is that the latter does not do anything to inform upper in the stack that a test has been skipped, which seems to break the principle described in previous sections.

Don't confuse these mechanisms, though. Each is supposed to be used for distinct purpose. Compare: by using the tf_enum_subtests you are saying that you actually did not even want to run the test in the first place. By using TF_ES_BAILOUT, you are saying that you wanted to run the test but could not.

A few common cases if that helps you:

  • If during the test you find out that for some reason it can't be carried out (e.g. an external resource is not available, or something outside the SUT is broken), use TF_ES_BAILOUT.

    tf_enum_subtests() {
        echo test1
        echo test2
        echo test3
    }
    
    tf_do_subtest() {
        case $1 in
            test1) do_stuff  ;;
            test2) do_other_stuff ;;
            test3) curl -s http://www.example.com/ >file \
                    || return $TF_ES_BAILOUT
                   do_stuff_with file ;;
        esac
    }
    
  • If you want to filter out some sub-tests for some platforms, e.g. a test for only 64-bit architectures, or a test only for Mac OS (IOW, you can safely say that running this sub-test would be totally pointless on this box), use tf_enum_subtests--just omit this test from enumeration.

    tf_enum_subtests() {
        echo test1
        echo test2
        if this_is_macos_x; then
            echo test3
        fi
    }
    
  • If you want to disable (comment out test) that you might not have implemented yet or is broken (and for some reason you still want it to haunt the test code) or something else outside SUT is broken and prevents you from running the test, use tf_enum_subtests and properly comment the reasons in code.

    tf_enum_subtests() {
        echo test1
        echo test2
    #   echo test3      #FIXME: implement after bz1234
    }
    
  • If in doubt, use TF_ES_BAILOUT.

On exit statuses: three and above

The difference in error, panic and higher values is subtle but important. Follow me as I try to explain:

  1. If script has changed something on the system outside the working directory, it is apparently expected to revert that change.

  2. Now if an error occurs, but the code responsible for cleaning up is safely run, you can say there was error but we have recovered.

  3. But if the change can't be reverted safely, we know that we have broken something and latter code may lead to weird results (including masking bugs(!)), it's time to panic (in the code, not in real life ;))

  4. And then there are corner cases like a bug in the script, OOM kill or timeout when the status will be different and not really controlled by the script. Such cases will have to be treated the same way as the "panic" case, but...

  5. the use of panic adds hint that the status has been set consciously by the script, albeit exiting "in a hurry"--without proper clean up.

Unfortunately there will be cases like above but with the error code less than four. Example is a Bash script syntax error, which returns 2, or Python exception which returns 1. Yes, in such cases the information conveyed by the exit status is wrong and you should do everything to avoid it.

Possibilities like "test has passed but then something blew up" exist, but conveying this information is responsibility of the test output.

Following table can be used as a cheat-sheet:

.---------------------------------------------------------------.
| e |    state of         |                                     |
| s |---------------------| script says                         |
|   | SUT   | environment |                                     |
|---|-------|-------------|-------------------------------------|
| 0 | OK    | safe        | test passed, everything worked fine |
| 1 | buggy | safe        | test failed, everything worked fine |
| 2 | ???   | safe        | I decided not to run the test       |
| 3 | ???   | safe        | Something blew up but I managed to  |
|   |       |             | clean up (I promise!)               |
| 4 | ???   | broken      | Something blew up and I rushed out  |
|   |       |             | in panic                            |
| * | ???   | broken      | ...nothing (is dead)                |
'---------------------------------------------------------------'

As you can see, following this semantic allows us to see both the state of the system under test (SUT) and the environment.

Following table illustrates how different statuses map to different scenarios with regard to test result as well as state of the environment:

.--------------------------------------------------.
| environment |  test result   |  test result      |
|             | pass fail unkn | pass fail unkn    |
|-------------|----------------|-------------------|
| clean(ed)   |  0    1    3   |  OK  FAIL ERROR   |
| untouched   |  ~    ~    2   |  ~    ~   BAILOUT |
| mess        |  ~    ~    4   |  ~    ~   PANIC   |
| ?! (trap)   |  ~    ~    5   |  ~    ~   ~       |
| ?! (sig 9)  |  ~    ~    137 |  ~    ~   ~       |
| ?! (aliens) |  ~    ~    ?   |  ~    ~   ~       |
'-------------|----------------|-------------------|
              |  exit status   |  human-readable   |
              |                |  name (TF_ES_*)   |
              '------------------------------------'