TFKit
=====


Installation
------------

The easiest way is to embed TFKit within your repo, ie. clone TFKit and
install it using:

    make install DESTDIR=/path/to/your/repo

Now you can run your test suite using *runtests* binary:

    $ cd /path/to/your/repo
    $ utils/tfkit/runtest

Note that the above probably won't return any useful results as you still
don't have any tests.


Writing tests
-------------

Tests can be written in any scripting language, although the built-in
framework, written in Bash, provides some useful features for writing
certain kind of relatively simple tests.

The harness, though, assumes that:

 *  Any direct sub-directory of `$TF_SUITE` directory ("tests" by default)
    that contains at least *TF_RUN* executable becomes a test,

 *  basename of this directory becomes the name of the test,

 *  and return code from running the executable is reported
    as result of the test, according to "Exit status" section below.


Naming
------

Test name should start with name of the module that is tested and
underscore.  If module name contains dots, they should be replaced with
underscores as well.

    core_sanity
    mod_submod_function
    ini_iniread

are valid test names.


Data
----

Should the test need any data, just leave it around in the test directory
along with *TF_RUN*.

Note that before running, the whole test directory is automatically
copied to a temporary location (one per test), and should the test fail,
copied back as a debugging relic.  For this reason, *do not store
huge amounts of data here*.  If you really need huge data, consider
obtaining it (and throwing it away) within runtime of *TF_RUN*.


Exit status
-----------

We try hard to follow this semantic:

 *  *Zero* means *OK* -- test has been run and passed.

 *  *One* means *Failure* -- test has been run but failed (e.g. found
     a bug).

 *  *Two* means *Bailout* --  test has decided not to run at all.

 *  *Three* means *Error* -- there was error detected during execution,
     but script was able to clean up properly.

 *  *Four* means *Panic* -- there was other error but script *was not*
     able to clean up properly.

 *  Anything else should indicate other uncaught errors, including those
    outside control of the program such as segfaults in the test code
    or test being SIGKILLed.

Notice that the higher the value is, the worse situation it indicates.
Thus, if a test is composed of several sub-tests, you need to make sure
to always **exit with the highest value** (subtest.sh does take care
of this).

See *common.sh* for functions and variables to help with handling exit
statuses with this semantic.

Also see Notes section for more details on exit statuses, including
cheat sheet and dscussuion.


Framework
---------


### harness.sh ###

This part is not intended to be used in tests, but rather contains
functions that help govern test discovery, preparation and execution as
is described in previous sections.  Feel free to poke around, of course.


### subtest.sh ###

As name suggests, this file defines few functions to handle subtests
in *TF_RUN*.

In order to make use of the subtests functionality, you will need to
define two functions yourself:  `tf_enum_subtests` to enumerate names of
tests you want to run, and `tf_do_subtest` with actual test
implementation.

The minimal *TF_RUN* with two subtests could look like this:

    #!/bin/bash

    . $TF_DIR/include/subtest.sh

    tf_enum_subtests() {
        echo test1
        echo test2
        something && echo test3
    }

    tf_do_subtest() {
        case $1 in
            test1)  myprog foo ;;
            test2)  myprog bar ;;
            test3)  myprog baz ;;
        esac
    }

    tf_do_subtests

At the end, `tf_do_subtests` acts as a launcher of the actual test.
In short, it will

 1. run `tf_enum_subtests`, taking each line as name of a subtest;
    for each subtest:

     1. source *TF_SETUP*, if such file is found,
     2. launch the `tf_do_subtest()` function with subtest name as
        the only argument,
     3. source *TF_CLEANUP*, if such file is found,

 2. and finally, report "worst" exit status encountered.

Note that subtest names need to be single words (`[a-zA-Z0-9_]`).


### tools.sh ###

This file contains various tools and utilities to help with testing.

Curently there is only one function, `tf_testflt` designed to help write
tests for simple unix filters.


#### tf_testflt ####

The idea is that tester specifies

 *  test name,
 *  command to launch the system under test,
 *  a data stream to use as STDIN,
 *  and expected STDOUT, STDERR, and exit status.

and tf_testflt launches the command, collects tha data and evaluates
and reports the result using unified diff.

In its simplest form:

    tf_testflt -n foo my_command arg

the function will run `my_command arg` (not piping anything to it),
and will expect it to finish with exit status 0 and empty both STDERR
and STDOUT.

Example of full form,

    tf_testflt -n foo -i foo.in -O foo.stdout -E foo.stderr -S 2 myprog

will pipe foo.in into `myprog`, expecting exit status of 2, and STDOUT and
STDERR as above.  Notice that parameters specifying expected values are
uppercase, and those specifying input values are lowercase.

Specifying name is mandatory, because it's used in reporting messages,
and as a basis for naming temporary result files: these are saved in
*results* subdirectory and kept for further reference.


### common.sh ###

This includes simple functions and variables shared between both mentioned
libraries.

First group is designed to help support the exit status semantic:

 *  The functions are `tf_exit_pass`, `tf_exit_fail`, `tf_exit_bailout`,
    `tf_exit_error` and `tf_exit_panic` and each take any number of
    parameters that are printed on stderr.

 *  The variables are `TF_ES_OK`, `TF_ES_FAIL`, `TF_ES_BAILOUT`,
    `TF_ES_ERROR` and `TF_ES_PANIC` and are supposed to be used with
    `return` builtin, e.g. to return from `tf_exit_error`.

Second group is useful to better control output:  functions `tf_warn`,
`tf_debug` and `tf_think` are used to print stuff on STDERR.  Use of
`tf_warn` is apparent, just as `tf_debug`, the latter being muted if
`TF_DEBUG` is set to `false` (set it to `true` to turn on debugging).

`tf_think` is used for progress info, and is muted unless `TF_VERBOSE`
is set to `true`, which is by default.


### Setup and cleanup ###

Special files *TF_SETUP* and *TF_CLEANUP* (one of them or both) can be
added along with *TF_RUN*.  These will be sourced before (*TF_SETUP*)
and after every subtest (*TF_CLEANUP*).

First, if any of these files are missing, it is considered as if the
respective phase succeeded.  Second, if setup phase fails, test will
be skipped and subtest exit status will be *TF_ES_BAILOUT*.   Last,
if cleanup fails (no matter result of setup), subtests aborts with
*TF_ES_PANIC* returned.  Be aware that in this case the actual test
status, albeit useful, is lost.

When coming from other test frameworks, this may feel harsh, but note
that this has been designed with the idea that if a cleanup fails,
it may render all further tests are automatically unsafe, because the
environment is not as expected.

To cope with this behavior, try to bear in mind following advice:

 1. Make sure you write setup/cleanup procedures with extreme care and
    test them well.

 2. Do not do complicated and risky things in the setup/cleanup phases.

 3. If you need to do such things, consider doing them in the *TF_RUN*
    instead of doing them for all subtests.

 4. You don't need to clean up everything, the contents of the testing dir
    will be moved out from the test system.

 5. If there are scenarios you can safely fix or ignore, handle them in
    a robust manner.


Notes
-----


### bailout vs. `tf_enum_subtests` ###

One more note to claify relation of bailout and `tf_enum_subtests`.
As you may have noticed, there are two ways how to skip a test:
return prematurely with `TF_ES_BAILOUT`, or suppress enumeration in
`tf_enum_subtests`.  The problem is that the latter does not do anything
to inform upper in the stack that a test has been skipped, which seems to
break the principle described in previous sections.

Don't confuse these mechanisms, though. Each is supposed to be used
for distinct purpose.  Compare: by using the `tf_enum_subtests` you are
saying that you actually **did not even want** to run the test in the
first place.  By using `TF_ES_BAILOUT`, you are saying that you **wanted**
to run the test but could not.

A few common cases if that helps you:

 *  If during the test you find out that for some reason it can't be
    carried out (e.g. an external resource is not available, or
    something outside the SUT is broken), use `TF_ES_BAILOUT`.

        tf_enum_subtests() {
            echo test1
            echo test2
            echo test3
        }

        tf_do_subtest() {
            case $1 in
                test1) do_stuff  ;;
                test2) do_other_stuff ;;
                test3) curl -s http://www.example.com/ >file \
                        || return $TF_ES_BAILOUT
                       do_stuff_with file ;;
            esac
        }

 *  If you want to filter out some sub-tests for some platforms, e.g. a
    test for only 64-bit architectures, or a test only for Mac OS (IOW,
    you can safely say that running this sub-test would be totally
    pointless on this box), use `tf_enum_subtests`--just omit this test
    from enumeration.

        tf_enum_subtests() {
            echo test1
            echo test2
            if this_is_macos_x; then
                echo test3
            fi
        }

 *  If you want to disable (comment out test) that you might not have
    implemented yet or is broken (and for some reason you still want
    it to haunt the test code) or something else outside SUT is broken
    and prevents you from running the test, use `tf_enum_subtests` and
    properly comment the reasons in code.

        tf_enum_subtests() {
            echo test1
            echo test2
        #   echo test3      #FIXME: implement after bz1234
        }

 *  If in doubt, use `TF_ES_BAILOUT`.


### On exit statuses: three and above ###

The difference in *error*, *panic* and higher values is subtle but
important.  Follow me as I try to explain:

 1. If script has changed something on the system outside the working
    directory, it is apparently expected to revert that change.

 2. Now if an error occurs, but the code responsible for cleaning up is
    safely run, you can say there was *error but we have recovered*.

 3. But if the change can't be reverted safely, we know that we have
    broken something and latter code may lead to weird results (including
    masking bugs(!)), it's time to *panic* (in the code, not in real
    life ;))

 4. And then there are corner cases like a bug in the script, OOM kill
    or timeout when the status will be different and not really controlled
    by the script.  Such cases will have to be treated the same way as
    the "panic" case, but...

 5. the use of *panic* adds hint that the status has been set consciously
    by the script, albeit exiting "in a hurry"--without proper clean up.

Unfortunately there will be cases like above but with the error code less
than four.   Example is a Bash script syntax error, which returns 2, or
Python exception which returns 1.  Yes, in such cases the information
conveyed by the exit status is wrong and you should do everything to
avoid it.

Possibilities like "test has passed but then something blew up" exist,
but conveying this information is responsibility of the test output.

Following table can be used as a cheat-sheet:

    .---------------------------------------------------------------.
    | e |    state of         |                                     |
    | s |---------------------| script says                         |
    |   | SUT   | environment |                                     |
    |---|-------|-------------|-------------------------------------|
    | 0 | OK    | safe        | test passed, everything worked fine |
    | 1 | buggy | safe        | test failed, everything worked fine |
    | 2 | ???   | safe        | I decided not to run the test       |
    | 3 | ???   | safe        | Something blew up but I managed to  |
    |   |       |             | clean up (I promise!)               |
    | 4 | ???   | broken      | Something blew up and I rushed out  |
    |   |       |             | in panic                            |
    | * | ???   | broken      | ...nothing (is dead)                |
    '---------------------------------------------------------------'

As you can see, following this semantic allows us to see both the state
of the system under test (SUT) *and* the environment.

Following table illustrates how different statuses map to different
scenarios with regard to test result as well as state of the environment:

    .--------------------------------------------------.
    | environment |  test result   |  test result      |
    |             | pass fail unkn | pass fail unkn    |
    |-------------|----------------|-------------------|
    | clean(ed)   |  0    1    3   |  OK  FAIL ERROR   |
    | untouched   |  ~    ~    2   |  ~    ~   BAILOUT |
    | mess        |  ~    ~    4   |  ~    ~   PANIC   |
    | ?! (trap)   |  ~    ~    5   |  ~    ~   ~       |
    | ?! (sig 9)  |  ~    ~    137 |  ~    ~   ~       |
    | ?! (aliens) |  ~    ~    ?   |  ~    ~   ~       |
    '-------------|----------------|-------------------|
                  |  exit status   |  human-readable   |
                  |                |  name (TF_ES_*)   |
                  '------------------------------------'