shell dot on steroids https://pagure.io/shellfu

Alois Mahdal cc5b6c8383 Remove unnecessary pipe I have no idea what I was thinking when I added the pipe there...		10 年前
..
data	Update tests after last change	10 年前
include	Remove unnecessary pipe	10 年前
scripts	Add tests for has_files	10 年前
README.md	Document setup/cleanup logic	10 年前
runtests	Force C locale (sorting, etc) for all tests	10 年前

Tests

Running tests is handled by test/runtests:

$ test/tuntest [filter]

filter is a regular expression to be applied to sub-test name, running only the matching ones.

Tests can be written in any scripting language--they just need to be put into tests/scripts directory, equipped with proper shebang line and marked as executable.

Naming

Test filename should start with name of the module that is tested and underscore. If module name contains dots, they should be replaced with underscores as well.

core_sanity
mod_submod_function
ini_iniread

are valid test names.

Data

If test needs data, place it under test/data/NAME, where NAME is exact filename of the test. The data will then be available in the working directory where the test is started (a temporary directory).

Exit status

We try hard to follow this semantic:

Zero means that test has been run and passed
One means that test has been run but failed.
Two means that test has bailed out.
Three means that there was error detected during execution.
Four or anything else means that there were other errors that script was not able to detect.

Notice that the higher the value is, the worse situation it signifies. Thus, if a test is composed of several sub-tests, always exit with the highest value. The same applies for the run_tests routine itself.

Three and above

The subtle but important difference in three, four and "anything else" is, that if script has changed something on the system outside the working directory, it is apparently expected to revert that change. Now if an error occurs, but the code responsible for cleaning up is safely run, you can use three. But if the change can't be reverted safely, use four.

Apparently there may be corner cases like a bug in the script, OOM kill or timeout when the status will be different and not really controlled by the script. There is underlying assumption, though, that such exit statuses are higher than four. Therefore, while from perspective of environment safety, four has the same meaning as anything higher ("something blew up and system is broken"), however, the use of four adds hint that the status has been set consciously by the script, albeit exiting "in a hurry"--without proper clean up.

Unfortunately there will be cases like above but with the error code less than four. Example is a bash script syntax error, which returns 2, or Python exception which returns 1. Yes, in such cases the information conveyed by the exit status is wrong and you should do everything to avoid it.

Possibilities like "test has passed but then something blew up" exist, but conveying this information is repoomnsibility of the test output.

Following table can be used as a cheat-sheet:

.---------------------------------------------------------------.
| e |    state of         |                                     |
| s |---------------------| script says                         |
|   | SUT   | environment |                                     |
|---|-------|-------------|-------------------------------------|
| 0 | OK    | safe        | test passed, everything worked fine |
| 1 | buggy | safe        | test failed, everything worked fine |
| 2 | ???   | safe        | I decided not to run the test       |
| 3 | ???   | safe        | Something blew up but I managed to  |
|   |       |             | clean up (I promise!)               |
| 4 | ???   | broken      | Something blew up and I rushed out  |
|   |       |             | in panic                            |
| * | ???   | broken      | ...nothing (is dead)                |
'---------------------------------------------------------------'

As you can see, following this semantic allows us to see both the state of the system umder test (SUT) and the environment.

Framework

If you write your tests in Bash, you are encouraged to make use of functions and variables provided with default test framework in test/include/simple.sh. These provide some simple structure for defining and running tests and oracles. The framework is focused on filter-like interfaces, so definitions where STDIN, command and expected STDOUT are all you need will be easiest.

However, you can re-implement any part of the framework so you can always make things a little bit more dynamic.

harness.sh

This part is not intended to be used in tests, but rather contains functions that help govern test discovery, preparation and execution as is described in previous chapters. Feel free to poke around, of course.

simple.sh

This includes functions that you can directly use in your tests.

A typical filter test using all features will look something like:

#!/bin/bash

. include/simple.sh

tf_enum_subtests() {
    echo test1
    echo test2
    something && echo test3
}

tf_do_subtests

and will be accompanied with 6 files in its data storage. These 6 files come in pair for each test: one, named just NAME defines how the test will run and what data will be fed into STDIN, and other, named NAME.oracle will define how exactly the STDOUT must look like.

For example, test1 could be

# desc: test one
# cmd: grep sausage

apple
sausage
banana
german sausage

and test1.oracle

sausage
german sausage

Now what happens when the test is executed is that each test named by tf_enum_subtests is taken as basis for the two filenames. The first one is scanned for some meta-data, of which cmd is most important. cmd makes up the command that is launched and the rest of the file--after first empty line--is then stuffed into this command's STDIN. The result is then saved as NAME.result and compared (by diff -u).

This makes it easy to define various test cases and inspect them, always seeing the commands cleanly tied together with the expected output. However, if this is limiting, you can tailor the whole process and still use some of the goodies.

The above "magic" is performed by three functions. tf_do_subtests uses tf_enum_subtests to go through the list of sub-tests you want to exexute, and passes each name to tf_do_subtest. tf_do_subtest does the file reading, decoding and executing and also accounts for some errors.

tf_enum_subtests does the rest: it maintains the rule that its final exit status needs to be the highest of all tf_do_subtest calls, plus adds some output sugar.

The combination of tf_do_subtests and tf_enum_subtests is almost always useful: the former lets you easily and dynamically say what tests you want to run (leaving lot of space for commenting it), and the latter takes care of the boring tasks like informing what is happening or maintaining hightest exit status.

So if you want to handle your tests yourself, you most likely want to re-implement tf_do_subtest. The interface is the simplest possible: all it takes is a single argument, which is the test name as enumerated by tf_enum_subtests, and of course, it needs to return proper status.

It's up to you what you do with this. You can use similar data-driven approach as original version or make it a wrapper around your own tool or create case/esac routers or even pass arguments in it.

common.sh

This includes simple functions and variables shared between both mentioned libraries.

First group is designed to help support the exit status semantic:

The functions are tf_exit_pass, tf_exit_fail, tf_exit_bailout, tf_exit_error and tf_exit_panic and each take any number of parameters that are printed on stderr.
The variables are TF_ES_OK, TF_ES_FAIL, TF_ES_BAILOUT, TF_ES_ERROR and TF_ES_PANIC and are supposed to be used with return builtin, e.g. to return from tf_exit_error.

Second group is useful to better control output: functions tf_warn, tf_debug and tf_think are used to print stuff on STDERR. Use of tf_warn is apparent, just as tf_debug, the latter being muted if TF_DEBUG is set to false (set it to true to turn on debugging).

tf_think is used for progress info, and is muted unless TF_DEBUG is set to true, which is on by defaiult.

A note on bailout vs. `tf_enum_subtests`

One more note to claify relation of bailout and tf_enum_subtests. As you may have noticed, there are two ways how to skip a test: return prematurely with TF_ES_BAILOUT, or suppress enumeration in tf_enum_subtests. The problem is that the latter does not do anything to inform upper in the stack that a test has been skipped, which seems to break the principle described in the previous chapters.

Don't confuse these mechanisms, though. Each is supposed to be used for distinct purpose. Compare: by using the tf_enum_subtests you are saying that you actually did not even want to run the test in the first place. By using TF_ES_BAILOUT, you are saying that you wanted to run the test but could not.

A few common cases if that helps you:

If during the test you find out that for some reason it can't be carried out (e.g. an external resource is not available, or something outside the SUT is broken), use TF_ES_BAILOUT.
If you want to disable the test because for some long-term condition, e.g. a known bug preventing execution of the test is not fixed, use tf_enum_subtests.
If you want to filter out some sub-tests to only for some platforms, e.g. 64-bit architecture, IOW, you can safely check that a sub-test would be totally pointless if run on this box, use tf_enum_subtests.
If you want to disable (comment out test) that you might not have implemented yet or is broken (and for some reason you still want it to pollute the test case), use tf_enum_subtests and properly comment the reasons in code.
If in doubt, use TF_ES_BAILOUT.

Setup and cleanup

Special files TF_SETUP and TF_CLEANUP (one of them or both) can be added along with data. These must be valid Bash scripts and will be execued before every subtest (TF_SETUP) and after every subtest (TF_CLEANUP) with execfail option enabled.

If setup phase fails, test will be skipped and subtest exit status will be TF_ES_BAILOUT. If cleanup fails (no matter result of setup), subtests aborts with TF_ES_PANIC is returned. Be aware that this happens also if test ran and reported useful status, losing the status in the process. To address this issue, make sure you write setup/cleanup procedures with extreme care and test them well.

If any of these files are missing, it is considered as if the respective phase succeeded.

README.md