Short user guide to integration tests in ICARUS
This page was adapted from SBND one, and the adaptation is not complete yet.
Note: when the text refers to “now” or “as of today” or “at the time of writing”, you can use the “blame” feature of GIT to find out when that was.
What are integration tests
ICARUS has two levels of tests:
- unit tests are small tests targeting a single feature or module;
you run it with
mrb test -j16
or equivalent¹ - integration tests exercise a complete chain of processing
Before pushing code that has any remote chance of changing existing results, you should run both. This guide is about the latter.
When the release schedule will become reasonably synchronous with
LArSoft, an automatic trigger will be activated so that every time code
is pushed in icaruscode
develop
branch, tests are also automatically
triggered. This is not the case yet as of writing.
What an integration test does
An integration test does what it is asked to by its configuration at
test/ci/ci_tests.cfg
.
Each test may include:
- run a LArSoft job
- compare the size of the output (in particular, the length of the data product collections) with a reference result
- compare selected histograms from the output with reference results
If any of the steps fails, the test fails and the failure is reported. For failures in running the job, the solution is for the author to just fix the code. For failures in the comparison with the reference results, the mismatch must be studied to determine whether the change is acceptable or not. In the former case the reference results need to be updated, in the latter again the code must be fixed.
Introduction to test running
For this introduction we run the tests in the local area (as in the instructions below).
The script test_runner
will execute the requested integration tests.
Since it relies on the settings from the current UPS environment, no
particular setup is needed, except for the UPS product containing the
script. So:
setup lar_ci
test_runner develop_test_icaruscode
will run the tests designed to be run during development.
The official documentation is in the
Fermilab Redmine wiki of lar_ci
package
(see also Further resources at the bottom of this page).
While currently the tests read all the input they need from unprivileged areas (ICARUS StashCache accessed via CVMFS:
/cvmfs/icarus.osgstorage.org/pnfs/fnal.gov/usr/icarus/persistent/stash
), it is in general still suggested that you get a certificate proxy if you find file access issues that might have sneaked in. Having a valid certificate proxy has no drawback.
TODO The output of a test in detail
The information of this section somehow depends on both the version of
the test SBND provides, and the version of lar_ci
. The following
information was compiled using the latest icaruscode
, v09_02_00
and
the current lar_ci
, v3_7_0
.
We use as example the detector propagation test. The list of available tests is:
$ test-runner -l
[...]
The current parallel limit is: 5
suite single_quick_test_icaruscode: (4 tests)
single_detsim_quick_test_icaruscode
single_g4_quick_test_icaruscode
single_gen_quick_test_icaruscode
single_reco_quick_test_icaruscode
suite single_seq_test_icaruscode: (4 tests)
single_detsim_seq_test_icaruscode
single_g4_seq_test_icaruscode
single_gen_seq_test_icaruscode
single_reco_seq_test_icaruscode
suite generate_reference_single_test_icaruscode: (4 tests)
single_detsim_seq_test_icaruscode
single_g4_seq_test_icaruscode
single_gen_seq_test_icaruscode
single_reco_seq_test_icaruscode
suite nucosmics_quick_test_icaruscode: (4 tests)
nucosmics_detsim_quick_test_icaruscode
nucosmics_g4_quick_test_icaruscode
nucosmics_gen_quick_test_icaruscode
nucosmics_reco_quick_test_icaruscode
[...]
We have about 10 tests, and loads of test suites that may share them. We can ask
test_runner
to run a single test, or a whole suite, or many of the above.
The detector propagation test is ci_g4_regression_quick_test_icaruscode
:
$ setup lar_ci
$ mkdir -p run_ci
$ cd run_ci
$ test-runner --statistics ci_g4_regression_quick_test_sbndcode
The current parallel limit is: 5
Test Suite: ci_g4_regression_quick_test_sbndcode
Expanded: ci_g4_regression_quick_test_sbndcode
Statistic: ci_g4_regression_quick_test_sbndcode exitcode 5120
Statistic: ci_g4_regression_quick_test_sbndcode rusage_user_cpu 21.190000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_user_cpu 118.647048
Statistic: ci_g4_regression_quick_test_sbndcode rusage_system_cpu 5.180000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_system_cpu 29.003856
Statistic: ci_g4_regression_quick_test_sbndcode rusage_elapsed 213.580000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_%cpu 12.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgtext 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgdata 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_maxrss 221176.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_inputs 3982312.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_outputs 2368.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_major_faults 17999.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_minor_faults 495733.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_swaps 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode valerrs 0
Statistic: ci_g4_regression_quick_test_sbndcode success False
Fcopy_out_results check point None false None None False
0 tests passed (0%), 1 tests failed, 0 tests with warnings, 0 tests skipped, out of 1
Not updating any reference files
It shows a job failure, and the errors.log
explains (in its own way)
that there is a authentication error… I forgot to get a certificate proxy!
I rerun after getting one, to get:
$ test_runner --statistics ci_g4_regression_quick_test_sbndcode
The current parallel limit is: 5
Test Suite: ci_g4_regression_quick_test_sbndcode
Expanded: ci_g4_regression_quick_test_sbndcode
Statistic: ci_g4_regression_quick_test_sbndcode exitcode 0
Statistic: ci_g4_regression_quick_test_sbndcode rusage_user_cpu 19.650000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_user_cpu 110.024280
Statistic: ci_g4_regression_quick_test_sbndcode rusage_system_cpu 2.750000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_system_cpu 15.397800
Statistic: ci_g4_regression_quick_test_sbndcode rusage_elapsed 25.810000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_%cpu 86.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgtext 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgdata 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_maxrss 285528.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_inputs 1708616.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_outputs 4800.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_major_faults 7626.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_minor_faults 267986.000000
Statistic: ci_g4_regression_quick_test_sbndcode rusage_swaps 0.000000
Statistic: ci_g4_regression_quick_test_sbndcode valerrs 2
Statistic: ci_g4_regression_quick_test_sbndcode success True
.copy_out_results check point None false None None False
1 tests passed (100%), 0 tests failed, 0 tests with warnings, 0 tests skipped, out of 1
Not updating any reference files
This is a winner. A local directory
ci_g4_regression_quick_test_sbndcode
was created, with:
file | |
---|---|
histocmp.html |
would contain rendering of comparison plots; empty because this test does not extract plots |
memory.db |
art output (from MemoryTracker service) |
cputime.db |
art output (from TimeTracker service) |
hist-sbnd_ci_sbndcode_quick_single_g4.root |
art output (from TFileService service) |
messages.log |
main output of LArSoft job on screen (art output from message facility service, as configured in SBND) |
errors.log |
error output stream of LArSoft job (art output from message facility service, as configured in SBND) |
stdout.log |
output to screen of (most of) the test_runner script, including also "event dumps" for both new and reference results, and some error messages from LArSoft job |
data_production_stats.log |
encoded error information (only on error) |
stderr.log |
error output stream of (most of) the test_runner script |
result.log |
test validation output |
avgevent_times.log |
average CPU times (broken when using TimeTracker ) |
TODO Example of failure from resource usage
This is the output of test_runner
after a successful LArSoft job has
used more resources (or less!) than expected:
The current parallel limit is: 5
Test Suite: ci_g4_regression_quick_test_sbndcode
Expanded: ci_g4_regression_quick_test_sbndcode
Memory usage 285648.000000 not in 200000.000000:210000.000000 range
Statistic: ci_g4_regression_quick_test_sbndcode exitcode 102
[...]
To obtain it, I artificially decreased the limit of (resident size) memory for the test to about 210 MB.
Running the integration test with the Continuous Integration system
There are mainly three ways to run an integration test:
- automatically, by pushing into a
develop
branch - locally, testing the code in the local development area
- remotely, testing the code in the GIT remote repositories
Currently tests are run on SLF7 with GCC compiler (Clang compiler is also potentially available).
Automatic testing
Whenever a commit is pushed into a develop
branch, a test is
automatically triggered. The test starts 15 minutes after the first push
to develop
, to give the user the time to push into different
repositories as needed.
- if the push is in
icaruscode
, onlyicaruscode
is built and tested - if the push is (also) in the
develop
branch of a LArSoft repository, the C.I. system will check out and build all LArSoft,icaruscode
and also the code from the other LArSoft-based experiments; this is to ensure that one experiment's change does not disrupt the others
The result of the test can be checked in the C.I. dashboard (on the top raw, you can select LArSoft or ICARUS to monitor the respective tests).
In this case, the quick test suite is executed (quick_test_icaruscode
,
chosen by the ICARUS lar_ci
workflow configuration).
Testing of the code in the local working area
Before pushing the code anywhere, integration tests may be executed locally from the MRB area² where the code has just been compiled (see also above, the introduction to test running. The commands to do so are:
setup lar_ci
test_runner --verbose develop_test_icaruscode
Among the useful options: --parallel-limit=12
will run at most 12
tests in parallel, instead than one after the other.
TODO Remote testing of published code
The Continuous Integration system can build and test any publicly available branch.
Tests are triggered by a command called trigger
from lar_ci
The full documentation of trigger
and lar_ci
is maintained in
Fermilab Redmine.
To ask for the ICARUS integration tests:
setup lar_ci
CI_CERT=/tmp/ci_cert.pem
kx509 -o "$CI_CERT"
trigger --build-delay 0 --cert "$CI_CERT"
This will run the “quick” test, just like if it had been triggered
automatically. The --build-delay 0
option tells the system to start as
soon as possible (instead of waiting for 15 minutes, which would be
pointless since we are not pushing anything any more).
That command runs the icaruscode_wf
workflow, which is what we usually want.
The supported workflows are also listed in
the official lar_ci
documentation.
Running on specific branches is also possible (--revisions
) and also documented in
lar_ci
wiki.
To use branches from GitHib the syntax is non-straightforward. For example:
trigger --build-delay 0 --cert "$CI_CERT" --revisions 'SBNSoftware/icaruscode@feature/myBranch,SBNSoftware/icarus_signal_processing@v09_22_01'
will test develop
branches for everything except for icaruscode
,
which will use feature/myBranch
, and for icarus_signal_processing
, using v09_22_01
tag.
Both repositories are taken from GitHub (SBNSoftware/icaruscode
and SBNSoftware/icarus_signal_processing)
The reference result files
The reference result files are currently stored in dCache:
/pnfs/icarus/persistent/ContinuousIntegration/reference
(see the XROOTD_REFERENCEFILEDIR_ICARUSCODE
key in the DEFAULT
section of
test/ci/ci_tests.cfg
and the related keys BASEFILERELPATH_ICARUSCODE
and
XROOTD_BASEFILEDIR_ICARUSCODE
).
Updating the reference files
When changes render the reference files obsolete, it’s time to generate new ones. It is possible to do this both with a remote test trigger, or locally. Remember that this action will affect the whole collaboration.
Generating the reference files with a remote “trigger”
After having concluded there is the need to update reference files, a single command (togehter with all due certifology) will do the trick:
CI_CERT=/tmp/${USER}/ci_cert.pem
mkdir -p "$(dirname "$CI_CERT")"
kx509 -o "$CI_CERT"
trigger --build-delay 0 --cert "$CI_CERT" --workflow Update_ref_files_ICARUSCODE_wf
(in this case, the “standard” grid certificate proxy seems
not to be enough; with a current Kerberos ticket, the command kx509
will create
the proper certificate).
The Update_ref_files_ICARUSCODE_wf
is a special workflow used for this purpose only.
At the end the temporary file with the certificate can be deleted:
rm "$CI_CERT"
Generating the reference files from a local working area
If a test (e.g. single_g4_seq_test_icaruscode
) requires a new reference file,
a few steps need to be taken.
First, notice that the test may be part of a sequence of tests; for example,
single_g4_seq_test_icaruscode
is part of one starting with
single_gen_seq_test_icaruscode
and going forth through a “detsim” and a “reco” stage.
A look into the reference file repository (which can be found in pieces in the usual
test/ci/ci_tests.cfg
file as REFERENCEFILEDIR_ICARUSCODE
; at the time of writing,
it points to /pnfs/icarus/persistent/stash/ContinuousIntegration/reference
)
shows a structure with a tag (e.g. standard
) and within it test stages.
Within each stage directory there are reference files, a group for each test,
which for the most part carry timestamps. One file for each group does not carry
any timestamp, and that one is the current reference file.
Reference files are generated with:
export build_identifier="-$(date +%Y%m%d%H%M)"
test_runner --expcode=ICARUSCODE --statistics --update test_suite [test_suite2 ...]
The first line defines the suffix for each reference file
(but test_runner
will not accept it empty). In the example, a timestamp is used.
The other like asks for updating the reference files, and in addition requests
extended resource usage statistics
(good if retuning the expected ranges in test/ci/ci_tests.cfg
)
and sets the experiment code name tag (mandatory).
Note that the new reference files will be copied all together and all at
the end of the job, and that if one copy fails the script won’t attempt
to copy the remaining new reference files.
At this point, if the new reference files are blessed, they need to be made current.
To do so, each one of the new files must be copied to a file without timestamp, e.g.:
cp -vaf /pnfs/icarus/persistent/stash/ContinuousIntegration/reference/standard/reco/single_*_test_icaruscode_Reference{-202009212301,}.root
will make single_reco_test_icaruscode_Reference-202009212301.root
the new reco
reference file.
Note: so far it has proven to be a good idea to use the test sequence to generate the reference files. Quick tests are expected to produce the same results, except that they may process fewer events.
TODO Available tests
The available tests can be printed with test-runner --list-tests
(-l
for short).
Test suites should be documented at the beginning of test/ci/ci_tests.cfg
configuration file.
A summary of the test suites (which will fall sadly out of date with time):
test suite name | description | run time |
---|---|---|
develop_test_sbndcode |
tests intended to be run during code development | 4300/1200 kVs |
complete_test_sbndcode |
tests intended to be run before final push (take longer than develop_test_sbndcode ) |
12000/6200 kVs |
quick_test_sbndcode |
includes both single particle and data-like event quick tests | 4300/1200 kVs |
seq_test_sbndcode |
includes both single particle and data-like event sequences tests | 7600/6200 kVs |
single_quick_test_sbndcode |
single particle (prodsingle_sbnd.fcl ) 5-stage chain, each step from reference file |
800/450 kVs |
nucosmics_quick_test_sbndcode |
data-like neutrino and background (GENIE+CORSIKA) 5-stage chain, each step from reference file | 3500/1200 kVs |
single_seq_test_sbndcode |
data-like single particle (prodsingle_sbnd.fcl ) 5-stage chain, in sequence |
1400/1400 kVs |
nucosmics_seq_test_sbndcode |
neutrino and background (GENIE+CORSIKA) 5-stage chain, in sequence | 6200/6200 kVs |
gallery_test_sbndcode |
runs tests related to the gallery examples in sbndcode |
|
generate_reference_sbndcode |
reruns all the jobs generating output files that can be used as reference³ | 5800/5800 kVs |
all_tests_sbndcode |
reruns all tests (used for maintenance only) | 12000/6200 kVs |
Most quick tests typically run just one or two events.
The run time in the table is the normalised one as reported by the C.I.
scripts, and it is heavily approximated. The first figure collects the
integrated time, while the second is the ideal run time when all tests
are run in parallel (for example, sequential tests can’t be
parallelised). The figures were obtained from sbndbuild01.fnal.gov
;
for reference, 1000 kVs on that machine take about 3 CPU minutes.
TODO Investigating test failures
TODO Results are different from the reference
The reference files are normally generated with a special trigger, as
described above. These special jobs are shown in the
dashboard
as icarus_ci
builds with workflow Update_ref_files_ICARUSCODE
, and can
be recognised because they have a table header with more columns,
including a column gen_ref_files
. A completed reference file job
replaces the previous reference files, so the current files come from
the most recent reference file job.
When comparing the result of a failed test, it may be useful to
precisely know the difference in the source code. The exact commit of
the code used to generate the reference result can be seen hovering the
mouse on the checkout
column of the most recent reference file job.
For example, sbnd_ci/32
, generating reference files on 2018-03-09
09:02, had:
larana LARSOFT_SUITE_v06_70_01
larcore LARSOFT_SUITE_v06_70_01
larcorealg LARSOFT_SUITE_v06_70_01
larcoreobj LARSOFT_SUITE_v06_70_01
lardata LARSOFT_SUITE_v06_70_01-6-g00d39317
lardataobj LARSOFT_SUITE_v06_70_01
lareventdisplay LARSOFT_SUITE_v06_70_01
larevt LARSOFT_SUITE_v06_70_01
larexamples LARSOFT_SUITE_v06_70_01
larpandora LARSOFT_SUITE_v06_70_01
larpandoracontent LARSOFT_SUITE_v06_70_01
larreco LARSOFT_SUITE_v06_70_01-2-g3db95cef
larsim LARSOFT_SUITE_v06_70_01
larsoft LARSOFT_SUITE_v06_70_01
larsoftobj LARSOFT_SUITE_v06_70_01
larwirecell LARSOFT_SUITE_v06_70_01
sbndcode v06_70_01-3-g28fc0d2
The codes shown there are from git describe
, and git
usually accepts
them where a commit hash of a tag would be required. If the failure was
from a remote job, you will get an e-mail with a history of recent
commits (for sbndcode
only), and comparing to the reference tag for
sbndcode
you can see which additional commits were used in your test.
In the example above, v06_70_01-3-g28fc0d2
is the reference tag, which
is formed by an actual base tag (v06_70_01
), the number of commits
beyond it (3
) and the hash of the head commit (28fc0d2
, introduced
by the g
letter). In this case, the correct commit is commit:28fc0d2.
Further resources
The hub for information on integration tests is the project
lar_ci
.
The contact to that system is, at the time of writing, Vito Di Benedetto.
Footnotes
[1] If the working area is already built, you can run all the tests with:
cd "$MRB_BUILDDIR"
ctest -j16
You can also go to the subdirectory of $MRB_BUILDDIR
that contains the
tests you care of, instead, and you'll end up running only the tests
under that directory.
[2] In fact, they can be run even without a MRB area, in an environment
where icaruscode
is already set up.
[3] The test suite generate_reference_sbndcode
is used by default by
the workflow to update reference files (Update_ref_files_SBNDCODE_wf
).
To use a different one, trigger
should be explicitly provided with a
--gen-ref-tests
option.
[For questions, contact Gianluca Petrillo.]