update kselftest, use Linaro test-defintions#445
Conversation
e9412c2 to
73f1f54
Compare
|
FYI: still using |
73f1f54 to
1548086
Compare
|
Update includes:
Successful LAVA job: https://lava.baylibre.com/scheduler/job/1757#bottom Removing |
|
Looks good in staging:
However, while the LAVA job runs successfully, I don't see any kselftest results in the staging frontend: https://staging.kernelci.org/test/ @gctucker any idea what might be going on here? |
|
Looking at the backend logs, it seems to be some bug in the log line number translation: I guess the |
|
FYI @mgalka |
1548086 to
aaf3b56
Compare
|
The backend exception has been fixed: kernelci/kernelci-backend#251 |
gctucker
left a comment
There was a problem hiding this comment.
What about the other template variants though (barebox, grub)? Even if they can't all be tested right now, I think they should be updated to also include kselftest/kselftest.jinja2. And the ipxe one should probably be dropped altogether as nothing is using that any more.
|
|
||
| - repository: https://github.com/Linaro/test-definitions.git | ||
| from: git | ||
| revision: master |
There was a problem hiding this comment.
One issue with using the head of the master branch is that it makes it harder to reproduce results. There doesn't seem to be any regular release tags in the test-definitions repository so we should either ask for tags to be created or keep a mirror and add KernelCI tags for each production update.
There was a problem hiding this comment.
As discussed, we will need a fork of test-definitions for KernelCI in any case to have a staging branch. We could use that fork with a kernelci.org branch used in production too.
| name: kselftest | ||
| parameters: | ||
| TESTPROG_URL: {{ kselftests_url }} | ||
| SKIPFILE: skipfile-lkft.yaml |
There was a problem hiding this comment.
Some skips will be platform-dependent, and there are many more platforms being tested in KernelCI than LKFT. So we'll need to either have a common file for everyone, or skipfile-kernelci.yaml added to the repo, or have a separate file (probably in a fork).
There was a problem hiding this comment.
Also worth mentioning that KernelCI tracks and reports regressions only, not test cases that have never passed. So we may be OK without maintaining a full list of tests that won't be fixed, only the broken ones that really don't work or cause some bad side-effects and prevent other tests from running.
| filters: | ||
| - passlist: {defconfig: ['kselftest']} | ||
| params: | ||
| job_timeout: '60' |
There was a problem hiding this comment.
This reminds me, we should change the parameter name to make it clear the timeout is a number of minutes...
|
Other than the comments above, this was tested OK on staging: |
|
@khilman Please also drop the "WIP" from the PR title if/when you think it's ready to be merged. |
|
Yay; this looks like great progress! Selftest tarball built, saved, fetched, and run! :) Next seems to be fixing the execution environment? It seems like something is mangling the log output (and likely interfering with the tests): All the weird "$word: _ $truncated_later_words" lines are coming from somewhere external to the kselftest run. Is that Lava getting in the way? e.g. "13: _ TAP", etc. And maybe some portion of the test's stdout/stderr is missing? This lkdtm "BUG" test can't actually fail: something central seems to be missing. |
|
@kees Thanks a lot for checking this out! Yes LAVA adds some noise because it uses the serial console to communicate with the platform. Then there is some logic in KernelCI backend to reduce the noise but maybe it's not tuned correctly. Let's dig out the raw log and what the filters produced to see what's going on. This is actually not a kselftest specific issue, the same logic is used for all LAVA tests. Maybe it's just more visible with kselftest. |
Removed the WIP. I think the issues with the output mangling are not directly related to this PR. Some tweaking of the test-defintion itself to be sure that the results are parsed from a log file and not from stdout should probably be explored. |
|
I wonder if @danrue might have any feedback on the output mangling issues above. We're using the LKFT LAVA test-definition un-modified. |
|
Should I open a separate bug for the "log mangling"? I can't quite figure out yet how to reproduce the runtime environment the tests are in. (As in, how do I set up Lava locally to try to debug this myself, etc.) |
|
@kees setting up LAVA might be a bit overkill, but what you could try is running the wrapper scripts that the LInaro test-defintion runs. Here's the test-definition: https://github.com/Linaro/test-definitions/tree/master/automated/linux/kselftest It does |
|
@kees @khilman I've captured the raw log produced by LAVA and dumped all the "target" messages in a plain text file: The only difference between that and what is stored by KernelCI is some LAVA signals, here's the diff with what the filters did: --- kselftest-raw-lava-target-log.txt 2020-08-28 09:29:10.661301920 +0100
+++ kselftest-meson-g12b-odroid-n2.txt 2020-08-28 09:14:01.000000000 +0100
@@ -1975,17 +1975,14 @@
+ cat uuid
+ UUID=31179_1.6.2.4.1
+ set +x
-<LAVA_SIGNAL_STARTRUN 0_timesync-off 31179_1.6.2.4.1>
+ systemctl stop systemd-timesyncd
[0;1;31mWarning:[0m The unit file, source configuration file or drop-ins of systemd-timesyncd.service changed on disk. Run 'systemctl daemon-reload' to reload units.
+ set +x
-<LAVA_SIGNAL_ENDRUN 0_timesync-off 31179_1.6.2.4.1>
+ export TESTRUN_ID=1_kselftest
+ cd /lava-31179/0/tests/1_kselftest
+ cat uuid
+ UUID=31179_1.6.2.4.5
+ set +x
-<LAVA_SIGNAL_STARTRUN 1_kselftest 31179_1.6.2.4.5>
+ cd ./automated/linux/kselftest/
+ ./kselftest.sh -t kselftest_armhf.tar.gz -s false -u http://storage.staging.kernelci.org/kernelci/staging.kernelci.org/staging-20200825.2/arm64/defconfig+kselftest/gcc-8/kselftest.tar.gz -L -S skipfile-lkft.yaml -b -g -e -p /opt/kselftests/mainline/
INFO: Generating a skipfile based on /lava-31179/0/tests/1_kselftest/automated/linux/kselftest/skipfile-lkft.yamlSo if the kselftest log output doesn't look right, it's indeed a problem with how the tests are being run in LAVA. So it's not a KernelCI issue per se but rather something with either kselftest or the Linaro test definitions. Let's see if @danrue has some thoughts on that; we can also check what the LKFT logs and results look like. |
|
FYI, some LKFT results and sample log: Looks similar to what we get (with apparently an extra bug when calling |
|
I haven't looked at LKFT's kselftest implementation details in a while, but I know that we often have problems getting results from kselftest runs. The diff thing looks like a busybox vs gnu diff type problem. |
The kselftest test-plan has been long out of date and mostly unused. Remove all current users since results have not been looked at, and because the test-plan is changing to require NFSroot, which not all devices support. Devices can be added back as they are validated. Signed-off-by: Kevin Hilman <[email protected]>
Update the kselftest testplan to use the kselftest tarball that is generated during the kernel build process. Also use the Linaro test-definitions repo for the test-plan. Signed-off-by: Kevin Hilman <[email protected]>
Signed-off-by: Kevin Hilman <[email protected]>
aaf3b56 to
fb3bc4b
Compare
| @@ -10,6 +10,7 @@ labs: | |||
| plan: | |||
| - baseline | |||
| - baseline-fastboot | |||
There was a problem hiding this comment.
I believe it's worth adding baseline-nfs since kselftest uses nfsroot, to check it's working with regular kernel configs:
| - baseline-fastboot | |
| - baseline-nfs | |
| - baseline-fastboot |
There was a problem hiding this comment.
I left that out on purpose because it's an extra test that doesn't add any value, other than sanity checking NFS.
Since there's no way to check the status of baseline-nfs before running kselftest, I don't see any extra value here.
There was a problem hiding this comment.
Well it can be used to verify manually that nfsroot works as expected, when kselftest jobs fail to run.
There was a problem hiding this comment.
Sure, but that's orthogonal to this patch, and should be up to the lab owner to decide. Let's not hold this series up for unrelated issues.
There was a problem hiding this comment.
Sure, this is not what's holding this PR. It just seemed like an obvious thing to me when I started enabling kselftests on more devices.
Well this looks like exactly the problem: It assume LAVA doesn't actually parse TAP? Hmmm |
|
Right, so that's an issue with the Linaro test definitions rather than KernelCI which runs them. I've created a fork in kernelci/test-definitions so we can work on things like that, and a first issue to follow up from this PR: kernelci/test-definitions#1 |
Update the kselftest test-plan:
Sharing test-defintions with LInaro/LKFT will allow better collaboration on keeping kselftests running/reliable.
NOTE: this draft PR is based on staging due to dependencies on existing PRs not yet merged into master.