It seems that any crash in APITESTS on vbox testbot is causing rosautotest to redo from start. We can see this in stdio log: http://build.reactos.org/builders/Windows_AMD64_1%20VBox-Test/builds/6323/st... (just search for "restarting" word). APITESTS are repated and such log as a whole is messing up testman, which is treating it like two submissions for the same revision. It accepts the first one (incomplete due to crash) and rejects the second one, reporting a duplicate: "ERROR! submit(13111, 518, ...) - We already have a result for this test suite in this test run! "Windows_AMD64_1 VBox-Test""
As we all know, we should expect tests to fail and even crash (this is what they are designed for), hence the error is in rosautotest and testman.
Rosautotest, strangely, is affected only on vbox, on kvm the apitest crash is not causing a restart of test run. Winetests, if i recall correctly, were never affected in such way.
Testman should not reject a submission after its saved up and revision - blocked.
I am aware that some will be tempted to comment out the test in question, but this is not a proper solution to this situation. Other apitest will crash in the future, bringing the whole thing back. Same as with regressions, lack of tests due to such issue, will bite us back in the future, when least expected and most needed. This is why I humbly request this issue to be solved properly, once and for all.
With best regards
----- Original message ----- From: buildbot@reactos.org To: caemyr@myopera.com Subject: buildbot failure in ReactOS on Windows_AMD64_1 VBox-Test Date: Tue, 14 Aug 2012 00:23:24 +0000
The Buildbot has finished a build on builder Windows_AMD64_1 VBox-Test while building ReactOS. Full details are available at: http://build.reactos.org/builders/Windows_AMD64_1%20VBox-Test/builds/6324
Buildbot URL: http://build.reactos.org/
Buildslave for this Build: Windows_AMD64_2
Build Reason: Triggerable(Windows_AMD64_1 VBox-Test Trigger) Build Source Stamp: 57075 Blamelist: akhaldi
BUILD FAILED: failed Cleanup
sincerely, -The Buildbot
Hi,
Le mardi 14 août 2012 à 02:48 +0200, caemyr@myopera.com a écrit :
Rosautotest, strangely, is affected only on vbox, on kvm the apitest crash is not causing a restart of test run. Winetests, if i recall correctly, were never affected in such way.
Nope. KVM exposes the same issue. Well, at least, randomly.
Testman should not reject a submission after its saved up and revision - blocked.
It has. This prevents for storing partial runs, or not consistant runs. Furthermore, I'm against hacking testman because rosautotest is failing. That's non-sense.
With my best regards,
On Tue, Aug 14, 2012, at 08:02 AM, Pierre Schweitzer wrote:
Nope. KVM exposes the same issue. Well, at least, randomly.
Are you sure? I never seen KVM failing on test log upload.
It has. This prevents for storing partial runs, or not consistant runs. Furthermore, I'm against hacking testman because rosautotest is failing. That's non-sense.
But the whole problem with testman now is that it actually is storing partial/incomplete runs: http://www.reactos.org/testman/compare.php?ids=13111 http://www.reactos.org/testman/compare.php?ids=13114
By that its preventing the full test run to be uploaded. Hence its not just rosautotest, there is partial blame on testman as well.
caemyr@myopera.com wrote:
It seems that any crash in APITESTS on vbox testbot is causing rosautotest to redo from start.
Apparently, we're dealing with lost cache contents or even filesystem corruption here. If you look for "Writing initial journal file", you'll see that it appears multiple times in the log while this should only happen once at the very beginning.
If somebody has a few minutes to spend, please take a look at CJournaledTestList.cpp in rosautotest. Maybe we can add a manual flush command here to ensure that the journal is written back to the disk before we start testing.
Cheers,
Colin
Colin Finck colin@reactos.org wrote:
If somebody has a few minutes to spend, please take a look at CJournaledTestList.cpp in rosautotest.
I've added flags to bypass caches in r57077, but this doesn't fix the problem unfortunately. I'm not sure whether any other (proper) fix is even doable in rosautotest itself. We're terminating the VM right after a crash, so what assumptions can we make about its filesystem contents? So far, the journal has been properly written back to disk for most VMs, but can we blame the VM if this doesn't happen?
This also seems to be a recent issue if I'm not mistaken. Older test runs on VirtualBox don't suffer from this problem when looking at BuildBot.
Cheers,
Colin
This issue comes and goes, from what i noticed its triggered by any crash in APITESTS. Crashes in WINETESTS are not triggereing it.
On Tue, Aug 14, 2012, at 11:28 PM, Colin Finck wrote:
I've added flags to bypass caches in r57077, but this doesn't fix the problem unfortunately. I'm not sure whether any other (proper) fix is even doable in rosautotest itself. We're terminating the VM right after a crash, so what assumptions can we make about its filesystem contents? So far, the journal has been properly written back to disk for most VMs, but can we blame the VM if this doesn't happen?
This also seems to be a recent issue if I'm not mistaken. Older test runs on VirtualBox don't suffer from this problem when looking at BuildBot.
Cheers,
Colin
Ros-dev mailing list Ros-dev@reactos.org http://www.reactos.org/mailman/listinfo/ros-dev
Guys, this is serious, crash is still present. Without test coverage we risk slipping regressions in and KVM testbot does not cover ahk app tests.
Hi, I'm really sorry about the inconvenience. When I committed the test case I didn't expect it would cause such great problems and thought that testbot would continue gratefully. So I deactivated the tests till either it doesn't crash in the kernel or the test process is more solid.
Now that we hit this problem we should try fixing it before we forget it again. Perhaps a possible solution would be for rosautotest to read a text file that contains tests that may cause a crash and run these tests at the end of the test process. Another solution would be to cause the vm to reboot right after creating the journal. I think that both solutions could give the vm enough time to write the journal to the disk.
ReactOS Development List ros-dev@reactos.org wrote on Thu, August 16th, 2012, 10:24 AM:
Guys, this is serious, crash is still present. Without test coverage we risk slipping regressions in and KVM testbot does not cover ahk app tests.