svn commit: r278323 - in head: etc/rc.d usr.sbin/jail

Garrett Cooper yaneurabeya at gmail.com
Thu Feb 12 21:57:36 UTC 2015


On Feb 12, 2015, at 13:12, Garrett Cooper <yaneurabeya at gmail.com> wrote:

> On Feb 9, 2015, at 18:51, James Gritton <jamie at freebsd.org> wrote:
> 
>> On 2015-02-06 22:23, Garrett Cooper wrote:
>>> On Feb 6, 2015, at 18:38, James Gritton <jamie at freebsd.org> wrote:
>>>> On 2015-02-06 19:23, Garrett Cooper wrote:
>>>>> I think you broke the Jenkins tests runs, and potentially jail support
>>>>> in some edgecases:
>>>>> https://jenkins.freebsd.org/job/FreeBSD_HEAD-tests2/651/
>>>> Where do I go from here?  There error you refer to certainly seems jail-related, which leads me to guess at something disconnected between the matching rc.d/jail and jail(8) change (i.e. using the new rc file with the old jail program).  But that's really just a wild guess.  Is there somewhere I look for more information?  For example, where does Jenkins actually do its thing?
>>>> Sorry for being so stupid in this - Jenkins has only been on the very edge of my awareness until now.
>>> I honestly don’t think it’s Jenkins because Jenkins runs in bhyve. I
>>> think you accidentally broke option handling in the jail configuration
>>> (please see my other reply about added “break;” statements).
>>> ...
>>> You can verify your changes by doing:
>>> % (cd /usr/tests/bin/pkill; sudo kyua test)
>> 
>> After some testing and looking around, I've decided the problem definitely isn't in rc.d where I thought it might be.  I've also decided it's probably not in my patch either.
>> 
>> I've run this kyua test on a 10 system (don't have current handy for such things at the moment), and sometimes I would see a failure and sometimes I wouldn't.  This was whether I was using the new or old jail code.  Later in the day, when the box was less loaded, it seemed to always pass.  Looking at the pkill-j_test script, I see jails being created with sleep commands both inside and outside the jail around its creation.  I'm guessing this script is very sensitive to timing issues that could be cause by (among other things) system load.  The jail commands in this script were also very simple, with the only parameters used being: path, name, ip4.addr, and command.  This isn't some kind of esoteric exercising of the jail(8) options, and I would expect if it works at one time it would work at another.  I've "hand-run" these particular jail commands and couldn't get them to fail (and the actual content of the jail(8) changes were tests already).
>> 
>> I looked at the freebsd-current (I think) list where the Jenkins errors are posted, and it's true it started failing the pkill-j test at the time I made my change. But it's also true that it had failed that test once the day before my change, and then started passing it again.  This particular test just seems to be fragile.
>> 
>> So I don't have anywhere else to go with this.  I'm going to assume jail(8) isn't the problem here.
> 
> The tests are racy and make some interesting assumptions. It appears that WITNESS plays a part in it, and I bet VIMAGE (something that I don’t have in my kernel config) plays a part in it too. I say this because I just ran into the issue when running the tests in a tight loop on my VMware workstation 7 instance with code from r278636.
> 
> Doesn’t surprise me because before r272305, it was failing consistently on head, so what Craig did in that commit helped, but it didn’t fully fix the raciness of the tests.
> 
> I’m going to recompile my system with VIMAGE and see if that impacts performance of the tests, and if so, I’ll adjust the sleep between setting up the jailed instances, and waiting for them to be fully formed.
> 
> Thanks!
> 
> $ while : ; do sudo prove -rv pgrep-j_test.sh || break; done
> pgrep-j_test.sh .. 
> 1..3
> usage: pgrep [-LSfilnoqvx] [-d delim] [-F pidfile] [-G gid] [-M core] [-N system]
>             [-P ppid] [-U uid] [-c class] [-g pgrp] [-j jid]
>             [-s sid] [-t tty] [-u euid] pattern ...
> not ok 1 - pgrep -j <jid> # pgrep output: '', pidfile output: '74275 74278'
> ok 2 - pgrep -j any
> ok 3 - pgrep -j none
> Failed 1/3 subtests 
> 
> Test Summary Report
> -------------------
> pgrep-j_test.sh (Wstat: 0 Tests: 3 Failed: 1)
>  Failed test:  1
> Files=1, Tests=3,  5 wallclock secs ( 0.04 usr  0.02 sys +  0.02 cusr  0.55 csys =  0.63 CPU)
> Result: FAIL

This Jenkins run is interesting: https://jenkins.freebsd.org/job/FreeBSD_HEAD-tests2/686/testReport/junit/bin.pkill/pgrep-j_test/main/ . The first run passed, but the second one didn’t (more output than expected). This error shouldn’t occur after r278636, but it definitely confirms the fact that the test is racy, in other ways.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20150212/a289e675/attachment.sig>


More information about the svn-src-head mailing list