RPI3 swap experiments, was Re: GPT vs MBR for swap devices
Mark Millard
marklmi at yahoo.com
Wed Jun 27 06:30:58 UTC 2018
On 2018-Jun-26, at 10:40 PM, bob prohaska <fbsd at www.zefox.net> wrote:
> On Tue, Jun 26, 2018 at 07:09:09PM -0700, Mark Millard wrote:
>>
>>
>> On 2018-Jun-26, at 3:28 PM, bob prohaska <fbsd at www.zefox.net> wrote:
>>
>>> On Tue, Jun 26, 2018 at 01:15:54PM -0700, Mark Millard wrote:
>>>> On 2018-Jun-26, at 8:18 AM, bob prohaska <fbsd at www.zefox.net> wrote:
>>>>
>>>>> On Tue, Jun 26, 2018 at 07:37:59AM -0700, Mark Millard wrote:
>>>>>>
>>>>>>
>>>>>> . . .
>>>>>>
>>>>>> As I remember, Bob P. Did reproduce drive errors even without
>>>>>> the problem drive being used for swapping. This too suggests
>>>>>> (A) as separate activity.
>>>>>>
>>>>> Indeed, it is a requirement. If the suspect device is used for swapping
>>>>> OOMA kills prevent the test from progressing to the point of failure.
>>>>>
>>>>
>>>> Looking back at http://www.zefox.net/~fbsd/rpi3/swaptests/
>>>> and information about /dev/da0 rive errors it does not
>>>> appear that a combination with:
>>>>
>>>> A) sufficient swap (> 1.5 GiByte total?) but no use of swap on
>>>> any partition on /dev/da0
>>>> and:
>>>> B) use of /dev/da0 for /usr/ and /var/
>>>> and:
>>>> C) Records from the console showing errors (or notes
>>>> indicating lack of such errors).
>>>>
>>>> exists. So I was remembering incorrectly.
>>>>
>>>> I'm not claiming such a combination is the best direction for
>>>> the next tests, but absent such tests there is no
>>>> compare/contrast to know if /dev/da0 would still get errors
>>>> despite the system having sufficient swap present on other
>>>> drives. Thus, I would not go so far as "is a requirement" on
>>>> the evidence available.
>>>>
>>>
>>> I just didn't bother to record successful runs. I'm logging one now.
>>>
>>>> We do have evidence for the system having insufficient swap
>>>> space: this context seems to have the current status "is
>>>> sufficient but might not be necessary" for /dev/da0
>>>> getting drive errors.
>>>>
>>> Not sure I understand here. Basically there seem to be three cases:
>>> Enough swap not on da0, -j4 buildworld completes.
>>> Any swap on da0, -j4 buildworld is killed by OOMA
>>> Not enough swap not on da0, -j4 buildworld crashes the machine eventually.
> ^^^^^^^^^^
> OK, here's my error. The third case should have been
> "not enough swap on mmcsd0".
>
>
>>>
>>> Are there other combinations I've overlooked? The first two don't seem
>>> worth repeating, at least not often.
>>
>> "buildworld completes with /dev/da0 errors" vs. "buildworld completes
>> without /dev/da0 errors" (for: enough swap not on /dev/da0 with no
>> swap on /dev/da0 ).
>>
>> That is a little simplistic, as there can be multiple retries
>> before FreeBSD gives up. Normal is no retries needed. Going
>> from rare single retries to frequent multiple retries but no
>> giving-up to it giving up sometimes is all abnormal as I
>> understand. But there are degrees of abnormal.
>>
>> And, yes, I have had past examples of significant drive reports
>> during buildworld that let buildworld appear to complete. (Not
>> that I trusted the result or the drive involved after such, at
>> least as the drive was powered/connected at the time.)
>>
>> For "any swap on da0" and "not enough swap not on da0" (with
>> no swap on da0) I'd add to your descriptions: "with /dev/da0
>> errors" (again simplistic).
>
> The only case where I've seen crashes and /dev/da0 errors is with
> insufficient swap on mmcsd0. I've come to ignore OOMA kills as
> too familiar to be interesting.
"crashes and /dev/da0 errors":
A) Any examples of /dev/da0 errors without crashes?
B) Any examples of crashes without /dev/da0 errors?
C) All examples that do either also does the other
(so both always go together)?
(I've having trouble parsing a specific meaning for
the reference. I did not go back trough all the logs
again to identify the combinations recorded.)
For (A), have you tried any examples of:
sufficient swap on mmcsd0 (or other such) with no swap
on da0 (but /usr and /var on /dev/da0)?
If yes, did you check on if there were /dev/da0 errors
logged? What, if any, /dev/da0 errors where logged?
None?
For (B), have you tried any examples of:
insufficient swap on (say) mmcsd0 and no use of the
/dev/da0 drive that has reported errors at all,
/usr/ and /var not on mmcsd0 (or whatever was used
for swap) either? Did some drive end up reporting
errors? Which? Did the system still crash as well?
Have such test-context combinations been tried?
Without any logs to look at for such alternatives, I
can not try to compare/contrast such to the others
examples.
>>
>> This goes along with my suggestion to split the /dev/da0
>> error investigation from the investigations of OMMA behavior
>> and crashing-the-machine: avoiding any confounding.
>>
> From what I've seen, OOMA isn't associated with da0 errors and crashes.
> To see the latter, OOMA must be avoided.
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list