GPT vs MBR for swap devices

Tue Jun 19 04:06:23 UTC 2018

Are you sure it is not /usr/obj activity which you are seeing when
there are large write delays?
On systems using traditional spinning disks for everything else
it really makes sense to put /usr/obj on its own SSD making sure
the SSD does not share an I/O channel with any other device.

--jau

> On 19 Jun 2018, at 6.42, bob prohaska <fbsd at www.zefox.net> wrote:
> 
>> On Mon, Jun 18, 2018 at 06:31:40PM -0700, Mark Millard wrote:
>>> On 2018-Jun-18, at 5:55 PM, bob prohaska <fbsd at www.zefox.net> wrote:
>>> 
>>>> On Mon, Jun 18, 2018 at 04:42:21PM -0700, Mark Millard wrote:
>>>> 
>>>> 
>>>>> On 2018-Jun-18, at 4:04 PM, bob prohaska <fbsd at www.zefox.net> wrote:
>>>>> 
>>>>>> On Sat, Jun 16, 2018 at 04:03:06PM -0700, Mark Millard wrote:
>>>>>> 
>>>>>> Since the "multiple swap partitions across multiple
>>>>>> devices" context (my description) is what has problems,
>>>>>> it would be interesting to see swapinfo information
>>>>>> from around the time frame of the failures: how much is
>>>>>> used vs. available on each swap partition? Is only one
>>>>>> being (significantly) used? The small one (1 GiByte)?
>>>>>> 
>>>>> There are some preliminary observations at
>>>>> 
>>>>> http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_1gbsdflash_swapinfo/1gbusbflash_1gbsdflash_swapinfo.log
>>>>> 
>>>>> If you search for 09:44: (the time of the OOM kills) it looks like
>>>>> both swap partitions are equally used, but only 8% full.
>>>>> 
>>>>> At this point I'm wondering if the gstat interval (presently 10 seconds)
>>>>> might well be shortened and the ten second sleep eliminated. On the runs
>>>>> that succeed swap usage changes little in twenty seconds, but the failures
>>>>> seem to to culminate rather briskly.
>>>> 
>>>> One thing I find interesting somewhat before the OOM activity is
>>>> the 12355 ms/w and 12318 ms/w on da0 and da0d that goes along
>>>> with having 46 or 33 L(q) and large %busy figures in the same
>>>> lines --and 0 w/s on every line:
>>>> 
>>>> Mon Jun 18 09:42:05 PDT 2018
>>>> Device          1K-blocks     Used    Avail Capacity
>>>> /dev/da0b         1048576     3412  1045164     0%
>>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>>> Total             2097152     6920  2090232     0%
>>>> dT: 10.043s  w: 10.000s
>>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0.0    0.1  mmcsd0
>>>>  46      0      0      0    0.0      0     16  12355      0      0    0.0   85.9  da0
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0.0    0.1  mmcsd0s3
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0.0    0.1  mmcsd0s3a
>>>>  33      0      0      0    0.0      0     22  12318      0      0    0.0  114.1  da0d
>>>> Mon Jun 18 09:42:25 PDT 2018
>>>> Device          1K-blocks     Used    Avail Capacity
>>>> /dev/da0b         1048576     3412  1045164     0%
>>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>>> Total             2097152     6920  2090232     0%
>>>> 
>>>> 
>>>> The kBps figures for the writes are not very big above.
>>>> 
>>> 
>>> If it takes 12 seconds to write, I can understand the swapper getting impatient....
>>> However, the delay is on /usr, not swap.
>>> 
>>> In the subsequent 1 GB USB flash-alone test case at
>>> http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_swapinfo/1gbusbflash_swapinfo.log
>>> the worst-case seems to be at time 13:45:00
>>> 
>>> dT: 13.298s  w: 10.000s
>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>>>   0      0      0      0    0.0      0      5    5.5      0      0    0.0    0.1  mmcsd0
>>>   9     84      0      0    0.0     84   1237   59.6      0      0    0.0   94.1  da0
>>>   0      0      0      0    0.0      0      5    5.5      0      0    0.0    0.1  mmcsd0s3
>>>   0      0      0      0    0.0      0      5    5.6      0      0    0.0    0.1  mmcsd0s3a
>>>   5     80      0      0    0.0     80   1235   47.2      0      0    0.0   94.1  da0b
>>>   4      0      0      0    0.0      0      1   88.1      0      0    0.0    0.7  da0d
>>> Mon Jun 18 13:45:00 PDT 2018
>>> Device          1K-blocks     Used    Avail Capacity
>>> /dev/da0b         1048576    22872  1025704     2%
>>> 
>>> 1.2 MB/s writing to swap seems not too shabby, hardly reason to kill a process.
>> 
>> That is kBps instead of ms/w.
>> 
>> I see a ms/w (and ms/r) that is fairly large (but notably
>> smaller than the ms/w of over 12000):
>> 
>> Mon Jun 18 13:12:58 PDT 2018
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/da0b         1048576        0  1048576     0%
>> dT: 10.400s  w: 10.000s
>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>>    0      4      0      0    0.0      4     66    3.4      0      0    0.0    1.3  mmcsd0
>>    8     18      1     32   1991     17    938   2529      0      0    0.0   88.1  da0
>>    0      4      0      0    0.0      4     63    3.5      0      0    0.0    1.3  mmcsd0s3
>>    0      4      0      0    0.0      4     63    3.5      0      0    0.0    1.3  mmcsd0s3a
>>    6     11      1     32   1991     10    938   3207      0      0    0.0   94.7  da0d
>> Mon Jun 18 13:13:19 PDT 2018
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/da0b         1048576        0  1048576     0%
>> 
>> 
> Yes, but again, it's on /usr, not  swap. One could argue that there are other
> write delays, not seen here, that do affect swap. To forestall that objection 
> I'll get rid of the ten second sleep in the script when the present test run
> finishes. 
> 
> 
>> Going in a different direction, I believe that you have
>> reported needing more than 1 GiByte of swap space so the
>> 1048576 "1K-blocks" would not be expected to be sufficient.
>> So the specific failing point may well be odd but the build
>> would not be expected to finish without an OOM for this
>> context if I understand right.
>> 
> Yes, the actual swap requirement seems to be slightly over 1.4 GB 
> at the peak based on other tests. I fully expected a failure, but
> at a much higher swap utilization.
> 
> 
>>> Thus far I'm baffled. Any suggestions?
>> 
>> Can you get a failure without involving da0, the drive that is
>> sometimes showing these huge ms/w (and ms/r) figures? (This question
>> presumes having sufficient swap space, so, say, 1.5 GiByte or more
>> total.)
>> 
> If you mean not using da0, no; it holds /usr. If you mean not swapping
> to da0, yes it's been done. Having 3 GB swap on microSD works. 
> Which suggests an experiment: use 1 GB SD swap and 1.3 GB mechanical
> USB swap. That's easy to try. 
> 
>> Having the partition(s) each be sufficiently sized but for which
>> the total would not produce the notice for too large of a swap
>> space was my original "additional" suggestion. I still want to
>> see what such does as a variation of a failing context. 
> 
> I'm afraid you've lost me here. With two partitions, one USB and
> the other SD of one GB each OOM kills happen at 8% utilization, 
> spread evenly across both. Does the size of the partition affect
> the speed of it? Capacity does not seem the problem.
> 
>> it would seem to be a good idea to avoid da0 and its sometimes
>> large ms/w and /ms/r figures.
>> 
> 
> I think the next experiment will be to use 1 GB of SD swap and
> 1.3 GB of mechanical USB swap. We know the SD swap is fast enough,
> and we know the USB mechanical swap is fast enough. If that
> combination works, maybe the trouble is congestion on da0. If the combo
> fails as before I'll be tempted to think it's USB or the swapper.
> 
> Thanks for reading!
> 
> 
> bob prohaska
>> 
>> 
> _______________________________________________
> freebsd-arm at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
> To unsubscribe, send any mail to "freebsd-arm-unsubscribe at freebsd.org"