deadlock or bad disk ? RELENG_8

Jeremy Chadwick freebsd at jdc.parodius.com
Mon Jul 19 02:34:21 UTC 2010


On Sun, Jul 18, 2010 at 05:42:14PM -0400, Mike Tancsa wrote:
> At 05:14 PM 7/18/2010, Jeremy Chadwick wrote:
> 
> >Where exactly is your swap partition?
> 
> On one of the areca raidsets.
> 
> # swapctl -l
> Device:       1024-blocks     Used:
> /dev/da0s1b    10485760       108

So is da0 actually a RAID volume "behind the scenes" on the Areca
controller?  How many disks are involved in that set?

> >If you Google for "swap_pager: indefinite wait buffer: bufobj" you'll
> >find this is a pretty well-established problem, but the situation varies
> >per person.  A common one is here (read the entire thread):
> >
> >http://www.mail-archive.com/freebsd-questions@freebsd.org/msg192481.html
> >
> >I have no advice as far as how to solve this problem.
> 
> If feels like a disk issue, but SMART values all seem ok

Well, the thread I linked you stated that the problem has to do with a
controller or disk "taking too long".  I have no idea what the threshold
is.  I suppose it could also indicate that your system is (possibly)
running low on resources (RAM); I would imagine swap_pager would get
called if a processes needed to be offloaded to swap.  So maybe this is
a system tuning thing more than a hardware thing.

You should probably set up a series of monitoring scripts that monitor
things like interrupt rate on devices, I/O statistics, and some general
memory statistics to determine if processes are being swapped out
excessively.  vmstat and iostat would help here; see man page for
relevant options (for swap stuff, vmstat -s).  There's also systat with
the -vmstat flag.

> CLI> disk smart drv=1
> [...]

Unrelated to the problem, but important to note:

The SMART output from the Areca CLI is hardly useful (bordering on
worthless); it only shows the adjusted/calculated values and not the
actual raw values.  Even if the CLI lets you print this information, I
would still strongly suggest using smartctl.  There's no indication
the Areca CLI has a quirks database for each drive model/type.  I'm also
not sure if the Areca CLI can provide the SMART error log, self-test
log, or the selective self-test log.

>  smartctl -a -d 3ware,1 /dev/twa0

Now I'm confused -- this indicates twa(4) is involved, not arcmsr(4).

Can you please provide a verbose explanation of the configuration of the
disks and controllers in this machine, including device and disk names
and what they're associated with, plus if they're RAIDed in any way?

Thanks.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list