Tracking down em problem
Sven Willenberger
sven at dmv.com
Wed Nov 2 07:36:36 PST 2005
FreeBSD6.0-RC1 (Wed Oct 26 13:31:21 EDT 2005)
I seem to have an issue with losing connections to an em interface
during process of heavy IO load. There are several variables here so I
am hoping for some guidelines to help troubleshoot this.
I have a postgresql server (8.0.4) set up on an i386 system. The data
directory is on its own partition (which is actually a gstripe/gmirror
setup -- see the footnote after my problem description).
I have enabled a replication system from another server. When I started
relication there was a large amount of data that had to be fed to this
server via the em0 interface. During this process, while ssh'ed to the
box, my connection would just hang for a few moments, then it would
recover. However, if I cd to the data directory (stripe/mirror) and
start ls -alrt several times, the connection actually gets broken; not
only my ssh connection but the replication connection from the master
server is broken.
I have tried to set debug.mpsafenet=0 in /boot/loader.conf to no avail
-- the same issue happens. Preemption is enabled in the kernel, as is
sched_4bsd. I don't really know how to proceed at this point to try and
troubleshoot this issue: as it stands now, it is most definitely a show
stopper for the purposes of this server.
Thanks,
Sven
*footnote: here is the gstripe/gmirror config:
a) the mirrors:
Geom name: pg1
State: COMPLETE
Components: 2
Balance: split
Slice: 8192
Flags: NONE
GenID: 0
SyncID: 1
ID: 1606567834
Providers:
1. Name: mirror/pg1
Mediasize: 36703949312 (34G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: da1
Mediasize: 36703949824 (34G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 1
ID: 2976581887
2. Name: da2
Mediasize: 36703949824 (34G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 1
Flags: DIRTY
GenID: 0
SyncID: 1
ID: 3738898587
Geom name: pg2
State: COMPLETE
Components: 2
Balance: split
Slice: 8192
Flags: NONE
GenID: 0
SyncID: 1
ID: 2419201320
Providers:
1. Name: mirror/pg2
Mediasize: 36703949312 (34G)
Sectorsize: 512
Mode: r1w1e2
Consumers:
1. Name: da3
Mediasize: 36703949824 (34G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: DIRTY
GenID: 0
SyncID: 1
ID: 4053765902
2. Name: da4
Mediasize: 36703949824 (34G)
Sectorsize: 512
Mode: r1w1e1
State: ACTIVE
Priority: 1
Flags: DIRTY
GenID: 0
SyncID: 1
ID: 2784554060
b) the stripes (using the mirrors):
Geom name: pgdata
State: UP
Status: Total=2, Online=2
Type: AUTOMATIC
Stripesize: 65536
ID: 2329725949
Providers:
1. Name: stripe/pgdata
Mediasize: 73407791104 (68G)
Sectorsize: 512
Mode: r1w1e1
Consumers:
1. Name: mirror/pg1
Mediasize: 36703949312 (34G)
Sectorsize: 512
Mode: r1w1e2
Number: 0
2. Name: mirror/pg2
Mediasize: 36703949312 (34G)
Sectorsize: 512
Mode: r1w1e2
Number: 1
This is then mounted as:
/dev/stripe/pgdata /usr/local/pgsql ufs rw,noatime
2 2
More information about the freebsd-current
mailing list