How to report bugs (Re: VERY frustrated with FreeBSD/UFS stability - please help or comment...)

Tue May 22 12:30:19 UTC 2007

On 05/21/07 15:37, Gore Jarold wrote:
> --- Kris Kennaway <kris at obsecurity.org> wrote:
> 
>> On Mon, May 21, 2007 at 12:16:33PM -0700, Gore
>> Jarold wrote:
>>
>>>>> a) am I really the only person in the world
>> that
>>>> moves
>>>>> around millions of inodes throughout the day ?
>>  Am
>>>> I
>>>>> the only person in the world that has ever
>> filled
>>>> up a
>>>>> snapshotted FS (or a quota'd FS, for that
>> matter)
>>>> ? 
> 
> 
> (snip)
> 
> 
>> You are certainly not the only persion who operates
>> on millions of
>> inodes, but it is disingenuous to suggest that this
>> is either a
>> "mainstream" or "simple" workload.  Also, I
>> personally know of several
>> people who do this without apparent problem, so that
>> is further
>> evidence that whatever problems you are seeing are
>> something specific
>> to your workload or configuration, or you are just
>> unlucky.
> 
> 
> Ok.  In my defense, I have to say that as a
> non-developer, end user, it's hard to watch people
> installing ZFS on FreeBSD and running with journaling
> and newfs'ing raw disk with 7.0-current, etc., and not
> feel like I am an extremely pedestrian use case.
> 
> I had no idea I was so cutting edge :)
> 
> 
> 
>> The larger issue here is that apparently you have
>> been suffering in
>> silence for many years with your various
>> frustrations and they have
>> finally exploded into this email.  This is really a
>> poor way to
>> approach the goal of getting your problems solved:
>> it is fundamentally
>> a failure of your expectations to think that without
>> adequately
>> reporting your bugs that they will somehow get
>> fixed.
> 
> 
> I need to clarify and respond to this ... my point was
> that every release since 5.0 has had some new and
> interesting instability in this regard.  Every time a
> new release comes out, it seems to be "fixed", only to
> reveal some other new and interesting instability.
> 
> So, no, I have not silently suffered with _any one_
> particular problem - they never seem to last more than
> one release or two.  It is only now, however, that I
> have come to realize that I am in the same spot
> (overall) today as I was in early 2004.  The details
> are slightly different, but the end result is that my
> rsyncs and cps and rms are too much for FreeBSD, and
> have been for 3 years now.
> 
> So what I am saying is, individual causes of
> instability (seem to) come and go, but I am not any
> better of today than I was with 5.0.  I have just
> realized this, and that is why I make my frustration
> known today.
> 
> 
>> Without these two things there is really very little
>> that a developer
>> can do to try and guess what might possibly be
>> happening on your
>> system.  However, it appears that we might now be
>> making some
>> progress:
>>
>>> ssh user at host rm -rf backup.2
>>> ssh user at host mv backup.1 backup.2
>>> ssh user at host cp -al backup.0 backup.1
>>> rsync /files user at host:/backup.0
>>>
>>> The /files in question range from .2 to 2.2
>> million
>>> files, all told. This means that when this script
>>> runs, it first either deletes OR unlinks up to 2
>>> million items.  Then it does a (presumably) zero
>> cost
>>> move operation.  Then it does a hard-link-creating
>> cp
>>> of the same (up to 2 million) items.
>> Please provide additional details of how the
>> filesystems in question
>> are configured, your kernel configuration, hardware
>> configuration, and
>> the debugging data referred to in 2) above.
> 
> 
> I will collect all of this and submit it the next time
> the system crashes...

For whatever it might be worth, I'm doing a very similar task (using 
rsnapshot), for backing up a decent amount of data, with a nightly 
difference of a million or so files, touching ~200million files nightly. 
  I currently have 5 10TB filesystems running, with 20TB more coming 
online today or tomorrow.

Here's a df output:

# df -ilk
Filesystem     1024-blocks       Used      Avail Capacity   iused 
ifree %iused  Mounted on
/dev/amrd0s3a     20308398    1114658   17569070     6%     23080 
2614742    1%   /
devfs                    1          1          0   100%         0 
    0  100%   /dev
/dev/amrd0s3e     18441132    3794030   13171812    22%        96 
2402206    0%   /tmp
/dev/amrd0s3d     20308398    3164982   15518746    17%    250111 
2387711    9%   /usr
/dev/ufs/vol1   9926678106 5030793092 4101750766    55%  38875256 
1244237702    3%   /vol1
/dev/ufs/vol2   9926678106 3668501950 5464041908    40%  67622249 
1215490709    5%   /vol2
/dev/ufs/vol3   9926678106 5937797134 3194746724    65%     97153 
1283015805    0%   /vol3
/dev/ufs/vol10  9925732858 8054663156 1077011074    88%  97873355 
1185121843    8%   /vol10
/dev/ufs/vol11  9925732858 7288510876 1843163354    80% 126038333 
1156956865   10%   /vol11

I have roughly 50-60 hardlinks per file (for about 80% of the files).

Obviously fsck is not an option (due to memory/time constraints) so I've 
been using gjournaling (thanks to PJD).

I *do* have one issue that has cropped up on one of these file systems, 
which I just recently found. I'll send a separate email with details.

I don't use snapshots, or background fsck on these at all, nor do I use 
quotas.

So, it can be done, and it can be done with pretty good reliability 
(whatever that might mean to any particular person).

Eric