[Bug 228174] [dump] dump(8) can read garbage and loop forever
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri May 11 22:21:22 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228174
Bug ID: 228174
Summary: [dump] dump(8) can read garbage and loop forever
Product: Base System
Version: 11.1-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: bin
Assignee: stable at FreeBSD.org
Reporter: eugen at freebsd.org
CC: fs at FreeBSD.org, longwitz at incore.de,
mckusick at FreeBSD.org
Hi!
I have several FreeBSD boxes with UFS2 and soft-updates enabled (no SU+J)
mounted as (ufs, local, soft-updates). They periodically run dump(8) to make
zero-level backups once a week and first level incremental backup on other days
of week.
The command line is like: dump -$level -C 32 -h 0 -au -f - /home
That is, not snapshots used and cache size is big.
Sometimes it finished just fine. Sometimes it its four children start to eat
CPU at max and loop forever not making progress and never finishing until
killed manually.
I've rebuilt /sbin/dump with debugging options -O0 -g and installed unstripped
binary and when the problem reproduced, I ran sysctl
kern.corefile=/var/tmp/%N.%P.core && killall -QUIT dump. So, I have five core
files with debug info.
I've digged it a bit and I think I found the problem. Here is what do I get for
FreeBSD 11.1-STABLE/amd64 r332356 and file system having 41943040 sectors, 512
bytes per sector and standard bsize=32768, fsize=4096:
Core was generated by `dump: /dev/mirror/gm0s1g: pass 4: 60.53% done, finished
in 0:03 at Fri May 11 03'.
Program terminated with signal 3, Quit.
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x000000080098bfea in _read () from /lib/libc.so.7
(gdb) bt
#0 0x000000080098bfea in _read () from /lib/libc.so.7
#1 0x0000000000408806 in atomic (func=0x4021f4 <read at plt>, fd=5,
buf=0x7ffffffedb44 "", count=4)
at /usr/local/src/sbin/dump/tape.c:877
#2 0x0000000000407d75 in flushtape () at /usr/local/src/sbin/dump/tape.c:245
#3 0x000000000040814b in dumpblock (blkno=3906360519982919218, size=32768)
at /usr/local/src/sbin/dump/tape.c:188
#4 0x000000000040cd38 in ufs2_blksout (dp=0x8011a9fc0, blkp=0x7ffffffedc58,
frags=679,
ino=522029, last=1) at /usr/local/src/sbin/dump/traverse.c:704
#5 0x000000000040cfea in dmpindir (dp=0x8011a9fc0, ino=522029, blk=5832576,
ind_level=0,
size=0x7fffffffdd58) at /usr/local/src/sbin/dump/traverse.c:609
#6 0x000000000040c322 in dumpino (dp=0x8011a9fc0, ino=522029)
at /usr/local/src/sbin/dump/traverse.c:573
#7 0x0000000000404cd1 in main (argc=0, argv=0x7fffffffec60)
at /usr/local/src/sbin/dump/main.c:579
Note insanely large blkno for frame 3.
I think the problem is in the dmpindir() function, frame 5.
Note ino=522029. It corresponds to some large text CSV file this box produces
periodically and "find /home -inode 522029 -ls" shows its pathname and I can
see its contents to the end with "less" command just fine.
However:
(gdb) frame 5
#5 0x000000000040cfea in dmpindir (dp=0x8011a9fc0, ino=522029, blk=5832576,
ind_level=0,
size=0x7fffffffdd58) at /usr/local/src/sbin/dump/traverse.c:609
(gdb) l dmpindir
580 * Read indirect blocks, and pass the data blocks to be dumped.
581 */
582 static void
583 dmpindir(union dinode *dp, ino_t ino, ufs2_daddr_t blk, int ind_level,
584 off_t *size)
585 {
586 union {
587 ufs1_daddr_t ufs1[MAXBSIZE / sizeof(ufs1_daddr_t)];
588 ufs2_daddr_t ufs2[MAXBSIZE / sizeof(ufs2_daddr_t)];
589 } idblk;
590 int i, cnt, last;
591
592 if (blk != 0)
593 bread(fsbtodb(sblock, blk), (char *)&idblk,
594 (int)sblock->fs_bsize);
595 else
596 memset(&idblk, 0, sblock->fs_bsize);
597 if (ind_level <= 0) {
598 if (*size > NINDIR(sblock) * sblock->fs_bsize) {
599 cnt = NINDIR(sblock) * sblock->fs_frag;
(gdb) p blk
$1 = 5832576
(gdb) p idblk.ufs2[0]
$2 = 3329910757767148590
The code bread's garbage from the file system (line 593) and uses it later
without any sanity checks.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-stable
mailing list